Golang ORM SQL and other sequel fun
I keep answering this question in a variety of different forums, so I wanted to just capture this as a blog post. Every few weeks/months, it seems someone asks about what ORM to use or why GORM is bad on the golang subreddit.
I have never used GORM but it generally takes an approach I’m not a fan of but I’ll touch on that in a bit. Let’s start with the basics.
What is an ORM?
ORM stands for an Object-Relational Mapping. In the simplest terms, it maps an object, or in golang this would be a struct to a relational table of some kind. Typical examples would be postgres, mysql, sqlite etc. Technically it doesn’t need to be a SQL database but it’s the most common use case.
It allows the developer to easily represent data manipulation in code without have to fall back on pure SQL. It also, in theory, allows you to more easily change from one database to another one without having to deal with the caveats of the SQL language differences and so on.
Approaches to ORMs
There are essentially two approaches to building ORMs and they diverge on their views on what should be the source of truth. Do you want your code to define the data layer, or do you want to define a data layer that your code uses?
- Code First Approach (CFA)
- Data First Approach (DFA)
(I’m not sure if these are the correct technical terms, but so I don’t have to type these 100 times over, I’ll use their short form of CFA and DFA.
Type Safety
This is a very opinionated view, but unless you have some type safety that you are gaining. This can be true of both approaches (CFA and DFA) there is little point to using an ORM. If all your ORM does is act as a glorified string builder, you might as well just write out your query as a string and use positional parameters (“?, $” etc) to avoid SQL injection and secure your code base.
The big advantage of type safety is that if you apply changes and additional columns are added, or data types are changed, you should be catching those at compile time not at runtime. Tests help but they’re not perfect or all encompassing. Any references to that column that don’t match the type should error out and force you to address it.
Why the Hate
This isn’t always due to the framework as much as the developer using it. Of course that also tends to end up with another developer picking up the mistakes that were done by generations of developers and struggling to decipher where the issue is.
One of the big issues with ORMs is that they make things too easy, they make it so developers don’t need to learn how SQL works (a required skill IMO) and they also tend to generate some fairly hideous queries.
- Has Many, One to Many etc.
There’s been more than a few times where I’ve seen a data model annotated with this pattern, which itself is fine. It simply says that a customer has multiple shipping addresses. Perfectly fine. He also has multiple orders, also fine.
The issue now comes when you need to update the Customer object. If you’re not careful and you need to update the user’s name, let’s say, instead of simply doing something like.
UPDATE customer set first_name = '' where customer_id = ?You end up with a massive read and update, which reads the customer, all addresses, entire order history and does an update across all entities.
- Black box element
This is somewhat related to the above but there is a tendency of things supposed to “just work” and not really understanding the underlying layer below it. Why setup indices, why a query is slow or even why a query suddenly changed from release A to B for no reason except that your ORM had an update and decided this new way was better than the latter.
This tends to be more true of the CFA approach but still something to look out for.
Code-First-Approach
This model lets you define what the database layer looks like in code. When deploying the app, part of the deployment involves the application applying a migration to the database to reflect the schema changes and update the database accordingly.
gorm is an example of that. If you’re coming from other ecosystems, django and RoR (Ruby on Rails) follow this pattern. They tend to get you into trouble, particularly if you have a database that is used by multiple services. You cannot have multiple code bases changing the same datastore and not expect any issues. If you have a small application that owns its database and nothing else will ever touch it but your app, you could get away with this approach. You are also heavily relying on the app to do the right thing.
So, a quick disclaimer. I generally strongly dislike this approach. I’ll reference a few choices here but I have more limited experience with these. If you do want to use CFA feel free to explore these further. They are what I would consider the standard choices for CFA.
Here are a few examples that follow that pattern.
Ent - Entity framework
In general it feels like Ent is appreciated a lot more in go. If I had to do a CFA approach, I think I would use Ent. The project is very well documented and also defines the schema and data type in code instead of doing reflection via tags like gorm.
There is a sort of code generation pattern but you’re still defining the schema in code rather than SQL so it is still a CFA pattern.
The generated code and syntax is also significantly more readable and easier to use, in my view.
u, err := client.User.
Query().
Where(user.Name("a8m")).
Only(ctx)
if err != nil {
return nil, fmt.Errorf("failed querying user: %w", err)
}GORM - is a pretty popular choice. It has a tendency to use struct tags for configurations, which are unnecessary and slow. Error handling is a bit odd returning a struct rather than an error that contains an error.
It also had a tendency to make it far too difficult to write your own queries. So you are either loading the object via:Query: https://gorm.io/docs/query.html
db.First(&user, 10) // limited ability to control the query generatedOR
SQL Builder: https://gorm.io/docs/sql_builder.html
db.Raw("SELECT id, name, age FROM users WHERE name = ?", "jinzhu").Scan(&result)Query isn’t very flexible and the SQL Builder removes any benefits, since you’re basically back to writing raw queries.
It also has a large user base. Give it a go and see if you like it. I have not personally used it to any great extent and have my biases and reservations around it based on the choices that were made.
The query above is fairly clear and easy to interpret what would be returned. It also returns a normal go error instead of its own specialized type.
Data-First-Approach
Data first approach, sometimes referred to as schema first approach, takes the opposite view. The source of truth is the database itself, or a schema file that defines the table structure, field types, etc.
I much prefer this approach, since the database is really the entity that rules this domain. No matter what your code things the schema should be, if the DB says otherwise, your code will break. So I really think it’s much better to connect to a DB or parse a SQL file to determine the data model for your ORM. There are a few options here that I HAVE spent some time on.
1. SQLBoiler - Honorary mention. The project is on maintenance mode and no longer used but I have used it in the past and enjoyed the experience overall. It’s worth a look at, though I would not use it beyond historical mention.
Jet - is a DFA where you point it to a live database and that is uses it to generate code. Then you have enums, tables and models generated and you can create your queries programmatically. This can vary from very verbose to terse depending on your preferences but you have to represent your query in code. The downside is that the documentation is a bit sparse, but can be found on the github’s wiki.
Example below:
SELECT(
Captures.Status,
Captures.ContextID,
Captures.Username,
Captures.RequestData,
Captures.StartTime,
Captures.EndTime,
).WHERE(Captures.CaptureID.EQ(UUID(captureId)).
Query(conn, &model)Bob - is a very nice alternative. It has a code generator and a decent level of documentation. You point it to a live database and it generates code based on a given configuration. It supports all the standard OSS databases you likely care about.
You can write raw query and use the generated code base. See examples below:
//raw query
psql.Select(
sm.Columns("id", "name"),
sm.From("users"),
sm.Where(psql.Quote("id").In(psql.Arg(100, 200, 300))),
)
//generated code
models.Query().One(ctx, db)userView.Query().All(ctx, db)
models.FindJet(ctx, db, 10).All()The syntax is similar to gorm which I really dislike. It’s nice that it has a bit better syntax for the Raw query but you lose the benefits of type safety when you use the raw query builder. The generated code feels again more black magic on what is being done. It’s a nice idea but not my preference.
SQLC - Is a VERY fast tool. It scans a database schema file and generates your code as defined by your configuration file. You can add additional tags and other customizations fairly easily. Supports multiple engines, works with both PGX and PG driver. It’s also very easy to use and setup.
The workflow involves creating a set of queries and having the SQLC generator parse those and create code to match. Example can be seen below and here.
-- name: ListAuthors :many
SELECT * FROM authors
ORDER BY name;
The downside is that as soon as soon as you want to filter or add more dynamic elements, you have to go back to the code. You either need a new variation say, ListAuthorsInNorthAmerica defined where you filter by location as well. Or you want filtered by location and ordered by books sold. It gets a bit messy and you constantly have to go back to the SQL and modify it, OR.. You define these place holders.
For example:
WHERE ( email = sqlc.narg(email) OR sqlc.narg(email) IS NULL )essentially every query will end up with an additional where clause. If the user sets the email arg then it’ll filter by email otherwise it’ll return true and pass through. This still requires foresight in knowing all possible filters you want to filter by. It’s really unfortunate, I wish SQLC allowed to expand on the original query.
You can export the Query generated as a public variable that you can pull into squirrel or goqu and get it to work but then you’re back to essentially using a string builder and losing the benefits of an ORM. I really, really want to use SQLC and the queries I do write work flawlessly. I have had little trouble getting almost any query to work. I just find it too limiting and detrimental to my work flow so I had to move on.
You can also if you don’t want to be explicit about which columns you want to query use this pattern.
SELECT(Captures.AllColumns).
WHERE(Captures.CaptureID.EQ(UUID(captureId)).
Query(conn, &model)Jet is, at this point, my preferred choice. I do tend to run into features that are not yet supported. Arrays are a big one that would be nice to add. The CLI configuration is a bit limited so you likely will end up writing a code generator to add your own customization. It’s not too bad but it is a bit awkward when a simple config file should suffice.
It also has some limitations with cross schema generation. For example, my current work project has the following schemas: tasks, audit, storage. The tasks schema has a table that references storage [cross schema foreign key], so the code generator needs to be executed first against storage then another time against the rest of the schemas; otherwise the code is unable to find the correct references.
Again, not a deal break but it just needs a bit more hand-holding and massaging to get to work properly.
Database Migrations
This topic isn’t really the intended topic of this post, but I wanted to mention some of the defacto names. Partly because many of DFA require you to have some solutions figured out. Some, like SQLC, actually require a schema.sql file to be generated. Here are a few popular options to choose from.
- Dbmate - My current tool of choice. You write your DDL changes, and rollback code the tool lets you apply migrations, rollbacks, etc. It also has the ability to be imported as a golang library, which I find invaluable for integration testing. I usually have a test that also performs the rollback operation and ensures at least the latest DDL can successfully both be applied and undone.
- Goose - Is pretty similar to dbmate. If I were to start a new project, I may consider using goose but the syntax for the SQLs is different enough to make it too annoying to really switch for no particular gains. Same concept. Create SQL, rollback SQL, import as a library, etc.
It does seem to be a bit more feature rich than dbmate, providing out of order migrations, for example. - Atlas - I have the least amount of experience with Atlas. It provides both SQL and code-driven migration.
Query Builders:
I wanted to reference the query builds I talked about. I won’t dive into any of these but their life purpose is to make it easier for you to generate a SQL query. They all typically have limited if any type safety but are sometimes useful when your ORM fails to accomplish your desired task.
- Squirrel—https://github.com/Masterminds/squirrel
- Goqu - https://github.com/doug-martin/goqu
- SQLX— https://github.com/jmoiron/sqlx
I’ll throw this on here as well as an honorable mention. It doesn’t do any type safety which in my view doesn’t qualify it as an ORM. Though it might be of use to some. - Bun - https://bun.uptrace.dev/ - Bun is a bit odd it provides a lot of tooling around it like fixtures, migration, distributed tracing but no type safety. Seems like such a huge oversight to me. It might be worth looking into if you’re still looking for the right tool for your project.
Final Verdict
Naturally, this is a very biased opinion. As much as I would like to constantly try different technologies to see which is best; at some point we all need to move on and write actual code. My current favorite is Jet which is what I’m using in my production code. I have a soft spot for SQLC but while it has a lot of promise, the lack of dynamic behavior is a bit unfortunate.
I current code base uses Jet and I have sqlx available for when Jet fails to provide me a path forward. All my Jet generated data models have the “db” tag added to them, which makes them compatible with SQLX if I need to fallback on that library instead.
There is no perfect solution like all things, but hopefully this provides some useful readings and references of the DB ecosystem in golang.
Please let me know if there’s anything I’ve missed or would like me to expand on. I think this is a mostly fair comparison. I tried to call out my biases when I expanded on any points I made. Feel free to disagree with me and use the links provided to explore the various tooling and make your own informed choices.
Thank you for reading. :-)
Accreditations
Since I’m too lazy to do art and AI just steels it anyways, I borrowed my gopher from this source.