Before I left Turbulent back in 2018, I had started to become vocal about the pitfalls of using ORMs for any project more complex than simple blog. I had started to develop this notion that as project complexity increases, the time saved by using an ORM would go down, until a point where using an ORM actually makes you lose time rather than save time.
At Audiokinetic, I recently started an internal project that tapped into my old reflexes as a backend web developer. Being in charge of choosing the tech stack, I had two main concerns guiding me:
- a non-trivial amount of C# code from a previous project must be re-used
- it must be easy for new developers to hop onto the project and familiarize themselves with it
Given those two requirements, I decided to go with a very vanilla tech stack: ASP.NET Core 6 for the server, and React/Redux for the client. Data would be stored in PostgreSQL, and we would use Entity Framework for bridging the .NET app with that. While I did remember my negative feelings toward ORMs, I theorized that perhaps I was unjustly projecting the drawbacks of the PHP ORMs to ORMs in general. Perhaps a “production-grade” ORM developed by a major player would change my mind.
Boy, was I wrong! I regret nothing about the above tech stack EXCEPT the Entity Framework bridge. Let’s go over why.
ORMs couple all your code to the database schema
In an MVC or MVVM architecture, you’ll end up with a lot of different modules that operate on data stored in the database. When you use an ORM, the data model classes have dual roles:
- They represent the relational database schema as object-oriented structures
- They act as the main data representation for your application modules.
This dual role results in your modules being coupled to the database schema. Want to change a relationship from one-to-many to many-to-many due to changing requirements? Gotta update all your modules using those entities. Want to break out a set of members into a separate table? Have fun going through all modules again!
This is often encapsulated in code that looks like this:
var listOfEntities = myORM->FindAllX(...);
var entityProcessor = DI->GetService<EntityProcessor>();
var result = entityProcessor->Process(listOfEntities);
EntityProcessor here is coupled to the database schema. There’s an entire generation of programmers raised on the MVC architecture that just don’t know this coupling is bad; it’s just a fact of life. And it doesn’t stop at the server-side either. APIs are going to return JSON representations of your ORM data model, which means all consumers of the API on the client-side are also coupled to the database schema!
Just say no! Cut the dependency at the source. Just let the EntityProcessor define the reduced data model it must operate on, and write a transformation function that converts the DB model to the EntityProcessor representation:
var listOfEntities = myORM->FindAllX(...);
var processorDataset = EntityProcessor::CreateDataset(listOfEntities);
var entityProcessor = DI->GetService<EntityProcessor>();
var result = entityProcessor->Process(processorDataset);
Here I’ve chosen to make the data transformation function a static function, but that’s just to illustrate that it’s best to separate that from the EntityProcessor module itself. It could also be a separate service module as well.
But wait! Don’t you end up with a function that’s coupled to BOTH the database and the EntityProcessor data models? Yes. But the transformation function is much smaller, much easier to understand, Does Only One Thing, and is wrapped in trivial unit tests (you are writing unit tests, right)? Which brings me to my second point…
ORMs make unit-testing needlessly difficult
Any module coupled with the ORM’s data model is inherently more difficult to put under unit tests. This is especially true if the module actually uses the ORM to perform queries and updates. But even when that’s not the case, it’s still a hassle.
If you don’t believe me, just read this article from the official Entity Framework documentation: https://learn.microsoft.com/en-us/ef/core/testing/choosing-a-testing-strategy. Here, Microsoft is basically throwing their hands up and saying “just don’t bother with test doubles, it’s more trouble than it’s worth. Just test against the real database”.
OK sure, but if all my modules are coupled to the ORM, that means I can’t test any of my modules without a real database? What’s the point of using an ORM then? I thought it was to abstract away databases?
The database abstraction layer is a lie
Even if you use an ORM, you are still optimizing for your chosen DBMS of choice. Some ORMs only work with a specific DBMS. Others, like Entity Framework, support multiple DBMSs, but some features don’t work the same, or don’t work at all. Most notably, foreign key checks work differently, or are not checked at all for in-memory databases. ORMs are the very definition of a leaky abstraction.
What is the alternative?
In my mind, the one thing that ORMs do really well is expressing queries using constructs in the native programming language you’re working in, and managing database connection pools. So let’s just extract those parts:
using My.ORM.Query;
string sql = select({"field1", "field2"})
.from("MyTable")
.where(or(equal("field1", "value1"), equal("field2", "value2"))); // lisp-style, could be expressed different if you don't like it
ORM::iterator results = ORM::executeQuery(sql);
var processorDataset = EntityProcessor::CreateDataSet(results);
// ... rest is the same
Again, since you’re supposed to de-couple the modules from the database schema using a simple transformation function, why not have this transformation function operate on the raw database result iterator instead of an ORM data model? It’ll be faster, you’ll save memory by not having to pre-load the entire dataset, and ONLY that transformation function needs to be tested using a real database.
Whenever you have to change your database schema, there are only two things you need to change:
- Repository services that generate the SQL queries you need for the transformation functions
- The transformation functions themselves.
Clean!