Friday, June 22, 2012

When Should I Use An ORM?

I think like everyone, I go through the same journey whenever I find out about a new technology..

Huh? –> This is really cool –> I use it everywhere –> Hmm, sometimes it’s not so great

Remember when people were writing websites with XSLT transforms? Yes, exactly. XML is great for storing a data structure as a string, but you really don’t want to be coding your application’s business logic with it.

I’ve gone through a similar journey with Object Relational Mapping tools. After hand-coding my DALs, then code generating them, ORMs seemed like the answer to all my problems. I became an enthusiastic user of NHibernate through a number of large enterprise application builds. Even today I would still use an ORM for most classes of enterprise application.

However there are some scenarios where ORMs are best avoided. Let me introduce my easy to use, ‘when to use an ORM’ chart.

When_to_use_an_ORM

It’s got two axis, ‘Model Complexity’ and ‘Throughput’. The X-axis, Model Complexity, describes the complexity of your domain model; how many entities you have and how complex their relationships are. ORMs excel at mapping complex models between your domain and your database. If you have this kind of model, using an ORM can significantly speed up and simplify your development time and you’d be a fool not to use one.

The problem with ORMs is that they are a leaky abstraction. You can’t really use them and not understand how they are communicating with your relational model. The mapping can be complex and you have to have a good grasp of both your relational database, how it responds to SQL requests, and how your ORM comes to generate both the relational schema and the SQL that talks to it. Thinking of ORMs as a way to avoid getting to grips with SQL, tables, and indexes will only lead to pain and suffering. Their benefit is that they automate the grunt work and save you the boring task of writing all that tedious CRUD column to property mapping code.

The Y-axis in the chart, Throughput, describes the transactional throughput of your system. At very high levels, hundreds of transactions per second, you need hard-core DBA foo to get out of the deadlocked hell where you will inevitably find yourself. When you need this kind of scalability you can’t treat your ORM as anything other than a very leaky abstraction. You will have to tweak both the schema and the SQL it generates. At very high levels you’ll need Ayende level NHibernate skills to avoid grinding to a halt.

If you have a simple model, but very high throughput, experience tells me that an ORM is more trouble than it’s worth. You’ll end up spending so much time fine tuning your relational model and your SQL that it simply acts as an unwanted obfuscation layer. In fact, at the top end of scalability you should question the choice of a relational ACID model entirely and consider an eventually-consistent event based architecture.

Similarly, if your model is simple and you don’t have high throughput, you might be better off using a simple data mapper like SimpleData.

So, to sum up, ORMs are great, but think twice before using one where you have a simple model and high throughput.

13 comments:

Harry McIntyre said...

Asking the wrong question I reckon. Should you even be using a database?

Forget about NoSQL, the future is NoDB! Of course the solutions are fairly experimental:

Periodic RAM serialization http://nuget.org/packages/rambase

Prevalence pattern based
http://nuget.org/packages/LiveDomain.Core
http://nuget.org/packages/DevMagicFake
http://nuget.org/packages/Kiwi.Prevalence
http://nuget.org/packages/Sourcery (my own contribution to the field)

Given that you can rent a server with 64GB of RAM for £100 a month these days, you could get away with some running pretty decent sized systems.

Rob Kent said...

This is consistent with my recent experience. I would add two other caveats for ORMs:

1. The advent of Linq has made it easy for developers to write something like:

Customers.Where(c => c.Orders.Where(o => o.OrderDate > otherDate).Join(Products.Where(...))... ad infinitum.

Linq has made it easy to write bad Sql. It is also possible that the people writing the Linq don't even know Sql or why it might be bad.

2. In 90% of situations, the ORM is doing much more work than is necessary for the UI action. I have recently had to maintain some old Web Forms apps that use straight ADO with stored procs. My first impression was how much faster the page requests are than our recent MVC+ORM apps. I can only conclude that is because they are just fetching the few columns of data they actually need to display a page and not the complete object graph. Or maybe the repository setup is more expensive than we assume?

Rob Kent said...

Forgot to add:

3. During recent performance tuning on an MVC app we noticed a problem with Views constructed with partial Views created by RenderAction.

The root action queried the repository to get its data, and then each child action did the same. In some cases, the child actions were repeating expensive queries with different parameters, so you have stuff like:

Child action A: Users.Where(u => u.Lastname.StartsWith("A"))

Child action B: Users.Where(u => u.Lastname.StartsWith("B"))

This problem is exacerbated by the Sprint way of working where different developers pick up work items and code them in isolation. There is no holisitic view of the changes or the code base.

Andrei Rinea said...

There are times when you just can't increase the throughput any more with an ORM. ADO.NET to the rescue then.

Also good luck with full text search and other similar things..

Anonymous said...

Kent, The problem is not the ORM. You can select fields just like SQL. If someone does "Select * from ..." is that SQL's fault?

I have not used LINQ enough, but I do have the same feeling you do. I wonder what it is actually doing.

I look at what peopled do with ORM and say.. would do that with SQL? No? Then why are you doing it with an ORM.

BTW, I did a contract at Sprint about 10 years ago. Some really good and smart people there.

Anonymous said...

ORM should not imply automatic database schema generation, nor auto sql generation. ORMs such as NHibernate and Entity Framework that attempt to completely abstract away the database are flawed. You could argue that this is an anti-pattern, that should be avoided at all costs. I have never liked ceding so much power and flexibility to an ORM tool.

IMO, the best alternative is to use an ORM that supports stored procedures. Unfortunately, not many do that very well. I have used iBatis.NET (now MyBatis.NET) for over 5 years and it has proven to be the best and most reliable ORM out there. I get all the benefits of stored procedures AND a rich domain model of business objects...at the cost of maintaining rather straightforward XML mapping files. Sure, this is "boring" and "mundane" work, but its a one-time cost during the entire lifecycle of your system and it's codebase.

Anonymous said...

"The problem with ORMs is that they are a leaky abstraction."

You shouldn't abstract your ORM anyway.

Anonymous said...

"I get all the benefits of stored procedures " What benefits? In 2012, SPs hold little value except for database vendors and dbas.

Unknown said...

"What benefits? In 2012, SPs hold little value except for database vendors and dbas."

This is pretty obvious. I just rewrote and entire website that ran on a combination of ColdFusion and ASP.NET in MVC3. In many cases they used stored procedures but a few they used dynamic SQL mixed in the code. I can't tell you how many hours were saved for each section written with a stored procedure, but trust me this is a huge benefit. Oh and by the way I use Entity Framework with the stored procedures as well as linq to entities. EF with SPs works fine for my purposes. This is the article I used to learn how: http://blogs.msdn.com/b/diego/archive/2012/01/10/how-to-execute-stored-procedures-sqlquery-in-the-dbcontext-api.aspx

I'm also starting to feel like Linq to Entities seems like more trouble than it's worth to figure out 1)how to do some query I know how to do with my eyes closed and 2)once I figure out the syntax, figuring out what the hell it's doing behind the scenes.

I do like the objects though, so maybe stick with the SPs?

Lukas Eder said...

I really like the objectivity of this post, specifically with respect to the model complexity / throughput diagram. This diagram puts ORMs to a clear quadrant of usefulness where they really belong and where they are really good at.

At the same time, if vertical scalability can be guaranteed, it shows that you're probably best off knowing your database and your SQL very well, staying in full control of the SQL that is generated by your stack - if operating in the high-throughput area.

Unfortunately, I don't know the .NET ecosystem well enough, but in the Java world, jOOQ (http://www.jooq.org) covers relational aspects pretty well modelling SQL as a domain-specific language in Java. Unlike LINQ-to-SQL, however, jOOQ focuses on generating SQL instead of providing a more or less uniform querying API for all data stores, which leads to another leaky abstraction on a language level:
http://blog.jooq.org/2013/07/01/linq-and-java/

Dan Howard said...

Try persism - nice and simple. http://persism.sourceforge.net

Anonymous said...

People always forget, that you aren't forced to chose between white or black, but you can combine them as you like.
My general rule is: start simple! Use the ORM for everything, in the 95% of the cases it will create good enough queries and you don't have to worry about them.
If you find a performance problem with a query, don't throw away everything (black or white), but dig down to the problem, most of the times it's only a matter of ORM tuning. In very few cases you will write your own sql query and ORM can handle it too! In critical unique cases you can use a SP or a view (high performances).
But don't fall in the trap "I can write sql better", sure that you can, the question is, do you really want to write all the sql?
And don't fall in the trap "SP have always better performances", sure that they have, but very seldom you need a ferrari and it's very expensive (business logic in the DB it's not testable!)
There is an old proverb: "if all you have is a hammer, everything looks like a nail". And its meaning is, don't use the same tool for every task, but have different tools and use the most suitable one for that task!

Unknown said...

Take a look at what Stackoverflow do. They use
.Net MVC with EF most of the time but switch to Dapper when performance is an issue.