3 Things I Wish I Knew When I Started Using Entity Framework

Entity Framework (EF) is Microsoft’s object-relational-mapping tool (ORM) that allows programmers to easily map database objects to C# object models. What attracts developers to Entity Framework is the automatic mapping of SQL tables to C# model classes, a feature that removes a lot of the tedious boiler plate code that developers would otherwise have to write.

The quickest way to see why developers love Entity Framework is with an example. Without Entity Framework, the code needed to SELECT some data from SQL looks something like this:

With Entity Framework and LINQ (Language Integrated Query), I can instead write something like this:

As a developer I want to be writing code like the latter example — look at how clear and concise it is! However, when I first started using Entity Framework I quickly came across specific scenarios where the ORM was not working like I would have expected. Below are my top 3 insights about Entity Framework that I had to learn the hard way.

1. Select() is your friend

When I first started using Entity Framework I was enamored by the idea of never having to write boring SQL connection and mapping code again. During that time, I was frequently writing quick concise LINQ queries like this all over the place:

All things were great for a while until I moved my code to production, where there was a lot more data, and my webpage started taking a very long time to return results. What was the problem?

My customer summary webpage was only displaying 4 columns from the Customer table to display summary information about customers: “State”, “Name”, “Address”, and “Email”. However, my Customer table had 50 columns in it. I foolishly thought since I was only using 4 columns from the table, Entity Framework would be smart enough to only bring those 4 columns of data back from the SQL server. Turns out I was wrong — what really was running behind the scenes was this:

Egad! I was running a SELECT *, which all SQL practitioners know is a big no-no both for performance as well as future maintainability of your queries!

One of the issues (features?) with Entity Framework is that by default the queries all run as SELECT *. Instead of bringing back all of the columns in a table, I figured I could limit my results by using Select() to choose which columns I wanted to return:

And there was much rejoicing amongst the DBAs.

2. Lazy Loading can be great or terrible

Lazy Loading is the idea that data isn’t pulled from the database until it is needed (hence the term “lazy”). This is a great feature if used correctly since it minimizes the amount of data that is transferred over the network. However, if you use lazy loading (which is often the default) unknowingly, you might be querying the SQL server more than is necessary.

Let’s say our Customer and Order models looks like this, where every Customer has many associated Orders:

If we write a LINQ query to bring back our Customer data and only use Customer data in our view, SQL just executes 1 query and things are efficient:

Controller:

View:

Result:

Bringing back fields exclusive to the Customer model.

SQL Trace:

One “SQL:BatchCompleted” means only a single query ran.

As you can see, Entity Framework only ran one query against the database and we are getting good performance. Very nice!

If we go ahead and change our View to display some Order data though:

View:

Result:

Now EF nows to bring dollar amounts back from our Order model.

SQL Trace:

The Order query is executed once for every customer — this could result in horrible performance.

Because of Lazy Loading, entity framework runs a query to retrieve an order for each customer! In my example data with 5 Customers with 1 Order each, Entity Framework runs 6 queries: 1 query to retrieve our Customers, and then 5 queries to retrieve orders for each of customers! Imagine if we were display more than 5 customer records — our SQL server would be getting slammed with queries!

The solution to this is to Eager Load the data. This means all of the data will be brought back in our initial query via a SQL JOIN. In order to do this, we just use the Include() method to retrieve all of the data up front:

Controller:

Trace:

EF generated SQL might be ugly, but it is efficient in this case.

Entity Framework only executes 1 query. This causes a little bit more upfront overhead since it transfers the Order data all at once, which would be a waste if we didn’t use that Order data in our View, but if we do plan to use Order data in our View, then using Include() reduces the number of queries that will run against the SQL server.

3. Sometimes it’s OK to not use Entity Framework LINQ queries

By default, Entity Framework generates some pretty interesting queries (interesting at least in the sense that no normal human being would structure a query like this). For example, in SQL I could write:

And the query generated by Entity Framework might look something like this:

Extents? Parameters? Aliases? What’s all of this extra junk in my query? Although the latter SQL statement works, it isn’t the prettiest on the eyes. And this is the example of simple query. Once you start writing complex LINQ queries using Entity Framework that involve the likes of Join(), GroupBy(), etc… the raw SQL that gets generated can get hairy and inefficient. In those cases, I’ve learned that it is perfectly to rely on good old t-sql stored procedures:

At the end of the day, we are using Entity Framework to make our lives easier. If a LINQ query gets too complicated or is generating inefficient queries, it is always fine to fall back on an old standby.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Leave a Reply

Your email address will not be published. Required fields are marked *