Someone recently told me that they don’t need to worry about SQL injection because they are using an ORM.
ORMs don’t automatically prevent SQL injection
Object-relational mapping (ORM) tools allow developers to easily access an application’s data layer without having to write lots of redundant code.
Most ORMs will safely parameterize certain types of queries. The following examples use Entity Framework and SQL Server, but these examples should apply to most other major ORMs and RDBMSs).
Our LINQ query making it easy to access our data layer:
And then the SQL query that our ORM generated.
You’ll notice the generated SQL query is using sp_executesql that has parameterized our input variable value “TFly37”. In this instance we can say the ORM did a good job following best practices in preventing a successful SQL injection attack.
But while ORMs may prevent some SQL injection attempts, there is no guarantee that they will prevent all injection attempts.
What follows are examples of when ORMs can allow for successful injection attacks.
Programatically assembled SQL
ORMs often provide the ability for developers to map the results of adhoc SQL queries to models. Adhoc queries mean potential injection vulnerabilities.
Consider the following:
Entity Framework, or any ORM for that matter, won’t be able to recognize a parameter concatenated directly into the query string.
Now hopefully the developer has some really strong input validation on the “username” parameter, but the fact still stands: this query is injectable and the ORM will happily execute it.
Does the logic needed in your app already exist as a stored procedure?
Excellent! Let’s be DRY (don’t-repeat-yourself) and call our procedure directly from our ORM:
So in this example, the ORM and developer have done everything right. So where’s the injection vulnerability?
Now let me be clear: this injection vulnerability is not the ORM’s fault.
However, it is the developer’s fault if he is blindly using a procedure without knowing what it does. It’s also the DBA’s fault if she provided an injectable procedure assuming the developer would sanitize any input values in the app.
Security is hard. Everyone needs to work together and be responsible for doing everything within their means to secure their applications.
Technically this isn’t an example of SQL injection.
But that’s why the title of this post says “2.5” instead of “3”.
In this example I’m using a dynamic LINQ to access my data:
If we pass the value
\"OR1==1ORUserName==\" we will have the ORM convert it to the following query:
Injection takes many forms and doesn’t just come straight from SQL. It’s important to follow best practices in all languages and tools to prevent security incidents.
Entity Framework (EF) is Microsoft’s object-relational-mapping tool (ORM) that allows programmers to easily map database objects to C# object models. What attracts developers to Entity Framework is the automatic mapping of SQL tables to C# model classes, a feature that removes a lot of the tedious boiler plate code that developers would otherwise have to write.
The quickest way to see why developers love Entity Framework is with an example. Without Entity Framework, the code needed to SELECT some data from SQL looks something like this:
As a developer I want to be writing code like the latter example — look at how clear and concise it is! However, when I first started using Entity Framework I quickly came across specific scenarios where the ORM was not working like I would have expected. Below are my top 3 insights about Entity Framework that I had to learn the hard way.
1. Select() is your friend
When I first started using Entity Framework I was enamored by the idea of never having to write boring SQL connection and mapping code again. During that time, I was frequently writing quick concise LINQ queries like this all over the place:
All things were great for a while until I moved my code to production, where there was a lot more data, and my webpage started taking a very long time to return results. What was the problem?
My customer summary webpage was only displaying 4 columns from the Customer table to display summary information about customers: “State”, “Name”, “Address”, and “Email”. However, my Customer table had 50 columns in it. I foolishly thought since I was only using 4 columns from the table, Entity Framework would be smart enough to only bring those 4 columns of data back from the SQL server. Turns out I was wrong — what really was running behind the scenes was this:
Egad! I was running a SELECT *, which all SQL practitioners know is a big no-no both for performance as well as future maintainability of your queries!
One of the issues (features?) with Entity Framework is that by default the queries all run as SELECT *. Instead of bringing back all of the columns in a table, I figured I could limit my results by using Select() to choose which columns I wanted to return:
And there was much rejoicing amongst the DBAs.
2. Lazy Loading can be great or terrible
Lazy Loading is the idea that data isn’t pulled from the database until it is needed (hence the term “lazy”). This is a great feature if used correctly since it minimizes the amount of data that is transferred over the network. However, if you use lazy loading (which is often the default) unknowingly, you might be querying the SQL server more than is necessary.
Let’s say our Customer and Order models looks like this, where every Customer has many associated Orders:
Because of Lazy Loading, entity framework runs a query to retrieve an order for each customer! In my example data with 5 Customers with 1 Order each, Entity Framework runs 6 queries: 1 query to retrieve our Customers, and then 5 queries to retrieve orders for each of customers! Imagine if we were display more than 5 customer records — our SQL server would be getting slammed with queries!
The solution to this is to Eager Load the data. This means all of the data will be brought back in our initial query via a SQL JOIN. In order to do this, we just use the Include() method to retrieve all of the data up front:
Entity Framework only executes 1 query. This causes a little bit more upfront overhead since it transfers the Order data all at once, which would be a waste if we didn’t use that Order data in our View, but if we do plan to use Order data in our View, then using Include() reduces the number of queries that will run against the SQL server.
3. Sometimes it’s OK to not use Entity Framework LINQ queries
By default, Entity Framework generates some pretty interesting queries (interesting at least in the sense that no normal human being would structure a query like this). For example, in SQL I could write:
And the query generated by Entity Framework might look something like this:
Extents? Parameters? Aliases? What’s all of this extra junk in my query? Although the latter SQL statement works, it isn’t the prettiest on the eyes. And this is the example of simple query. Once you start writing complex LINQ queries using Entity Framework that involve the likes of Join(), GroupBy(), etc… the raw SQL that gets generated can get hairy and inefficient. In those cases, I’ve learned that it is perfectly to rely on good old t-sql stored procedures:
At the end of the day, we are using Entity Framework to make our lives easier. If a LINQ query gets too complicated or is generating inefficient queries, it is always fine to fall back on an old standby.