Are Stored Procedures Faster Than Stand-Alone Queries?

Published on: 2019-10-15

Watch this week’s episode on YouTube.

A few months ago I was presenting for a user group when someone asked the following question:

Does a query embedded in a stored procedure execute faster than that same query submitted to SQL Server as a stand alone statement?

The room was pretty evenly split on the answer: some thought the stored procedures will always perform faster while others thought it wouldn’t really matter.

In short, the answer is that the query optimizer will treat a query defined in a stored procedure exactly the same as a query submitted on its own.

Let’s talk about why.

Start with a Plan

While submitting an “EXEC <stored procedure>” statement to SQL Server may require fewer packets of network traffic than submitting the several hundred (thousands?) lines that make up the query embedded in the procedure itself, that is where the efficiencies of a stored procedure end*.

*NOTE: There are certain SQL Server performance features, like temporary object cachingnatively compiled stored procedures for optimized tables, etc… that will improve the performance of a stored procedure over an ad hoc query. However in my experience, most people aren’t utilizing these types of features so it’s a moot point.

After receiving the query, SQL Server’s query optimizer looks at these two submitted queries exactly the same. It will check to see if a cached plan already exists for either query (and if one does, it will use that), otherwise it will send both queries through the optimization process to find a suitable execution plan. If the standalone query and the query defined in the stored procedure are exactly the same, and all other conditions on the server are exactly the same at the time of execution, SQL Server will generate the same plans for both queries.

To prove this point, let’s look at the following query’s plan as well as the plan for a stored procedure containing that same query:

CREATE OR ALTER PROCEDURE dbo.USP_GetUpVotes
	@UserId int
AS
SELECT  
    COUNT(*) AS UpVotes 
FROM 
	dbo.Posts p
    INNER JOIN Votes v
		ON v.PostId = p.Id 
WHERE 
    p.OwnerUserId = @UserId
	and VoteTypeId = 2
ORDER BY UpVotes DESC


EXEC dbo.USP_GetUpVotes 23
DECLARE @UserId int = 23

SELECT 
    COUNT(*) AS UpVotes 
FROM 
	dbo.Posts p
    INNER JOIN Votes v
		ON v.PostId = p.Id 
WHERE 
    p.OwnerUserId = @UserId
	and VoteTypeId = 2
ORDER BY UpVotes DESC
Execution Plan for both stored procedure and ad hoc query
I didn’t include a screenshot of the second plan because it is identical.

As you can see, the optimizer generates identical plans for both the standalone query and the stored procedure. In the eyes of SQL Server, both of these queries will be executed in exactly the same way.

But I Swear My Stored Procedures Run Faster!

I think that a lot of the confusion for thinking that stored procedures execute faster comes from caching.

As I wrote about a little while back, SQL Server is very particular about needing every little detail about a query to be exactly the same in order for it to reuse its cached plan. This includes things like white space and case sensitivity.

It is much less likely that a query inside of a stored procedure will change compared to a query that is embedded in code. Because of this, it’s probably more likely that your stored procedure plans are being ran from cached plans while your individually submitted query texts may not be utilizing the cache. Because of this, the stored procedure may in fact be executing faster because it was able to reuse a cached plan. But this is not a fair comparison – if both plans would pull from the cache, or if both plans had to generate new execution plans, they would both have the same execution performance.

So does it matter if I use stored procedures or not?

So while in the majority of cases a standalone query will perform just as quickly as that same query embedded in a store procedure I still think it’s better to use stored procedures when possible.

First, embedding your query inside of a stored procedure increases the likelihood that SQL Server will reuse that query’s cached execution plan as explained above.

Secondly, using stored procedures is cleaner for organization, storing all of your database logic in one location: the database itself.

Finally, and most importantly, using stored procedures gives your DBA better insight into your queries. Storing a query inside of a stored procedure means your DBA can easily access and analyze it, offering suggestions and advice on how to fix it in case it is performing poorly. If your queries are all embedded in your apps instead, it makes it harder for the DBA to see those queries, reducing the likelihood that they will be able to help you fix your performance issues in a timely manner.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Joining on NULLs

Published on: 2019-03-26

Watch this week’s episode on YouTube

It’s important to be aware of columns that allow NULL values since SQL Server may handle NULLs differently than you might expect.

Today I want to look at what things to consider when joining on columns containing NULL values.

Natural, Composite, NULLable keys

Let’s pretend we have an Account table containing the accounts of various users and an AccountType table describing the different types of accounts:

Account and AccountType tables

These tables have the unfortunate design characteristics of:

  1. They use a natural, composite key of YearOpened and AccountType
  2. NULL is the valid default for AccountType

Not that either of the above attributes are outright bad, just that we need to handle them appropriately. For example, if we want to bring back a description of each user’s account, we might write a query with an inner join like this:

SELECT
    a.UserId,
    at.YearOpened,
    at.AccountType,
    at.Description
FROM
    dbo.Account a
    INNER JOIN dbo.AccountType at
        ON a.YearOpened = at.YearOpened
        AND a.AccountType = at.AccountType

Only to discover the rows with NULLs are not present:

Inner join with no NULLs

Joining on NULLs

Since it’s not possible to join on NULL values in SQL Server like you might expect, we need to be creative to achieve the results we want.

One option is to make our AccountType column NOT NULL and set some other default value. Another option is to create a new column that will act as a surrogate key to join on instead.

Both of the above options would fix the problem at the source, but what about if we can only make changes to our queries?

One common approach is to convert the NULLs to some other non-NULL value using a function like COALESCE or ISNULL:

SELECT
    a.UserId,
    at.YearOpened,
    at.AccountType,
    at.Description
FROM
    dbo.Account a
    INNER JOIN dbo.AccountType at
        ON a.YearOpened = at.YearOpened
        AND ISNULL(a.AccountType,'`') = ISNULL(at.AccountType,'`')
Correct results with ISNULL

While this returns the results we want, there are two major issues with this approach:

  1. In the above example we converted NULLs to the ` character. If we had a valid ` character in our data, we would get logically incorrect joins.
  2. Our query can no longer perform index seeks.

The first issue isn’t a huge deal if you can guarantee the character you are replacing NULLs with will never appear in the column of data.

The second issue is more important since ISNULL prevents your query from being SARGable and will cause poor performance on large tables of data.

SARG Killer

Those Compute Scalar operators are forcing SQL Server to Scan the indexes and compute a value for every row.

A More Efficient Solution

If using a function like ISNULL hurts the performance of our queries, what can we do instead?

SELECT
    a.UserId,
    at.YearOpened,
    at.AccountType,
    at.Description
FROM
    dbo.Account a
    INNER JOIN dbo.AccountType at
        ON a.YearOpened = at.YearOpened
        AND (a.AccountType = at.AccountType OR (a.AccountType IS NULL AND at.AccountType IS NULL))
Correct results with efficiency

This produces the same exact results while allowing SQL Server to Seek when possible and avoid costly row by row computations:

There are no seeks here since I don’t have any additional filters, but the lack of Compute Scalar operators should be enough to prove the point.

While there are a few more variations that can achieve the same results using different execution plans (writing a query that joins non-nulls and unioning it with a query that selects only the nulls, using a computed column to convert the NULLs to non-null values, etc…) the key to good performance is to choose a solution that will not force SQL Server to compute values for every single row.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Optimizing for Ad Hoc Workloads

Published on: 2019-01-22

Watch this week’s episode on YouTube.

The execution plan cache is a great feature: after SQL Server goes through the effort of generating a query plan, SQL Servers saves that plan in the plan cache to be reused again at a later date.

One downside to SQL Server caching almost all plans by default is that some of those plans won’t ever get reused. Those single use plans will exist in the plan cache, inefficiently tying up a piece of the server’s memory.

Today I want to look at a feature that will keep these one-time use plans out of the plan cache.

Plan Stubs

Instead of filling the execution plan cache with plans that will never get reused, the optimize for ad hoc workloads option will cache a plan stub instead of the full plan. The plan stub is significantly smaller in size and is only replaced with the full execution plan when SQL Server recognizes that the same query has executed multiple times.

This reduces the amount of size one-time queries take up in t he cache, allowing more reusable plans to remain in the cache for longer periods of time.

Enabling this server-level feature is as easy as (a database scoped versions :

sp_configure 'show advanced options',1
GO
reconfigure
GO	
sp_configure 'optimize for ad hoc workloads',1
GO
reconfigure
go

Once enabled you can watch the plan stub take up less space in the cache:

-- Run each of these queries once
DECLARE @Username varchar = 'A'
SELECT UserName 
FROM IndexDemos.dbo.[User] 
WHERE UserName like @Username+'%';
GO DECLARE @Username varchar = 'B' SELECT UserName FROM IndexDemos.dbo.[User] WHERE UserName like @Username+'%';
GO SELECT cp.cacheobjtype, cp.objtype, cp.plan_handle, cp.size_in_bytes, qp.query_plan, st.text FROM sys.dm_exec_cached_plans cp CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp INNER JOIN sys.dm_exec_query_stats qs ON cp.plan_handle = qs.plan_handle CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st WHERE st.text like 'DECLARE @Username varchar =%';
424 bytes each, these plan stubs are tiny!

Now if we run our second query filtering on UserName LIKE ‘B%’ again and then check the plan cache, we’ll notice the stub is replaced with an actual compiled plan:

This super simple compiled plan takes up significantly more space. Multiple by several thousand user queries and your plan cache will be quickly filling up.

The downside to plan stubs is that they add some cpu load  to our server: each query gets compiled twice before it gets reused from cache.  However, since plan stubs reduce the size of our plan cache, this allows more reusable queries to be cached for longer periods of time.

Great! All my cache problems will be solved

Not necessarily.

If your workload truly involves lots of ad hoc queries (like many analysts all working on different problems or dynamic SQL that’s generating completely different statements on every execution), enabling Optimize for Ad hoc Workloads may be your best option (Kimberly Tripp also has a great alternative: clearing single use plans automatically on a schedule).

However, often times single-use query plans have a more nefarious origin: unparameterized queries. In this case, enabling Optimize for Ad hoc Workloads may not negatively impact your server, but it certainly won’t help. Why? Because those original queries will still be getting generated.

Brent Ozar has a good overview of why this happens, but the short answer is to force parameterization on your queries. When you enable force parameterization, SQL Server will not automatically parameterize your queries if they aren’t already, reducing the number of one off query plans in your cache.

Whether you are dealing with too many single use queries on your server or some other problem, just remember to find the root cause of the problem instead of just treating the symptoms.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!