SQL Server 2019 Feature Power Rankings

Published on: 2019-10-29

Watch me comment on all of the features in this week’s video.

With the release of SQL Server 2019 imminent, I thought it’d be fun to rank which features I am most looking forward to in the new release.

(Also, I needed a lighter blogging week since I’m busy finishing preparing for my two sessions at PASS Summit next week – hope to see you there!).

SQL Server 2019 Feature rankings

I decided to rank these features on two axes: Excitement and Priority

Excitement is easy to describe: how excited I am about using these features. In my case, excitement directly correlates with performance and developer usability improvements. That doesn’t mean “Low Excitement” features aren’t beneficial; on the contrary, many are great, they just don’t top my list (it wouldn’t be fun to have a quadrant with everything in the top right).

Priority is how quickly I’ll work on implementing or tuning these features. The truth is that some of these features will work automatically once a SQL Server instance is upgraded, while some will require extra work (ie. query rewriting, hardware config). Once again, “Low Priority” features aren’t bad, they just won’t be the features that I focus on first.

Finally, these rankings are based on Microsoft’s descriptions of these features and what little tinkering I’ve done with pre-releases of SQL Server 2019. As far as I know, this chart will totally change once I start using these features regularly in production environments.

And here are my rankings in list form in case that’s more your style:

High Excitement, High Priority

  • Scalar function inlining
  • Memory grant feedback
  • sys.dm_exec_query_plan_stats
  • Accelerated Database Recovery
  • Table Variable deferred compilation

High Excitement, Low Priority

  • Big Data Clusters
  • Polybase all the things
  • Enhancements to running on Windows, Linux, and containers

Low Excitement, High Priority

  • Batch mode on rowstore indexes
  • Index encrypted columns
  • Optimize for sequential key
  • Useful truncation error messages

Low Excitement, Low Priority

  • New graph functions
  • Java language extension

What are you most excited for in 2019? What features did I miss? Disagree with where something should be ranked? Let me know in the comments below.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Are Stored Procedures Faster Than Stand-Alone Queries?

Published on: 2019-10-15

Watch this week’s episode on YouTube.

A few months ago I was presenting for a user group when someone asked the following question:

Does a query embedded in a stored procedure execute faster than that same query submitted to SQL Server as a stand alone statement?

The room was pretty evenly split on the answer: some thought the stored procedures will always perform faster while others thought it wouldn’t really matter.

In short, the answer is that the query optimizer will treat a query defined in a stored procedure exactly the same as a query submitted on its own.

Let’s talk about why.

Start with a Plan

While submitting an “EXEC <stored procedure>” statement to SQL Server may require fewer packets of network traffic than submitting the several hundred (thousands?) lines that make up the query embedded in the procedure itself, that is where the efficiencies of a stored procedure end*.

*NOTE: There are certain SQL Server performance features, like temporary object cachingnatively compiled stored procedures for optimized tables, etc… that will improve the performance of a stored procedure over an ad hoc query. However in my experience, most people aren’t utilizing these types of features so it’s a moot point.

After receiving the query, SQL Server’s query optimizer looks at these two submitted queries exactly the same. It will check to see if a cached plan already exists for either query (and if one does, it will use that), otherwise it will send both queries through the optimization process to find a suitable execution plan. If the standalone query and the query defined in the stored procedure are exactly the same, and all other conditions on the server are exactly the same at the time of execution, SQL Server will generate the same plans for both queries.

To prove this point, let’s look at the following query’s plan as well as the plan for a stored procedure containing that same query:

CREATE OR ALTER PROCEDURE dbo.USP_GetUpVotes
	@UserId int
AS
SELECT  
    COUNT(*) AS UpVotes 
FROM 
	dbo.Posts p
    INNER JOIN Votes v
		ON v.PostId = p.Id 
WHERE 
    p.OwnerUserId = @UserId
	and VoteTypeId = 2
ORDER BY UpVotes DESC


EXEC dbo.USP_GetUpVotes 23
DECLARE @UserId int = 23

SELECT 
    COUNT(*) AS UpVotes 
FROM 
	dbo.Posts p
    INNER JOIN Votes v
		ON v.PostId = p.Id 
WHERE 
    p.OwnerUserId = @UserId
	and VoteTypeId = 2
ORDER BY UpVotes DESC
Execution Plan for both stored procedure and ad hoc query
I didn’t include a screenshot of the second plan because it is identical.

As you can see, the optimizer generates identical plans for both the standalone query and the stored procedure. In the eyes of SQL Server, both of these queries will be executed in exactly the same way.

But I Swear My Stored Procedures Run Faster!

I think that a lot of the confusion for thinking that stored procedures execute faster comes from caching.

As I wrote about a little while back, SQL Server is very particular about needing every little detail about a query to be exactly the same in order for it to reuse its cached plan. This includes things like white space and case sensitivity.

It is much less likely that a query inside of a stored procedure will change compared to a query that is embedded in code. Because of this, it’s probably more likely that your stored procedure plans are being ran from cached plans while your individually submitted query texts may not be utilizing the cache. Because of this, the stored procedure may in fact be executing faster because it was able to reuse a cached plan. But this is not a fair comparison – if both plans would pull from the cache, or if both plans had to generate new execution plans, they would both have the same execution performance.

So does it matter if I use stored procedures or not?

So while in the majority of cases a standalone query will perform just as quickly as that same query embedded in a store procedure I still think it’s better to use stored procedures when possible.

First, embedding your query inside of a stored procedure increases the likelihood that SQL Server will reuse that query’s cached execution plan as explained above.

Secondly, using stored procedures is cleaner for organization, storing all of your database logic in one location: the database itself.

Finally, and most importantly, using stored procedures gives your DBA better insight into your queries. Storing a query inside of a stored procedure means your DBA can easily access and analyze it, offering suggestions and advice on how to fix it in case it is performing poorly. If your queries are all embedded in your apps instead, it makes it harder for the DBA to see those queries, reducing the likelihood that they will be able to help you fix your performance issues in a timely manner.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Whitespace, Letter Case, and Other Things That Prevent Plan Reuse

Published on: 2019-09-03

Watch this week’s episode on YouTube.

Last week’s post briefly mentioned that SQL server may not reuse a query plan from cache if there is a small difference in whitespace.

In addition to differences in whitespace, SQL Server won’t reuse query plans for a variety of reasons. Today’s post will cover some of the more common scenarios of when a plan won’t be reused.

Viewing the Cache

One way to view what’s in SQL Server’s query plan cache is by querying the sys.db_exec_query_stats DMV. Combined with the sys.dm_exec_sql_text and sys.dm_exec_query_plan functions, we can get an idea of what’s in the cache and how often it’s getting reused:

SELECT
	s.sql_handle,
	s.plan_generation_num,
	s.plan_handle,
	s.execution_count,
	s.query_hash,
	s.query_plan_hash,
	t.text,
	p.query_plan
FROM 
	sys.dm_exec_query_stats s
	CROSS APPLY sys.dm_exec_sql_text(sql_handle) t
	CROSS APPLY sys.dm_exec_query_plan(plan_handle) p

Now that we know how to view information about the query plan cache, let’s run DBCC FREEPROCCACHE to clear it out so we can start with a clean slate (warning: probably a bad idea to run this in production).

Patterns That Prevent Plan Reuse

With our plan cache empty, let’s run some similar queries and see if SQL Server decides to reuse plans, or insert nearly duplicate entries into the plan cache.

Whitespace

If we run these two queries:

SELECT TOP 100 DisplayName FROM dbo.Users;
GO
SELECT TOP 100 DisplayName FROM     dbo.Users;
GO

And then go check our query plan cache:

Two separate plans due to whitespace

We’ll notice that SQL Server compiled two separate plans for what is essentially the same exact query (the only difference being the extra spaces before the table name). What this means is that if your queries are not exactly the same, or if you are in the habit of highlighting queries in SSMS before you run them (and accidently select some extra white space before/after your query), SQL Server will generate a brand new plan for you even though it has a compiled plan for a nearly identical query already in cache.

Don’t think this can happen to you? While filming the video for this blog post, I accidently made this exact mistake even though I had reviewed my code and I was intentionally trying to show you what not to do. It’s easy to slip up. Check out the blooper at 2:25 if interested.

Letter Case

Differences in letter casing also prevent cached plan reuse:

SELECT TOP 100 DisplayName FROM dbo.Users;
GO
select TOP 100 DisplayName from dbo.Users;
GO
Different case

In this case, only our SELECT and FROM keywords have changed from upper to lower case, but in the eyes of SQL Server these queries are now different and it will generate plans individually for them.

Comments

Want to improve the clarity of your business logic with a comment? Great! But watch out if running the query with and without comments because SQL Server will consider these queries to be different:

SELECT TOP 100 DisplayName FROM dbo.Users;
GO
/* This query returns all user names */
SELECT TOP 100 DisplayName FROM dbo.Users;
GO
Comments

Schema Names

Running the following two queries:

SELECT TOP 100 DisplayName FROM dbo.Users;
GO
SELECT TOP 100 DisplayName FROM Users;
GO
Unspecified schemas

These two queries create two separate entries in the plan cache. The problem here is that every user can have a different default schema, so if you don’t prefix a table name with a schema, SQL Server won’t be able to guarantee that each execution of the query should run for the same exact table, forcing it to generate new plans and prevent reuse.

SET Options

If you have two queries running with two different configurations of SET options, the queries will also generate separate plans:

SET ANSI_PADDING ON;
GO
SELECT TOP 100 DisplayName FROM dbo.Users;
GO
SET ANSI_PADDING OFF;
GO
SELECT TOP 100 DisplayName FROM dbo.Users;
GO
Two plans for different set options

Simple Parameterization Datatypes

If a query is simple enough for SQL Server to parameterize it on its own, we’ll see the parameterize query usage count increase since SQL Server can reuse the query:

SELECT DisplayName FROM dbo.Users WHERE Id = 3;
GO
SELECT DisplayName FROM dbo.Users WHERE Id = 4;
GO
Two queries, one cached entry

However, you’ll notice that SQL Server defined the parameterized value as tinyint. If our query suddenly sends in a value larger than 255:

SELECT DisplayName FROM dbo.Users WHERE Id = 256;
GO
New @smallint parameter

We’ll see a new plan generated and stored in cache.

What is particularly interesting is that if we have differences in whitespace or comments in queries that can be simply parameterized, SQL Server will actually correct for this and be able to reuse the same plan even though queries without simple parameterization would fail to reuse the same plan:

SELECT DisplayName FROM dbo.Users WHERE Id = 3;
GO
SELECT DisplayName FROM dbo.Users WHERE Id =         3;
GO
SELECT DisplayName FROM dbo.Users WHERE             Id = 3;
GO
SELECT DisplayName /*comment*/ FROM dbo.Users WHERE             Id = 3;
GO
All four queries, one cached plan entry

Be Consistent

In conclusion, SQL Server is very particular about the differences in your submitted queries, no matter how small those differences may be. If you want SQL Server to be able to reuse plans from its cache, you need to make sure those queries are exactly the same.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!