12 Ways To Rewrite SQL Queries for Better Performance

Published on: 2019-05-28

Watch this week’s video on YouTube. Thanks to you, we just crossed the 2k subscriber mark!

Over the past several week’s I’ve been exploring ways to rewrite queries to improve execution performance.

I learned a lot of these techniques over time from trial an error, attending presentations, reading blog posts, speaking to other dbas and developers, etc… but never knew of a good resource that summarized these techniques in one place.

This post will be a quick round-up of everything I’ve covered so far, as well as 8 additional techniques that I use occasionally but don’t necessarily require a full detailed post to explain them.

Why Rewrite Queries?

I often find myself working in environments where modifying indexes or changing server settings is out of the question when performance tuning. I usually run into these scenarios when dealing with:

  • Vendor databases
  • “Fragile” systems
  • Not enough disk space
  • Limited tooling/ad hoc analysis
  • Features limited by security software

While solving the root cause of a performance problem is always preferable, sometimes the only way I’m able to fix problems in these environments is by rewriting the queries.

I decided to write this summary post because it is a resource I would have loved to have when starting out. Sometimes it can be easy to get “writer’s block” when trying to think of ways to rewrite a SQL query, so hopefully this list of techniques can provide ideas and get your creative juices flowing.

So, without further ado, here is a list of 12 techniques in no particular order that you can use to rewrite your queries to change their performance.

12 Ways to Refactor a Query to Change Performance

1. Window functions vs GROUP BY

Sometimes window functions rely a little too much on tempdb and blocking operators to accomplish what you ask of them. While using them is always my first choice because of their simple syntax, if they perform poorly you can usually rewrite them as an old-fashioned GROUP BY to achieve better performance.

2. Correlated subqueries vs derived tables

Many people like using correlated subqueries because the logic is often easy to understand, however switching to derived table queries often produces better performance due to their set-based nature.

3. IN vs UNION ALL

When filtering rows of data on multiple values in tables with skewed distributions and non-covering indexes, writing your logic into multiple statements joined with UNION ALLs can sometimes generate more efficient execution plans than just using IN or ORs.

4. Temporary Staging Tables

Sometimes the query optimizer struggles to generate an efficient execution plan for complex queries. Breaking a complex query into multiple steps that utilize temporary staging tables can provide SQL Server with more information about your data. They also cause you to write simpler queries which can cause the optimizer to generate more efficient execution plans as well as allow it to reuse result sets more easily.

5. Forcing Table Join Orders

Sometimes outdated statistics and other insufficient information can cause the SQL Server query optimizer to join tables in a less than ideal sequence. Adam Machanic has a fantastic presentation on forcing table join order with blocking operators without having to resort to join hints.

6. DISTINCT with few unique values

Using the DISTINCT operator is not always the fastest way to return the unique values in a dataset. In particular, Paul White uses recursive CTEs to return distinct values on large datasets with relatively few unique values. This is a great example of solving a problem using a very creative solution.

7. Eliminate UDFs

UDFs often cause poor query performance due to forcing serial plans and causing inaccurate estimates. One way to possibly improve the performance of queries that call UDFs is to try and inline the UDF logic directly into the main query. With SQL Server 2019 this will be something that happens automatically in a lot of cases, but as Brent Ozar points out you might occasionally have to manually inline a UDF’s functionality to get the best performance.

8. Create UDFs

Sometimes a poorly configured server will parallelize queries too frequently and cause poorer performance than their serially equivalent plan. In those cases, putting the troublesome query logic into a scalar or multi-statement table-valued function might improve performance since they will force that part of the plan to run serially. Definitely not a best practice, but it is one way to force serial plans when you can’t change the cost threshold for parallelism.

9. Data Compression

Not only does data compression save space, but on certain workloads it can actually improve performance. Since compressed data can be stored in fewer pages, read disk speeds are improved, but maybe more importantly the compressed data allows more to be stored in SQL Server’s buffer pool, increasing the potential for SQL Server to reuse data already in memory.

10. Indexed Views

When you can’t add new indexes to existing tables, you might be able to get away with creating a view on those tables and indexing the view instead. This works great for vendor databases where you can’t touch any of the existing objects.

11. Switch cardinality estimators

The newer cardinality estimator introduced in SQL Server 2014 improves the performance of many queries. However, in some specific cases it can make queries perform more slowly. In those cases, a simple query hint is all you need to force SQL Server to change back to the legacy cardinality estimator.

12. Copy the data

If you can’t get better performance by rewriting a query, you can always copy the data you need to a new table in a location where you CAN create indexes and do whatever other helpful transformations you need to do ahead of time.

…And more

By no means is this list exhaustive. There are so many ways to rewrite queries, and not all of them will work all the time.

The key is to think about what the query optimizer knows about your data and why it’s choosing the plan it is. Once you understand what it’s doing, you can start getting creative with various query rewrites that address that issue.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Temporary Staging Tables

Published on: 2019-05-07

Watch this week’s episode on YouTube.

SQL Server Spool operators are a mixed bag. On one hand, they can negatively impact performance when writing data to disk in tempdb. On the other hand, they allow filtered and transformed result sets to be temporarily staged, making it easier for that data to be reused again during that query execution.

The problem with the latter scenario is that SQL Server doesn’t always decide to use a spool; often it’s happy to re-read (and re-process) the same data repeatedly. When this happens, one option you have is to explicitly create your own temporary staging table that will help SQL Server cache data it needs to reuse.

This post is a continuation of my series to document ways of refactoring queries for improved performance. I’ll be using the StackOverflow 2014 data dump for these examples if you want to play along at home.

No spools

Let’s start by looking at the following query:

WITH January2010Badges AS ( 
	SELECT 
		UserId,
		Name,
		Date
	FROM 
		dbo.Badges 
	WHERE 
		Date >= '2010-01-01' 
		AND Date <= '2010-02-01' 
), Next10PopularQuestions AS ( 
	SELECT TOP 10 * FROM (SELECT UserId, Name, Date FROM January2010Badges WHERE Name = 'Popular Question' ORDER BY Date OFFSET 10 ROWS) t
), Next10NotableQuestions AS ( 
	SELECT TOP 10 * FROM (SELECT UserId, Name, Date FROM January2010Badges WHERE Name = 'Notable Question' ORDER BY Date OFFSET 10 ROWS) t
), Next10StellarQuestions AS ( 
	SELECT TOP 10 * FROM (SELECT UserId, Name, Date FROM January2010Badges WHERE Name = 'Stellar Question' ORDER BY Date OFFSET 10 ROWS) t
)
SELECT UserId, Name FROM Next10PopularQuestions 
UNION ALL 
SELECT UserId, Name FROM Next10NotableQuestions
UNION ALL 
SELECT UserId, Name FROM Next10StellarQuestions 

Note: This is not necessarily the most efficient way to write this query, but it makes for a good demo.

This query is returning offset results for different badges from one month of data in the dbo.Badges table. While the query is using a CTE to make the logic easy to understand (i.e. filter the data to just January 2010 results and then calculate our offsets based on those results), SQL Server isn’t actually saving the results of our January2010Badges expression in tempdb to get reused. If we view the execution plan, we’ll see it reading from our dbo.Badges clustered index three times:

Three clustered index scans
Table 'Badges'. Scan count 27, logical reads 151137, ...

That means every time SQL Server needs to run our offset logic in each “Next10…” expression, it needs to rescan the entire clustered index to first filter on the Date column and then the Name column. This results in about 150,000 logical reads.

Divide and Conquer

One potential solution would be to add a nonclustered index that would allow SQL Server to avoid scanning the entire clustered index three times. But since this series is about improving performance without adding permanent indexes (since sometimes you are stuck in scenarios where you can’t easily add or modify an index), we’ll look at mimicking a spool operation ourselves.

We’ll use a temporary table to stage our filtered January 2010 results so SQL Server doesn’t have to scan the clustered index each time it needs to perform logic on that subset of data. For years I’ve referred to this technique as “temporary staging tables” or “faking spools”, but at a recent SQL Saturday Jeff Moden told me he refers to it as “Divide and Conquer”. I think that’s a great name, so I’ll use it going forward. Thanks Jeff!

First let’s divide our query so that we insert our January 2010 data into its own temporary table:

DROP TABLE IF EXISTS #January2010Badges;
CREATE TABLE #January2010Badges
(
	UserId int,
	Name nvarchar(40),
	Date datetime
	CONSTRAINT PK_NameDateUserId PRIMARY KEY CLUSTERED (Name,Date,UserId)
);

INSERT INTO #January2010Badges
SELECT
	UserId,
	Name,
	Date
FROM 
	dbo.Badges
WHERE 
	Date >= '2010-01-01' 
	AND Date <= '2010-02-01'; 

You’ll notice I added a clustered primary key which will index the data in an order that will make filtering easier.

Next, we conquer by changing the rest of our query to read from our newly created temp table:

WITH Next10PopularQuestions AS ( 
	SELECT TOP 10 * FROM (SELECT UserId, Name, Date FROM #January2010Badges WHERE Name = 'Popular Question' ORDER BY Date OFFSET 10 ROWS) t
), Next10NotableQuestions AS ( 
	SELECT TOP 10 * FROM (SELECT UserId, Name, Date FROM #January2010Badges WHERE Name = 'Notable Question' ORDER BY Date OFFSET 10 ROWS) t
), Next10StellarQuestions AS ( 
	SELECT TOP 10 * FROM (SELECT UserId, Name, Date FROM #January2010Badges WHERE Name = 'Stellar Question' ORDER BY Date OFFSET 10 ROWS) t
)
SELECT UserId, Name FROM Next10PopularQuestions 
UNION ALL 
SELECT UserId, Name FROM Next10NotableQuestions 
UNION ALL 
SELECT UserId, Name FROM Next10StellarQuestions 

Running this all together, we get the following plans and logical read counts:

Clustered index seeks
Table 'Badges'. Scan count 9, logical reads 50379, ...

(42317 rows affected)

(20 rows affected)
Table '#January2010Badges______________________________00000000003B'. Scan count 3, logical reads 12, ...

In this version of the query, SQL Server scans the clustered index a single time and saves that data to a temporary table. In the subsequent SELECTs, it seeks from this much smaller temporary table instead of going back to the clustered index, reducing the total amount of reads to 50379 + 12 = 50392: about a third of what the original query was doing.

Temporary Staged Data

At the end of day, you can hope that SQL Server creates a spool to temporarily stage or data, or you can be explicit about it and do it yourself. Either option is going to increase usage on your tempdb database, but at least by defining the temporary table yourself you can customize and index it to achieve maximum reuse and performance for your queries.

It’s important to note that this is not a technique you want to abuse: writing and reading too much data from tempdb can cause contention problems that can make you worse off than having allowed SQL Server to scan your clustered index three times. However, when implemented sparingly and for good reasons, this technique can greatly improve the performance of certain queries.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

IN vs UNION ALL

Published on: 2019-04-30

Watch this week’s episode on YouTube.

When you need to filter query results on multiple values, you probably use an IN() statement or multiple predicates separated by ORs:

WHERE Col1 IN ('A','B','C')

or

WHERE Col1 = 'A' OR Col1 = 'B' OR Col1 = 'C'

While SQL Server will generate the same query plan for either syntax, there is another technique you can try that can sometimes can improve performance under certain conditions: UNION ALL.

This post is a continuation of my series to document ways of refactoring queries for improved performance. I’ll be using the StackOverflow 2014 data dump for these examples if you want to play along at home.

Lookups and Scans

Let’s say we have the following index on our dbo.Badges table:

CREATE NONCLUSTERED INDEX [IX_Badges] ON [dbo].[Badges] ([Name]) INCLUDE ([UserId]);

Next let’s run these two separate queries:

/* Query 1 */
SELECT 
    Name, UserId, Date 
FROM 
    dbo.Badges 
WHERE 
    Name = 'Benefactor'
OPTION(MAXDOP 1)

/* Query 2 */
SELECT 
    Name, UserId, Date 
FROM 
    dbo.Badges 
WHERE 
    Name = 'Research Assistant'
OPTION(MAXDOP 1)

Note I’m enforcing MAXDOP 1 here to remove any performance differences due to parallelism in these demos.

The nonclustered index doesn’t cover these queries – while SQL Server can seek the index for the Name predicate in the WHERE clause, it can’t retrieve all the columns in the SELECT from the index alone. This leaves SQL Server with a tough choice to make:

  1. Does it scan the whole clustered index to return all the required columns for the rows requested?
  2. Does it seek to the matching records in the nonclustered index and then perform a key lookup to retrieve the remaining data?

So, what does SQL Server decide to do?

Execution plans

For Query 1, SQL Server thinks that reading the entire clustered index and returning only the rows where Name = 'Benefactor' is the best option.

SQL Server takes a different approach for Query 2 however, using the non-covering nonclustered indexes to find the records with Name = 'Research Assistant' and then going to look up the Date values in the clustered index via a Key Lookup

The reason SQL server chooses these two different plans is because it thinks it will be faster to return smaller number of records with a Seek + Key Lookup approach (“Research Assistant”, 127 rows), but faster to return a larger number of records with a Scan (“Benefactor”, 17935 rows).

Kimberly Tripp has an excellent post that defines where this “tipping point” from a key lookup to a clustered index scan typically occurs, but the important thing to keep in mind for this post is that we can sometimes use SQL Server’s ability to switch between these two approaches to our advantage.

Combining Queries with IN

So, what plan does SQL Server generate when we combine our two queries into one?

SELECT 
    Name, UserId, Date 
FROM 
    dbo.Badges 
WHERE 
    Name IN ('Benefactor','Research Assistant')
OPTION(MAXDOP 1)
Key Lookup

Interestingly enough SQL Server decides to retrieve the requested rows from the nonclustered index and then go lookup the remaining Date column in the clustered index.

If we look at the page reads (SET STATISTICS IO ON;) we’ll see SQL Server had to read 85500 pages to return the data requested:

(18062 rows affected)
Table 'Badges'. Scan count 2, logical reads 85500, physical reads 20, read-ahead reads 33103, ...

Without correcting our index to include the Date column, is there some way we can achieve the same results with better performance?

UNION ALL

In this case it’s possible to rewrite our query logic to use UNION ALL instead of IN/ORs:

SELECT 
    Name,UserId,Date 
FROM 
    dbo.Badges 
WHERE 
    Name = 'Benefactor' 
UNION ALL
SELECT 
    Name,UserId,Date 
FROM 
    dbo.Badges 
WHERE 
    Name = 'Research Assistant'
OPTION(MAXDOP 1)
UNION ALL execution plan

We get the same exact results through a hybrid execution plan.

In this case, our plan mirrors what SQL Server did when running our original two queries separately:

  • The rows where Name = 'Benefactor' are returned by scanning the clustered index.
  • The nonclustered index is seeked with clustered index lookups for the Name = 'Research Assistant' records.

Looking at the IO statistics for this UNION ALL query:

(18062 rows affected)
Table 'Badges'. Scan count 2, logical reads 50120, physical reads 6, read-ahead reads 49649, ...

Even though this query reads the whole clustered index to get the Benefactor rows, the total number of logical reads is still smaller than the seek/key lookup pattern seen in the combined query with IN(). This UNION ALL version gives SQL Server the ability to build a hybrid execution plan, combining two different techniques to generate a plan with fewer overall reads.

IN or UNION ALL?

There’s no way to know for sure without trying each variation.

But if you have a slow performing query that is filtering on multiple values within a column, it might be worth trying to get SQL Server to use a different plan by rewriting the query.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!