12 Ways To Rewrite SQL Queries for Better Performance

Watch this week's video on YouTube

Over the past several week's I've been exploring ways to rewrite queries to improve execution performance.

I learned a lot of these techniques over time from trial an error, attending presentations, reading blog posts, speaking to other dbas and developers, etc... but never knew of a good resource that summarized these techniques in one place.

This post will be a quick round-up of everything I've covered so far, as well as 8 additional techniques that I use occasionally but don't necessarily require a full detailed post to explain them.

Why Rewrite Queries?

I often find myself working in environments where modifying indexes or changing server settings is out of the question when performance tuning. I usually run into these scenarios when dealing with:

  • Vendor databases
  • "Fragile" systems
  • Not enough disk space
  • Limited tooling/ad hoc analysis
  • Features limited by security software

While solving the root cause of a performance problem is always preferable, sometimes the only way I'm able to fix problems in these environments is by rewriting the queries.

I decided to write this summary post because it is a resource I would have loved to have when starting out. Sometimes it can be easy to get "writer's block" when trying to think of ways to rewrite a SQL query, so hopefully this list of techniques can provide ideas and get your creative juices flowing.

So, without further ado, here is a list of 12 techniques in no particular order that you can use to rewrite your queries to change their performance.

12 Ways to Refactor a Query to Change Performance

1. Window functions vs GROUP BY

Sometimes window functions rely a little too much on tempdb and blocking operators to accomplish what you ask of them. While using them is always my first choice because of their simple syntax, if they perform poorly you can usually rewrite them as an old-fashioned GROUP BY to achieve better performance.

2. Correlated subqueries vs derived tables

Many people like using correlated subqueries because the logic is often easy to understand, however switching to derived table queries often produces better performance due to their set-based nature.

3. IN vs UNION ALL

When filtering rows of data on multiple values in tables with skewed distributions and non-covering indexes, writing your logic into multiple statements joined with UNION ALLs can sometimes generate more efficient execution plans than just using IN or ORs.

4. Temporary Staging Tables

Sometimes the query optimizer struggles to generate an efficient execution plan for complex queries. Breaking a complex query into multiple steps that utilize temporary staging tables can provide SQL Server with more information about your data. They also cause you to write simpler queries which can cause the optimizer to generate more efficient execution plans as well as allow it to reuse result sets more easily.

5. Forcing Table Join Orders

Sometimes outdated statistics and other insufficient information can cause the SQL Server query optimizer to join tables in a less than ideal sequence. Adam Machanic has a fantastic presentation on forcing table join order with blocking operators without having to resort to join hints.

6. DISTINCT with few unique values

Using the DISTINCT operator is not always the fastest way to return the unique values in a dataset. In particular, Paul White uses recursive CTEs to return distinct values on large datasets with relatively few unique values. This is a great example of solving a problem using a very creative solution.

7. Eliminate UDFs

UDFs often cause poor query performance due to forcing serial plans and causing inaccurate estimates. One way to possibly improve the performance of queries that call UDFs is to try and inline the UDF logic directly into the main query. With SQL Server 2019 this will be something that happens automatically in a lot of cases, but as Brent Ozar points out you might occasionally have to manually inline a UDF's functionality to get the best performance.

8. Create UDFs

Sometimes a poorly configured server will parallelize queries too frequently and cause poorer performance than their serially equivalent plan. In those cases, putting the troublesome query logic into a scalar or multi-statement table-valued function might improve performance since they will force that part of the plan to run serially. Definitely not a best practice, but it is one way to force serial plans when you can't change the cost threshold for parallelism.

9. Data Compression

Not only does data compression save space, but on certain workloads it can actually improve performance. Since compressed data can be stored in fewer pages, read disk speeds are improved, but maybe more importantly the compressed data allows more to be stored in SQL Server's buffer pool, increasing the potential for SQL Server to reuse data already in memory.

10. Indexed Views

When you can't add new indexes to existing tables, you might be able to get away with creating a view on those tables and indexing the view instead. This works great for vendor databases where you can't touch any of the existing objects.

11. Switch cardinality estimators

The newer cardinality estimator introduced in SQL Server 2014 improves the performance of many queries. However, in some specific cases it can make queries perform more slowly. In those cases, a simple query hint is all you need to force SQL Server to change back to the legacy cardinality estimator.

12. Copy the data

If you can't get better performance by rewriting a query, you can always copy the data you need to a new table in a location where you CAN create indexes and do whatever other helpful transformations you need to do ahead of time.

...And more

By no means is this list exhaustive. There are so many ways to rewrite queries, and not all of them will work all the time.

The key is to think about what the query optimizer knows about your data and why it's choosing the plan it is. Once you understand what it's doing, you can start getting creative with various query rewrites that address that issue.

Pinal Dave Helps Me Fix My Performance Tuning Problems

Watch this week's video on YouTube

This week I was fortunate enough to film a video in collaboration with Pinal Dave, the SQL Authority himself.  Pinal is creative, hilarious, and kind; making this video with him was A BLAST!

Although the video is a little tongue in cheek, Pinal's recommendations are very real: I've encountered plenty of scenarios where these solutions fixed slow queries.  Will these recommendations fix the problem in every situation?  Of course not, but they are a great place to start.

Instead of creating a text version of the concepts covered in the video (you should really watch it), I thought it would be fun to do a behind-the-scenes narrative of how the video came together because it is unlike any other project I've done before.

The Idea

After agreeing to make a video together, we tossed around a few ideas.  Because we live in different time zones, we thought it would be a fun to do something where I kept waking Pinal up in the middle of the night.

We iterated over what SQL Server examples to use (originally the second example was going to show my queries running out of space because autogrowth being turned off).  We also ended up adding another example after my wife suggested that having it build to three scenarios instead of two would be funnier - I agree!

Asynchronous Filming

You've probably already figured it out, but I didn't really wake Pinal up in the video (honestly, I think midnight would be too early to wake him up anyway; in our back and forth emails, I was seeing responses from him that were in the 1-2am range).

I filmed a preliminary version of my parts of the video, very roughly edited them together, and sent it over to Pinal.

He then filmed his segments, giving me lots of great footage (I'm not sure if it was ad-libbed or not, but I was dying of laughter when watching through his clips).

Then I re-filmed my parts to try to match his dialog as closely as possible.  Re-filming my parts also allowed me to self-edit and not ramble as much.

Everything Else

After that, it was just the usual process of editing, color correction, audio processing, etc...

I'm happy with how it turned out, especially given all of the technical challenges we had with filming separately.

Major thanks again to Pinal for being supportive and willing to make a fun SQL Server video.  Enjoy!

How To Make Your Queries Perform Like They Used To

chad-kirchoff-202730

Photo by Chad Kirchoff on Unsplash

Watch this week's video on YouTube

In the ideal world, you fully test how your SQL Server will handle upgrading to the latest version.  You're able to catch and mitigate any performance surprises early on.

In the real world, you might not be able to perform a good test before upgrading.  You only learn about poor performance once you upgrade in production.

Does this scenario sound familiar?  It's what I've been thinking about recently as I have to deal with query performance issues after upgrading to SQL Server 2016.

Coming from SQL Server 2012, I know the new cardnality estimator added in 2014 is a major source of the problems I'm experiencing.  While the new cardinality estimator improves performance of some queries, it has also made made some of my queries take hours to run instead of seconds.

Long-term, the solution is to revisit these queries, their stats, and the execution plans being generated for them to see what can be rewritten for better performance.

But ain't nobody got time for that (at least when facing performance crippling production issues).

Short-term, put a band-aid on to stop the bleeding.

I could change the compatibility mode of the database to revert back to SQL Server 2012 (before the new cardinality estimator was introduced), but then I miss out on being able to use all of the new SQL Server 2016 improvements just because a few queries are misbehaving.

I could enable trace flag 9481 to have my queries use the old cardinality estimator, however as a developer I probably don't have access to play with trace flags (and for good reason).

Starting with 2016 SP1, what I can do is use the legacy cardinality estimator query hint:

OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));

This hint is great because it doesn't require developers to have any special permissions.  It also allows SQL to use the old cardinality estimator for poor performing queries only - the rest of the server gets to benefit from the improvements brought on by the new cardinality estimator.

With time, I can revisit the queries that are using this hint to try to improve their performance, but in the mean time it's a great way to fix regressed query performance due to the new cardinality estimator.