Should You Use Index Hints?

Published Tue 31 July 2018 in SQL > Performance Tuning > Indexing

One of the things that the SQL Server query optimizer does is determine how to retrieve the data requested by your query.

Usually it does a pretty good job, which is a great because if it didn't then we'd be spending most of our days programming sorting and joining algorithms instead of having fun actually working with our data.

Sometimes the query optimizer has a lapse in judgement and createds a less-than-efficient plan, requiring us to step in and save the day.

Index Hints Give You Control

One way to "fix" a poor performing plan is to use an index hint. While we normally have no control over how SQL Server retrieves the data we requested, an index hint forces the query optimizer to use the index specified in the hint to retrieve the data (hence, it's really more of a "command" than a "hint").

Sometimes when I feel like I'm losing control I like using an index hint to show SQL Server who's boss. I occasionally will also use index hints when debugging poor performing queries because it allows me to confirm whether using an alternate index would improve performance without having to overhaul my code or change any other settings.

...But Sometimes That's Too Much Power

While I like using index hints for short-term debugging scenarios, that's about the only time they should be used because they can create some pretty undesirable outcomes.

For example, let's say I have this nice simple query and index here:

CREATE INDEX IX_OwnerUserId_CreationDate_Includes
ON dbo.Posts (OwnerUserId, CreationDate) INCLUDE (AcceptedAnswerId, ClosedDate, CommentCount, FavoriteCount, LastActivityDate);

SELECT
    OwnerUserId,
    AcceptedAnswerId
FROM
    dbo.Posts
WHERE
    OwnerUserId < 1000

This index was specifically created for a different query running on the Posts table, but it will also get used by the simple query above.

Executing this query without any hints causes SQL Server to use it anyway (since it's a pretty good index for the query), and we get decent performance: only 1002 logical reads.

2018-07-30_12-40-12 I wish all of my execution plans were this simple.

Let's pretend we don't trust the SQL Server optimizer to always choose this index, so instead we force it to use it by adding a hint:

SELECT
    OwnerUserId,
    AcceptedAnswerId
FROM
    dbo.Posts WITH (INDEX(IX_OwnerUserId_CreationDate_Includes))
WHERE
    OwnerUserId < 1000

With this hint, the index will perform exactly the same: 1002 logical reads, a good index seek, etc...

But what happens if in the future a better index gets added to the table?

CREATE INDEX IX_OwnerUserId_AcceptedAnswerId_Includes
ON dbo.Posts (OwnerUserId, AcceptedAnswerId) INCLUDE (LastEditorUserId, ParentId);

If we run the query WITHOUT the index hint, we'll see that SQL Server actually chooses this new index because it's smaller and we can get the data we need in only 522 logical reads:

2018-07-30_12-45-02 This execution plan looks the same, but you'll notice the smaller, more data dense index is being used.

If we had let SQL Server do it's job, it would have given us a great performing query! Instead, we decided to intervene and hint (ie. force) it to use a sub-optimal index.

Things Can Get Worse

The above example is pretty benign - sure, without the hint SQL Server would have read about half as many pages, but this isn't a drastic difference in this scenario.

What could be disastrous is if because of the hint, the query optimizer decides to make a totally different plan that isn't nearly as efficient. Or if one day someone drops the hinted index, causing the query with the hint to down right fail:

2018-07-30_12-50-55

Index hints can be nice to use in the short-term for investigating, testing, and debugging. However, they are almost never the correct long-term solution for fixing query performance.

Instead, it's better to look for the root-cause of a poor performing query: maybe you need to rebuild stats on an index or determine if the cardinality estimator being used is not ideal. You might also benefit from rewriting a terribly written query.

Any of these options will likely help you create a better, long-term, flexible solutions rather than forcing SQL Server to use the same hard-coded, potentially sub-optimal index forever.

Pinal Dave Helps Me Fix My Performance Tuning Problems

Published Tue 24 July 2018 in SQL > Performance Tuning

Watch this week's video on YouTube

This week I was fortunate enough to film a video in collaboration with Pinal Dave, the SQL Authority himself. Pinal is creative, hilarious, and kind; making this video with him was A BLAST!

Although the video is a little tongue in cheek, Pinal's recommendations are very real: I've encountered plenty of scenarios where these solutions fixed slow queries. Will these recommendations fix the problem in every situation? Of course not, but they are a great place to start.

Instead of creating a text version of the concepts covered in the video (you should really watch it), I thought it would be fun to do a behind-the-scenes narrative of how the video came together because it is unlike any other project I've done before.

The Idea

After agreeing to make a video together, we tossed around a few ideas. Because we live in different time zones, we thought it would be a fun to do something where I kept waking Pinal up in the middle of the night.

We iterated over what SQL Server examples to use (originally the second example was going to show my queries running out of space because autogrowth being turned off). We also ended up adding another example after my wife suggested that having it build to three scenarios instead of two would be funnier - I agree!

Asynchronous Filming

You've probably already figured it out, but I didn't really wake Pinal up in the video (honestly, I think midnight would be too early to wake him up anyway; in our back and forth emails, I was seeing responses from him that were in the 1-2am range).

I filmed a preliminary version of my parts of the video, very roughly edited them together, and sent it over to Pinal.

He then filmed his segments, giving me lots of great footage (I'm not sure if it was ad-libbed or not, but I was dying of laughter when watching through his clips).

Then I re-filmed my parts to try to match his dialog as closely as possible. Re-filming my parts also allowed me to self-edit and not ramble as much.

Everything Else

After that, it was just the usual process of editing, color correction, audio processing, etc...

I'm happy with how it turned out, especially given all of the technical challenges we had with filming separately.

Major thanks again to Pinal for being supportive and willing to make a fun SQL Server video. Enjoy!

T-SQL Tuesday #104 Roundup

Published Tue 17 July 2018 in SQL > TSqlTuesday

This month's T-SQL Tuesday topic asked "What code would you hate to live without?" Turns out you like using script and code to automate boring, repetitive, and error-prone tasks.

Thank you to everyone who participated; I was nervous that July holidays and summer vacations would stunt turnout, however we wound up with 42 posts!

Watch tsqltuesday.com for next month's topic and consider signing up to host.

Watch this week's video on YouTube

Without further ado, here are this month's entries sorted in random order:

Stuart Moore shares the history behind needing to automate restore testing and writing the SqlAutoRestores PowerShell module to help. Nowadays his commands are found in dbatools. Great example of how a project can evolve through the community.
Arthur Daniels shares his script to identify the key and included columns of indexes in a given table.
Glenn Berry shares his DMV Diagnostic Queries and the story behind how he started developing them back in 2006.
Jason Brimhall links to multiple scripts he's shared in the past as well as a new script for remotely auditing server access to catch infilitraters red-handed.
Doug Purnell talks about how he uses database snapshots and shares some code for how he manages them.
Jay Robinson shares two C# extensions (shout to my fellow devs!): one to check an enum for a value and a second to cleanly handle the lengthy DBNull.Value syntax.
Drew Furgiuele shares how he scripts out his indexes to re-apply after snapshot replication. He then writes very similar functionality using PowerShell in only 6 lines!
Tim Weigel shares which community scripts he uses regularly, as well as sharing his own scripts around replication, stored procedure execution information, and file manipulation.
Hugo Kornelis submitted two posts. The first post shares sp_metasearch which helps with performing impact analysis and the second post follows up with an enhancement he's made to Ola Hallengren's database maintenance scripts to ignore backup BizTalk databases.
Andy Mallon shares his comprehensive script for checking database, file, data, log, etc... sizes. Great explanations of his reasoning for writing the queries the way he did.
Dan Clemens shares his database search script with a switch that includes searching across agent jobs.
Jess Pomfret wrote a script that shows compression stats for database objects. Wanting to run it against a whole instance (or across mulitple servers), she wrote a dbatools command to automate the process.
Kenneth Fisher shows us how he organizes his toolbox using an SSMS solution.
Rob Farley shares code he's written to demonstrate the pain of using NOLOCK.
Steve Jones shares a procedure from Microsoft that he uses for transferring logins and passwords between instances.
Kevin Hill shares two scripts he uses for finding low-hanging index optimization fruit: one that finds queries performing heap or clustered index scans, and another that returns the top 5 missing indexes per database.
Michael Villegas learned that Azure SQL doesn't allow you to graphically show user roles and permissions, so he wrote a script to query those details (works for on-premise SQL Server as well).
Nate Johnson shares scripts that identify if tables are being replicated, whether SSRS subscriptions executed, and how much space certain objects and files are consuming.
William Andrus shares how he uses his search script to find similarly named fields or all instances of a piece of text within a database.
Bert Wagner (me!) I share my template for generating dynamic table-driven code, making queries more adaptable to future changes.
Rudy Rodarte shows us a script he uses for iterating over a date range to use for executing date based queries.
Brent Ozar admits he can't live without sp_Blitz, but this month he shares a script for checking how much plan cache history exists on a server.
Jeff Mlakar offers a solution for taking all databases on an instance offline (and then back online) again.
Erik Darling offers a solution for constructing dynamic SQL so that his MAX variables don't get truncated. He also links to a script for printing long strings in SSMS.
Chrissy LeMaire takes the hard work out of instance to instance migrations by sharing her single-line dbatools command that will do it all for you. She also shares how dbachecks automates manual checklist work.
Glenda Gable mentions two procedures, one that is a high performance cursor rewrite and one that is a robust log shipping solution.
Aaron Bertrand shows us how he discovers undocumented SQL Server features by comparing new builds to the previous versions.
Ryan Desmond writes about his post-install confirguration process and shares code he runs to customize Ola Hallengren's maintenance scripts for his environments.
Josh Simar shares his database file size code that is optimized for "very large databases" that span multiple files and filegroups.
Sander Stad discusses the importance of sharing code and offers a few dbatools commands that he's contributed to or authored around backup testing, log shipping, and SQL Server Agent manipulation.
Andy Levy wrote an SSMS snippet to generate a cursor. Before you chew him out though, he has some really good uses cases for needing to use them.
Andy Yun reveals what's in his T-SQL toolbox and explains his organization strategies for 10+ years of scripts he's collected.
Eduardo Pivaral shares a script he uses to output query results into an HTML table, making it easy to copy into an email.
Raul Gonzalez shows us a versatile script for searching database tables and returning information on attributes such as column name, size, key definitions, and more.
Matthew McGiffen wanted to find the most expensive queries on an instance using Query Store instead of the traditional DMVs, so he wrote a script to do just that.
Daniel Hutmacher shares his beefed up version of sp_help. Includes ASCII art dependency graphs and database search.
Christian Gräfe provides a function he wrote for padding the left-side of a value with zeros.
Adrian Buckman shares his SQLUndercover Inspector HTML reporting tool, as well as scripts for helping to alter AG groups, checking for running jobs, and auditing failed logins.
Louis Davidson shares his technique for using relative positioning in date tables to make querying custom periods (eg. your company's fiscal month) easier.
Lance England shares a PowerShell script to automate generating upsert merge statements for his ETLs.

Building Dynamic Table-Driven Queries

Published Tue 10 July 2018 in SQL > TSqlTuesday

This post is a response to this month's T-SQL Tuesday #104 prompt by me! T-SQL Tuesday is a way for SQL Server bloggers to share ideas about different database and professional topics every month.

This month's topic is asking what code would you hate to live without?

Watch this week's video on YouTube

When given the choice between working on new projects versus maintaining old ones, I'm always more excited to work on something new.

That means that when I build something that is going to used for years to come, I try to build it so that it will require as little maintenance as possible in the future.

One technique I use for minimizing maintenance is making my queries dynamic. Dynamic queries, while not right for every situation, do one thing really well: they allow you to modify functionality without needing a complete rewrite when your data changes. The way I look it, it's much easier to add rules and logic to rows in table than having to modify a table's columns or structure.

To show you what I mean,let's say I want to write a query selecting data from model.sys.database_permissions:

SELECT class
      ,class_desc
      ,major_id
      ,minor_id
      ,grantee_principal_id
      ,grantor_principal_id
      ,type
      ,permission_name
      ,state
      ,state_desc
  FROM model.sys.database_permissions

Writing the query as above is pretty simple, but it isn't flexible in case the table structure changes in the future or if we want to programmatically write some conditions.

Instead of hardcoding the query as above, here is a general pattern I use for writing dynamic table-driven queries. SQL Server has the handy views sys.all_views and sys.all_columns that show information about what columns are stored in each table/view:

2018-07-03_21-00-45

Using these two views, I can use this dynamic SQL pattern to build the same exact query as above:

-- Declare some variables up front
DECLARE 
    @FullQuery nvarchar(max),
    @Columns nvarchar(max),
    @ObjectName nvarchar(128)

-- Build our SELECT statment and schema+table name
SELECT 
    @Columns = COALESCE(@Columns + ', ', '') + '[' + c.[name] + ']',
    @ObjectName = QUOTENAME(s.name) + '.' + QUOTENAME(o.name)
FROM 
    sys.all_views o  
    INNER JOIN sys.schemas s
        ON o.schema_id = s.schema_id
    INNER JOIN sys.all_columns c
        ON o.object_id = c.object_id
WHERE 
    o.[name] = 'database_permissions'
ORDER BY
    c.column_id 

-- Put all of the pieces together an execute
SET @FullQuery = 'SELECT ' + @Columns + ' FROM ' + @ObjectName

EXEC(@FullQuery)

The way building a dynamic statement like this works is that I build my SELECT statement as a string based on the values stored in my all_columns view. If a column is ever added to this view, my dynamic code will handle it (I wouldn't expect this view to change that much in future versions of SQL, but in other real-world tables I can regularly expect changing data).

Yes, writing certain queries dynamically like this means more up front work. It also means some queries won't run to their full potential (not necessarily reusing plans, not tuning every individual query, needing to be thoughtful about SQL injection attacks, etc...). There are A LOT of downsides to building queries dynamically like this.

But dynamically built queries make my systems flexible and drastically reduce the amount of work I have to do down the road. In the next few weeks I hope to go into this type of dynamically built, table-driven process in more detail (so you should see the pattern in the example above get reused soon!).

Code You Would Hate To Live Without (T-SQL Tuesday #104 Invitation)

Published Tue 03 July 2018 in SQL > TSqlTuesday

MJ-t-sql-Tuesday

The recent news about Microsoft acquiring GitHub has me thinking about how amazing it is for us to be part of today's online code community.

Before modern online programming communities, finding good code samples or sharing your own code was challenging. Forums and email lists (if searchable) were good, but beyond that you had to rely on books, coworkers, and maybe a local meetup of like-minded individuals to help you work through your programming problems.

Watch this week's video on YouTube

Today, accessing and using code from the internet is second nature - I almost always first look online to see if a good solution already exists. At the very least, searching blogs, GitHub, and StackOverflow for existing code is a great way to generate ideas.

For this month's T-SQL Tuesday, I want you to write about code you've written that you would hate to live without.

Maybe you built a maintenance script to free up disk space, wrote a query to gather system stats for monitoring, or coded some PowerShell to clean up string data. Your work doesn't need to be completely original either - maybe you've improved the code in some open source project to better solve the problem for your particular situation.

There's probably someone out there in the world who is experiencing the same problem that you have already solved; let's make their life a little easier by sharing.

And don't worry if your code isn't perfect - just explain how your solution works and if you are aware of any caveats. If it's not an exact solution for someone else's problem, at the very least it may help them generate some ideas.

Finally, here's a reminder of the official rules for T-SQL Tuesday:

Publish your contribution on Tuesday, July 10, 2018. Let's use the "it's Tuesday somewhere" rule.
Include the T-SQL Tuesday Logo and have it link to this post.
Please comment below with a link to your post (trackbacks/pingbacks should work too but...comments ensure I don't miss your post)
Tweet about your post using #tsql2sday.
If you'd like to host in the future, contact Adam Machanic.