Visualizing Nested Loops Joins And Understanding Their Implications

Published on: 2018-12-11

Watch this week’s episode on YouTube.

What Physical Join Operators Tell Us

Everyone has their own method of reading an execution plan when performance tuning a slow SQL query.  One of the first things I like to look at are what kind of join operators are being used:

These three little icons may not seem like the most obvious place to begin troubleshooting a slow query, but with larger plans especially I like starting with a quick glance at the join operators because they allow you to infer a lot about what SQL Server thinks about your data.

This will be a three part series where we’ll learn how each join algorithm works and what they can reveal about our upstream execution plan operators.

Nested Loops Join

Nested loops joins work like this: SQL Server takes the first value from our first table (our “outer” table – by default SQL Server decides for us which table of the two this will be), and compares it to every value in our second “inner” table to see if they match. 

Once every inner value has been checked, SQL Server moves to the next value in the outer table and the process repeats until every value from our outer table has been compared to every value in our inner table.

This description is a worst case example of the performance of a nested loop join.  Several optimizations exist that can make the join more efficient.  For example, if the inner table join values are sorted (because of an index you created or a spool that SQL Server created), SQL Server can process the rows much faster:

In the above animation, SQL Server has the inner input data sorted, allowing it to seek directly to the rows it needs, reducing the total number of comparisons that need to be made.

For more in-depth explanations of the internals and optimizations of nested loops joins, I recommend reading this post by Craig Freedman as well as Hugo Kornelis’s reference on nested loops.

What Do Nested Loops Joins Reveal?

Knowing the internals of how a nested loops join works allows us to infer what the optimizer thinks about our data and the join’s upstream operators, helping us focus our performance tuning efforts. 

Here are a few scenarios to consider the next time you see a nested loops join being used in your execution plan:

  • Nested loops joins are CPU intensive; at worst, every row needs to be compared to every other row and this can take some time.  This means when you see a nested loops join, SQL Server probably thinks that one of the two inputs is relatively small.
    • … and if one of the inputs is relatively small, great!  If instead you see upstream operators that are moving large amounts of data, you may have a estimation problem going on in this area of the plan and may need to update stats/add indexes/refactor the query to have SQL Server provide better estimates (and maybe a more appropriate join).
  • Nested loops sometimes accompany RID or key lookups.  I always check for one of these because they often leave room for some performance improvements:
    • If a RID lookup exists, it’s usually easy enough to add a clustered index to that underlying table to squeeze out some extra performance.
    • If either RID or key lookup exist, I always check what columns are being returned to see if a smaller index could be used instead (by including a column in a key/column of an existing index) or if the query can be refactored to not bring back those columns (eg. get rid of the SELECT *).
  • Nested loops joins do not require data to be sorted on input.  However, performance can improve with a sorted inner data source (see animation above), and SQL Server might choose a more efficient operator if the inputs are both sorted. 
    • At the very least, nested loops joins make me think to check whether the input data isn’t sorted because of some upstream transformations, or because of missing indexes.

So while nested loops in your plans will always require more investigation, looking at them and the operators around them can provide some good insight into what SQL Server thinks about your data.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Two Words For Faster Scalar Functions

Published on: 2018-12-04

Watch this week’s episode on YouTube.

WITH SCHEMABINDING optimizations were added all the way back in SQL Server 2005.  So why bother talking about them in 2018?

Because no one is taking advantage of them!

Ok, maybe that’s a little unfair.  I’m sure some people advantage of the optimizations, but most code I see posted online that could benefit doesn’t include it.  So let’s talk about an easy way for some of our non-data-accessing scalar functions to get a performance boost.


When you create a function or view, you can add the WITH SCHEMABINDING option to prevent any database objects that the view/function uses from being modified.  This is a pretty cool feature which prevents you from making a table or column change that would cause a view/function to break.

And while that’s pretty cool functionality on its own, what’s even better is that the SQL Server optimizer can do some pretty cool things when it knows one of your non-data accessing scalar functions is schema bound.

For example, let’s say we have these two functions.  You’ll notice the second one includes the WITH SCHEMABINDING syntax:

When SQL Server executes a function, by default it checks whether the function has any database objects it needs to access.  For our example functions above, this is a waste of time since neither function accesses any data.

The WITH SCHEMABINDING option forces SQL Server to take note at function creation time whether any data access is occurring.  This allows SQL Server to skip that check at run time and allowing the function to execute significantly faster:

Take a look at those CPU times: 1594ms vs 62ms!  Since SQL Server saves a LOT of time by not having to verify the underlying data sources (for a more in-depth explanation of how SQL Server checks whether a function accesses data, I highly recommend reading this StackOverflow answer by Paul White).

WITH SCHEMABDINING also has performance optimization for queries that would normally need to implement Halloween Protection as well.  Halloween Protection essentially prevents SQL Server from modifying the same records more than once and usually implemented by the addition of an spool operator to the execution plan:

In this example, you’ll notice our non-schema-bound function introduces a Table Spool while our second schema-bound function forgoes this addition since SQL Server knows there won’t be any potential for conflict.

In conclusion, if you have a non-data-accessing scalar function you should always add WITH SCHEMABINDING to reap the benefits of SQL Server’s optimizations when available.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Bert & Pinal Troubleshoot a Slow Performing SQL Server

Published on: 2018-11-13

What happens when you run into performance tuning expert Pinal Dave?  Talk about how to troubleshoot a slow performing SQL Server of course!

This week is all about the video, so if you can’t view it above, head on over to my YouTube channel to watch it there.

After watching, be sure to check out Pinal’s free performance diagnostic scripts over at

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!