Visualizing Hash Match Join Internals And Understanding Their Implications

This post is part 3 in a series about physical join operators (be sure to check out part 1 - nested loops joins, and part 2 - merge joins).

Watch this week's video on YouTube

Hash Match joins are the dependable workhorses of physical join operators.

While Nested Loops joins will fail if the data is too large to fit into memory, and Merge Joins require that the input data are sorted, a Hash Match will join any two data inputs you throw at it (as long as the join has an equality predicate and you have enough space in tempdb). 

The base hash match algorithm has two phases that work like this:

Hash-Match-Join-Looping-1

During the first "Build" phase, SQL Server builds an in-memory hash table from one of the inputs (typically the smaller of the two).  The hashes are calculated based on the join keys of the input data and then stored along with the row in the hash table under that hash bucket.  Most of the time there is only 1 row of data per hash bucket except when:

  1. There are rows with duplicate join keys.
  2. The hashing function produces a collision and totally different join keys receive the same hash (uncommon but possible).

Once the hash table is built, SQL Server begins the "Probe" phase.  During this second phase, SQL Server calculates the join key hash for each row in the second input, and checks to see if it exists in the hash table created in the first build phase.  If it finds a match for that hash, it then verifies if the join keys between the row(s) in the hash table and the row from the second table actually match (it needs to perform this verification due to potential hash collisions).

A common variation on this hash match algorithm occurs when the build phase cannot create a hash table that can be fully stored in memory:

Hash-Match-Join-spill-looping-1

This happens when the data is larger than what can be stored in memory or when SQL Server grants an inadequate amount of memory required for the hash match join.

When SQL Server runs doesn't have enough memory to store the build phase hash table, it proceeds by keeping some of the buckets in memory, while spilling the other buckets to tempdb. 

During the probe phase, SQL Server joins the rows of data from the second input to buckets from the build phase that are in memory. If the bucket that the row potentially matches isn't currently in memory, SQL Server writes that row to tempdb for later comparison. 

Once the matches for one bucket are complete, SQL Server clears that data from memory and loads the next bucket(s) into memory. It then compares the second input's rows (currently residing in tempdb) with the new in-memory buckets.

As with every physical join operator in this series, there are way more details about the hash match operator on Hugo Kornelis's reference on hash matches.

What Do Hash Match Joins Reveal?

Knowing the internals of how a hash match join works allows us to infer what the optimizer thinks about our data and the join's upstream operators, helping us focus our performance tuning efforts. 

Here are a few scenarios to consider the next time you see a hash match join being used in your execution plan:

  • While hash match joins are able to join huge sets of data, building the hash table from the first input is a blocking operation that will prevent downstream operators from executing. Due to this, I always check to see if there is an easy way to convert a hash match to either a nested loops or merge join.  Sometimes that won't be possible (too many rows for nested loops or unsorted data for merge joins) but it's always worth checking if a simple index change or improved estimates from a statistics update would cause SQL Server to pick a non-blocking hash match join operator.

  • Hash match joins are great for large joins - since they can spill to tempdb, it allows them to perform joins on large datasets that would fail an in-memory join with either the nested loops or merge join operators.

    • Seeing a hash match join operator means SQL Server thinks the upstream inputs are big.  If we know our inputs shouldn't be that big, then it's worth checking if we have a stats/estimation problem that is causing SQL Server to choose a hash match join incorrectly.
  • When executed in memory, hash match joins are fairly efficient. Problems arise when the build phase spills to tempdb.

    • If I notice the little yellow triangle indicating that the join is spilling to tempdb, I take a look to see why: if the data is larger than the server's available memory, there's not much that can be done there, but if the memory grant seems unusually small that means we probably have another statistics problem that is providing the SQL Server optimizer estimates that are too low.