OR vs UNION ALL – Is One Better For Performance?

Today I want to show you a trick that could make your queries run faster.

It won’t always work, but when it does everyone will be impressed with your performance tuning prowess.  Let’s go!

Watch this week’s episode on YouTube.

Our Skewed Data

Let’s create a table and insert some data.
Notice the heavily skewed value distribution.  Also notice how we have a clustered index and a very skimpy nonclustered index:

If we write a query that filters on one of the low-occurrence values in Col3, SQL Server will perform an index seek with a key lookup (since our skimpy nonclustered index doesn’t cover all of the columns in our SELECT):

If we then add an OR to our WHERE clause and filter on another low-occurrence value in Col3, SQL Server changes how it wants to retrieve results:

Suddenly those key-lookups become too expensive for SQL Server and the query optimizer thinks it’ll be faster to just scan the entire clustered index.

In general this makes sense; SQL Server tries to pick plans that are good enough in most scenarios, and in general I think it chooses wisely.

However, sometimes SQL Server doesn’t pick great plans.  Sometimes the plans it picks are downright terrible.

If we encountered a similar scenario in the real-world where our tables had more columns, more rows, and larger datatypes, having SQL Server switch from a seek to a scan could kill performance.

So what can we do?


The first thing that comes to mind is to modify or add some indexes.

But maybe our (real-world) table already has too many indexes.  Or maybe we are working with a data source where we can’t modify our indexes.

We could also use the FORCESEEK hint, but I don’t like using hints as permanent solutions because they feel dirty (and are likely to do unexpected things as your data changes).

One solution to UNION ALL

One solution that a lot of people overlook is rewriting the query so that it uses UNION ALLs instead of ORs.

A lot of the time it’s pretty easy to refactor the query to multiple SELECT statements with UNION ALLs while remaining logically the same and returning the same results:

Sure, the query is uglier and will be a bigger pain to maintain if you need to make changes in the future, but sometimes we have to suffer for fashion query performance.

But does our UNION ALL query perform better?

Well the plan shows seeks, but as Erik Darling recently pointed out, seeks aren’t always a good thing.

So let’s compare the reads of the OR query versus the UNION ALL query using SET STATISTICS IO ON:

So in this case, tricking SQL Server to pick a a different plan by using UNION ALLs gave us a performance boost.  The difference in reads isn’t that large in the above scenario, but I’ve had this trick take my queries from minutes to seconds in the real world.

So the next time you are experiencing poor performance from a query with OR operators in it, try rewriting it using UNION ALLs.

It’s not always going to fix your performance problem but you won’t know until you give it a try.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Inverted Polygons? How to Troubleshoot SQL Server’s Left Hand Rule

Last week we looked at how easy it is to import GeoJSON data into SQL Server’s geography datatype.

Sometimes your source data won’t be perfectly formatted for SQL Server’s spatial datatypes though.

Today we’ll examine what to do when our geographical polygon is showing us inverted results.

Watch this week’s vlog on my YouTube channel.

Colorado Is A Rectangle

If you look at the state of Colorado on a map, you’ll notice its border is pretty much a rectangle.

Roughly marking the lat/long coordinates of the state’s four corners will give you a polygon comprised of the following points:

Or in GeoJSON format (set equal to a SQL variable) you might represent this data like so:

Note: four points + one extra point that is a repeat of our first point – this last repeated point let’s us know that we have a closed polygon since it ends at the same point where it began.

Viewing Our Colorado Polygon

Converting this array of points to the SQL Server geography datatype is pretty straight forward:

We can then take a look at SQL Server Management Studio’s Spatial Results tab and see our polygon of Colorado drawn on a map.  You might notice something looks a little funny with this picture though:


Discerning eyes might notice that SQL Server didn’t shade in the area inside of the polygon – it instead shaded in everything in the world EXCEPT for the interior of our polygon.

If this is the first time you’ve encountered this behavior then you’re probably confused by this behavior – I know I was.

The Left-Hand/Right-Hand Rules

There is a logical explanation though for why SQL Server is seemingly shading in the wrong part of our polygon.

SQL Server’s geography datatype follows the “left-hand rule” when determining which side of the polygon should be shaded.  On the contrary, the GeoJSON specification specifies objects should be formed following the “right-hand rule.”

The left hand rule works like this: imagine you are walking the path of polygon – whatever is to the left of the line you are walking is what is considered the “interior” of that polygon.

So if we draw arrows that point in the direction that the coordinates are listed in our GeoJSON, you’ll notice we are making our polygon in a clockwise direction:

If you imagine yourself walking along this line in the direction specified, you’ll quickly see why SQL Server shades the “outside” of the polygon: following the left-hand rule, everything except for the state of Colorado is considered the interior of our polygon shape.

Reversing Polygon Direction

So the problem here is that our polygon data was encoded in a different direction than the SQL Server geography datatype expects.

One way to fix this is to correct our source data by reordering the points so that the polygon is drawn in a counter-clockwise direction:

This is pretty easy to do with a polygon that only has five points, but this would be a huge pain for a polygon with hundreds or thousands of points.

So how do we solve this in a more efficient manner?

Easy, use SQL Server’s ReorientObject() function.

ReorientObject() does what we did manually above – it manipulates the order of our polygon’s points so that it changes the direction in which the polygon is drawn.

Note: SQL uses a different order when reversing the points using ReorientObject() than the way we reversed them above.  The end result ends up being the same however.

Regardless of which method you choose to use, the results are the same: our polygon of Colorado is now drawn in the correct direction and the Spatial Results tab visually confirms this for us:

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Importing GeoJSON Earthquake Data Into SQL Server

A significant portion of Yellowstone National Park sits on top of a supervolcano.  Although it’s not likely to erupt any time soon, the park is constantly monitored for geological events like earthquakes.

This week I want to take a look at how you can import this earthquake data, encoded in GeoJSON format, into SQL Server in order to be able to analyze it using SQL Server’s spatial functions.

Watch this week’s post on YouTube! I really enjoyed making all of the overlays for this episode.


The source for the data we’ll be using is the 30-day earthquake feed from the USGS.  This data is encoded in the GeoJSON format, a specification that makes it easy to share spatial data via JSON.  To get an idea of how it looks, here’s an extract:

The key thing we’ll be examining in this data is the “features” array: it contains one feature object for each earthquake that’s been recorded in the past 30 days.  You can see the “geometry” child object contains lat/long coordinates that we’ll be importing into SQL Server.

If you want the same 30-day GeoJSON extract we’ll be using in all of the following demo code, you can download it here.

Importing GeoJSON into SQL Server

There’s no out of the box way to import GeoJSON data into SQL Server.

However, using SQL Server’s JSON functions we can build our own solution pretty easily.

First, let’s create a table where we can store all of earthquake data:

Then, let’s use the OPENJSON() function to parse our JSON and insert it into our table:

We use OPENJSON() to parse our JSON hierarchy and then concatenate together the lat and long values into our well known text format to be able to use it with SQL Server’s spatial function STPointFromText:

What results is our earthquake data all nicely parsed out into our dbo.EarthquakeData table:

What about Yellowstone?

The above data includes earthquakes from around world.  Since we only want to examine earthquakes in Yellowstone, we’ll need to filter the data out.

There’s a handy Place column in the data that we could probably add a LIKE ‘%yellowstone%’ filter to – but this is a post about spatial data in SQL, we can do better!

The Wyoming State Geological Survey website has Shapefiles for the boundary of Yellowstone National Park.  Since we are practicing our GeoJSON import skills, I converted the Shapefiles to GeoJSON using an online converter and the resulting data looks like this:

You can download the full park boundary GeoJSON file here.

Just like before, we’ll use SQL Server’s OPENJSON() function to parse our GeoJSON data into a well-known text POLYGON.

First we create our table:

And then populate it, this time using the STPolyFromText spatial function:

Filtering our data

Now we have two tables: dbo.EarthquakeData and dbo.ParkBoundaries.  What we want to do is select only the Earthquake data points that fall within the boundaries of Yellowstone National Park.

This is easy to do using the STIntersects spatial function, which returns a “1” for any rows where one geography instance (our lat/long earthquake coordinate) intersects another geography instance (our park boundary):

The rest is up to you

So all it takes to import GeoJSON data into SQL Server is knowing how to use SQL Server’s JSON functions.

Once geographical data is imported into geography data types, SQL Server’s spatial functions offer lots of flexibility for how to efficiently slice and dice the data.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!