Ignoring NULLs with FIRST_VALUE

Published on: 2018-08-28

Watch this week’s episode on YouTube.

The SQL Server FIRST_VALUE function makes it easy to return the “first value in an ordered set of values.”

The problem is that if that first value happens to be a NULL, there is no easy, built-in way to skip it.

While a UserVoice item exists to add the ability to ignore nulls (go vote!), today, we’re going accomplish that end result with some alternative queries.

The Setup

Here’s the example data we’ll be skipping nulls on:

We’ve got a an integer identity column, two groups of rows, and NULLs that are sprinkled into otherwise unsuspecting integer values.

If we write a query that uses the FIRST_VALUE function, you’ll notice that our NULL gets chosen in group two – not quite what we want:

Let’s look at two queries that will help us get the number 6 into that FirstValue1 column for the second group.

The Contenders

“The Derived FIRST_VALUE”

First up is still the FIRST_VALUE function, but inside of a derived table:

By filtering out NULLs in our derived table query, FIRST_VALUE returns the first non-null value like we want.  We then join that back to the original data and all is right again.

“The Triple Join”

Our second attempt at this query sends us back to the dark ages of SQL Server 2008 before the FIRST_VALUE function existed:

We perform a triple join, with the critical element being our derived table which gets the MIN Id for each group of rows where Value1 IS NOT NULL.  Once we have the minimum Id for each group, we join back in the original data and produce the same final result:

The Performance

Both of the above queries produce the same output – which one should you use in your production code?

Well, the “Derived FIRST_VALUE” query has a lower relative cost than the “Triple Join” query, maybe it’s better?

This isn’t a real-world execution plan though – surely we never scan heaps our production environments.

Let’s add a quick clustered index and see if that changes anything:

Okay, a closer match up but the “Derived FIRST_VALUE” query still appears to have a slight edge.

If we SET STATISTICS IO ON though we start to see a different story:

With only 8 rows of data, our “Derived FIRST_VALUE” query sure is performing a lot of reads.

What if we increase the size of our sample dataset?

And now check our plans and stats IO:

WOW that’s a lot of reads in the “Derived FIRST_VALUE” query.

Conclusion

Besides sharing some solutions, the point I tried to make above is that DON’T TRUST CODE YOU FIND ON THE INTERNET (or in books, or copied from colleagues, etc…)

Both of the above queries will return the first value without NULLs.  But they probably won’t perform exactly the same as they did on my examples above.

Copy the above code for sure – but test it out. See what works better on your specific server configuration, data size, and indexes.  Maybe both queries are terrible and you need a third, better way of doing it (if you write one, let me know!) – but please, please, please, always test your code.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

5 thoughts on “Ignoring NULLs with FIRST_VALUE”

  1. Hi Bert,

    I ran in to these type of questions before and solved it in the FIRST_VALUE function itself by adding an additional ORDE BY condition which is sorting the NULL values to the bottom.

    DROP TABLE IF EXISTS ##DATA
    GO
    CREATE TABLE ##Data
    (
    Id int IDENTITY(0,1),
    GroupId int,
    Value1 int
    );
    GO
    INSERT INTO ##Data VALUES (1,1)
    INSERT INTO ##Data VALUES (1,1)
    INSERT INTO ##Data VALUES (1,3)
    INSERT INTO ##Data VALUES (2,NULL)
    INSERT INTO ##Data VALUES (2,NULL)
    INSERT INTO ##Data VALUES (2,6)
    INSERT INTO ##Data VALUES (2,4)
    INSERT INTO ##Data VALUES (2,5);
    GO

    SELECT
    Id,
    GroupId,
    Value1,
    FIRST_VALUE(Value1) OVER (PARTITION BY GroupId ORDER BY IIF(VALUE1 IS NULL, 1, 0) ASC, ID ASC) AS FirstValue1
    FROM
    ##Data
    ORDER BY ID

    This is working really well.

    Best regards,

    Harry

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.