JSON Support Is The Best New Developer Feature in SQL 2016 — Part 4: Performance Comparisons

Published on: 2017-02-21

This is the fourth article in my series about learning how to use SQL Server 2016’s new JSON functions. If you haven’t already, you can read Part 1 — Parsing JSON, Part 2 — Creating JSON, and Part 3 — Updating, Adding, and Deleting JSON.


Additional performance comparisons available in an updated post.

We’ve finally come to my favorite part of analyzing any new software feature: performance testing. SQL Server 2016’s new JSON functions are great for parsing JSON data, creating JSON data, and modifying JSON data, but are they efficient?

Today we’ll examine three areas of SQL Server JSON performance:

  1. How to maximize performance for the new SQL Server JSON functions
  2. How the new SQL Server JSON functions compare against what was previously available in SQL Server
  3. How the new SQL JSON functions compare against Newtonsoft’s Json.NET

Maximizing SQL Server JSON Function Performance

I wanted to use a sizable data set in order to test the performance of the new JSON functions in SQL Server 2016. I found arthurkao’s car year/make/model data on GitHub and decided this ~20k element JSON array would be perfect for performance testing purposes. For my tests I’ll be using both the original JSON string as well as a SQL table that I created from the original JSON array:

Unlike XML in SQL Server (which is stored in it’s own datatype), JSON in SQL Server 2016 is stored as an NVARCHAR. This means instead of needing to use special indexes, we can use indexes that we are already familiar with.

To maximize performance, we can use Microsoft’s recommendation of adding a computed column for one of the JSON properties and then indexing that computed column:

Non-persisted computed columns (like in the example above) do not take up any additional space in the table. You can verify this for yourself by running sp_spaceused 'dbo.Cars' before and after adding the non-persisted column to the table.

Having a computed column doesn’t add any performance to our query on its own but it does now allow us to add an index to our parsed/computed JSON property.

Having the computed column doesn’t improve performance — we are still seeing a Table Scan

The clustered index that we add next stores pointers to each parsed/computed value causing the table not to take up any space and only causes the SQL engine to recompute the columns when the index needs to be rebuilt:

And the resulting execution plan now shows both queries (the one using JSON_VALUE() in the WHERE clause directly as well the one calling our computed column) using index seeks to find the data we are looking for:

Yay index seeks!

Overall, adding computed columns to our table adds no overhead in terms of storage space and allows us to then add indexes on JSON properties which improve performance significantly.

SQL Server 2016 JSON vs SQL Server pre-2016 JSON

As I’ve mentioned before, the best option for processing JSON data in SQL Server before 2016 was by using Phil Factor’s amazing JSON parsing function. Although the function works well, it is limited by what SQL Server functionality was available at the time and therefore wasn’t all that efficient.

The above query should work for getting the data we need. I’m abusing what the parseJSON function was probably built to do (I don’t think it was intended to parse ~20k element JSON arrays), and I’ll be honest I waited 10 minutes before killing the query. Basically, trying to parse this much data in SQL before 2016 just wasn’t possible (unless you wrote CLR).

Compared to the following queries which is using our indexed computed column SQL Server 2016 is able to return all of the results to us in 1 ms:

SQL Server 2016 JSON vs Newtonsoft’s Json.NET

In cases like the above where parsing JSON in SQL Server was never an option, my preferred method has always been to parse data in C#. In particular, Newtonsoft’s Json.NET is the standard for high performance JSON parsing, so let’s take a look at how SQL Server 2016 compares to that.

The following code shows 6 tests I ran in SQL Server 2016:

And then the same tests in a C# console app using Json.Net:

And the results compared side by side:

Essentially, it seems like Json.Net beats SQL Server 2016 on larger JSON manipulations, both are equal with small JSON objects, and SQL Server 2016 has the advantage at filtering JSON data when indexes are used.

Conclusion

SQL Server 2016 is excellent at working with JSON. Even though Json.NET beats SQL Server 2016 at working with large JSON objects (on the magnitude of milliseconds), SQL Server is equally fast on smaller objects and is advantageous when JSON data needs to be filtered or searched.

I look forward to using the SQL Server 2016 JSON functions more in the future, especially in instances where network I/O benefits me to process JSON on the SQL Server or when working with applications that cannot process JSON data, like SQL Server Reporting Services.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

How SQL 2016’s JSON Functionality Will Make You Want To Use JSON

Published on: 2017-02-14

This post is a response to this month’s T-SQL Tuesday prompt. T-SQL Tuesday was created by Adam Machanic and is a way for SQL users to share ideas about interesting topics. This month’s topic is Fixing Old Problems with Shiny New Toys.


Why I love JSON

I know that lots of users in the SQL community cringe when they hear about JSON, NoSQL, and other non-relational database storage options. I get it. Lots of times a string storage format compromises the many benefits SQL has to offer: indexing, data integrity, simple querying, oversight into schema changes by developers, etc… I’ve seen (and written) some queries that perform like molasses in winter solely due them being in a non-sargable string format.

Yet, with all of those downsides I still love JSON. I think there’s a happy middle ground where DBAs and developers can agree on when it is appropriate to use JSON. Would I ever want to use JSON in a data warehouse that is being heavily queried for analytics? No. Would I also want to use JSON where performance is extremely important? No. Would I want to use it in an application where data or schema integrity is extremely important? No.

However, I love using JSON when capturing dynamically structured user input, especially if the database is being used only to persist the data instead of analyze it. I love having JSON in the database because it so easily serializes/deserializes between JSON and my C# models. If I want low latency to my webapp, I also might write the JSON from the app to the database and then have it get queued to transform into normalized data by an ETL later on. These reason in particular are what make me incredibly excited for the new JSON functionality in SQL Server 2016.

Okay, I guess there are some valid uses for JSON. What are my options for using JSON in SQL Server?

Before SQL Server 2016 was released there was no support for JSON in SQL Server. None. Zilch. Zero. The best thing we had was Phil Factor’s amazing JSON parsing user defined function or writing your own CLR.

Phil’s function is truly amazing and it cleverly uses some built-in SQL functionality to deserialize almost any type of JSON string you can throw at it. It does however have downsides — the function has a few edge case bugs (try adding a space before any “:” in your JSON), it does not allow for easy querying of the JSON, and the function needs to be added to any database you want to use it on.

I’m not trying to downplay Phil’s function — I love it and use it in a lot of my code — but the fact of the matter is that it is a very clever workaround for something that Microsoft wasn’t supporting at the time.

So what did JSON parsing look like in the pre-2016 world? Let’s take a look at some example data I generated using http://www.json-generator.com/:

Basically we have a JSON string that represents five users and some of their attributes (note: I would argue that storing this much data about an individual would be better suited to fit into some structured tables, but alas this was the type of data that was easy to generate for example purposes).

If I wanted to parse out each user’s name to use for an infrequently ran report or for ETL purposes, this is what my query would have to look like:

There are a couple funny things we have to do in pre-2016 SQL to parse this JSON:

  1. We have to add the parseJson() function to our database
  2. Since there are multiple key’s named “NAME”, we need to add a LAG() function to help identify the property’s key so we can ensure we have the correct one.
  3. Nested queries required since windowed functions can’t run in the WHERE clause.

Overall this code works and it gets the job done. However we need to ensure our server has the parseJson( function added and filtering our JSON data to pull out the properties we are interested in takes a little bit of work. Additionally, the parseJson() function does do some heavy processing which affects our overall performance:

The Shiny New Toys in SQL Server 2016: JSON Functions

SQL Server 2016 includes many new JSON parsing functions that are available for us to use. Writing a query to return the same result set as above is now as simple as this:

The OPENJSON() function allows us to write an XPath query to filter the JSON Users object into five separate rows of data, one per user:

Then, we use the JSON_VALUE() function and XPath once again to filter out just the “name” property. Overall, this code is much simpler to write, performs more consistently (the issue with a space character before a “:” is handled correctly), and performs much quicker:

I love the syntax of the new SQL JSON parsing functions: it’s easy to remember, easy to use in real life scenarios, and is very fast. This is one new feature of SQL Server 2016 that definitely makes my life easier and makes any other ways of parsing out JSON data obsolete.


If you are interested in learning more about JSON in SQL server, I’ve been blogging the past few weeks about how to use all of SQL’s new JSON functions.

There’s also one bug that I found with how SQL handles JSON deletes that I submitted to Microsoft Connect if you want to upvote and see JSON continue to evolve on SQL Server.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

JSON Support Is The Best New Developer Feature in SQL 2016 — Part 3: Updating, Adding, and Deleting…

Published on: 2017-02-07

This is the third article in my series about learning how to use SQL Server 2016’s new JSON functions. If you haven’t already, you can read Part 1 — Parsing JSON and Part 2 — Creating JSON.


So far we’ve looked at how to parse existing JSON objects and how to create new JSON objects. Today I want to look at the easy ways to modify JSON objects as well as the (mostly) easy ways to delete elements from a JSON object.

SQL Server 2016 offers us the new JSON_MODIFY() function for updating existing JSON strings. It’s pretty simple to use for replacing existing values in a JSON string:

The JSON_MODIFY() function works similar to a SQL REPLACE() function: the first argument specifies what data we are modifying, the second argument selects which property we are going to replace via XPath syntax, and the third argument specifies what we are replacing the value with. Pretty easy!

Replacing values with JSON is simple. Adding new values into existing JSON is also fairly simple using JSON_MODIFY():

Replacing data and adding new data is pretty easy with JSON_MODIFY(). This new SQL Server 2016 function ain’t no two trick pony though — it allows deletion of data as well!

Deleting properties in a JSON object is fairly straightforward: all you have to do is run the function with the same arguments as our modification example, except this time passing in NULL as our replacement value:

Up to this point, JSON_MODIFY() has worked great for modifying, adding to, and deleting properties from our JSON data. However, there is one serious flaw with JSON_MODIFY() and that’s deleting JSON array values — instead of deleting the value from the array and then shifting the rest of the array over, it simply replaces the array value with a NULL:

I hate that solution. FOR XML PATH() to rebuild or JSON array? Ugly, ugly, ugly. I have been impressed with all of the new JSON functionality in SQL Server 2016 except for the deletion of array elements.

Deleting properties should be the same as deleting array elements with JSON_MODIFY(): the property and array element should be completely removed from the JSON object, not just replaced with a NULL. I opened a Microsoft Connect issue for this bug here, please vote for it if you want to see this problem fixed as well: https://connect.microsoft.com/SQLServer/feedback/details/3120404

Microsoft’s been pretty good about fixing bugs lately, so let’s hope this gets fixed in SQL Server vNext!

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!