When Is It Appropriate To Store JSON in SQL Server?

Published on: 2017-03-14

Who needs a relational database when everything can be stored in a JSON string?

Every once in a while I hear of some technologist say that relational databases are dead; instead, a non-table based NoSQL storage format is the way of the future. SQL Server 2016 introduced JSON functionality, making it possible for some “non-SQL” data storage to make its way into the traditionally tabled-based SQL Server.

Does this mean all data in SQL Server going forward should be stored in long JSON strings? No, that would be a terrible idea. There are instances when storing JSON in SQL Server is a good choice though. In this post I want to create recommendations for when data should be stored as JSON and when it shouldn’t.

Databases Should Not Be Entirely Comprised Of JSON

The screenshot below is an example of what I think some developers would do if they were given free reign in SQL Server 2016:

Here we have an application database (“InventoryApp”) that consists of only a single table (“dbo.Data”) with three JSON NVARCHAR(MAX) columns to represent all of the data required by the app. Relationships exist between Sales, Purchases, and Customers but these are not defined on the database side.

If you are from the world of relational-SQL, you might not believe that anyone would design such a database structure. Believe me though, this is a realistic scenario. Entire companies (eg. Firebase: https://firebase.google.com/) build their services around abstracting the database layer away from developers, essentially storing entire tables or databases in large JSON strings.

Many developers like storing data this way because it is easy to deserialize JSON strings into objects in their programming languages to use in their apps. They like the fact that with JSON they can have an infinitely changing storage schema (just add new keys, values, and arrays!) so if they need a new field for their app, they can just add it in, serialize the object to a JSON string, and store it again in the database.

Obviously, going completely “NoSQL” might make short term development easier/quicker, but using SQL Server 2016 to only store data this way is a travesty: there’s no way to use many of SQL Server’s amazing performance, schema definition and validation, and security features.

So when is it appropriate to store JSON in SQL Server?

Appropriate Use Case #1: Error Logging

Errors happen. When they do, it’s nice to be able to go back and look at the error message to see what happened.

The problem is that the structure of error messages isn’t always consistent. Sometimes only the value of a single property will help identify the cause of failure. Other times, something more complex fails and it would be nice to have all of the values of a complex object available to make troubleshooting easier.

This is where JSON steps in: in most programming languages, it is easy to convert error messages and run time values to a JSON object on error. And since error messages and data values change in structure depending on where they occur, it’s easy to dynamically turn any type of object into JSON data.

This data is perfect to store in SQL to be looked at later. None of these ideas are new — nvarchar(max) has been in SQL for a while now, and so programmers everywhere have been storing error information in that datatype.

With SQL Server 2016, it is now easier to examine and parse the error information directly in SQL Server Management Studio with the variety of JSON parsing functions available. No longer do programmers have to copy the code into some different tool — they can easily do it in SSMS.

Appropriate Use Case #2: Piloting Ideas

Most large workplaces have controls in place that prevent developers from making changes in production. In general this is a Good Idea™.

However, controls are sometimes too restrictive. For example, due to security restrictions, lack of server space, company politics, etc… developers are sometimes stuck developing in production. It’s an unfortunate fact of life. In those scenarios, developers have to go through hell if they have to elevate each database structure change every time they want to test something in production.

JSON to the rescue! An nvarchar(max) column in a table can have its JSON data be easily added to and modified to fit more data than it was originally intended to hold. All without any database structure change requests.

Now this is not an ideal situation. In fact, it’s a scenario that can add a lot of technical debt to the application long-term if not planned for.

However, if a “flexible” JSON column is built with eventual conversion to a traditional table structure in mind from the start, it’s actually simple for a developer to transition an entirely JSON storage structure to a relational format later on. They key here is that the developer needs to have this conversion planned from day one.

Appropriate Use Case #3: Non-Analytical Data

Analytical data is SQL Server’s bread and butter. Need to store lots of data and be able to query against it all day long? No problem, there are a plethora of performance tuning options to make your queries run fast and efficiently.

However, sometimes not all data needs to be analyzed. Often an app might need to save some session data to a database temporarily — why bother creating all of the maintenance overhead of strict database schemas if the data will never be queried for analytical purposes? Another example might be a website’s dynamically created user profile settings. You can build normalized table(s) to store all of that data, but then you will be writing programming logic to normalize and denormalize your data out of the app.

If this data will not have to be searched, then why bother adding all of the overhead? Keep it in JSON and be done with.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

The Ultimate SQL Server JSON Cheat Sheet

Published on: 2017-03-07

Examples for handling JSON data in SQL Server 2016+

This post is a reference of my examples for processing JSON data in SQL Server. For more detailed explanations of these functions, please see my post series on JSON in SQL Server 2016:

Additionally, the complete reference for SQL JSON handling can be found at MSDN: https://msdn.microsoft.com/en-us/library/dn921897.aspx


Parsing JSON

Getting string JSON data into a SQL readable form.

ISJSON()

Checks to see if the input string is valid JSON.

JSON_VALUE()

Extracts a specific scalar string value from a JSON string using JSON path expressions.

Strict vs. Lax mode

If the JSON path cannot be found, determines if the function should return a NULL or an error message.

JSON_QUERY()

Returns a JSON fragment for the specified JSON path.

This is useful to help filter an array and then extract values with JSON_VALUE():

OPEN_JSON()

Returns a SQL result set for the specified JSON path. The result set includes columns identifying the datatypes of the parsed data.

Creating JSON

Creating JSON data from either strings or result sets.

FOR JSON AUTO

Automatically creates a JSON string from a SELECT statement. Quick and dirty.

FOR JSON PATH

Formats a SQL query into a JSON string, allowing the user to define structure and formatting.

Modifying JSON

Updating, adding to, and deleting from JSON data.

JSON_MODIFY()

Allows the user to update properties and values, add properties and values, and delete properties and values (the delete is unintuitive, see below).

Modify:

Add:

Delete property:

Delete from array (this is not intuitive, see my Microsoft Connect item to fix this: https://connect.microsoft.com/SQLServer/feedback/details/3120404/sql-modify-json-null-delete-is-not-consistent-between-properties-and-arrays)

SQL JSON Performance Tuning

SQL JSON functions are already fast. Adding computed columns and indexes makes them extremely fast.

Computed Column JSON Indexes

JSON indexes are simply regular indexes on computed columns.

Add a computed column:

Add an index to our computed column:

Performance test:

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Book Notes: The Art of Readable Code by Dustin Boswell

Published on: 2017-02-28

I highlight and take notes when I read nonfiction books. Once I finish a book, I format and edit my notes so that I can easily remind myself of what I learned without having to reread the book. These notes are not a substitute for reading the book, they only serve as a reminder of key concepts.


The fewer lines a variable is in scope, the shorter its name is allowed to be. Variables used in larger scopes must be less ambiguous so they are allowed to have longer names.

Don’t use project specific abbreviations — someone new to the project won’t know what they mean. “So our rule of thumb is: would a new teammate understand what the name means? If so, then it’s probably okay.”

“The clearest way to name a limit is to put max_ or min_ in front of the thing being limited.”

“In general, adding words like is, has, can, or should can make booleans more clear.”

“The best names are ones that can’t be misconstrued — the person reading your code will understand it the way you meant it, and no other way.”

“Everyone prefers to read code that’s aesthetically pleasing. By “formatting” your code in a consistent, meaningful way, you make it easier and faster to read.”

“The purpose of commenting is to help the reader know as much as the writer did.”

More comments aren’t necessarily better — the comments take up valuable screen real estate, so if there is a comment in the code “it better be worth it”.

“Don’t Comment Bad Names — Fix the Names Instead”

“Instead of minimizing the number of lines, a better metric is to minimize the time needed for someone to understand it.”

“The simplest way to break down an expression is to introduce an extra variable that captures a smaller subexpression. This extra variable is sometimes called an “explaining variable” because it helps explain what the subexpression means.”

“One technique [to clean up code] is to see if you can solve the problem the “opposite” way. Depending on the situation you’re in, this could mean iterating through arrays in reverse or filling in some data structure backward rather than forward.”

In general, keep scopes short and finish tasks as quickly as possible.

“The more places a variable is manipulated, the harder it is to reason about its current value.”

“The advice for this chapter is to aggressively identify and extract unrelated subproblems. Here’s what we mean: Look at a given function or block of code, and ask yourself, “What is the high-level goal of this code?” For each line of code, ask, “Is it working directly to that goal? Or is it solving an unrelated subproblem needed to meet it?” If enough lines are solving an unrelated subproblem, extract that code into a separate function.”

“Code that does multiple things at once is harder to understand. A single block of code might be initializing new objects, cleansing data, parsing inputs, and applying business logic, all at the same time.”

Break code up into small, easy to understand fragments. This makes understanding easier and reduces potential for bugs.

“do only one task at a time.”

“When explaining a complex idea to someone, it’s easy to confuse them with all the little details. It’s a valuable skill to be able to explain an idea “in plain English,” so that someone less knowledgeable than you can understand. It requires distilling an idea down to the most important concepts. Doing this not only helps the other person understand but also helps you think about your own ideas more clearly.”

“This chapter discussed the simple technique of describing your program in plain English and using that description to help you write more natural code. This technique is deceptively simple, but very powerful.”

“Knowing when not to code is possibly the most important skill a programmer can learn. Every line of code you write is a line that has to be tested and maintained. By reusing libraries or eliminating features, you can save time and keep your codebase lean and mean.”

“When you start a project, it’s natural to get excited and think of all the cool features you’ll want to implement. But programmers tend to overestimate how many features are truly essential to their project. A lot of features go unfinished or unused or just complicate the application.”

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!