As a developer my favorite new feature of SQL Server 2016 is JSON support.
I love JSON in SQL because I already love JSON everywhere outside of SQL: it uses much less space than XML for serializing data, it's what most apps are now using for API communication, and when web developing I love that it is already valid JavaScript (no need to deserialize!).
I had this same type of excitement for XML in SQL Server, but after using it the excitement quickly turned into disappointment: having to constantly use the XML datatype was inconvenient (when most XML data I used was already stored in nvarchar(max) columns) and I never found the syntax of OPENXML() to be that intuitive.
Everything I've done with JSON in SQL Server 2016 so far has been great. I've already been storing persistent JSON in SQL, so being able to manipulate JSON within SQL is even better. In this series of posts I will go over the various functionalities of using JSON in SQL Server 2016:
- Part 1 — Parsing JSON
- Part 2 — Creating JSON
- Part 3 — Updating, Adding, and Deleting JSON
- Part 4 — JSON Performance Comparison
Part 1 — Parsing JSON
What is JSON?
JavaScript Object Notation (JSON) " is a lightweight data-interchange format."
My simple, mostly caveat-free* explanation is that it is a format for storing object data in JavaScript. It's lightweight and easy to read, so it's used in lots of applications that aren't just JavaScript (although it's especially easy to consume in JavaScript because it is JavaScript*).
*Caveats? See http://stackoverflow.com/a/383699
So what's JSON look like? The JSON below represents the current inventory of cars in my garage. It shows I have two cars as well as some of their attributes:
{
"Cars": [{
"Make": "Volkswagen",
"Model": {
"Base": "Golf",
"Trim": "GL"
},
"Year": 2003,
"PurchaseDate": "2006-10-05T00:00:00.000Z"
}, {
"Make": "Subaru",
"Model": {
"Base": "Impreza",
"Trim": "Premium"
},
"Year": 2016,
"PurchaseDate": "2015-08-18T00:00:00.000Z"
}]
}
Strict versus Lax mode
For any of the SQL JSON functions (OPENJSON()
, JSON_VALUE()
, JSON_QUERY()
, andJSON_MODIFY()
) you can specify whether invalid JSON paths will return NULL
or an error. The default value is lax
, which will return a NULL
for non-existing JSON paths, whereas strict
will return an error message.
-- Lax (default: function will return an error if invalid JSON path specified
SELECT JSON_VALUE('{ "Color" : "Red" }', '$.Shape') --lax is the default, so you don't need to be explicitly state it
-- Output: NULL
SELECT JSON_VALUE('{ "Color" : "Red" }', 'lax $.Shape')
-- Output: NULL
-- Strict: function will return an error if invalid JSON path specified
SELECT JSON_VALUE('{ "Color" : "Red" }', 'strict $.Shape')
-- Output: Property cannot be found on the specified JSON path.
The lax
modifier is helpful when writing queries that check to see if values exist in a JSON object while the strict
modifier works great for error checking and and validation.
ISJSON()
A simple function for verifying whether an inputted string is valid JSON. This is great to use in order to validate JSON formatting before running any of remaining functions in this post.
SELECT ISJSON('{ "Color" : "Blue" }') -- Returns 1, valid
-- Output: 1
SELECT ISJSON('{ "Color" : Blue }') -- Returns 0, invalid, missing quotes
-- Output: 0
SELECT ISJSON('{ "Number" : 1 }') -- Returns 1, valid, numbers are allowed
-- Output: 1
SELECT ISJSON('{ "PurchaseDate" : "2015-08-18T00:00:00.000Z" }') -- Returns 1, valid, dates are just strings in ISO 8601 date format https://en.wikipedia.org/wiki/ISO_8601
-- Output: 1
SELECT ISJSON('{ "PurchaseDate" : 2015-08-18 }') -- Returns 0, invalid
-- Output: 0
JSON_VALUE()
Extracts a scalar value from a JSON string. This function needs to be able to parse the value, so it will not parse out complex objects like arrays.
-- See https://gist.github.com/bertwagner/356bf47732b9e35d2156daa943e049e9 for a formatted version of this JSON
DECLARE @garage nvarchar(1000) = N'{ "Cars": [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }] }'
SELECT JSON_VALUE(@garage, '$.Cars[0].Make') -- Return the make of the first car in our array
-- Output: Volkswagen
SELECT CAST(JSON_VALUE(@garage, '$.Cars[0].PurchaseDate') as datetime2) -- Return the Purchase Date of the first car in our array and convert it into a DateTime2 datatype
-- Output: 2006-10-05 00:00:00.0000000
SELECT JSON_VALUE(@garage, '$.Cars') -- This returns NULL because the values of Cars is an array instead of a simple object
-- Output: NULL
SELECT JSON_VALUE(@garage, '$.Cars[1].Model') -- This is also invalid because JSON_VALUE cannot return an array...only scalar values allowed!
-- Output: NULL
SELECT JSON_VALUE(@garage, '$.Cars[1].Model.Base') -- Much better
-- Output: Impreza
JSON_VALUE()
is great for accessing operational data that might be using JSON to store multiple attributes for a single entry.
JSON_QUERY()
JSON_QUERY()
is meant to work for all of the datatypes that JSON_VALUE()
doesn't know how to return: basically JSON_QUERY()
returns JSON string representations of complex JSON objects like arrays.
-- See https://gist.github.com/bertwagner/356bf47732b9e35d2156daa943e049e9 for a formatted version of this JSON
DECLARE @garage nvarchar(1000) = N'{ "Cars": [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }] }'
-- This returns NULL because the values of Cars is an array instead of a simple object
SELECT JSON_VALUE(@garage, '$.Cars')
-- Output: NULL
-- Using JSON_QUERY() however returns the JSON string representation of our array object
SELECT JSON_QUERY(@garage, '$.Cars')
-- Output: [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }]
-- This instance of JSON_VALUE() correctly returns a singular scalar value
SELECT JSON_VALUE(@garage, '$.Cars[0].Make')
-- Output: Volkswagen
-- Using JSON_QUERY will not work for returning scalar values - it only will return JSON strings for complex objects
SELECT JSON_QUERY(@garage, '$.Cars[0].Make')
-- Output: NULL
It's possible to use JSON_QUERY()
along with JSON_VALUE()
to essentially extract any type of data from JSON, whether it's a simple or complex object datatype.
-- See https://gist.github.com/bertwagner/356bf47732b9e35d2156daa943e049e9 for a formatted version of this JSON
DECLARE @garage nvarchar(1000) = N'{ "Cars": [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }] }'
-- We use JSON_QUERY to get the JSON representation of the Cars array
SELECT JSON_QUERY(@garage, '$.Cars')
-- Output: [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }]
-- If we combine it with JSON_VALUE we can then pull out specific scalar values
SELECT JSON_VALUE(JSON_QUERY(@garage, '$.Cars') , '$[0].Make')
-- Output: Volkswagen
OPENJSON()
While JSON_VALUE()
extracts singular scalar values and JSON_QUERY()
extracts singular JSON strings, the OPENJSON()
function extracts result sets from a JSON string. In addition to the extracted value, OPENJSON()
outputs the order of JSON objects as well as their datatypes. OPENJSON()
will also output string representations of JSON arrays instead of just displaying NULL
, similar to JSON_QUERY()
.
-- See https://gist.github.com/bertwagner/356bf47732b9e35d2156daa943e049e9 for a formatted version of this JSON
DECLARE @garage nvarchar(1000) = N'{ "Cars": [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }] }'
SELECT * FROM OPENJSON(@garage, '$.Cars') -- Displaying the values of our "Cars" array. We additionally get the order of the JSON objects outputted in the "key" column and the JSON object datatype in the "type" column
/* Output:
key value type
------ ------------------------------------------------------------------------------------------------------------------------------------ ----
0 { "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" } 5
1 { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" } 5
*/
SELECT * FROM OPENJSON(@garage, '$.Cars[0]') -- Specifying the first element in our JSON array. JSON arrays are zero-index based
/* Output:
key value type
---------------- ------------------------------------- ----
Make Volkswagen 1
Model { "Base": "Golf", "Trim": "GL" } 5
Year 2003 2
PurchaseDate 2006-10-05T00:00:00.000Z 1
*/
SELECT * FROM OPENJSON(@garage, '$.Cars[0].Model') -- Pulling the Model property from the first element in our Cars array
/* Output:
key value type
------- ------- ----
Base Golf 1
Trim GL 1
*/
The flexibility of OPENJSON()
makes it possible to extract any values from JSON data, especially when combining OPENJSON()
with JSON_VALUE()
. The examples below show how to parse out a scalar value from complex JSON objects (like arrays). Note that using the WITH
option gives us a lot more flexibility with how we can format our output result.
-- See https://gist.github.com/bertwagner/356bf47732b9e35d2156daa943e049e9 for a formatted version of this JSON
DECLARE @garage nvarchar(1000) = N'{ "Cars": [{ "Make": "Volkswagen", "Model": { "Base": "Golf", "Trim": "GL" }, "Year": 2003, "PurchaseDate": "2006-10-05T00:00:00.000Z" }, { "Make": "Subaru", "Model": { "Base": "Impreza", "Trim": "Premium" }, "Year": 2016, "PurchaseDate": "2015-08-18T00:00:00.000Z" }] }'
-- Here we retrieve the Make of each vehicle in our Cars array
SELECT JSON_VALUE(value, '$.Make') FROM OPENJSON(@garage, '$.Cars')
/* Output:
------------
Volkswagen
Subaru
*/
-- Parsing and converting some JSON dates to SQL DateTime2
SELECT CAST(JSON_VALUE(value, '$.PurchaseDate') as datetime2) FROM OPENJSON(@garage, '$.Cars')
/* Output:
---------------------------
2006-10-05 00:00:00.0000000
2015-08-18 00:00:00.0000000
*/
-- We can also format the output schema of a JSON string using the WITH option. This is especially cool because we can bring up values from sub-arrays (see Model.Base and Model.Trim) to our top-level row result
SELECT * FROM OPENJSON(@garage, '$.Cars')
WITH (Make varchar(20) 'strict $.Make',
ModelBase nvarchar(100) '$.Model.Base',
ModelTrim nvarchar(100) '$.Model.Trim',
Year int '$.Year',
PurchaseDate datetime2 '$.PurchaseDate')
/* Output:
Make ModelBase Year PurchaseDate
-------------- ----------- ----------- ---------------------------
Volkswagen Golf 2003 2006-10-05 00:00:00.0000000
Subaru Impreza 2016 2015-08-18 00:00:00.0000000
*/
These JSON functions should help you parse any JSON data you encounter in SQL server (as long as it's valid and stored as nvarchar). Stay tuned over the next few weeks as we look at other SQL JSON functions that will help us create data, modify data, and compare SQL's JSON performance to other languages.