Extracting JSON Values Longer Than 4000 Characters

Published on: 2018-09-18

A while back I built an automated process that parses JSON strings into a relational format.

Up until recently this process had been working great: my output table had all of the data I was expecting, neatly parsed into the correct rows and columns.

Last week I noticed an error in the output table however.  One row that was supposed to have a nicely parsed JSON value for a particular column had an ugly NULL instead.

Truncated?

First I checked my source JSON string – it had the “FiveThousandAs” property I was looking for:

So the source data was fine.

I checked the table column I was inserting into as well and confirmed it was defined as nvarchar(max), so no problem there.

The last thing I checked was the query I was using:

If I run that on it’s own, I reproduce the NULL I was seeing inserted into my table:

JSON_VALUE is limiting

After a little bit more research, I discovered that the return type for JSON_VALUE is limited to 4000 characters.   Since JSON_VALUE is in lax mode by default, if the output has more than 4000 characters, it fails silently.

To force an error in future code I could use SELECT JSON_VALUE(@json, 'strict $.FiveThousandAs')  so at least I would be notified immediately of an problem with my  query/data (via failure).

Although strict mode will notify me of issues sooner, it still doesn’t help me extract all of the data from my JSON property.

(Side note: I couldn’t define my nvarchar(max) column as NOT NULL because for some rows the value could be NULL, but in the future I might consider adding additional database validation with a check constraint).

OPENJSON

The solution to reading the entire 5000 character value from my JSON property is to use OPENJSON:

My insert query needed to be slightly refactored, but now I’m able to return any length value (as long as it’s under 2gb).

In hindsight, I should have used OPENJSON() from the start: not only is it capable of parsing the full length values from JSON strings, but it performs significantly faster than any of the other SQL Server JSON functions.

As a best practice, I think I’m going to use OPENJSON by default for any JSON queries to avoid problems like this in the future.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Why Is My VARCHAR(MAX) Variable Getting Truncated?

Published on: 2018-05-15

Watch this week’s episode on YouTube.

Sometimes SQL Server doesn’t do what you tell it to do.

Normally that’s ok – SQL is a declarative language after all, so we’re supposed to tell it what we want it to do, not how we want it done.

And while that’s fine for most querying needs, it can become really frustrating when SQL Server decides to completely disregard what you explicitly asked it to do.

Why Is My VARCHAR(MAX) Truncated to 8000 Characters?

A prime example of this is when you declare a variable as VARCHAR(MAX) because you want to assign a long string to it.  Storing values longer than 8000 characters long is the whole point of VARCHAR(MAX), right?

If we look at the above query, I would expect my variable @dynamicQuery to be 8001 characters long; it should be 8000 letter ‘a’s followed by a single letter ‘b’.  8001 characters total, stored in a VARCHAR(MAX) defined variable.

But does SQL Server actually store all 8001 characters like we explicitly asked it to?

No:

First we can see that the LEN() of our variable is only 8000 – not 8001 – characters long!

Copying and pasting our resulting value into a new query window also shows us that there is no character ‘b’ at position 8001 like we expected.

The Miserly SQL Server

The reason this happens is that SQL Server doesn’t want to store something as VARCHAR(MAX) if none of the variable’s components are defined as VARCHAR(MAX).  I guess it doesn’t want to store something in a less efficient way if there’s no need for it.

However, this logic is flawed since we clearly DO want to store more than 8000 characters.  So what can we do?

Make Something VARCHAR(MAX)

Seriously, that’s it.  You can do something like CAST the single character ‘b’ as VARCHAR(MAX) and your @dynamicQuery variable will now contain 8001 characters:

But casting a single character as VARCHAR(MAX) isn’t very intuitive.

Instead, I recommend casting a blank as VARCHAR(MAX) and prefixing it to the start of your variable string.  Leave yourself a comment for the future and hopefully you’ll remember why this superfluous looking piece of code is needed:

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!