Data with Bert logo

The 5 Scariest Moments for a SQL Server Developer

Watch this week's video on YouTube

While families and friends are scaring each other this Halloween week with stories of ghosts and ghouls, I thought it'd be way scarier to talk about truncate tables and source control.

Accidental Data Deletion

You've spent all morning loading millions of records into your tables.  Some of that data was painstakingly loaded by hand from flat files, and some lookup values you even authored manually because "you'll only have to do this once."

You are jumping around between SSMS windows, running this query here and that query there, pressing F5 like there's no tomorrow. And then right before you press to F5 to finish all of your work...

...you realize that you ran the whole tab worth of code instead of the one statement you meant to highlight.  "Commands completed successfully" flashes in your Message window, and in horror you discover you accidentally ran a truncate table statement!

Ok so this is pretty bad.

You know backups are being taken but…can the DBA restore the most recent data you spent all morning working on?  You tell her about your misfortune ("...yeah one of my coworkers accidentally deleted the data…") and hold your breath.

After a few minutes she gets back to you saying everything is back up and working.  Saved by good backup practices...this time!

Problems With code You Inherited

You receive an instant message from your DBA.

DBA: "Hey , how are you doing?

"You: "Good how are you?"

DBA: "Look, the instance XYZ has been at 100% CPU utilization for the past hour.  I see you have that has been running that whole time….and it's using a scalar function.

You: "…"

You: "Sorry.  I am just running this job for Bill since he left a few weeks ago.  Obviously there are some problems with the way he was writing some of this queries and I haven't had a chance to look them over yet."

DBA: "That's ok.  I know you would never write a scalar function that totally destroys the server.  Anyway, I was able to easily rewrite the query into a table-function that should run a lot more efficiently.  Is it ok if I kill your original query and run this new one instead?"

You: "Yeah sure, that'd be great.  Thank you!"

Missing Index Details

You've been having a good day: no outages, no annoying customer requests, and the only thing keeping you from going home is to finish tuning the query you are working on.

You execute your query after making all necessary changes and…nothing.  The executing query icon just keeps spinning, and spinning, and spinning…

You go grab some water, and by that time the query has finished.  You switch over to the Execution plan tab and…you see SQL Server's Missing Index hint recommending that you create an index.

Now don't be confused, this text looks green and friendly.  You might be tempted to follow the recommendation and add the missing index it recommends - but don't! 

That siren song of easy query tuning will steer you straight into the rocks known as unnecessary index maintenance. 

Just before you execute Microsoft's missing index details script, you realize that if you add some include columns and reorder a couple of the key columns, you'll have an index that will be applicable to all of your table's queries.

You avoid getting ship wrecked and create an index that useful for more than just this one query - phew!

Email Subjects That Read: "Does anyone know why none of my queries are running?"

Last night's data load failed.  "No big deal," you assure yourself. "I'll just run it now - no one will know the difference."

You kick-off your ETL queries and decide to go make some oatmeal.  At your office's kitchen you run into a coworker you haven't seen in a while, and you get to talking about work, life, that weekend's football game that was lost in overtime…that's been happening a lot this season, huh?

On the way back to your desk, one of your customers stops you to ask some questions, your manager needs to discuss project planning with you, a new employee is introduced, etc…

By this point the office is busy with activity.  You finally make it back to your desk MUCH later than you originally intended.  What were you doing again?

You can't remember so you decide to check your email.  You notice a chain emails asking "Does anyone know what's going on with server ABC?" Things are running so SLOW!  I can't run even the simplest queries!"

OH CRAP you think.  You look at your load process queries and they are still running with no end in sight.  Do you kill the queries and wait for the hour-long rollback to complete or…

...just then your queries finish.  A minute later you receive another email saying "seems like everything is working fine now." 

Another bullet dodged.

Where Did My Code Go?

Working on this team project has been a fun experience for you - responsibilities have been shared and you and your coworkers have been delivering on deadlines ahead of schedule.  You've never been able to write queries and make progress to the end product as quickly as this.

You sit down to start working and remember that you had to debug a query from yesterday afternoon.  You go and open the stored procedure in dev and…… it's not the same query anymore.

Sure, parts of it are the same, but there are some extra parameters and some of the logic has changed.  What the heck…?

"Did anyone mess with USP_GetProducts since yesterday?" you ask your alley of teammates.

"Uhh…I updated it this morning to make it work for my business logic requirement," said intern Sam.  "Is something wrong?"

"Well, it's totally different from the way I had it when I left yesterday evening.  I bet we got out of sync and the merge conflict wasn't resolved correctly" you say.

"Oh yeah, I got some message about merging.  I just clicked the 'keep my version' button," says Sam.

"No big deal," you say with confidence.  "I can pull my version out of source control.  Come on over and I can show you how we can merge both of our queries together."

Source control saves the day.

Happy Halloween.

Unexpected Shorthand Date Implicit Conversions

I do my best work in the mornings.  Evenings are pretty good too once I get a second wind.

Late afternoon are my nemesis for getting any serious technical or creative work done.  Usually I reserve that time for responding to emails, writing documentation, and brewing coffee.

Some afternoons I can't help myself though and end up getting myself into trouble.

What is THAT!?

It all started when I was troubleshooting an existing query that was having issues.  During the process of trying to understand what the query was doing, I happened to look at the execution plan:

CREATE TABLE ##DatetimeTest ( SomeField varchar(50) NULL, CreateDatetime datetime);
INSERT INTO ##DatetimeTest VALUES ('asdf',GETDATE());

DECLARE @Today datetime = GETDATE();
SELECT DISTINCT
    *
FROM 
    ##DatetimeTest
WHERE
    CreateDatetime >= @Today-30;

Simplified, but you get the point.

When I hovered over the Table Scan, the Predicate section caught my eye.  Specifically, I wondered "Why is 1900-01-31 showing up? I don't have that anywhere in my query!"

(fun side story: the real query I was working on was dealing with user-defined datatypes, something I don't have experience with, so I thought those had something to do with the problems I was experiencing.  I latched onto this 1900 date as the potential cause but it ended up being a red herring!)

Help!

Like I mentioned, late afternoons are not when I do my best work.  I couldn't figure out why SQL Server was converting my -30 to January 31, 1900.

Intrigued and having no clue what was going on, I decided to post the question with the #sqlhelp hashtag on Twitter.  Fortunately for me, Aaron Bertrand, Jason Leiser, and Thomas LaRock all came to the rescue with ideas and answers - thanks guys!

Implicit Conversion

In hindsight, the answer is obvious: the -30 implicitly converts to a datetime (the return type of my @Today variable), in this case 30 days after the start of the minimum datetime value, 1900-01-01.

This makes perfect sense: SQL Server needs to do some math and in order to do so it first needs to make sure both datatypes in the equation match.  Since int readily converts to datetime but not the other way around, SQL Server was just doing its job.

Future Problems

As I mentioned earlier, this int to datetime conversion wasn't the actual issue with my query - in my drowsy state I mistook it as being the source of my problem.

And while it wasn't a problem this time, it can become a problem in the future.

Aaron has an excellent article on the problems with shorthand date math, but the most relevant future issue with my query is: what if someone in the future decides to update all datetimes to datetime2s (datetime2 being Microsoft's recommended datatype for new work)?

If we update to a datetime2s and run the query again:

ALTER TABLE ##DatetimeTest
ALTER COLUMN CreateDatetime datetime2;

DECLARE @Today datetime2 = GETDATE();
SELECT DISTINCT
    *
FROM 
    ##DatetimeTest
WHERE
    CreateDatetime >= @Today-30;

Everything is broken

AHHH!  While SQL Server had no problem converting our previous code between datetime and int, it's not so happy about converting datetime2.

Morals

In the end, the above scenario had nothing to do with the actual problem I had on hand (which had to do with some operator precedence confusion).

To avoid future confusion and problems it's still better to refactor the code to be explicit with what you want to do by using the DATEADD() function:

DECLARE @Today datetime = GETDATE();
SELECT DISTINCT
    *
FROM 
    ##DatetimeTest
WHERE
    CreateDatetime >= DATEADD(day, -30, @Today);

Conditionally Returning Rows Based On Query Results

Watch this week's video on YouTube

While I normally prefer formatting my query results in a downstream app/reporting layer, sometimes I can't get around adding some business formatting logic to my SQL queries.

Today I want to show you four different ways to conditionally output rows of data based on a SQL query.

Setting Up The Base Query

You've probably worked with a query that results in either 0 or 1 rows being returned:

DECLARE @CurrentDatetime datetime = getdate();
SELECT
    1 AS AreSecondsDivisbleBy2,
    @CurrentDatetime AS CurrentDatetime
WHERE
    DATEPART(second,@CurrentDatetime) % 2 = 0

If you put that query into a derived table and add some IIF() logic on top of it, you now have a situation where you your result may contain a row with one of two distinct values or no rows at all:

If you put that query into a derived table and add some IIF() logic on top of it, you now have a situation where you your result may contain a row with one of two distinct values or no rows at all:

DECLARE @CurrentDatetime datetime = GETDATE();
SELECT
    IIF(DATEPART(second,d.CurrentDatetime) % 3 = 0, 1,0) AS AreSecondsDivisbleBy3And2,
    d.CurrentDatetime
FROM
    (
    SELECT
        1 AS AreSecondsDivisbleBy2,
        @CurrentDatetime AS CurrentDatetime
    WHERE
        DATEPART(second,@CurrentDatetime) % 2 = 0
    ) d

(Note: if you ever need to know whether the seconds part of the current time is divisible by 3 and 2, use SELECT IIF(DATEPART(second,GETDATE()) % 6 = 0,1,0) and not this monstrosity of a query  I'm creating to demonstrate a scenario when you dependent derived table logic).

Sometimes we may want to force certain scenarios based on the output of the query above. Here are few common patterns that I find myself doing when needing to meet certain business requirements of queries to return data conditionally.

Always Return 1 Row

Let's say we are happy with getting a result value of 0 or 1 for AreSecondDivisbleBy3And2, but want to additionally return some other value when our derived table returns no rows. We can accomplish this with a UNION ALL and some sorting:

DECLARE @CurrentDatetime datetime = GETDATE();
SELECT TOP 1
    AreSecondsDivisbleBy3And2,
    CurrentDatetime
FROM
    (
    SELECT
        IIF(DATEPART(second,CurrentDatetime) % 3 = 0, 1,0) AS AreSecondsDivisbleBy3And2,
        d.CurrentDatetime,
        0 AS OrderPrecedence
    FROM
        (
        SELECT
            1 AS AreSecondsDivisbleBy2,
            @CurrentDatetime AS CurrentDatetime
        WHERE
            DATEPART(second,@CurrentDatetime) % 2 = 0
    ) d
    UNION ALL
    SELECT
        -1 AS AreSecondsDivisbleBy3And2,
        @CurrentDatetime AS CurrentDatetime,
        1 AS OrderPrecedence
    ) p
ORDER BY
     OrderPrecedence

We can limit our query to return TOP 1 and then add an OrderPrecedence column to determine which query result row to return. If our original query has a row of data, it will be returned because of it's OrderPrecedence. If our original query returns 0 rows, our fall back UNION ALL default value of -1 will be returned.

Return 1 row when value is 1, 0 rows when value is 0

What about a situation where we want to return a row when AreSecondsDivisbleBy3And2 is equal to 1, but no row when it is equal to 0?

The IIF function and CASE statements work great for conditionally returning a value in a SELECT, but how can we conditionally return a row?  Here's one approach:

DECLARE @CurrentDatetime datetime = GETDATE();
SELECT
     AreSecondsDivisibleBy3And2,
    CurrentDatetime
FROM
    (
    SELECT
        IIF(DATEPART(second,d.CurrentDatetime) % 3 = 0, 1,0) AS AreSecondsDivisibleBy3And2,
        CurrentDatetime
    FROM
        (
        SELECT
            1 AS AreSecondsDivisbleBy2,
            @CurrentDatetime AS CurrentDatetime
        WHERE
            DATEPART(second,@CurrentDatetime) % 2 = 0
        ) d
    ) d1
    INNER JOIN
    (
    SELECT
        0 AS ValueToNotReturn
    ) d2
        ON d1.AreSecondsDivisibleBy3And2 != d2.ValueToNotReturn

In this scenario, we return a row when our value is 1, but do not return a row when the value is 0 or our derived table doesn't return any rows.

We use an INNER JOIN to filter out the row value that we want to return 0 rows.

Return 1 row when null, and 0 rows when the value is 1 or 0

In this scenario we want to return no rows when AreSecondsDivisibleBy3And2=1 and a row when its not.

I've never had a real-world use for this one, but it's essentially a combination of the first two solutions.

SELECT
    AreSecondsDivisibleBy3And2,
    CurrentDatetime
FROM
    (
    SELECT TOP 1
        AreSecondsDivisibleBy3And2,
        CurrentDatetime
    FROM
        (
        SELECT
            IIF(DATEPART(second,d.CurrentDatetime) % 3 = 0, 1,0) AS AreSecondsDivisibleBy3And2,
            d.CurrentDateTime,
            0 AS OrderPrecedence
        FROM
            (
            SELECT
                1 AS AreSecondsDivisbleBy2,
                @CurrentDatetime AS CurrentDatetime
            WHERE
                DATEPART(second,@CurrentDatetime) % 2 = 0
            ) d
        UNION ALL
        SELECT
            -1 AS AreSecondsDivisibleBy3And2,
            @CurrentDatetime AS CurrentDatetime,
            1 AS OrderPrecedence
        )u
        ORDER BY
             OrderPrecedence
    ) d1
    INNER JOIN
    (
    SELECT
        -1 AS ValueToNotReturn
    ) d2
        ON d1.AreSecondsDivisibleBy3And2 = d2.ValueToNotReturn

This one is by far the most difficult one to logic through, but it's a pure reversal of return no rows when rows are present and return a row when no rows are present.

Always return 0 rows

I'm not exactly sure what the business case for never returning any rows would be, but this one is pretty simple: just add a condition that will always evaluate to false:

DECLARE @CurrentDatetime datetime = GETDATE();
SELECT
    IIF(DATEPART(second,d.CurrentDatetime) % 3 = 0, 1,0) AS AreSecondsDivisibleBy3And2
FROM
    (
    SELECT
        1 AS AreSecondsDivisbleBy2,
        @CurrentDatetime AS CurrentDatetime
    WHERE
        DATEPART(second,@CurrentDatetime) % 2 = 0
    ) d
WHERE
    1=0

Since 1=0 will never be true, your query will never return any results regardless of what kind of logic is happening in your SELECT or derived table.

Shortchanged with International Money in SQL Server

Watch this week's video on YouTube

Imagine you have to perform some salary analysis for your employer International Mega Corp.

The data you have to work with looks something like this:

DROP TABLE IF EXISTS ##InternationalMegaCorpSalaries;
GO
CREATE TABLE ##InternationalMegaCorpSalaries
(
    EmployeeId int IDENTITY,
    Salary nvarchar(10),
    Country nvarchar(20),
);
INSERT INTO ##InternationalMegaCorpSalaries VALUES ('1,000.00','United States');
INSERT INTO ##InternationalMegaCorpSalaries VALUES ('8 789,37','Sweden');
INSERT INTO ##InternationalMegaCorpSalaries VALUES ('6.274,26','Turkey');
INSERT INTO ##InternationalMegaCorpSalaries VALUES ('1000.00','United States');

2018-09-21_12-48-56

Why are the salaries stored as nvarchar and formatted with commas, spaces, and periods?

Great question!  Someone wanted to make sure these amounts would look good in the UI so storing the formatted values in the database would be the way to go...

Pretty for the UI, not really great for needing to do analysis on.

Thanks to Zanoni Labuschagne, one of the subscribers to my YouTube channel, for recommending this topic!

CONVERT!

I'm a firm believer that money values should always be stored in the decimal datatype.  I can't think of a time where I wouldn't care about the precision and accuracy of money.

So let's try converting our salaries to decimal(10,2):

SELECT 
    EmployeeId,
    TRY_CONVERT(decimal(10,2),Salary) AS Salary,
    Country 
FROM ##InternationalMegaCorpSalaries

image-3

Well, that didn't work.  Maybe converting to floats will work as a quick fix?

SELECT 
    EmployeeId,
    TRY_CONVERT(float,Salary) AS Salary,
    Country 
FROM ##InternationalMegaCorpSalaries

image-4

Nope that didn't work either

(sidenote: I'm OK with that though - I don't think float should ever be used for storing money.  If you want to see a quick example of why float math is problematic take a look at this (and for more detail read about it here):

DECLARE 
    @Num1 float = .15,
    @Num2 float = .15,
    @Num3 float = .1,
    @Num4 float = .2
-- Not equal
SELECT IIF(@Num1+@Num2 = @Num3+@Num4,1,0)

Ok so those didn't work.  What if we try converting to the money datatype - that should work for being able to read these money formats right?

SELECT 
    EmployeeId,
    TRY_CONVERT(money,Salary) AS Salary,
    Country 
FROM ##InternationalMegaCorpSalaries

image

50% correct...!  Our employees in Turkey are being seriously underpaid with conversion though.  I'm kind of glad to not have to rely on this solution though since the money datatype has its own fair share of problems as well.

Time to get Cultured

SQL Server's TRY_PARSE function might be able to help us, but first we need to create a relationship between each country's money formatting and it's culture code:

ALTER TABLE ##InternationalMegaCorpSalaries
ADD CultureCode AS CASE Country 
                        WHEN 'United States' THEN 'en-US'
                        WHEN 'Sweden' THEN 'sv-SE'
                        WHEN 'Turkey' THEN 'Tr-TR' END

And finally our SELECT query:

SELECT 
    EmployeeId,
    TRY_PARSE(Salary AS DECIMAL(10,2) USING CultureCode) AS Salary,
    Country 
FROM ##InternationalMegaCorpSalaries

image-1

Success!  Our salary values are now perfectly converted into the decimal datatype without the need for any ugly REPLACE(), SUBSTRING(), or other string parsing functions.

While this carefully curated demo correctly converted all of our values, it's important to always test that the culture value you choose correctly formats your string formatted number.  For example, Wikipedia leads me to believe that the  Danes write their numbers like "6 338,70" SQL Server's culture definition doesn't convert this correctly:

SELECT TRY_PARSE('6 338,70' AS DECIMAL(10,2) USING 'da-dk')

image-2

In those instances, you may need to substitute another culture code to get the correct conversion to occur.

Extracting JSON Values Longer Than 4000 Characters

Watch this week's video on YouTube

A while back I built an automated process that parses JSON strings into a relational format.

Up until recently this process had been working great: my output table had all of the data I was expecting, neatly parsed into the correct rows and columns.

Last week I noticed an error in the output table however.  One row that was supposed to have a nicely parsed JSON value for a particular column had an ugly NULL instead.

Truncated?

First I checked my source JSON string - it had the "FiveThousandAs" property I was looking for:

DECLARE @json nvarchar(max) = N'{
    "Id" : 1,
    "FiveThousandAs" : "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    "FourAs" : "aaaa"
}';

So the source data was fine.

I checked the table column I was inserting into as well and confirmed it was defined as nvarchar(max), so no problem there.

The last thing I checked was the query I was using:

SELECT JSON_VALUE(@json, '$.FiveThousandAs')

If I run that on it's own, I reproduce the NULL I was seeing inserted into my table:

2018-09-15_18-26-27

JSON_VALUE is limiting

After a little bit more research, I discovered that the return type for JSON_VALUE is limited to 4000 characters.   Since JSON_VALUE is in lax mode by default, if the output has more than 4000 characters, it fails silently.

To force an error in future code I could use SELECT JSON_VALUE(@json, 'strict \$.FiveThousandAs')  so at least I would be notified immediately of an problem with my  query/data (via failure).

Although strict mode will notify me of issues sooner, it still doesn't help me extract all of the data from my JSON property.

(Side note: I couldn't define my nvarchar(max) column as NOT NULL because for some rows the value could be NULL, but in the future I might consider adding additional database validation with a check constraint).

OPENJSON

The solution to reading the entire 5000 character value from my JSON property is to use OPENJSON:

SELECT * 
FROM OPENJSON(@json) 
WITH (FiveThousandAs nvarchar(max) '$.FiveThousandAs')

My insert query needed to be slightly refactored, but now I'm able to return any length value (as long as it's under 2gb).

In hindsight, I should have used OPENJSON() from the start: not only is it capable of parsing the full length values from JSON strings, but it performs significantly faster than any of the other SQL Server JSON functions.

As a best practice, I think I'm going to use OPENJSON by default for any JSON queries to avoid problems like this in the future.