Data with Bert logo

Will Technology Eliminate Your Tech Job?

9f4ed-1lh0mvkliatliiikt0vlyow

This post is a response to this month's T-SQL Tuesday prompt. T-SQL Tuesday was created by Adam Machanic and is a way for SQL users to share ideas about interesting topics. This month's topic is The Times They Are A-Changing.


I think everyone's had the same fear at some point in their career: "Am I going to lose my job because of X?" X can be a variety of things — company reorganizations, positions being outsourced, robotic automation, new software advancements, etc…

I think the answer to this question depends 100% on the type of individual you are and nothing to do with what your job actually is (was).

Being a Linchpin

Seth Godin discusses the concept of a Linchpin in his same-titled book. A Linchpin is someone who is so good at what they do that they become indispensable to their organization. Linchpins are the kind of people who are self-motivated and are able to consistently deliver quality work. They are integral to the operation of a business, even if they don't get all of the glamour of having VP or Director in their title.

And why are Linchpins always guaranteed jobs? In one scenario, Linchpins will outgrow their role and be promoted or find a better job. They are always learning and growing in addition to delivering, and so this is the natural procession. In the alternate scenario, if the Linchpin has to lose his or her current job (ie. think company buyouts where entire departments close), they will either 1) become promoted to elsewhere in the company because management recognizes their great skills or 2) they will have no problem finding work elsewhere, especially with great recommendations from their former employer.

The Cloud, SaaS, PaaS, and other technologies

The past few years have seen many new technologies come into the SQL professional's workspace. Administrators now have the ability to manage their server instances online in the cloud and use new features and functionality that weren't previously available in local-network only instances. Developers also have new tools to interact with cloud instance, but also have totally new functionality available to them from a variety of online services.

As of now, I think most of these new advancements augment our current technology instead of replace it. I think this means that some professionals will choose to not learn about them or how to use them. And it's really easy to justify not learning them — it can be hard for some to find the time to learn something that they can't immediately use.

However, some professionals will be excited and will learn about these new technologies. Even if their environments don't need to use cloud platforms and other new features, they will find small areas in their environment that can use these technologies so they start getting experience using them. Worst case, even if it's not possible to modify something existing with these new tools, these professionals will create sandboxes for themselves and learn to use some of these technologies anyway. By doing this, they will be more confident in using these tools when the time necessitates that they be used.

When it's time to be promoted or to switch jobs, which of the two professionals is more likely to get hired — the one who knows only his or her old technology really well, or the professional who has taken the time to learn these new features even if they didn't have to use them in their old environment?

Is my role of business intelligence developer going to disappear?

I'm a professional learner. Officially I'm a business intelligence developer, but unofficially I also am a web developer, manager, DBA, and electrical engineer. I don't pretend that I am an expert in all of those unofficial capacities (or even the official one!), but I do continually try to improve myself in all of those roles.

Do I worry about having new technologies replace my current job role? No. I do think the tools I use today will be outdated and replaced at some point in the future though.

I imagine some future version of SSRS will be able to generate the majority of the reporting needed for my database based off metadata. Data will continue to evolve and live in environments other than just SQL Server, making my need for SSIS less important — I'll have to learn other ways to transform data, whether through C#, Python, some cloud querying tool, or all of the above. I'll have to get used to not only using data from databases and flat files, but also mixing in data from APIs and cloud storage. Some of this data will be relational but a lot of it will not.

And all of that sounds exciting! Learning new ways of working with data is a thrill because it means I won't get bored working on the same thing year after year. Sure, 10 years from now new technologies will replace my current job — fortunately for me though, by that point I'll be working with those new technologies.

[Video] JSON Usage and Performance in SQL Server 2016

Using JSON because you are lazy is not a good excuse!

Last night I had the privilege to present to the Ohio North SQL Server User Group about JSON in SQL Server 2016. There was a great crowd present (they laughed at all of my terrible jokes so how can they not be great!?) and I had a wonderful time sharing what I know about JSON.

Below you can find my video recording of the presentation as well as the slides and demo code.

Also worth highlighting is OnTopReplica, an open source piece of software I used that basically does picture-in-picture display of another window on your desktop. You can hear everyone get excited by it at 18 minutes and 30 seconds into the video.

Enjoy the resources below :). These as well as resources from other past presentations are available at https://bertwagner.com/presentations/ .

Video

https://youtu.be/r2xFIsdSJ2Q

Slides

https://www.slideshare.net/BertWagner/json-usage-and-performance-in-sql-server-2016

Demo Scripts

https://bertwagner.com/presentations/JSON%20in%20SQL%20Server%202016%20-%20Bert%20Wagner.zip

Who Stuck These Letters In My DateTimes?

7341d-1tspfzbxx6-gyfexbhclupq

Parsing, creating, and modifying JSON in SQL Server 2016 is really easy. JSON dates and times are not.

Coming from a predominantly SQL background, the JSON DateTime format took some getting used to, especially when it came to converting SQL datetimes to JSON and vice versa.

The remainder of this post will get you well on your way to working with JSON date times in SQL Server.

Breakdown of JSON date/time

In SQL Server, datetime2's format is defined as follows:

YYYY-MM-DD hh:mm:ss[.fractional seconds]

JSON date time strings are defined like:

YYYY-MM-DDTHH:mm:ss.sssZ

Honestly, they look pretty similar. However, there are few key differences:

  • JSON separates the date and time portion of the string with the letter T
  • The Z is optional and indicates that the datetime is in UTC (if the Z is left off, JavaScript defaults to UTC). You can also specify a different timezone by replacing the Z with a + or  along with HH:mm (ie. -05:00 for Eastern Standard Time)
  • The precision of SQL's datetime2 goes out to 7 decimal places, in JSON and JavaScript it only goes out to 3 places, so truncation may occur.

Now that we know the key differences between SQL datetime2 and JSON date time strings, let's explore common transformations when working with JSON data in SQL.

Parsing JSON date time into SQL datetime2

The most common operation I perform with these new JSON functions is parsing, so let's start with those. Let's see how we can parse the date/times from JSON using SQL Server 2016's JSON_VALUE() function:

DECLARE @jsonData nvarchar(max) = N'{ "createDate" : "2017-03-28T12:45:00.000Z" }'

-- SQL's JSON_VALUE() will read in the JSON date time as a string
SELECT JSON_VALUE(@jsonData, '$.createDate')
-- Output: 2017-03-28T12:45:00Z

-- If we want to read it in as a SQL datetime2, we need to use a CONVERT() (or a CAST())
SELECT CONVERT(datetime2, JSON_VALUE(@jsonData, '$.createDate'))
-- Output: 2017-03-28 12:45:00.0000000

-- 7 zeroes after the decimal? Our source only had 3 zeroes!
-- Since JSON/JavaScript times have decimal precision to only 3 places, we need to make
-- the precision of datetime2 match
SELECT CONVERT(datetime2(3), JSON_VALUE(@jsonData, '$.createDate'))
-- Output: 2017-03-28 12:45:00.000

-- So now we are returning our UTC date time from JSON, but what if we need to convert it to a different time zone?
-- Using SQL Server 2016's AT TIME ZONE with CONVERT() will allow us to do that easily.
-- To get a full list of time zone names, you can use SELECT * FROM sys.time_zone_info
SELECT CONVERT(datetime2(3), JSON_VALUE(@jsonData, '$.createDate')) AT TIME ZONE 'Eastern Standard Time'
-- Output: 2017-03-28 12:45:00.000 -04:00

-- What if we just need to grab the date?  Pretty easy, just CONVERT() to date
SELECT CONVERT(date, JSON_VALUE(@jsonData, '$.createDate'))
-- Output: 2017-03-28

--Same with just the time, just remember to use a precision value of 3
SELECT CONVERT(time(3), JSON_VALUE(@jsonData, '$.createDate'))
-- Output: 12:45:00.000

Inserting SQL datetime2 into JSON

Taking date/time data out of JSON and into SQL was pretty easy. What about going the opposite direction and inserting SQL date/time data into JSON?

DECLARE @sqlData datetime2 = '2017-03-28 12:45:00.1234567'

-- Let's first try the simplest SQL to JSON conversion first using FOR JSON PATH
SELECT @sqlData as SQLDateTime2 FOR JSON PATH
-- Output: [{"SQLDateTime2":"2017-03-28T12:45:00"}]

-- Honestly that's not too bad!
-- The datetime gets created in the YYYY-MM-DDTHH:MM:SS.fffffff format
-- Although this is pretty much what we need, what if we want to be explicit and specify that we are in UTC?
-- Just add the AT TIME ZONE modifier and we will get our JSON "Z" indicating UTC
SELECT @sqlData AT TIME ZONE 'UTC' AS SQLDateTime2 FOR JSON PATH
-- Output: [{"SQLDateTime2":"2017-03-28T12:45:00.1234567Z"}]

-- And if we provide a different time zone offset, the JSON is formatted correctly with the +/-HH:MM suffix:
SELECT @sqlData AT TIME ZONE 'Eastern Standard Time' AS SQLDateTime2 FOR JSON PATH
-- Output: [{"SQLDateTime2":"2017-03-28T12:45:00.1234567-04:00"}]

-- You might notice that there are 7 fractional second decimal places in all of the above examples.
-- Although out of JSON spec, this is ok!

-- What if we just want to insert the date?  Just specify with a SQL CONVERT()
SELECT CONVERT(date, @sqlData) as SQLDateTime2 FOR JSON PATH
-- Output: [{"SQLDateTime2":"2017-03-28"}]

-- And the same goes with the time portion
SELECT CONVERT(time, @sqlData) as SQLDateTime2 FOR JSON PATH
-- Output: [{"SQLDateTime2":"12:45:00.1234567"}]

Modifying JSON date time with SQL

So we've seen how easy it is to parse and create JSON date/time strings, but what about modifying JSON data?

DECLARE @sqlDate datetime2 = '2017-03-28 12:45:00.1234567'



DECLARE @jsonData nvarchar(max) = N'{ "createDate" : "2017-03-28T12:45:00.000Z" }'
        ,@newDate datetime2(3) = '2017-03-28T12:48:00.123Z'

-- Let's start out modifying our data by replacing the value completely
SELECT JSON_VALUE(@jsonData, '$.createDate')

-- If we want to pass in a perfectly formatted JSON string, then it's pretty easy
SELECT JSON_MODIFY(@jsonData, '$.createDate', '2017-03-28T12:48:00.123Z')
-- Output: { "createDate" : "2017-03-28T12:48:00.123Z" }

-- If we want to pass in a SQL datetime2 value, say like what we have stored in @newDate, then things get a little messy.
-- The JSON_MODIFY function requires the third argument to be the nvarchar datatype.  This means
-- we need to get our SQL datetime2 into a valid JSON string first.  

-- If we use FOR JSON PATH to create the JSON date from the SQL datetime2, things get ugly because 
-- FOR JSON PATH always creates a property : value combination
SELECT JSON_MODIFY(@jsonData, '$.createDate', (SELECT @newDate as newDate FOR JSON PATH))
-- Output: { "createDate" : [{"newDate":"2017-03-28T12:48:00.123"}] }

-- In order to only pass the JSON datetime into the value for the "createDate" property, we need to 
-- use the CONVERT style number 127 to convert our dateTime to a JSON format
SELECT JSON_MODIFY(@jsonData, '$.createDate', (SELECT CONVERT(nvarchar, @newDate, 127)))
-- Output: { "createDate" : "2017-03-28T12:48:00.123" }

-- But what happened to our "Z" indicating UTC?  
-- We of course need to specify the AT TIME ZONE again:
SELECT JSON_MODIFY(@jsonData, '$.createDate', (SELECT CONVERT(nvarchar, @newDate AT TIME ZONE 'UTC', 127)))
--Output: { "createDate" : "2017-03-28T12:48:00.123Z" }

Overall, working with JSON dates/times is really easy using SQL Server 2016's new JSON functions. Microsoft could have done a really bad job not following the ECMA standards, but they did a great job crossing their T's and placing their Z's.

C#'s foreach ruined my afternoon

"Forest Fire" by CIFOR is licensed under CC BY-NC-ND 2.0

The other afternoon I ran into some nightmarish debugging with the following code:

private static void StartThreads()
{
    var values = new List<int>() { 1, 2, 3 };
    var threads = new List<Thread>();

    foreach (var value in values)
    {
        Thread t = new Thread(() => Run(value));
        threads.Add(t);
        t.Start();
    }

    // Wait for all threads to complete
    foreach (Thread t in threads)
        t.Join();
}

private static void Run(int value)
{
    Console.Write(value.ToString());
}

(I know, I know, I wish I could be using TPL but in this case I couldn't)

On my local machine, the code above ran and gave me my expected console output of 123 (your results may vary depending on what order the threads run in).

When I ran this code on my server however, the output was 333.

begin pulling out hair

Long story short, after a couple hours of investigation I figured out that the way a foreach loop works under the hood in C# ≥ 5.0, which is what I run on my local machine, works differently than a foreach loop in C# < 5.0, which is what I had on my server.

From the C# 4.0 spec, a foreach loop is really a while loop behind the scenes, meaning the code above really translates into something like this:

private static void StartThreads()
{
    var values = new List<int>() { 1, 2, 3 };
    var threads = new List<Thread>();
    IEnumerator<int> e = ((IEnumerable<int>)values).GetEnumerator();

    try
    {
        int v; // DECLARED OUTSIDE OF THE LOOP!!!
        while (e.MoveNext())
        {
            v = (int)(int)e.Current;
            Thread t = new Thread(() => Run(v));
            threads.Add(t);
            t.Start();
        }
    }
    finally
    {
        if (e != null) ((IDisposable)e).Dispose();
    }

    // Wait for all threads to complete
    foreach (Thread t in threads)
        t.Join();
}

The important thing to note in the above code is that int v gets declared outside of the while loop.

In the C# 5.0 spec, that int v gets declared inside the loop (causing it to get recreated with every iteration):

private static void StartThreads()
{
    var values = new List<int>() { 1, 2, 3 };
    var threads = new List<Thread>();
    IEnumerator<int> e = ((IEnumerable<int>)values).GetEnumerator();

    try
    {
        while (e.MoveNext())
        {
            int v; // C# 5.0 DECLARED INSIDE THE LOOP
            v = (int)(int)e.Current;
            Thread t = new Thread(() => Run(v));
            threads.Add(t);
            t.Start();
        }
    }
    finally
    {
        if (e != null) ((IDisposable)e).Dispose();
    }

    // Wait for all threads to complete
    foreach (Thread t in threads)
        t.Join();
}

Because my local machine and server were running different versions of .NET, the same exact code was producing totally different results.

Eventually I found Eric Lippert's article about the matter. Since I'm still fairly new to the world of .NET, I wasn't around for the big debate that took place in his comment's section regarding which should be the correct implementation. However, it is interesting to note that the C# devs decided to switch the logic on how the foreach loop operates so late in the game.

My eventual workaround for the .NET 3.5/C# 4.0 server was to assign the int to a newly created variable inside the foreach:

foreach (var value in values)
{
    var tempValue = value; // THE FIX
    Thread t = new Thread(() => Run(tempValue));
    threads.Add(t);
    t.Start();
}

As frustrating it may be to debug problems like this, it is nice to learn a little bit more of the language's history and idiosyncrasies.

When Is It Appropriate To Store JSON in SQL Server?

Who needs a relational database when everything can be stored in a JSON string?

Every once in a while I hear of some technologist say that relational databases are dead; instead, a non-table based NoSQL storage format is the way of the future. SQL Server 2016 introduced JSON functionality, making it possible for some "non-SQL" data storage to make its way into the traditionally tabled-based SQL Server.

Does this mean all data in SQL Server going forward should be stored in long JSON strings? No, that would be a terrible idea. There are instances when storing JSON in SQL Server is a good choice though. In this post I want to create recommendations for when data should be stored as JSON and when it shouldn't.

Databases Should Not Be Entirely Comprised Of JSON

The screenshot below is an example of what I think some developers would do if they were given free reign in SQL Server 2016:

bb208-1zzw7xinj5wmotojplta2lw

Here we have an application database ("InventoryApp") that consists of only a single table ("dbo.Data") with three JSON NVARCHAR(MAX) columns to represent all of the data required by the app. Relationships exist between Sales, Purchases, and Customers but these are not defined on the database side.

If you are from the world of relational-SQL, you might not believe that anyone would design such a database structure. Believe me though, this is a realistic scenario. Entire companies (eg. Firebase: https://firebase.google.com/) build their services around abstracting the database layer away from developers, essentially storing entire tables or databases in large JSON strings.

Many developers like storing data this way because it is easy to deserialize JSON strings into objects in their programming languages to use in their apps. They like the fact that with JSON they can have an infinitely changing storage schema (just add new keys, values, and arrays!) so if they need a new field for their app, they can just add it in, serialize the object to a JSON string, and store it again in the database.

Obviously, going completely "NoSQL" might make short term development easier/quicker, but using SQL Server 2016 to only store data this way is a travesty: there's no way to use many of SQL Server's amazing performance, schema definition and validation, and security features.

So when is it appropriate to store JSON in SQL Server?

Appropriate Use Case #1: Error Logging

Errors happen. When they do, it's nice to be able to go back and look at the error message to see what happened.

The problem is that the structure of error messages isn't always consistent. Sometimes only the value of a single property will help identify the cause of failure. Other times, something more complex fails and it would be nice to have all of the values of a complex object available to make troubleshooting easier.

This is where JSON steps in: in most programming languages, it is easy to convert error messages and run time values to a JSON object on error. And since error messages and data values change in structure depending on where they occur, it's easy to dynamically turn any type of object into JSON data.

This data is perfect to store in SQL to be looked at later. None of these ideas are new — nvarchar(max) has been in SQL for a while now, and so programmers everywhere have been storing error information in that datatype.

With SQL Server 2016, it is now easier to examine and parse the error information directly in SQL Server Management Studio with the variety of JSON parsing functions available. No longer do programmers have to copy the code into some different tool — they can easily do it in SSMS.

Appropriate Use Case #2: Piloting Ideas

Most large workplaces have controls in place that prevent developers from making changes in production. In general this is a Good Idea™.

However, controls are sometimes too restrictive. For example, due to security restrictions, lack of server space, company politics, etc… developers are sometimes stuck developing in production. It's an unfortunate fact of life. In those scenarios, developers have to go through hell if they have to elevate each database structure change every time they want to test something in production.

JSON to the rescue! An nvarchar(max) column in a table can have its JSON data be easily added to and modified to fit more data than it was originally intended to hold. All without any database structure change requests.

Now this is not an ideal situation. In fact, it's a scenario that can add a lot of technical debt to the application long-term if not planned for.

However, if a "flexible" JSON column is built with eventual conversion to a traditional table structure in mind from the start, it's actually simple for a developer to transition an entirely JSON storage structure to a relational format later on. They key here is that the developer needs to have this conversion planned from day one.

Appropriate Use Case #3: Non-Analytical Data

Analytical data is SQL Server's bread and butter. Need to store lots of data and be able to query against it all day long? No problem, there are a plethora of performance tuning options to make your queries run fast and efficiently.

However, sometimes not all data needs to be analyzed. Often an app might need to save some session data to a database temporarily — why bother creating all of the maintenance overhead of strict database schemas if the data will never be queried for analytical purposes? Another example might be a website's dynamically created user profile settings. You can build normalized table(s) to store all of that data, but then you will be writing programming logic to normalize and denormalize your data out of the app.

If this data will not have to be searched, then why bother adding all of the overhead? Keep it in JSON and be done with.