What Does The GO Command Do?

Published on: 2018-05-29

Watch this week’s episode on YouTube.

When I started working with T-SQL, I thought the GO command was optional, kind of like semicolons.  It appeared in plenty of SSMS generated scripts, but it seemed like I never had to add it to any queries of my own:

SELECT 1 as Col1 INTO #Test1;
GO

-- ...or....

SELECT 1 as Col1 INTO #Test2;

-- both seem to work equally well!

Turns out that GO isn’t T-SQL at all, but a command that allows apps (like SSMS) to send batches of queries to SQL Server.  It also turns out it has some more functionality than I originally thought.

So when should you use GO?

Using GO When Required

Just like semicolons not being entirely optional (ever try to run a CTE after a statement without a semicolon?), GO isn’t completely optional either.

Some operations require that GO appear immediately after them:

DROP PROCEDURE IF EXISTS dbo.BatchTest;
CREATE PROCEDURE dbo.BatchTest
AS
BEGIN
	SELECT 1 as Col1
END;

The above script will fail with the error message “‘CREATE/ALTER PROCEDURE’ must be the first statement in a query batch.”

There are multiple commands (“CREATE DEFAULT, CREATE FUNCTION, CREATE PROCEDURE, CREATE RULE, CREATE SCHEMA, CREATE TRIGGER, and CREATE VIEW”) that require being the first statement of a batch, so using GO is required if you are going to try running other statements as part of your script.

So how do you run that CREATE PROCEDURE statement after first checking and dropping that procedure?  Just add GO so that CREATE PROCEDURE is the first statement of the batch:

DROP PROCEDURE IF EXISTS dbo.BatchTest;
GO

CREATE PROCEDURE dbo.BatchTest
AS
BEGIN
	SELECT 1 as Col1
END;

Executing Commands Multiple Times

So the previous example was one where SQL Server required me to type two extra characters to run certain commands.  Boo.  What about something actually useful I can do with GO?

Sometimes you may want to run a statement more than once.  You can do that by being trigger happy with the F5 key or your mouse button, but you don’t want to do that 10,000 times, do you?

That’s where GO shines.  Simply add an integer after GO and SQL Server will execute that batch of statements however many times you specified.  For example, the following code will insert 10,000 rows into a table:

INSERT INTO dbo.TestData VALUES (1,2,3)
GO 10000

This is really useful when creating test data or simulating workloads.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Converting JSON to SQL Server CREATE TABLE Statements

Published on: 2018-05-22

Watch this week’s episode on YouTube.

Tedious, repetitive tasks are the bane of any lazy programmer.  I know, because I am one.

One such repetitive task that I find comparable to counting grains of rice is building database layouts from JSON data sources.

While some online services exist that will parse JSON objects into database structures, I don’t like using them because I don’t trust the people running those sites with my data.  Nothing personal against them, I just don’t want to be passing my data through their servers.

My solution to this problem was to write a query that will parse my unfamiliar JSON documents into a series of CREATE TABLE statements.

Automatically Generating A SQL Database Schema From JSON

You can always get the most recent version of the query from GitHub, but I’ll post the current version below so that it’s easier to explain in this post:

/*
This code takes a JSON input string and automatically generates
SQL Server CREATE TABLE statements to make it easier
to convert serialized data into a database schema.

It is not perfect, but should provide a decent starting point when starting
to work with new JSON files.

A blog post with more information can be found at https://bertwagner.com/2018/05/22/converting-json-to-sql-server-create-table-statements/
*/
SET NOCOUNT ON;

DECLARE 
	@JsonData nvarchar(max) = '
		{
			"Id" : 1,
			"IsActive":true,
			"Ratio": 1.25,
			"ActivityArray":[true,false,true],
			"People" : ["Jim","Joan","John","Jeff"],
			"Places" : [{"State":"Connecticut", "Capitol":"Hartford", "IsExpensive":true},{"State":"Ohio","Capitol":"Columbus","MajorCities":["Cleveland","Cincinnati"]}],
			"Thing" : { "Type":"Foo", "Value" : "Bar" },
			"Created_At":"2018-04-18T21:25:48Z"
		}',
	@RootTableName nvarchar(4000) = N'AppInstance',
	@Schema nvarchar(128) = N'dbo',
	@DefaultStringPadding smallint = 20;

DROP TABLE IF EXISTS ##parsedJson;
WITH jsonRoot AS (
	SELECT 
		0 as parentLevel, 
		CONVERT(nvarchar(4000),NULL) COLLATE Latin1_General_BIN2 as parentTableName, 
		0 AS [level], 
		[type] ,
		@RootTableName COLLATE Latin1_General_BIN2 AS TableName,
		[key] COLLATE Latin1_General_BIN2 as ColumnName,
		[value],
		ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ColumnSequence
	FROM 
		OPENJSON(@JsonData, '$')
	UNION ALL
	SELECT 
		jsonRoot.[level] as parentLevel, 
		CONVERT(nvarchar(4000),jsonRoot.TableName) COLLATE Latin1_General_BIN2, 
		jsonRoot.[level]+1, 
		d.[type],
		CASE WHEN jsonRoot.[type] IN (4,5) THEN CONVERT(nvarchar(4000),jsonRoot.ColumnName) ELSE jsonRoot.TableName END COLLATE Latin1_General_BIN2,
		CASE WHEN jsonRoot.[type] IN (4) THEN jsonRoot.ColumnName ELSE d.[key] END,
		d.[value],
		ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ColumnSequence
	FROM 
		jsonRoot
		CROSS APPLY OPENJSON(jsonRoot.[value], '$') d
	WHERE 
		jsonRoot.[type] IN (4,5) 
), IdRows AS (
	SELECT 
		-2 as parentLevel,
		null as parentTableName,
		-1 as [level],
		null as [type],
		TableName as Tablename,
		TableName+'Id' as columnName, 
		null as [value],
		0 as columnsequence
	FROM 
		(SELECT DISTINCT tablename FROM jsonRoot) j
), FKRows AS (
	SELECT 
		DISTINCT -1 as parentLevel,
		null as parentTableName,
		-1 as [level],
		null as [type],
		TableName as Tablename,
		parentTableName+'Id' as columnName, 
		null as [value],
		0 as columnsequence
	FROM 
		(SELECT DISTINCT tableName,parentTableName FROM jsonRoot) j
	WHERE 
		parentTableName is not null
)
SELECT 
	*,
	CASE [type]
		WHEN 1 THEN 
			CASE WHEN TRY_CONVERT(datetime2, [value], 127) IS NULL THEN 'nvarchar' ELSE 'datetime2' END
		WHEN 2 THEN 
			CASE WHEN TRY_CONVERT(int, [value]) IS NULL THEN 'float' ELSE 'int' END
		WHEN 3 THEN 
			'bit'
		END COLLATE Latin1_General_BIN2 AS DataType,
	CASE [type]
		WHEN 1 THEN 
			CASE WHEN TRY_CONVERT(datetime2, [value], 127) IS NULL THEN MAX(LEN([value])) OVER (PARTITION BY TableName, ColumnName) + @DefaultStringPadding ELSE NULL END
		WHEN 2 THEN 
			NULL
		WHEN 3 THEN 
			NULL
		END AS DataTypePrecision
INTO ##parsedJson
FROM jsonRoot
WHERE 
	[type] in (1,2,3)
UNION ALL SELECT IdRows.parentLevel, IdRows.parentTableName, IdRows.[level], IdRows.[type], IdRows.TableName, IdRows.ColumnName, IdRows.[value], -10 AS ColumnSequence, 'int IDENTITY(1,1) PRIMARY KEY' as datatype, null as datatypeprecision FROM IdRows 
UNION ALL SELECT FKRows.parentLevel, FKRows.parentTableName, FKRows.[level], FKRows.[type], FKRows.TableName, FKRows.ColumnName, FKRows.[value], -9 AS ColumnSequence, 'int' as datatype, null as datatypeprecision FROM FKRows 

-- For debugging:
-- SELECT * FROM ##parsedJson ORDER BY ParentLevel, level, tablename, columnsequence

DECLARE @CreateStatements nvarchar(max);

SELECT
	@CreateStatements = COALESCE(@CreateStatements + CHAR(13) + CHAR(13), '') + 
	'CREATE TABLE ' + @Schema + '.' + TableName + CHAR(13) + '(' + CHAR(13) +
		STRING_AGG( ColumnName + ' ' + DataType + ISNULL('('+CAST(DataTypePrecision AS nvarchar(20))+')','') +  CASE WHEN DataType like '%PRIMARY KEY%' THEN '' ELSE ' NULL' END, ','+CHAR(13)) WITHIN GROUP (ORDER BY ColumnSequence) 
	+ CHAR(13)+')'
FROM
	(SELECT DISTINCT 
		j.TableName, 
		j.ColumnName,
		MAX(j.ColumnSequence) AS ColumnSequence, 
		j.DataType, 
		j.DataTypePrecision, 
		j.[level] 
	FROM 
		##parsedJson j
		CROSS APPLY (SELECT TOP 1 ParentTableName + 'Id' AS ColumnName FROM ##parsedJson p WHERE j.TableName = p.TableName ) p
	GROUP BY
		j.TableName, j.ColumnName,p.ColumnName, j.DataType, j.DataTypePrecision, j.[level] 
	) j
GROUP BY
	TableName


PRINT @CreateStatements;


In the variables section, we can define our input JSON document string as well as define things like a root table name and default database schema name.

There is also a string padding variable.  This padding variable’s value is added to the max value length found in each column being generated, giving each column a little bit more breathing room.

Next in the script is the recursive CTE that parses the JSON string.  The OPENJSON() function in SQL Server makes this part relatively easy since some of the work of determining datatypes is already done for you.

I’ve taken the liberty to convert all strings to nvarchar types, numbers to either floats or ints, booleans to bits, and datetime strings to datetime2s.

Two additional CTE expressions add an integer IDENTITY PRIMARY KEY column to each table as well as a column referencing the parent table if applicable (our foreign key column).

Finally, a little bit of dynamic SQL pieces together all of these components to generate our CREATE TABLE scripts.

Limitations

I created this code with a lot of assumptions about my (unfamiliar) JSON data sets.  For the purpose of roughly building out tables from large JSON files, I don’t need the results to be perfect and production-ready; I just want the results to be mostly correct so the vast majority of tedious table creation work is automated.

With that disclaimer made, here are a few things to be aware of:

  • Sometimes there will be duplicate column names generated because of naming – just delete one.
  • While foreign key columns exist, the foreign key constraints don’t.
  • This code uses STRING_AGG.  I’ll leave it up to you to convert to STUFF and FOR XML PATH if you need to run it in versions prior to 2017.

Summary

This script is far from perfect.  But it has eliminated the need for me to build out these tables and columns from scratch.  Sure, the output sometimes needs a tweak or too, but for my purposes I’m happy with how it turned out.  I hope it helps you eliminate some boring table creation work too.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Why Is My VARCHAR(MAX) Variable Getting Truncated?

Published on: 2018-05-15

Watch this week’s episode on YouTube.

Sometimes SQL Server doesn’t do what you tell it to do.

Normally that’s ok – SQL is a declarative language after all, so we’re supposed to tell it what we want it to do, not how we want it done.

And while that’s fine for most querying needs, it can become really frustrating when SQL Server decides to completely disregard what you explicitly asked it to do.

Why Is My VARCHAR(MAX) Truncated to 8000 Characters?

A prime example of this is when you declare a variable as VARCHAR(MAX) because you want to assign a long string to it.  Storing values longer than 8000 characters long is the whole point of VARCHAR(MAX), right?

DECLARE @dynamicQuery VARCHAR(MAX);

SET @dynamicQuery = REPLICATE('a',8000) + 'b'

SELECT @dynamicQuery as dynamicQueryValue, LEN(@dynamicQuery) AS dynamicQueryLength

If we look at the above query, I would expect my variable @dynamicQuery to be 8001 characters long; it should be 8000 letter ‘a’s followed by a single letter ‘b’.  8001 characters total, stored in a VARCHAR(MAX) defined variable.

But does SQL Server actually store all 8001 characters like we explicitly asked it to?

No:

First we can see that the LEN() of our variable is only 8000 – not 8001 – characters long!

Copying and pasting our resulting value into a new query window also shows us that there is no character ‘b’ at position 8001 like we expected.

The Miserly SQL Server

The reason this happens is that SQL Server doesn’t want to store something as VARCHAR(MAX) if none of the variable’s components are defined as VARCHAR(MAX).  I guess it doesn’t want to store something in a less efficient way if there’s no need for it.

However, this logic is flawed since we clearly DO want to store more than 8000 characters.  So what can we do?

Make Something VARCHAR(MAX)

Seriously, that’s it.  You can do something like CAST the single character ‘b’ as VARCHAR(MAX) and your @dynamicQuery variable will now contain 8001 characters:

But casting a single character as VARCHAR(MAX) isn’t very intuitive.

Instead, I recommend casting a blank as VARCHAR(MAX) and prefixing it to the start of your variable string.  Leave yourself a comment for the future and hopefully you’ll remember why this superfluous looking piece of code is needed:

-- using CAST('') to force SQL to define
-- as varchar(MAX)
SET @dynamicQuery =  CAST('' AS varchar(MAX))
	+ REPLICATE('a',8000)+ 'b'

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!