FizzBuzz and Multiple Recursive Member CTEs

Published on: 2018-08-21

Watch this week’s episode on YouTube.

Last week I needed to write a recursive common table expression.  I’ve written them before, but it’s been a while and needed to visit the documentation to reference the syntax.

Instead of going straight to the examples, I decided to read into some of the details (since skipping the details really hurt me in last week’s post) and noticed this line that I had never seen before:

“Multiple…recursive members can be defined” – what????

I never knew you could have multiple recursive member statements in a CTE.  Heck, I didn’t even know what having multiple recursive members could do.

Since the documentation doesn’t talk about them beyond the one highlighted line above, I decided to create some examples to see if I could get them to work.

FizzBuzz

FizzBuzz is a programming puzzle that asks the solver to write a program that will list the numbers 1 to 100, displaying the word “Fizz” for any numbers that are a multiple of 3, “Buzz” for any multiples of 5, and “FizzBuzz” for any multiples of 3 and 5.

I decided to try and implement the FizzBuzz problem as both a single and multiple member CTE to see how the solutions would differ.

The Basic Recursive CTE

To start out, I decided to write a CTE that lists all numbers 0 to 100:

WITH c AS
(
    -- anchor member
    SELECT
        0 AS RowNumber
    UNION ALL
    -- recursive member
    SELECT
        c.RowNumber + 1
    FROM
        c /* the result of our last iteration */
    WHERE 
        RowNumber < 100
)

SELECT * 
FROM c;

The first SELECT statement in the CTE definition is known as the “anchor” member.  This query runs a single time and acts as the initial result that the recursive query acts on.

The second SELECT statement in the CTE definition is known as the “recursive” member.  This statement executes on the results of the previous execution (or on the results of the anchor member for the first iteration).

The recursive member will execute over and over again as long as it is still producing results.  Since our recursive statement is just adding 1 to the previous result, our recursive query would run forever – which is why we add the WHERE condition stop it from executing once we reach 100.

Our final SELECT statement returns the results of our recursive CTE, providing us with a neat list of numbers from 0 to 100:

Single Recursive Member CTE for FizzBuzz

Now that our basic recursive CTE is working, let’s make it solve FizzBuzz.  Here is our updated code:

WITH c AS
(
    SELECT
        0 AS RowNumber,
        'FizzBuzz' AS FizzOrBuzz
    UNION ALL
    SELECT
        c.RowNumber + 1,
        CASE 
            WHEN (c.RowNumber + 1) % 15 = 0 THEN 'FizzBuzz' 
            WHEN (c.RowNumber + 1) % 3 = 0 THEN 'Fizz' 
            WHEN (c.RowNumber + 1) % 5 = 0 THEN 'Buzz' 
            ELSE NULL 
        END
    FROM
        c
    WHERE 
        RowNumber < 100
)

SELECT
    RowNumber,
    FizzOrBuzz
FROM C
ORDER BY RowNumber;

First, we add a second column to our results to display the word “Fizz”, “Buzz”, or “FuzzBuzz”.

In the anchor member, we defaulted this value to “FizzBuzz”.  In our recursive member, we added a CASE statement to display the correct word.  The modulo operator (%) checks to see if the current row divided by 3, 5, or 15 results in a remainder – if the remainder is 0 then we know we found a multiple of that number.

This solution is pretty easy to read and provides the expected output for our FizzBuzz puzzle:

Multiple Recurisve Member CTE for FizzBuzz

Alright the moment we’ve been waiting for – the multiple recursive member CTE:

WITH c AS
(
    SELECT
        0 AS RowNumber,
        'FizzBuzz' AS FizzOrBuzz
    UNION ALL
    /* All rows not Fizz or Buzz or FizzBuzz */
    SELECT
        c.RowNumber + 1,
        NULL AS FizzOrBuzz
    FROM
        c
    WHERE
        c.RowNumber+1 <= 100
        AND (c.RowNumber+1)%3<>0
        AND (c.RowNumber+1)%5<>0
    UNION ALL
    /* Fizz rows */
    SELECT
        c.RowNumber + 3,
        CAST('Fizz' AS VARCHAR(8)) AS FizzOrBuzz
    FROM
        c
    WHERE 
        c.RowNumber+3 <= 100
        and FizzOrBuzz in ('Fizz','FizzBuzz')
    UNION ALL
    /* Buzz rows */
    SELECT
        c.RowNumber + 5,
        'Buzz' AS FizzOrBuzz
    FROM
        c
    WHERE
        c.RowNumber+5 <= 100
        and FizzOrBuzz in ('Buzz','FizzBuzz')
)

SELECT
    RowNumber,
    STRING_AGG(FizzOrBuzz,'') AS FizzOrBuzz
FROM C
GROUP BY
    RowNumber
ORDER BY RowNumber

You’ll notice we have 3 recursive members: the first generates all rows up to 100 that are not multiples of 3 or 5, the second generates all rows that are multiples of 3, and the third statement generates all rows that are multiple of 5.

If we were to run SELECT * FROM c; after only making the mentioned changes, you’ll notice that it looks like things are mostly working, but that we have duplicates (and incorrect labeling) for rows that are multiples of 3 and 5:

The way I decided to fix that is by adding a STRING_AGG() function to the final SELECT statement, concatenating the outputs of rows with the same RowNumber. With that addition, our multiple recursive member CTE FizzBuzz solution is complete.

One thing to be aware of in the above solution: each of the recursive member statements will execute on the previous results of ANY recursive member statement, so we add the conditions “…and FizzOrBuzz in …” to force each recursive statement to run only on the output from its own previous result.  This feels like cheating a little bit, but it was the only way I could solve the problem I had defined.

Practical Examples and Further Reading

I had a hard time coming up with a practical uses for multiple recursive member CTEs.

I searched online for some examples but it doesn’t seem like many people have written about the topic.  One exception I did find was an article by Itzik Ben-Gan where he uses them to solve Lord of the Rings family trees (heh).

Honestly though, as excited as I was initially to learn that doing this is possible, I don’t know if/when I’ll ever use it.  I’m hoping I encounter a problem one day that can make use of multiple recursive statements, but who knows if that will ever happen.

If you have used multiple recursive member CTEs to solve a real-world problem before, leave me a comment – I’d love to hear about the scenario you used it in.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Converting JSON to SQL Server CREATE TABLE Statements

Published on: 2018-05-22

Watch this week’s episode on YouTube.

Tedious, repetitive tasks are the bane of any lazy programmer.  I know, because I am one.

One such repetitive task that I find comparable to counting grains of rice is building database layouts from JSON data sources.

While some online services exist that will parse JSON objects into database structures, I don’t like using them because I don’t trust the people running those sites with my data.  Nothing personal against them, I just don’t want to be passing my data through their servers.

My solution to this problem was to write a query that will parse my unfamiliar JSON documents into a series of CREATE TABLE statements.

Automatically Generating A SQL Database Schema From JSON

You can always get the most recent version of the query from GitHub, but I’ll post the current version below so that it’s easier to explain in this post:

/*
This code takes a JSON input string and automatically generates
SQL Server CREATE TABLE statements to make it easier
to convert serialized data into a database schema.

It is not perfect, but should provide a decent starting point when starting
to work with new JSON files.

A blog post with more information can be found at https://bertwagner.com/2018/05/22/converting-json-to-sql-server-create-table-statements/
*/
SET NOCOUNT ON;

DECLARE 
	@JsonData nvarchar(max) = '
		{
			"Id" : 1,
			"IsActive":true,
			"Ratio": 1.25,
			"ActivityArray":[true,false,true],
			"People" : ["Jim","Joan","John","Jeff"],
			"Places" : [{"State":"Connecticut", "Capitol":"Hartford", "IsExpensive":true},{"State":"Ohio","Capitol":"Columbus","MajorCities":["Cleveland","Cincinnati"]}],
			"Thing" : { "Type":"Foo", "Value" : "Bar" },
			"Created_At":"2018-04-18T21:25:48Z"
		}',
	@RootTableName nvarchar(4000) = N'AppInstance',
	@Schema nvarchar(128) = N'dbo',
	@DefaultStringPadding smallint = 20;

DROP TABLE IF EXISTS ##parsedJson;
WITH jsonRoot AS (
	SELECT 
		0 as parentLevel, 
		CONVERT(nvarchar(4000),NULL) COLLATE Latin1_General_BIN2 as parentTableName, 
		0 AS [level], 
		[type] ,
		@RootTableName COLLATE Latin1_General_BIN2 AS TableName,
		[key] COLLATE Latin1_General_BIN2 as ColumnName,
		[value],
		ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ColumnSequence
	FROM 
		OPENJSON(@JsonData, '$')
	UNION ALL
	SELECT 
		jsonRoot.[level] as parentLevel, 
		CONVERT(nvarchar(4000),jsonRoot.TableName) COLLATE Latin1_General_BIN2, 
		jsonRoot.[level]+1, 
		d.[type],
		CASE WHEN jsonRoot.[type] IN (4,5) THEN CONVERT(nvarchar(4000),jsonRoot.ColumnName) ELSE jsonRoot.TableName END COLLATE Latin1_General_BIN2,
		CASE WHEN jsonRoot.[type] IN (4) THEN jsonRoot.ColumnName ELSE d.[key] END,
		d.[value],
		ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS ColumnSequence
	FROM 
		jsonRoot
		CROSS APPLY OPENJSON(jsonRoot.[value], '$') d
	WHERE 
		jsonRoot.[type] IN (4,5) 
), IdRows AS (
	SELECT 
		-2 as parentLevel,
		null as parentTableName,
		-1 as [level],
		null as [type],
		TableName as Tablename,
		TableName+'Id' as columnName, 
		null as [value],
		0 as columnsequence
	FROM 
		(SELECT DISTINCT tablename FROM jsonRoot) j
), FKRows AS (
	SELECT 
		DISTINCT -1 as parentLevel,
		null as parentTableName,
		-1 as [level],
		null as [type],
		TableName as Tablename,
		parentTableName+'Id' as columnName, 
		null as [value],
		0 as columnsequence
	FROM 
		(SELECT DISTINCT tableName,parentTableName FROM jsonRoot) j
	WHERE 
		parentTableName is not null
)
SELECT 
	*,
	CASE [type]
		WHEN 1 THEN 
			CASE WHEN TRY_CONVERT(datetime2, [value], 127) IS NULL THEN 'nvarchar' ELSE 'datetime2' END
		WHEN 2 THEN 
			CASE WHEN TRY_CONVERT(int, [value]) IS NULL THEN 'float' ELSE 'int' END
		WHEN 3 THEN 
			'bit'
		END COLLATE Latin1_General_BIN2 AS DataType,
	CASE [type]
		WHEN 1 THEN 
			CASE WHEN TRY_CONVERT(datetime2, [value], 127) IS NULL THEN MAX(LEN([value])) OVER (PARTITION BY TableName, ColumnName) + @DefaultStringPadding ELSE NULL END
		WHEN 2 THEN 
			NULL
		WHEN 3 THEN 
			NULL
		END AS DataTypePrecision
INTO ##parsedJson
FROM jsonRoot
WHERE 
	[type] in (1,2,3)
UNION ALL SELECT IdRows.parentLevel, IdRows.parentTableName, IdRows.[level], IdRows.[type], IdRows.TableName, IdRows.ColumnName, IdRows.[value], -10 AS ColumnSequence, 'int IDENTITY(1,1) PRIMARY KEY' as datatype, null as datatypeprecision FROM IdRows 
UNION ALL SELECT FKRows.parentLevel, FKRows.parentTableName, FKRows.[level], FKRows.[type], FKRows.TableName, FKRows.ColumnName, FKRows.[value], -9 AS ColumnSequence, 'int' as datatype, null as datatypeprecision FROM FKRows 

-- For debugging:
-- SELECT * FROM ##parsedJson ORDER BY ParentLevel, level, tablename, columnsequence

DECLARE @CreateStatements nvarchar(max);

SELECT
	@CreateStatements = COALESCE(@CreateStatements + CHAR(13) + CHAR(13), '') + 
	'CREATE TABLE ' + @Schema + '.' + TableName + CHAR(13) + '(' + CHAR(13) +
		STRING_AGG( ColumnName + ' ' + DataType + ISNULL('('+CAST(DataTypePrecision AS nvarchar(20))+')','') +  CASE WHEN DataType like '%PRIMARY KEY%' THEN '' ELSE ' NULL' END, ','+CHAR(13)) WITHIN GROUP (ORDER BY ColumnSequence) 
	+ CHAR(13)+')'
FROM
	(SELECT DISTINCT 
		j.TableName, 
		j.ColumnName,
		MAX(j.ColumnSequence) AS ColumnSequence, 
		j.DataType, 
		j.DataTypePrecision, 
		j.[level] 
	FROM 
		##parsedJson j
		CROSS APPLY (SELECT TOP 1 ParentTableName + 'Id' AS ColumnName FROM ##parsedJson p WHERE j.TableName = p.TableName ) p
	GROUP BY
		j.TableName, j.ColumnName,p.ColumnName, j.DataType, j.DataTypePrecision, j.[level] 
	) j
GROUP BY
	TableName


PRINT @CreateStatements;


In the variables section, we can define our input JSON document string as well as define things like a root table name and default database schema name.

There is also a string padding variable.  This padding variable’s value is added to the max value length found in each column being generated, giving each column a little bit more breathing room.

Next in the script is the recursive CTE that parses the JSON string.  The OPENJSON() function in SQL Server makes this part relatively easy since some of the work of determining datatypes is already done for you.

I’ve taken the liberty to convert all strings to nvarchar types, numbers to either floats or ints, booleans to bits, and datetime strings to datetime2s.

Two additional CTE expressions add an integer IDENTITY PRIMARY KEY column to each table as well as a column referencing the parent table if applicable (our foreign key column).

Finally, a little bit of dynamic SQL pieces together all of these components to generate our CREATE TABLE scripts.

Limitations

I created this code with a lot of assumptions about my (unfamiliar) JSON data sets.  For the purpose of roughly building out tables from large JSON files, I don’t need the results to be perfect and production-ready; I just want the results to be mostly correct so the vast majority of tedious table creation work is automated.

With that disclaimer made, here are a few things to be aware of:

  • Sometimes there will be duplicate column names generated because of naming – just delete one.
  • While foreign key columns exist, the foreign key constraints don’t.
  • This code uses STRING_AGG.  I’ll leave it up to you to convert to STUFF and FOR XML PATH if you need to run it in versions prior to 2017.

Summary

This script is far from perfect.  But it has eliminated the need for me to build out these tables and columns from scratch.  Sure, the output sometimes needs a tweak or too, but for my purposes I’m happy with how it turned out.  I hope it helps you eliminate some boring table creation work too.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!