T-SQL Documentation Generator

Published on: 2019-01-08

This post is a response to this month’s T-SQL Tuesday #110 prompt by Garry Bargsley.  T-SQL Tuesday is a way for the SQL Server community to share ideas about different database and professional topics every month.

This month’s topic asks to share how we automate certain processes.


I’m a fan of keeping documentation close to the code. I prefer writing my documentation directly above a procedure, function, or view definition because that’s where it will be most beneficial to myself and other developers.

Not to mention that’s the only place where the documentation has any chance of staying up to date when changes to the code are made.

What drives me crazy though is making a copy of that documentation somewhere else, into a different format. You know, like when someone without database access needs you to send them a description of all of the procedures for a project. Or if you are writing end-user documentation for your functions and views.

Not only is creating a copy of the documentation tedious, but there is no chance that it will stay up to date with future code changes.

So today I want to share how I automate some of my documentation generation directly from my code.

C# XML Style Documentation in T-SQL

C# uses XML to document objects directly in the code:

/// <summary>
/// Retrieves the details for a user.
/// </summary>
/// <param name="id">The internal id of the user.</param>
/// <returns>A user object.</returns>
public User GetUserDetails(int id)
{
    User user = ...
    return user;
}

I like this format: the documentation is directly next to the code and it is structured as XML, making it easy to parse for other uses (eg. use a static document generator to create end-user documentation directly from these comments).

This format is easily transferable to T-SQL:

/*
<documentation>
  <author>Bert</author>
  <summary>Retrieves the details for a user.</summary>
  <param name="@UserId">The internal id of the user.</param>
  <returns>The username, user's full name, and join date</returns>
</documentation>
*/
CREATE PROCEDURE dbo.USP_SelectUserDetails
       @UserId int
AS
BEGIN
    SELECT Username, FullName, JoinDate FROM dbo.[User] WHERE Id = @UserId
END
GO
 
 
/*
<documentation>
  <author>Bert</author>
  <summary>Returns the value 'A'.</summary>
  <param name="@AnyNumber">Can be any number.  Will be ignored.</param>
  <param name="@AnotherNumber">A different number.  Will also be ignored.</param>
  <returns>The value 'A'.</returns>
</documentation>
*/
CREATE FUNCTION dbo.UDF_SelectA
(
    @AnyNumber int,
    @AnotherNumber int
)
RETURNS char(1)
AS
BEGIN
       RETURN 'A';
END
GO

Sure, this might not be as visually appealing as the traditional starred comment block, but I’ve wrestled with parsing enough free formatted text that I don’t mind a little extra structure in my comments.

Querying the Documentation

Now that our T-SQL object documentation has some structure, it’s pretty easy to query and extract those XML comments:

WITH DocumentationDefintions AS (
SELECT
    SCHEMA_NAME(o.schema_id) as schema_name,
    o.name as object_name,
    o.create_date,
    o.modify_date,
    CAST(SUBSTRING(m.definition,CHARINDEX('<documentation>',m.definition),CHARINDEX('</documentation>',m.definition)+LEN('</documentation>')-CHARINDEX('<documentation>',m.definition)) AS XML) AS Documentation,
    p.parameter_id as parameter_order,
    p.name as parameter_name,
    t.name as parameter_type,
    p.max_length,
    p.precision,
    p.scale,
    p.is_output
FROM
    sys.objects o
    INNER JOIN sys.sql_modules m
        ON o.object_id = m.object_id
    LEFT JOIN sys.parameters p
        ON o.object_id = p.object_id
    INNER JOIN sys.types t
        ON p.system_type_id = t.system_type_id
WHERE 
    o.type in ('P','FN','IF','TF')
)
SELECT
    d.schema_name,
    d.object_name,
    d.parameter_name,
    d.parameter_type,
    t.c.value('author[1]','varchar(100)') as Author,
    t.c.value('summary[1]','varchar(max)') as Summary,
    t.c.value('returns[1]','varchar(max)') as Returns,
    p.c.value('@name','varchar(100)') as DocumentedParamName,
    p.c.value('.','varchar(100)') as ParamDescription
FROM
    DocumentationDefintions d 
    OUTER APPLY d.Documentation.nodes('/documentation') as t(c) 
    OUTER APPLY d.Documentation.nodes('/documentation/param') as p(c)
WHERE
    p.c.value('@name','varchar(100)') IS NULL -- objects that don't have documentation
    OR p.c.value('@name','varchar(100)') = d.parameter_name -- joining our documented parms with the actual ones
ORDER BY
    d.schema_name,
    d.object_name,
    d.parameter_order

This query pulls the parameters of our procedures and functions from sys.parameters and joins them with what we documented in our XML documentation. This gives us some nicely formatted documentation as well as visibility into what objects haven’t been documented yet:

Only the Beginning

At this point, our procedure and function documentation is easily accessible via query. We can use this to dump the information into an Excel file for a project manager, or schedule a job to generate some static HTML documentation directly from the source every night.

This can be extended even further depending on your needs, but at least this is an automated starting point for generating further documentation directly from the T-SQL source.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Faking Temporal Tables with Triggers

Published on: 2018-09-11

This post is a response to this month’s T-SQL Tuesday #106 prompt by Steve Jones.  T-SQL Tuesday is a way for the SQL Server community to share ideas about different database and professional topics every month.

This month’s topic asks to share our experiences with triggers in SQL Server.


Triggers are something that I rarely use.  I don’t shy away from them because of some horrible experience I’ve had, but rather I rarely have a good need for using them.

The one exception is when I need a poor man’s temporal table.

Temporal Table <3

When temporal tables were added in SQL Server 2016 I was quick to embrace them.

A lot of the data problems I work on benefit from being able to view what data looked like at a certain point back in time, so the easy setup and queriability of temporal tables was something that I immediately loved.

No System Versioning For You

Sometimes I can’t use temporal tables though, like when I’m forced to work on an older version of SQL Server.

Now, this isn’t a huge issue; I can still write queries on those servers to achieve the same result as I would get with temporal tables.

But temporal tables have made me spoiled.  They are easy to use and I like having SQL Server manage my data for me automatically.

Fake Temporal Tables With Triggers

I don’t want to have to manage my own operational versus historical data and write complicated queries for “point-in-time” analysis, so I decided to fake temporal table functionality using triggers.

Creating the base table and history table are pretty similar to that of a temporal table, just without all of the fancy PERIOD and GENERATED ALWAYS syntax:

CREATE TABLE dbo.Birds  
(   
 Id INT IDENTITY PRIMARY KEY,
 BirdName varchar(50),
 SightingCount int,
 SysStartTime datetime2 DEFAULT SYSUTCDATETIME(),
 SysEndTime datetime2 DEFAULT '9999-12-31 23:59:59.9999999'  
);
GO
CREATE TABLE dbo.BirdsHistory
(   
 Id int,
 BirdName varchar(50),
 SightingCount int,
 SysStartTime datetime2,
 SysEndTime datetime2  
) WITH (DATA_COMPRESSION = PAGE);
GO
CREATE CLUSTERED INDEX CL_Id ON dbo.BirdsHistory (Id);
GO

The single UPDATE,DELETE trigger is really where the magic happens though.  Everytime a row is updated or deleted, the trigger inserts the previous row of data into our history table with correct datetimes:

CREATE TRIGGER TemporalFaking ON dbo.Birds
AFTER UPDATE, DELETE
AS
BEGIN
SET NOCOUNT ON;

DECLARE @CurrentDateTime datetime2 = SYSUTCDATETIME();

/* Update start times for newly updated data */
UPDATE b
SET
       SysStartTime = @CurrentDateTime
FROM
    dbo.Birds b
    INNER JOIN inserted i
        ON b.Id = i.Id

/* Grab the SysStartTime from dbo.Birds
   Insert into dbo.BirdsHistory */
INSERT INTO dbo.BirdsHistory
SELECT d.Id, d.BirdName, d.SightingCount,d.SysStartTime,ISNULL(b.SysStartTime,@CurrentDateTime)
FROM
       dbo.Birds b
       RIGHT JOIN deleted d
              ON b.Id = d.Id
END
GO

The important aspect to this trigger is that we always join our dbo.Birds table to our inserted and deleted tables based on the primary key, which is the Id column in this case.

If you try to insert/update/delete data from the dbo.Birds table, the dbo.BirdsHistory table will be updated exactly like a regular temporal table would:

/* inserts */
INSERT INTO dbo.Birds (BirdName, SightingCount) VALUES ('Blue Jay',1);
GO
INSERT INTO dbo.Birds (BirdName, SightingCount) VALUES ('Cardinal',1);
GO
BEGIN TRANSACTION
INSERT INTO dbo.Birds (BirdName, SightingCount) VALUES ('Canada Goose',1)
INSERT INTO dbo.Birds (BirdName, SightingCount) VALUES ('Nuthatch',1)
COMMIT
GO
BEGIN TRANSACTION
INSERT INTO dbo.Birds (BirdName, SightingCount) VALUES ('Dodo',1)
INSERT INTO dbo.Birds (BirdName, SightingCount) VALUES ('Ivory Billed Woodpecker',1)
ROLLBACK
GO

/* updates */
UPDATE dbo.Birds SET SightingCount = SightingCount+1 WHERE id = 1;
GO
UPDATE dbo.Birds SET SightingCount = SightingCount+1 WHERE id in (2,3);
GO
BEGIN TRANSACTION
UPDATE dbo.Birds SET SightingCount = SightingCount+1 WHERE id =4;
GO
ROLLBACK

/* deletes */

DELETE FROM dbo.Birds WHERE id = 1;
GO
DELETE FROM dbo.Birds WHERE id in (2,3);
GO
BEGIN TRANSACTION
UPDATE dbo.Birds SET SightingCount = SightingCount+1 WHERE id =4;
GO
ROLLBACK

If you run each of those batches one at a time and check both tables, you’ll see how the dbo.BirdsHistory table keeps track of all of our data changes.

Now seeing what our dbo.Birds data looked like at a certain point-in-time isn’t quite as easy as a system versioned table in SQL Server 2016, but it’s not bad:

DECLARE @SYSTEM_TIME datetime2 = '2018-09-07 16:30:11';
SELECT * 
FROM
    (
    SELECT * FROM dbo.Birds
    UNION ALL
    SELECT * FROM dbo.BirdsHistory
    ) FakeTemporal
WHERE 
    @SYSTEM_TIME >= SysStartTime 
    AND @SYSTEM_TIME < SysEndTime;

Real Performance

One reason many people loath triggers is due to their potential for bad performance (particular when many triggers get chained together).

I wanted to see how this trigger solution compares to an actual temporal table.  While searching for good ways to test this difference, I found that Randolph West has done some testing on trigger-based temporal tables.  While our solutions are different, I like their performance testing methodology: view the transaction log records for real temporal tables and compare them to those of the trigger-based temporal tables.

I’ll let you read the details of how to do the comparison test in their blog post but I’ll just summarize the results of my test: the trigger based version is almost the same as a real system versioned temporal table.

Because of how I handle updating the SysStartTime column in my dbo.Birds table, I get one more transaction than a true temporal table:

You could make the trigger solution work identical to the true temporal table (as Randolph does) if you are willing to make application code changes to populate the SysStartTime column on insert into dbo.Birds.

Conclusion

For my purposes, the trigger-based temporal table solution has a happy ending.  It works for the functionality that I need it for and prevents me from having to manage a history table through some other process.

If you decide to use this in your own pre-2016 instances, just be sure to test the functionality you need; while it works great for the purposes that I use temporal tables for, your results may vary if you need additional functionality (preventing truncates on the history table, defining a retention period for the history, etc… are all features not implemented in the examples above).

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!


T-SQL Tuesday #104 Roundup

Published on: 2018-07-17

This month’s T-SQL Tuesday topic asked “What code would you hate to live without?” Turns out you like using script and code to automate boring, repetitive, and error-prone tasks.

Thank you to everyone who participated; I was nervous that July holidays and summer vacations would stunt turnout, however we wound up with 42 posts!

Watch tsqltuesday.com for next month’s topic and consider signing up to host.

Watch this T-SQL Tuesday Roundup on YouTube.

Without further ado, here are this month’s entries sorted in random order:

  • Stuart Moore shares the history behind needing to automate restore testing and writing the SqlAutoRestores PowerShell module to help.  Nowadays his commands are found in dbatools.  Great example of how a project can evolve through the community.
  • Arthur Daniels shares his script to identify the key and included columns of indexes in a given table.
  • Glenn Berry shares his DMV Diagnostic Queries and the story behind how he started developing them back in 2006.
  • Jason Brimhall links to multiple scripts he’s shared in the past as well as a new script for remotely auditing server access to catch infilitraters red-handed.
  • Doug Purnell talks about how he uses database snapshots and shares some code for how he manages them.
  • Jay Robinson shares two C# extensions (shout to my fellow devs!): one to check an enum for a value and a second to cleanly handle the lengthy DBNull.Value syntax.
  • Drew Furgiuele shares how he scripts out his indexes to re-apply after snapshot replication.  He then writes very similar functionality using PowerShell in only 6 lines!
  • Tim Weigel shares which community scripts he uses regularly, as well as sharing his own scripts around replication, stored procedure execution information, and file manipulation.
  • Hugo Kornelis submitted two posts.  The first post shares sp_metasearch which helps with performing impact analysis and the second post follows up with an enhancement he’s made to Ola Hallengren’s database maintenance scripts to ignore backup BizTalk databases.
  • Andy Mallon shares his comprehensive script for checking database, file, data, log, etc… sizes.  Great explanations of his reasoning for writing the queries the way he did.
  • Dan Clemens shares his database search script with a switch that includes searching across agent jobs.
  • Jess Pomfret wrote a script that shows compression stats for database objects.  Wanting to run it against a whole instance (or across mulitple servers), she wrote a dbatools command to automate the process.
  • Kenneth Fisher shows us how he organizes his toolbox using an SSMS solution.
  • Rob Farley shares code he’s written to demonstrate the pain of using NOLOCK.
  • Steve Jones shares a procedure from Microsoft that he uses for transferring logins and passwords between instances.
  • Kevin Hill shares two scripts he uses for finding low-hanging index optimization fruit: one that finds queries performing heap or clustered index scans, and another that returns the top 5 missing indexes per database.
  • Michael Villegas learned that Azure SQL doesn’t allow you to graphically show user roles and permissions, so he wrote a script to query those details (works for on-premise SQL Server as well).
  • Nate Johnson shares scripts that identify if tables are being replicated, whether SSRS subscriptions executed, and how much space certain objects and files are consuming.
  • William Andrus shares how he uses his search script to find similarly named fields or all instances of a piece of text within a database.
  • Bert Wagner (me!) I share my template for generating dynamic table-driven code, making queries more adaptable to future changes.
  • Rudy Rodarte shows us a script he uses for iterating over a date range to use for executing date based queries.
  • Brent Ozar admits he can’t live without sp_Blitz, but this month he shares a script for checking how much plan cache history exists on a server.
  • Jeff Mlakar offers a solution for taking all databases on an instance offline (and then back online) again.
  • Erik Darling offers a solution for constructing dynamic SQL so that his MAX variables don’t get truncated.  He also links to a script for printing long strings in SSMS.
  • Chrissy LeMaire takes the hard work out of instance to instance migrations by sharing her single-line dbatools command that will do it all for you.  She also shares how dbachecks automates manual checklist work.
  • Glenda Gable mentions two procedures, one that is a high performance cursor rewrite and one  that is a robust log shipping solution.
  • Aaron Bertrand shows us how he discovers undocumented SQL Server features by comparing new builds to the previous versions.
  • Ryan Desmond writes about his post-install confirguration process and shares code he runs to customize Ola Hallengren’s maintenance scripts for his environments.
  • Josh Simar shares his database file size code that is optimized for “very large databases” that span multiple files and filegroups.
  • Sander Stad discusses the importance of sharing code and offers a few dbatools commands that he’s contributed to or authored around backup testing, log shipping, and SQL Server Agent manipulation.
  • Andy Levy wrote an SSMS snippet to generate a cursor.  Before you chew him out though, he has some really good uses cases for needing to use them.
  • Andy Yun reveals what’s in his T-SQL toolbox and explains his organization strategies for 10+ years of scripts he’s collected.
  • Eduardo Pivaral shares a script he uses to output query results into an HTML table, making it easy to copy into an email.
  • Raul Gonzalez shows us a versatile script for searching database tables and returning information on attributes such as column name, size, key definitions, and more.
  • Matthew McGiffen wanted to find the most expensive queries on an instance using Query Store instead of the traditional DMVs, so he wrote a script to do just that.
  • Daniel Hutmacher shares his beefed up version of sp_help.  Includes ASCII art dependency graphs and database search.
  • Christian Gräfe provides a function he wrote for padding the left-side of a value with zeros.
  • Adrian Buckman  shares his SQLUndercover Inspector HTML reporting tool, as well as scripts for helping to alter AG groups, checking for running jobs, and auditing failed logins.
  • Louis Davidson shares his technique for using relative positioning in date tables to make querying custom periods (eg. your company’s fiscal month) easier.
  • Lance England shares a PowerShell script to automate generating upsert merge statements for his ETLs.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!