Contributing to Community

Published on: 2018-05-08

This post is a response to this month’s T-SQL Tuesday #102 prompt by Riley Major. T-SQL Tuesday is a way for SQL Server users to share ideas about different database and professional topics every month.

The prompt I’ve chosen to write about this month is how and why I got started contributing to the SQL Server community.


About a year ago, I was determined to improve my presentation skills.  I knew that in order to do that I needed to get more practice speaking.

I already was at my max for presenting at local user groups, conferences, etc… because at some point it becomes too cost and time prohibitive to travel to more events.  As an alternative, I decided that if I couldn’t get more practice by speaking in person, I could at least film myself presenting.

And I figured if I’m already filming myself presenting, I might as well put a little extra polish on it and make the content available for others to watch.

And that is how I started filming weekly videos about SQL Server.

SQL Server Videos

There are already plenty of great SQL Server presentations on YouTube, spanning a plethora of topics from a variety of experts who know way more about SQL Server than me.

Whenever I want to learn about a SQL Server topic, I search for something like “SQL Server backups” or “SQL Server columnstore indexes” on YouTube.  There are plenty of great recorded presentations, virtual chapter screencasts, Q&As, and other tutorials for learning almost any topic you can imagine.

However, sometimes I’m not in the mood to watch in-depth hour long presentations.  Sometimes I want to watch a short, informative, regularly scheduled entertaining SQL videos – and this is where I saw a gap in programming.

So what better way to get what you want than by scratching your own itch.  I figured if I want to watch that type of SQL Server video, then I’m sure other people out there want to watch those same kinds of short SQL videos too.

Bert, the Director

When I was a kid, I wanted to be a movie maker.  In particular, I was entranced by special effects, so I made movies with friends that involved plenty of lightsabers, explosions, and green screen effects all throughout middle school and high school.

So while making SQL videos wasn’t going to be totally new territory, I sure was unprepared for all of the initial work involved.

For the first three months, I was spending 15-20 hours per week writing, creating demos, shooting, editing, publishing, and marketing my videos.  Over time I’ve cut this process down to 8-10 hours a week, a more manageable amount of work that I can mostly get done on weekend mornings before the rest of the house wakes up.

Results

Making videos about SQL Server has been an amazing experience.  Not only do I personally feel fulfilled creating something week after week that improves my own skills, but it’s rewarding to receive positive feedback via comments, messages, and emails that I’m also helping others become better SQL developers.

Contributing has also made me appreciate how amazing the #sqlfamily community truly is.  Everyone I talk to is wonderful and supportive, and everyone I meet wants to see one another succeed.

Your Turn

If you aren’t already, I hope you consider contributing to the community .  Whether it be via blog posts, code contributions, presenting, tweeting, or making videos, giving back to the SQL Server community will grow your own skills and allow you to meet some really great people.

It can be scary putting yourself out there publicly, but don’t let that stop you.  If you give it your best then the SQL Server community won’t let you down.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Is It Possible To Conditionally Index JSON Data?

Published on: 2018-05-01

Check out this week’s episode on YouTube.

Recently I received a great question from an attendee to one of my sessions on JSON (what’s up Nam!):

At first glance it sounds like a filtered index question, and ultimately it is, but because of some of the intricacies involved in the response I thought it would make for a good blog post.

The Problem: Schema On Read

Imagine I have a central table that keeps track of warnings and errors for my burrito ordering app:

DROP TABLE IF EXISTS dbo.BurritoAppLog;
GO

CREATE TABLE dbo.BurritoAppLog 
( 
	Id int IDENTITY PRIMARY KEY,
	ErrorDetails nvarchar(1000)
); 
GO 

INSERT INTO dbo.BurritoAppLog VALUES (N'{"Type":"Warning", "MessageId": 100, "Severity": "High", "Information":"Running low on steak." }'); 
INSERT INTO dbo.BurritoAppLog VALUES (N'{"Type":"Warning", "MessageId": 50, "Severity": "Low", "Information":"Running low on queso." }');
GO 4000
INSERT INTO dbo.BurritoAppLog VALUES (N'{"Type":"Error", "MessageId": 10, "User":"Bert", "ErrorMessage":"Lettuce not available." }'); 
INSERT INTO dbo.BurritoAppLog VALUES (N'{"Type":"Error", "MessageId": 20, "User":"Jim", "ErrorMessage":"Cannot wrap burrito with quadruple meat." }'); 
GO 100

Now imagine wanting to generate a report of only the rows that are errors.

Obviously, you’d want to index this data for faster querying performance.  Adding a non-clustered index on a non-persisted computed column of our JSON “Type” property will accomplish that:

ALTER TABLE dbo.BurritoAppLog 
ADD ErrorType AS JSON_VALUE(ErrorDetails, '$.Type');

ALTER TABLE dbo.BurritoAppLog 
ADD MessageId AS JSON_VALUE(ErrorDetails, '$.MessageId');

CREATE INDEX IX_ErrorType ON dbo.BurritoAppLog (ErrorType) INCLUDE (MessageId);

SELECT MessageId FROM dbo.BurritoAppLog WHERE ErrorType = 'Error'

And that works great.  Except that error entries in our table make up only 2.5% of our total rows.  Assuming we’ll never need to query WHERE ErrorType = ‘Warning’ , this index is using a lot of unnecessary space.

So what if we create a filtered index instead?

Filtered JSON Indexes…

A filtered index should benefit us significantly here: it should save us space (since it won’t include all of those warning rows) and it should make our INSERT queries into this table faster since the index won’t need to be maintained for our non-“Error” rows.

So let’s create a filtered index:

CREATE INDEX FX_ErrorType ON dbo.BurritoAppLog (ErrorType) INCLUDE (MessageId) WHERE ErrorType = 'Error'

Oh.

So I guess we can’t create a filtered index where the filter is on a computed column.  Maybe SQL Server won’t mind if we persist the computed column?

DROP INDEX IX_ErrorType ON dbo.BurritoAppLog

ALTER TABLE dbo.BurritoAppLog
DROP COLUMN ErrorType;

ALTER TABLE dbo.BurritoAppLog 
ADD ErrorType AS JSON_VALUE(ErrorDetails, '$.Type') PERSISTED;

CREATE INDEX FX_ErrorType ON dbo.BurritoAppLog (ErrorType) INCLUDE (MessageId) WHERE ErrorType = 'Error'

NOOOOOOPPPPEEEE.  Same error message.

The issue is that SQL Server does not like computed columns, persisted or not, in a filtered index’s WHERE clause.  It’s one of the many limitations of filtered indexse (Aaron Bertrand has a great post outlining many of the shortcomings).

Computed Column Filtered Index Workaround

What is a performance minded, space-cautious, JSON-loving developer supposed to do?

One workaround to get our filtered index would be to parse our ErrorType property into its own table column on insert:

ALTER TABLE dbo.BurritoAppLog 
ADD PermanentErrorType varchar(10);

UPDATE dbo.BurritoAppLog SET PermanentErrorType = JSON_VALUE(ErrorDetails, '$.Type');

With our PermanentErrorType column in place, we have no problem generating our filtered index:

CREATE INDEX FX_PermanentErrorType ON dbo.BurritoAppLog (PermanentErrorType) INCLUDE (MessageId) WHERE PermanentErrorType = 'Error'

If we compare the sizes of our nonclustered index to our filtered index, you’ll immediately that the filtered index is significantly smaller:

However, our table size is now slightly larger because of the added table column.

Conclusion

So what do you do if you run into this situation?  Well, if the ratio of undesired records to desired records is large like in the example above, you might want to make a permanent column to include in your filtered index – the size/performance benefit is certainly there.  This does mean that your table size will be larger (additional column) but performance will be faster if your queries are able to use the smaller filtered index.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!