SQL and NOT SQL: Best of 2017

Photo by Rick Meyers on Unsplash

With only a few days left in 2017, I thought it would be fun to do a year in review post.  Below you’ll find some of my top 5 favorites in a variety of SQL and non-SQL related categories.  Hope you enjoy and I’ll see you with new content in 2018!

Thanks for reading and thanks for watching on YouTube all throughout 2017!

Top 5 Blog Posts

“Top 5” is totally subjective here.  These aren’t necessarily ordered by view counts, shares, or anything like that.  These just happen to be my personal favorites from this year.

Top 5 Vlogs

You guys watched over 500 hours of my  videos this year, thank you!

I think the content of this last video is good, but I don’t actually like anything else about it.  Why did I bother including it then?

It’s the first SQL Server vlog I made this year.  I’m a little embarrassed by how bad it is, but I keep it up as motivation for myself to see how much I’ve improved since I started filming videos.

If you ever think you want to start doing something – just start doing it.  Keeping track of progress and watching how you evolve is extremely rewarding.

Top 5 Posts That Never Got Written

I keep a list of post ideas.  Here are 5 ideas I didn’t get to this year.

I’m not ruling out ever writing them, but don’t hold your breath.  If you are ambitious, feel free to steal them for yourself – just let me know when you do because I’d love to read them!

  • “ZORK! in SQL” – I actually think it would be really fun to program one of my favorite text based games in SQL Server.  Don’t get eaten by a grue!
  • “How To Fly Under Your DBA’s Radar” – this could really go either way: how to do sneaky things without your DBA knowing OR how to be a good SQL developer and not get in trouble with your DBA.
  • “Geohashing in SQL Server” Geohashes are really cool.  It’d be fun to write about how to create them in SQL Server (probably a CLR, but it might be able to be done with some crazy t-sql).
  • “A SQL Magic Trick” – from the age of 12-18 I worked as a magician.  Sometimes I dream of teaching a SQL concept via a card trick.  Don’t rule this one out.
  • “Alexa DROP DATABASE” – write an Alexa skill to manage your Azure SQL instance.  I know this is technically feasible, I don’t know how useful this would be.

Top 5 Tweets/Instagrams

I’m not a huge social media guy to begin with, but I do like sharing photos.

More drifting

A post shared by Bert Wagner (@bertwagner) on

Top 5 Catch-All

These are some of the random things that helped me get through the year.

  • Red Bird Coffee.  This is premium coffee at affordable prices.  The Ethiopian Aricha is the best coffee I’ve ever had – tastes like red wine and chocolate.
  • Vulfpeck’s The Beautiful Game – I listened to this album more than any other to get into a working groove.  So funky.
  • Pinpoint: How GPS Is Changing Technology, Culture, and Our Minds – This was probably my favorite book of the year.  If you ever wonder how GPS works, or the implications of a more connected world, this book was absolutely fascinating.
  • LED Lighting Strips – I put these behind my computer monitors to create some nice lighting for filming, but I’ve found myself leaving them on all the time because they add a nice contrasting back light to my screens.
  • BONUS! Mechanical Keyboard – This thing is inexpensive, but amazing.  I didn’t realize what I was missing out until I started typing on it.  The sound of the clacking keys brought on an immediate flashback to the 1990s when I last had a mechanical keyboard.  I don’t know if it allows me to type faster like many users claim, but I am definitely happier typing on it. CLACK! CLACK! CLACK!

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

How To Create Multi-Object JSON Arrays in SQL Server

Recently I was discussing with Peter Saverman whether it would be possible to take some database tables that look like this:

And output them so that the Cars and Toys data would map to a multi-object JSON array like so:

Prefer visuals instead of text?  You can watch this week’s post on my YouTube channel.

Why would you ever need this?

If you are coming from a pure SQL background, at this point you might be wondering  why you would ever want create an object array that contains mixed object types.  Well, from an application development standpoint this type of scenario can be fairly common.

In a database, it makes sense to divide Home and Car and Toy into separate tables.  Sure, we could probably combine the latter two with some normalization, but imagine we will have many different types of entities that will be more difficult to normalize – sometimes it just makes sense to store this information separately.

Not to mention that performing analytical type queries across many rows of data will typically be much faster stored in this three table format.

The three table layout, while organized from a database standpoint, might not be the best way to organize the data in an object-oriented application.  Usually in a transaction oriented application, we want our data to all be together as one entity.  This is why NoSQL is all the rage among app developers.  Having all of your related data all together makes it easy to manage, move, update, etc…  This is where the array of multi-type objects comes in – it’d be pretty easy to use this structure as an array of dynamic or inherited objects inside of our application.

Why not just combine these Car and Toy entities in app?

Reading the data into the app through multiple queries and mapping that data to objects is usually the first way you would try doing something like this.

However, depending on many different variables, like the size of the data, the number of requests, the speed of the network, the hardware the app is running on, etc… mapping your data from multiple queries might not be the most efficient way to go.

On the other hand, if you have a big beefy SQL Server available that can do those transformations for you, and you are willing to pay for the processing time on an $8k/core enterprise licensed machine, then performing all of the these transformations on your SQL Server is the way to go.

The solution

UPDATE: Jovan Popovic suggested an even cleaner solution using CONCAT_WS.  See the update at the bottom of this post.

First, here’s the data if you want to play along at home:

And here’s the query that does all of the transforming:

There are a couple of key elements that make this work.

CROSS APPLY

When using FOR JSON PATH , ALL rows and columns from that result set will get converted to a single JSON string.

This creates a problem if, for example, you want to have a column for your JSON string and a separate column for something like a foreign key (in our case, HomeId).  Or if you want to generate multiple JSON strings filtered on a foreign key.

The way I chose to get around this is to use CROSS APPLY with a join back to our Home table – this way we get our JSON string for either Cars or Toys created but then output it along with some additional columns.

WITHOUT_ARRAY_WRAPPER

When using FOR JSON PATH to turn a result set into a JSON string, SQL Server will automatically add square brackets around the JSON output as if it were an array.

This is a problem in our scenario because when we use FOR JSON PATH to turn the Car and Toy table into JSON strings, we eventually want to combine them together into the same array instead of two separate arrays.  The solution to this is using the WITHOUT_ARRAY_WRAPPER option to output the JSON string without the square brackets.

Conclusion

Your individual scenario and results may vary.  This solution was to solve a specific scenario in a specific environment.

Is it the right way to go about solving your performance problems all of the time? No.  But offloading these transformations onto SQL Server is an option to keep in mind.

Just remember – always test to make sure your performance changes are actually helping.

UPDATED Solution Using CONCAT_WS:

This solution recommended by Jovan Popovic is even easier than above.  It requires using CONCAT_WS, which is available starting in SQL Server 2017 (the above solution requires STRING_AGG which is also in 2017, but it could be rewritten using FOR XML string aggregation if necessary for earlier versions)

 

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Reducing the Number of Reads in Your Queries

“A-Bridged” – reducing reads – get it?  I’m sorry you’re reading this.  Photo by Jamie Street on Unsplash

In last week’s post, we went over how one of best ways to improve query performance was to reduce the number of reads that your query has to do.

Less reads typically means faster query performance – so how can you reduce the number of reads SQL Server is required to make?

Prefer a visual?  Watch this week’s content on my YouTube channel.

Write more selective queries

Writing more selective queries can be done in a  few different ways.

For starters, if you don’t need every column of data, don’t use SELECT *.  Depending on the size of your rows, providing only the columns you need might allow SQL Server to use an index  that is narrower and/or denser.  An index that is narrow (fewer columns) or dense (more records per page) allows SQL Server to return the same amount of data you need in fewer pages.

The same thing goes for being more selective in your ON and WHERE clauses.  If you specify no WHERE conditions, SQL Server will return every single page in your table.  If you can filter down the data to exactly what you need, SQL Server can return only the data you need. This reduces logical reads and improves speed.

Finally, do you have large object (LOB) data (eg. varchar(max), image, etc..) on the tables in your query? Do you actually need it in your final result set? No?  Then don’t include it as part of your reads!  This could mean creating an index on the columns you do need or putting your LOB data in a separate table and only joining to it when you need it.

Fix suboptimal execution plans

It’s possible that your query is already as selective as it can be.  Maybe you are getting too many reads because SQL Server is generating and using a suboptimal execution plan.

A big tip off that this might be happening is if your cardinality estimates are out of whack a.k.a. the estimated vs. actual row counts have a large difference between them:

If SQL Server thinks it only is going to read 1 row of data, but instead needs to read way more rows of data, it might choose a poor execution plan which results in more reads.

You might get a suboptimal execution plan like above for a variety of reasons, but here are the most common ones I see:

If you had a query that previously ran fine but doesn’t anymore, you might be able to utilize Query Store to help identify why SQL Server started generating suboptimal plans.

The key is to get a good execution plan so that you aren’t performing unnecessary reads.

But what if you aren’t encountering any of the problems above and performance is still slow due to high numbers of reads?  Simply…

Add an index

A cup of coffee and a shot of espresso might have the same caffeine content – espresso is just more caffeine dense, just like the data stored in a narrow index.  Photo by Mike Marquez on Unsplash

If you have an existing table that has many columns and you only need a subset of them for your query, then consider adding an index for those columns.

Indexes are copies of your data stored in a different order with generally fewer columns.  If SQL Server is able to get all of the information it needs from a narrow index, it will do that instead of reading the full table/clustered index.

A copy of the data that has fewer columns will have greater page density (or the amount of data that fits on each page).  If SQL Server can get all of the data it needs by reading fewer, denser pages then your query will run faster.

Don’t just go adding indexes willy nilly though.  You may already have an index that almost contains all of the columns your query needs.  Look at a table’s existing indexes first and see if any of them are close to what you need.  Usually, you’ll be better off adding an included column or two to an existing index instead creating a whole brand new index.  And you’ll save on disk space by not creating duplicate indexes either.

Reduce index fragmentation

So indexes are great for reducing reads because they allow us to store only the data that is needed for a specific query (both as key columns and included columns).  Fewer columns = greater density = fewer reads necessary.

However, indexes can become  fragmented.  There’s internal fragmentation, which causes less data to be stored on a page than what is possible, and external fragmentation which causes the pages to be stored out of logical order on the disk.

Internal fragmentation is problematic because it reduces page density, causing SQL Server to have to read more pages in order to get all of the data it needs.

External fragmentation is problematic, especially for spinning disk hard drives, because SQL Server needs to read from all over the disk to get the data it needs.

In general, reorganizing or rebuilding an index are the typical ways you want to fix a fragmented index.

To slow down future fragmentation, you can test out different fill factors to try and prevent page splits from fragmenting your indexes.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!