The Forgotten Fourth SQL Server Recovery Model

SQL Server recovery models define when database transactions are written to the transaction log.   Understanding these models is critical for backup and recovery purposes as well as for how their behaviors impact the performance of queries.

Today we’ll examine the differences between SQL Server’s three official recovery models as well as an unofficial “fourth” recovery model that won’t help in backup/recovery, but will help in performance of certain processes.

Full Recovery

The only recovery model that can potentially save all of your data when something bad happens (NOTE: “potentially” because if you aren’t taking enough and/or testing your backups, you might experience data loss).

Under the full recovery model, every transaction is written to the transaction log first, and then persisted to the actual database.  This means if something disastrous happens to your server, as long as the change made its way into the transaction log AND your transaction log is readable AND your previous full/differential/log backups can be restored, you shouldn’t experience any data loss (there are a lot of assumptions made with that statement though, so don’t use this post as your only data loss prevention guide)

From a performance standpoint, full recovery is the slowest of the bunch because every transaction needs to be logged, and that creates some overhead.   Might be good for your OLTP databases, maybe not so much for your analytical staging databases (assuming you can recreate that data).

Simple Recovery

While some people incorrectly believe that simple recovery means no writing to the transaction log  (need proof that a database in simple recovery still writes to the trans log?  Try running ROLLBACK TRANSACTION after a huge delete) it actually means that the transaction log is cleared as soon as SQL Server is done using it and data has made its way to disk.

Since the transaction log is cleared regularly, your overall log size can be smaller since space is regularly reused.  Additionally, since that space is cleared you don’t have to worry about backing it up.

No persistence of the transaction log means you won’t be able to recover all of your data in case of server failure though.  This is generally OK if you are using simple recovery in databases where its easy to recreate any data since your last full backup (eg. staging data where you can easily redo the transactions that were lost).

Simple recovery minimally logs as many transactions as possible, making the throughput faster.  This works well in staging databases and for ETLs where data is in flux and can be easily recreated.

Bulk-Logged Recovery

If Goldilocks thinks the full recovery model has too much logging, and the simple model not enough logging, then she’ll find the amount of logging in the bulk-logged recovery model to be just right.

Under bulk-logged, most transactions are fully logged, allowing for data restoration of those fully logged transactions if the need arises.  Bulk transactions however are minimally logged, allowing for better performance of things like bulk inserts (but no ability for restoration).

While restorations under the bulk-logged recovery model aren’t as flexible as full recovery (eg. if the transaction log has any bulk transactions, you have to restore the whole transaction log instead of just up to a certain point), it does allow full logging for when most people need it and minimal logging for when most people don’t need it.  Meaning for certain situations you can have your cake and eat it too!

The Fourth Unofficial Recovery Model: In-Memory SCHEMA_ONLY Durability

The SCHEMA_ONLY durability setting on a memory optimized table isn’t a recovery model.  But it does behave a little bit like a recovery model in the sense that it defines how operations against your memory optimized table interact with your transaction log:

They don’t.  Well, almost.

And that’s the beauty of it, at least from a performance stand point.  If you are willing to trade off the ability to recover data for performance, then the SCHEMA_ONLY durability fits the bill – so long transaction log overhead.

So while none of the official recovery models allow you to prevent writing to the transaction log, the SCHEMA_ONLY durability setting does!

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

In-Memory OLTP: A Case Study

Watch this week’s episode on YouTube.

When In-Memory OLTP was first released in SQL Server 2014, I was excited to start using it.  All I could think was “my queries are going to run so FAST!

Well, I never got around to implementing In-Memory OLTP.  Besides having an incompatible version of SQL Server at the time, the in-memory features had too many limitations for my specific use-cases.

Fast forward a few years, and I’ve done nothing with In-Memory OLTP.  Nothing that is until I saw Erin Stellato present at our Northern Ohio SQL Server User Group a few weeks ago – her presentation inspired me to take a look at In-Memory OLTP again to see if I could use it.

Use case: Improving ETL staging loads

After being refreshed on the ins and outs of in-memory SQL Server, I wanted to see if I could apply some of the techniques to one of my etls.

The ETL consists of two major steps:

  1. Shred documents into row/column data and then dump that data into a staging table.
  2. Delete some of the documents from the staging table.

In the real world, there’s a step 1.5 that does some processing of the data, but it’s not relevant to these in-memory OLTP demos.

So step one was to create my staging tables.  The memory optimized table is called “NewStage1” and the traditional disked based tabled is called “OldStage1”:

Few things to keep in mind:

  • The tables have the same columns and datatypes, with the only difference being that the NewStage1 table is memory optimized.
  • My database is using simple recovery so I am able to perform minimal logging/bulk operations on my disk-based table.
  • Additionally, I’m using  the SCHEMA_ONLY durability setting.  This gives me outstanding performance because there is no writing to the transaction log!  However, this means if I lose my in-memory data for any reason (crash, restart, corruption, etc…) I am completely out of luck.  This is fine for my staging data scenario since I can easily recreate the data if necessary.

Inserting and deleting data

Next I’m going to create procedures for inserting and deleting my data into both my new and old staging tables:

Few more things to note:

  • My new procedures are natively compiled: SQL Server compiles them up front so at run time it can just execute without any extra steps.  The procedures that target my old disk-based tables will have to compile every time.
  • In the old delete procedure, I am deleting data in chunks so my transaction log doesn’t get full.  In the new version of the procedure, I don’t have to worry about this because, as I mentioned earlier, my memory optimized table doesn’t have to use the transaction log.

Let’s simulate a load

It’s time to see if all of this fancy in-memory stuff is actually worth all of the restrictions.

In my load, I’m going to mimic loading three documents with around 3 million rows each.  Then, I’m going to delete the second document from each table:

The in-memory version should have a significant advantage because:

  1. The natively compiled procedure is precompiled (shouldn’t be a huge deal here since we are doing everything in a single INSERT INTO…SELECT).
  2. The in-memory table inserts/deletes don’t have to write to the transaction log (this should be huge!)

Results

Disk-based In-Memory
INSERT 3 documents 65 sec 6 sec
DELETE 1 document 46 sec 0 sec
Total time 111 sec 6 sec
Difference -95% slower 1750% faster

The results speak for themselves.  In this particular example, in-memory destroys the disk-based solution out of the water.

Obviously there are downsides to in-memory (like consuming a lot of memory) but if you are going for pure speed, there’s nothing faster.

Warning! I am not you.

And you are not me.

While in-memory works great for my ETL scenario, there are many requirements and limitations.  It’s not going to work in every scenario.  Be sure you understand the in-memory durability options to prevent any potential data loss and try it out for yourself!  You might be surprised by the performance gains you’ll see.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

3 Essential Tools For The SQL Server Developer

This post is a response to this month’s T-SQL Tuesday #101 prompt by Jens Vestergaard. T-SQL Tuesday is a way for SQL Server bloggers to share ideas about different database and professional topics every month.

This month’s topic is about what essential SQL Server related tools you use on a regular basis.


SQL Server Management Studio is an excellent tool for my day to day SQL Server developing needs.

However, sometimes I need to do things besides writing queries and managing server objects.  Below is a list of my three most used tools I use on a regular basis when working with SQL Server.

Watch this week’s episode on YouTube.

1. WinMerge

Often I need to compare the bodies of two stored procedures, table definitions, etc… to find differences.

While there are some built-in tools for doing difference comparisons in Visual Studio and SSMS source control plugins, I prefer using the third-party open-source tool WinMerge:

The tool is a pretty straightforward difference checking tool, highlighting lines where the data between two files is different.

It has some other merge functions available in it, but honestly I keep it simple and use it to just look for differences between two pieces of text.

2. OnTopReplica

When on a single display, screen real estate is at a premium.  This is especially true if you are forced to use a projector that’s limited to 1024×768 resolution…

OnTopReplica to the rescue!  This nifty open-source tool allows you to select a window and keep it open on top of all other windows.

This is great for when I want to reference some piece of code or text on screen while working in another window:

In addition to forcing a window open to stay on top, it allows you to crop and resize that window so only the relevant parts are visible.

The OnTopReplica view is live too – that means it’s great to use as a magnifier on your SSMS result sets when presenting (instead of constantly having to zoom in and out with ZoomIt):

Look at those beautifully zoomed in results!

3. ScreenToGif

Sometimes explaining concepts with pictures is hard.  For example, wouldn’t that last screenshot be way better if it was animated?

ScreenToGif is an open-source screen capture tool that does an excellent job compressing your recorded videos into gif animations.  It also allows editing individual frames, allowing the addition of text, graphics, and keyboard shortcuts.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!