In-Memory OLTP: A Case Study

Published on: 2018-04-17

Watch this week’s episode on YouTube.

When In-Memory OLTP was first released in SQL Server 2014, I was excited to start using it.  All I could think was “my queries are going to run so FAST!

Well, I never got around to implementing In-Memory OLTP.  Besides having an incompatible version of SQL Server at the time, the in-memory features had too many limitations for my specific use-cases.

Fast forward a few years, and I’ve done nothing with In-Memory OLTP.  Nothing that is until I saw Erin Stellato present at our Northern Ohio SQL Server User Group a few weeks ago – her presentation inspired me to take a look at In-Memory OLTP again to see if I could use it.

Use case: Improving ETL staging loads

After being refreshed on the ins and outs of in-memory SQL Server, I wanted to see if I could apply some of the techniques to one of my etls.

The ETL consists of two major steps:

  1. Shred documents into row/column data and then dump that data into a staging table.
  2. Delete some of the documents from the staging table.

In the real world, there’s a step 1.5 that does some processing of the data, but it’s not relevant to these in-memory OLTP demos.

So step one was to create my staging tables.  The memory optimized table is called “NewStage1” and the traditional disked based tabled is called “OldStage1”:

Few things to keep in mind:

  • The tables have the same columns and datatypes, with the only difference being that the NewStage1 table is memory optimized.
  • My database is using simple recovery so I am able to perform minimal logging/bulk operations on my disk-based table.
  • Additionally, I’m using  the SCHEMA_ONLY durability setting.  This gives me outstanding performance because there is no writing to the transaction log!  However, this means if I lose my in-memory data for any reason (crash, restart, corruption, etc…) I am completely out of luck.  This is fine for my staging data scenario since I can easily recreate the data if necessary.

Inserting and deleting data

Next I’m going to create procedures for inserting and deleting my data into both my new and old staging tables:

Few more things to note:

  • My new procedures are natively compiled: SQL Server compiles them up front so at run time it can just execute without any extra steps.  The procedures that target my old disk-based tables will have to compile every time.
  • In the old delete procedure, I am deleting data in chunks so my transaction log doesn’t get full.  In the new version of the procedure, I don’t have to worry about this because, as I mentioned earlier, my memory optimized table doesn’t have to use the transaction log.

Let’s simulate a load

It’s time to see if all of this fancy in-memory stuff is actually worth all of the restrictions.

In my load, I’m going to mimic loading three documents with around 3 million rows each.  Then, I’m going to delete the second document from each table:

The in-memory version should have a significant advantage because:

  1. The natively compiled procedure is precompiled (shouldn’t be a huge deal here since we are doing everything in a single INSERT INTO…SELECT).
  2. The in-memory table inserts/deletes don’t have to write to the transaction log (this should be huge!)

Results

Disk-based In-Memory
INSERT 3 documents 65 sec 6 sec
DELETE 1 document 46 sec 0 sec
Total time 111 sec 6 sec
Difference -95% slower 1750% faster

The results speak for themselves.  In this particular example, in-memory destroys the disk-based solution out of the water.

Obviously there are downsides to in-memory (like consuming a lot of memory) but if you are going for pure speed, there’s nothing faster.

Warning! I am not you.

And you are not me.

While in-memory works great for my ETL scenario, there are many requirements and limitations.  It’s not going to work in every scenario.  Be sure you understand the in-memory durability options to prevent any potential data loss and try it out for yourself!  You might be surprised by the performance gains you’ll see.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

2 thoughts on “In-Memory OLTP: A Case Study”

  1. Hi Bert,

    I’m thrilled to see other blog about In-Memory OLTP – welcome to the club!
    Having blogged extensively about this feature for the last two years, I have some feedback for you.

    1. Parallelism – other than SELECTs using Interop, all DML for memory-optimized tables is single threaded. While your workload generated fantastic performance numbers, it’s not real-world.

    Please see the following posts:
    http://www.nikoport.com/2018/01/20/parallelism-in-hekaton-in-memory-oltp/

    http://nedotter.com/archive/2016/05/in-memory-oltp-optimizing-data-load/

    Basically, you must roll your own parallelism, i.e. create concurrent connections for inserting to ETL staging tables, otherwise performance is not great.

    2. Concurrency – Your post doesn’t specify how many concurrent connections you tested with, but I’m guessing it was single threaded, and if so, this is a common mistake. It would lead you to believe that your workload can only benefit if you use native compilation, which is not correct.

    The In-Memory feature is capable of scaling extremely high levels of concurrent activity, which on-disk workloads simply cannot match, and testing single-threaded (especially with Interop) is not likely to display the potential performance benefits of this awesome feature.

    Best of luck in your evaluation of In-Memory OLTP!

  2. How about Update operation actually I am making a realtime leadeboard in which my procedure calculates points of user to obtain total points and current rank in realtime basis and update their score and rank…

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.