Multiple Identity Inserts

Published on: 2019-06-25

Watch this week’s video on YouTube.

This week I want to share something that surprised me about using SQL Server’s SET IDENTITY_INSERT statement.

I started with two tables with identity columns defined:

CREATE TABLE dbo.[User]
(
	Id int identity,
	UserName varchar(40)
);
CREATE TABLE dbo.StupidQuestions
(
	Id bigint identity,
	UserId int,
	Question varchar(400)
);


INSERT INTO dbo.[User] (UserName) VALUES ('Jim');
INSERT INTO dbo.[User] (UserName) VALUES ('Jane');
INSERT INTO dbo.[User] (UserName) VALUES ('Jin');
INSERT INTO dbo.[User] (UserName) VALUES ('Joyce');

INSERT INTO dbo.StupidQuestions (UserId,Question) VALUES (1,'Is smooth peanut butter better than chunky?');
INSERT INTO dbo.StupidQuestions (UserId,Question) VALUES (1,'Do I really need to backup my production databases?');
INSERT INTO dbo.StupidQuestions (UserId,Question) VALUES (2,'How to grant developers SA access?');
INSERT INTO dbo.StupidQuestions (UserId,Question) VALUES (3,'I''m getting an error about not being able to add any more indexes to my table - how do I increase the limit?');
INSERT INTO dbo.StupidQuestions (UserId,Question) VALUES (4,'How can I include more than 32 columns in my index key?');
GO

I wanted to copy the data from these two tables into two other tables:

CREATE TABLE dbo.User_DEV
(
	Id int identity,
	UserName varchar(40)
);

CREATE TABLE dbo.StupidQuestions_DEV
(
	Id bigint identity,
	UserId int,
	Question varchar(400)
);

This would allow me to safely test some changes on these _DEV table copies without breaking my original tables.

The next step was to write a couple of INSERT INTO SELECT statements:

INSERT INTO dbo.User_DEV
SELECT Id,UserName FROM dbo.[User]

INSERT INTO dbo.StupidQuestions_DEV
SELECT Id,UserId,Question FROM dbo.StupidQuestions

And of course as soon as I executed them SQL Server threw an error stating that I can’t INSERT data into tables containing identity columns without first enabling identity inserts:

An explicit value for the identity column in table 'dbo.User_DEV' can only be specified when a column list is used and IDENTITY_INSERT is ON.

Ok, simple enough to fix: we just need to do what the error message says and SET IDENTITY_INSERT ON for both tables:

SET IDENTITY_INSERT dbo.User_DEV ON;  
SET IDENTITY_INSERT dbo.StupidQuestions_DEV ON;  

And… it still didn’t work:

IDENTITY_INSERT is already ON for table 'IdentityTest.dbo.User_DEV'. Cannot perform SET operation for table 'dbo.StupidQuestions_DEV'.

One at a time

Although I’ve probably moved data around like this hundreds (thousands?) of times before, I’ve never encountered this particular error.

Apparently SQL Server only allows one table to have the IDENTITY_INSERT property enabled at a time within each session. The solution therefore is straightforward: enable identity inserts and copy each table’s data one at a time:

SET IDENTITY_INSERT dbo.User_DEV ON; 
INSERT INTO dbo.User_DEV (Id,UserName)
SELECT Id,UserName FROM dbo.[User];
SET IDENTITY_INSERT dbo.User_DEV OFF; 

SET IDENTITY_INSERT dbo.StupidQuestions_DEV ON;
INSERT INTO dbo.StupidQuestions_DEV (Id,UserId,Question)
SELECT Id,UserId,Question FROM dbo.StupidQuestions
SET IDENTITY_INSERT dbo.StupidQuestions_DEV OFF;

20/20

In hindsight, I think I’ve never encountered this error before because I normally use the the Export Data Wizard in SSMS or a dedicated SSIS package to move data around. Either of those options are typically easier than writing T-SQL to move data across servers or for repeatability for when I need to regularly refresh tables with test data.

However, when using either of those options I’ve never paid attention to the implementation details, causing me to assume I knew how SQL Server handles identity inserts.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Trailing Spaces in SQL Server

Published on: 2019-06-18

Watch this week’s episode on YouTube.

A long time ago I built an application that captured user input. One feature of the application was to compare the user’s input against a database of values.

The app performed this text comparison as part of a SQL Server stored procedure, allowing me to easily update the business logic in the future if necessary.

One day, I received an email from a user saying that the value they were typing in was matching with a database value that they knew shouldn’t match. That is the day I discovered SQL Server’s counter intuitive equality comparison when dealing with trailing space characters.

Padded white space

You are probably aware that the CHAR data type pads the value with spaces until the defined length is reached:

DECLARE @Value CHAR(10) = 'a'
SELECT
	@Value AS OriginalValue,
	LEN(@Value) AS StringLength,
	DATALENGTH(@Value) AS DataLength,
	CAST(@Value AS BINARY) AS StringToHex;
String length = 1, DATALENGTH = 10, String as hex = 61202020202020202020

The LEN() function shows the number of characters in our string, while the DATALENGTH() function shows us the number of bytes used by that string.

In this case, DATALENGTH is equal to 10. This result is due to the padded spaces occurring after the character “a” in order to fill the defined CHAR length of 10. We can confirm this by converting the value to hexadecimal. We see the value 61 (“a” in hex) followed by nine “20” values (spaces).

If we change our variable’s data type to VARCHAR, we’ll see the value is no longer padded with spaces:

DECLARE @Value VARCHAR(10) = 'a'
SELECT
	@Value AS OriginalValue,
	LEN(@Value) AS StringLength,
	DATALENGTH(@Value) AS DataLength,
	CAST(@Value AS BINARY) AS StringToHex;
String length = 1, DATALENGTH = 1, String as hex = 61000000000000000000

Given that one of these data types pads values with space characters while the other doesn’t, what happens if we compare the two?

DECLARE 
	@CharValue CHAR(10) = '',
	@VarcharValue VARCHAR(10) = ''
SELECT
	IIF(@CharValue=@VarcharValue,1,0) AS ValuesAreEqual,
	DATALENGTH(@CharValue) AS CharBytes,
	DATALENGTH(@VarcharValue) AS VarcharBytes

In this case SQL Server considers both values equal, even though we can confirm that the DATALENGTHs are different.

This behavior doesn’t only occur with mixed data type comparisons however. If we compare two values of the same data type, with one value containing several space characters, we experience something…unexpected:

DECLARE 
	@NoSpaceValue VARCHAR(10) = '',
	@MultiSpaceValue VARCHAR(10) = '    '
SELECT
	IIF(@NoSpaceValue=@MultiSpaceValue,1,0) AS ValuesAreEqual,
	DATALENGTH(@NoSpaceValue) AS NoSpaceBytes,
	DATALENGTH(@MultiSpaceValue) AS MultiSpaceBytes

Even though our two variables have different values (a blank compared to four space characters), SQL Server considers these values equal.

If we add a character with some trailing whitespace we’ll see the same behavior:

DECLARE 
	@NoSpaceValue VARCHAR(10) = 'a',
	@MultiSpaceValue VARCHAR(10) = 'a     '
SELECT
	IIF(@NoSpaceValue=@MultiSpaceValue,1,0) AS ValuesAreEqual,
	DATALENGTH(@NoSpaceValue) AS NoSpaceBytes,
	DATALENGTH(@MultiSpaceValue) AS MultiSpaceBytes

Both values are clearly different, but SQL Server considers them to be equal to each other. Switching our equal sign to a LIKE operator changes things slightly:

DECLARE 
   @NoSpaceValue VARCHAR(10) = 'a',
   @MultiSpaceValue VARCHAR(10) = 'a     '
SELECT
   IIF(@NoSpaceValue LIKE @MultiSpaceValue,1,0) AS ValuesAreEqual,
   DATALENGTH(@NoSpaceValue) AS NoSpaceBytes,
   DATALENGTH(@MultiSpaceValue) AS MultiSpaceBytes

Even though I would think that a LIKE without any wildcard characters would behave just like an equal sign, SQL Server doesn’t perform these comparisons the same way.

If we switch back to our equal sign comparison and prefix our character value with spaces we’ll also notice a different result:

DECLARE 
	@NoSpaceValue VARCHAR(10) = 'a',
	@MultiSpaceValue VARCHAR(10) = '    a'
SELECT
	IIF(@NoSpaceValue=@MultiSpaceValue,1,0) AS ValuesAreEqual,
	DATALENGTH(@NoSpaceValue) AS NoSpaceBytes,
	DATALENGTH(@MultiSpaceValue) AS MultiSpaceBytes

SQL Server considers two values equal regardless of spaces occurring at the end of a string. Spaces preceding a string however, no longer considered a match.

What is going on?

ANSI

While counter intuitive, SQL Server’s functionality is justified. SQL Server follows the ANSI specification for comparing strings, adding white space to strings so that they are the same length before comparing them. This explains the phenomena we are seeing.

It does not do this with the LIKE operator however, which explains the difference in behavior.

Comparisons when extra spaces matter

Let’s say we want to do a comparison where the difference in trailing spaces matters.

One option is to use the LIKE operator as we saw a few examples back. This is not the typical use of the LIKE operator however, so be sure to comment and explain what your query is attempting to do by using it. The last thing you want is some future maintainer of your code to switch it back to an equal sign because they don’t see any wild card characters.

Another option that I’ve seen is to perform a DATALENGTH comparison in addition to the value comparison:

DECLARE 
	@NoSpaceValue VARCHAR(10) = 'a',
	@MultiSpaceValue VARCHAR(10) = 'a    '
SELECT
	IIF(@NoSpaceValue = @MultiSpaceValue AND DATALENGTH(@NoSpaceValue) = DATALENGTH(@MultiSpaceValue),1,0) AS ValuesAreEqual,
	DATALENGTH(@NoSpaceValue) AS NoSpaceBytes,
	DATALENGTH(@MultiSpaceValue) AS MultiSpaceBytes

This solution isn’t right for every scenario however. For starters, you have no way of knowing if SQL Server will execute your value comparison or DATALENGTH predicate first. This could wreck havoc on index usage and cause poor performance.

A more serious problem can occur if you are comparing fields with different data types. For example, when comparing a VARCHAR to NVARCHAR data type, it’s pretty easy to create a scenario where your comparison query using DATALENGTH will trigger a false positive:

DECLARE 
	@NoSpaceValue VARCHAR(10) = 'a ',
	@MultiSpaceValue NVARCHAR(10) = 'a'
SELECT
	IIF(@NoSpaceValue = @MultiSpaceValue AND DATALENGTH(@NoSpaceValue) = DATALENGTH(@MultiSpaceValue),1,0) AS ValuesAreEqual,
	DATALENGTH(@NoSpaceValue) AS NoSpaceBytes,
	DATALENGTH(@MultiSpaceValue) AS MultiSpaceBytes

Here the NVARCHAR stores 2 bytes for every character, causing the DATALENGTHs of a single character NVARCHAR to be equal to a character + a space VARCHAR value.

The best thing to do in these scenarios is understand your data and pick a solution that will work for your particular situation.

And maybe trim your data before insertion (if it makes sense to do so)!

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!

Joker’s Wild

Published on: 2019-06-11

This past weekend I had a blast presenting Joker’s Wild with Erin Stellato (blog|twitter), Andy Mallon (blog|twitter), and Drew Furgiuele (blog|twitter).

Watch it here!

Table of contents:

  • What is Joker’s Wild? Watch this to witness Andy’s amazing PowerPoint animation skills (0:00)
  • Bert demos SQL injection (2:25)
  • Erin recollects desserts (9:55)
  • Andy shares an automation tip (18:55)
  • Andy explains an ANSI standard (23:10)
  • Drew describes containers (27:02)

While a video doesn’t quite give you the same experience as being in the room with dozens of other data professionals laughing and shouting along, hopefully it gives you an idea.

Here’s a behind-the-scenes peek at how it all came together.

A Different Kind Of Presentation

I’ve wanted to do a “fun” SQL Server presentation for a while; something that would be lighthearted while still delivering (some) educational value.

I ran some ideas past Erin after SQL Saturday Cleveland earlier this year. We came up with several concepts ideas we could incorporate into the presentation (thanks to Paul Popovich and Luis Gonzalez for also helping us generate a lot of these ideas) and at that point I think Erin came up with the name “Joker’s Wild.”

Blind Commitment

Fast forward a few months: occasionally I’d talk about the presentation idea with people but still wasn’t any closer to actually making it real.

Then a few days before the SQL Saturday Columbus submission deadline, Erin reached out to ask if we were going to submit. We recruited Andy and Drew to help present and submitted an abstract:

Come one, come all to the greatest (and only) SQL Server variety show at SQL Saturday Columbus.

This session features a smattering of lightning talks covering a range of DBA- and developer-focused SQL Server topics, interspersed with interactive games to keep the speakers and audience on their toes.

Plan for plenty of sarcasm, laughs, and eye rolls in this thoughtfully structured yet highly improvised session.

We can’t guarantee what you’ll learn, but we do promise a great time!

*Slot machine will not generate real money for “winners”

Structure

If that abstract reads a little vague, it’s because at that point we didn’t know exactly what we wanted to do yet. Once our session was selected though it was time to come up with a concrete plan (big thank you to David Maxwell and Peter Shore for giving us the opportunity to try something like this).

After some discussion, Erin, Andy, Drew, and I came up with the following structure:

  1. The audience will choose the lightning talk topic
  2. We will spin the “Wheel of Misfortune” to determine the presentation style, including:
    • Slides I didn’t write
    • Random slide timing
    • Who has the clicker?
  3. We will play some SQL Server themed Jeopardy and Pictionary with the audience

After our first meeting Andy created the world’s most versatile PowerPoint presentation that would run the show. Seriously, if you haven’t watched the video above yet, go watch it – that introduction is all PowerPoint goodness created by him.

The Session and Final Thoughts

I’m incredibly happy with how it all went. The session was planned but a lot of it was still left up to a highly improvised performance. I had a lot of fun preparing and presenting, and I think the session was well received by the audience. Jeopardy and Pictionary were a lot of fun too, even though I ran out of video recording space so I couldn’t include them in the video.

I hope we have another opportunity to present this session again in the future.

Thank you again David and Peter for letting us do this session as part of SQL Saturday Columbus.

Thank you to our audience for taking a risk on attending a session you didn’t know much about. Also for your great participation.

And thank you Erin, Andy, and Drew for helping do something fun and different.

Thanks for reading. You might also enjoy following me on Twitter.

Want to learn even more SQL?

Sign up for my newsletter to receive weekly SQL tips!