I had a discussion yesterday with Kendal Van Dyke (@SQLDBA) about IDENT_CURRENT(). Basically, Kendal had this code, which he had tested and trusted on its own, and wanted to know if he could rely on IDENT_CURRENT() being accurate in a high-scale, concurrent environment:
BEGIN TRANSACTION;
INSERT dbo.TableName(ColumnName) VALUES('Value');
SELECT IDENT_CURRENT('dbo.TableName');
COMMIT TRANSACTION;
The reason he had to do this is because he needs to return the generated IDENTITY value to the client. The typical ways we do this are:
- SCOPE_IDENTITY()
- OUTPUT clause
- @@IDENTITY
- IDENT_CURRENT()
Some of these are better than others, but that's been done to death, and I'm not going to get into it here. In Kendal's case, IDENT_CURRENT was his last and only resort, because:
- TableName had an INSTEAD OF INSERT trigger, making both SCOPE_IDENTITY() and the OUTPUT clause useless from the caller, because:
- SCOPE_IDENTITY() returns NULL, since the insert actually happened in a different scope
- the OUTPUT clause generates error Msg 334 because of the trigger
- He eliminated @@IDENTITY; consider that the INSTEAD OF INSERT trigger could now (or might later be changed to) insert into other tables that have their own IDENTITY columns, which would mess up the returned value. This would also thwart SCOPE_IDENTITY(), if it were possible.
- And finally, he couldn't use the OUTPUT clause (or a resultset from a second query of the inserted pseudo-table after the eventual insert) within the trigger, because this capability requires a global setting, and has been deprecated since SQL Server 2005. Understandably, Kendal's code needs to be forward compatible and, when possible, not rely completely on certain database or server settings.
So, back to Kendal's reality. His code seems safe enough – it's in a transaction, after all; what could go wrong? Well, let's take a look at a few important sentences from the IDENT_CURRENT documentation (emphasis mine, because these warnings are there for good reason):
…
Be cautious about using IDENT_CURRENT to predict the next generated identity value. The actual generated value may be different from IDENT_CURRENT plus IDENT_INCR because of insertions performed by other sessions.
Transactions are barely mentioned in the body of the document (only in the context of failure, not concurrency), and no transactions are used in any of the samples. So, let's test out what Kendal was doing, and see if we can get it to fail when multiple sessions are running concurrently. I'm going to create a log table to keep track of the values generated by each session – both the identity value that was actually generated (using an after trigger), and the value claimed to be generated according to IDENT_CURRENT().
First, the tables and triggers:
-- the destination table:
CREATE TABLE dbo.TableName
(
ID INT IDENTITY(1,1),
seq INT
);
-- the log table:
CREATE TABLE dbo.IdentityLog
(
SPID INT,
seq INT,
src VARCHAR(20), -- trigger or ident_current
id INT
);
GO
-- the trigger, adding my logging:
CREATE TRIGGER dbo.InsteadOf_TableName
ON dbo.TableName
INSTEAD OF INSERT
AS
BEGIN
INSERT dbo.TableName(seq) SELECT seq FROM inserted;
-- this is just for our logging purposes here:
INSERT dbo.IdentityLog(SPID,seq,src,id)
SELECT @@SPID, seq, 'trigger', SCOPE_IDENTITY()
FROM inserted;
END
GO
Now, open a handful of query windows, and paste this code, executing them as close together as possible to ensure the most overlap:
SET NOCOUNT ON;
DECLARE @seq INT = 0;
WHILE @seq <= 100000
BEGIN
BEGIN TRANSACTION;
INSERT dbo.TableName(seq) SELECT @seq;
INSERT dbo.IdentityLog(SPID,seq,src,id)
SELECT @@SPID,@seq,'ident_current',IDENT_CURRENT('dbo.TableName');
COMMIT TRANSACTION;
SET @seq += 1;
END
Once all of the query windows have completed, run this query to see a few random rows where IDENT_CURRENT returned the wrong value, and a count of how many rows in total were affected by this misreported number:
SELECT TOP (10)
id_cur.SPID,
[ident_current] = id_cur.id,
[actual id] = tr.id,
total_bad_results = COUNT(*) OVER()
FROM dbo.IdentityLog AS id_cur
INNER JOIN dbo.IdentityLog AS tr
ON id_cur.SPID = tr.SPID
AND id_cur.seq = tr.seq
AND id_cur.id <> tr.id
WHERE id_cur.src = 'ident_current'
AND tr.src = 'trigger'
ORDER BY NEWID();
Here are my 10 rows for one test:
I found it surprising that almost a third of the rows were off. Your results will certainly vary, and may depend on the speed of your drives, recovery model, log file settings, or other factors. On two different machines, I had vastly different failure rates - by a factor of 10 (a slower machine only had in the neighborhood of 10,000 failures, or roughly 3%).
Immediately it's clear that a transaction isn't enough to prevent IDENT_CURRENT from pulling the IDENTITY values generated by other sessions. How about a SERIALIZABLE transaction? First, clear the two tables:
TRUNCATE TABLE dbo.TableName;
TRUNCATE TABLE dbo.IdentityLog;
Then, add this code to the beginning of the script in multiple query windows, and run them again as concurrently as possible:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
This time, when I run the query against the IdentityLog table, it shows that SERIALIZABLE may have helped a little bit, but it hasn't solved the problem:
And while wrong is wrong, it looks from my sample results that the IDENT_CURRENT value is usually only off by one or two. However, this query should yield that it can be *way* off. In my test runs, this result was as high as 236:
SELECT MAX(ABS(id_cur.id - tr.id))
FROM dbo.IdentityLog AS id_cur
INNER JOIN dbo.IdentityLog AS tr
ON id_cur.SPID = tr.SPID
AND id_cur.seq = tr.seq
AND id_cur.id <> tr.id
WHERE id_cur.src = 'ident_current'
AND tr.src = 'trigger';
Through this evidence we can conclude that IDENT_CURRENT is not transaction-safe. It seems reminiscent of a similar but almost opposite problem, where metadata functions like OBJECT_NAME() get blocked - even when the isolation level is READ UNCOMMITTED - because they don't obey surrounding isolation semantics. (See Connect Item #432497 for more details.)
On the surface, and without knowing a lot more about the architecture and application(s), I don't have a really good suggestion for Kendal; I just know that IDENT_CURRENT is *not* the answer. :-) Just don't use it. For anything. Ever. By the time you read the value, it could already be wrong.
I'm probably being a Captain Oblivious here but I don't understand why OUTPUT is not an option — the error 334 requires that there be an INTO clause, and surely using table variables or temp tables would work in both inside the trigger (for purpose of passing IDs to related tables) and outside the trigger to then return the resultset in a next SQL statement, right?
Ben, no, the error message can be raised without an INTO clause (and in fact is raised in this case because there is *not* an INTO clause). Take the table I created above, add the INSTEAD OF INSERT trigger, and then run this:
On my system, I get:
The target table 'dbo.TableName' of the DML statement cannot have any enabled triggers if the statement contains an OUTPUT clause without INTO clause.
Now, on the flip side, create a table variable and use an INTO clause, and it "seems" to work:
@@IDENTITY here would return the right result, but only because my INSTEAD OF INSERT trigger is simplistic and doesn't perform any subsequent inserts… which it certainly could in the real world.
I don't know that doing anything inside the trigger is going to help, because he still needs a supported and forward-compatible way to get that output back to the caller (if all he had to do was insert the generated ID into another table, this post wouldn't even be here :-)).
So it seems like the only way to pull the generated ID in this specific scenario is to pull the MAX under SERIALIZABLE. Of course this relies on the IDENTITY not being re-seeded, SET IDENTITY_INSERT ON not being used, no multi-row inserts, etc. – and with the full understanding of how necessarily evil SERIALIZABLE can be, and what it can do to transaction throughput.
Thank you for the explanation, Aaron. I'm fairly sure that I used OUTPUT inside the trigger body for purpose of passing IDs to related tables and didn't have that problem – I wasn't expecting OUTPUT to return "0" in the code you showed.
I tested and indeed whenever OUTPUT is used inside the trigger body, it behaves predictably but not so in the calling code. This is bizarre because it seems to me that the OUTPUT inside the trigger can get the right value and is processed before the OUTPUT in the calling code is processed. I hadn't tried to use the 'return resultset from trigger' feature ever (seemed wrong to me) so I naively assumed that OUTPUT in calling code would behave same way as it does inside the trigger body.
My altered trigger code:
My altered calling code:
Thank you for the useful lesson, Aaron.
Maybe this is a case for sequences?
http://technet.microsoft.com/en-us/library/ff878091.aspx
Jeff,
Yes, it's possible, though I'm not sure he's on SQL Server 2012 (when SEQUENCE was introduced). I also don't know if this is the only piece of the code that has to be changed – other code may insert into the table and not care about the returned value at all, but they would also have to be changed to at least call some central procedure that grabbed the sequence and inserted it. Removing the IDENTITY property is not a schema change to be taken lightly, and having to put SET IDENTITY_INSERT ON statements everywhere doesn't seem like a good solution either.
I would have thought that the point at which Kendal chose to use an INSTEAD OF INSERT trigger was probably also the point that they moved away from being "supported and forward-compatible".
The whole point of an INSTEAD OF INSERT is that you are going to do the insert your own way right? Maybe even not do the insert at all or instead of the single insert into the selected table do multiple inserts into other tables. Seems to me that writing the Id from the INSTEAD OF trigger to a new logging table is probably the most forward(and backward)-compatible thing to do Especially since it might be multiple Ids or no Id at all.
You used the alternate table method in your example so you could get the correct value for your tests by linking back via @@SPID why not do the same for the production?
I think someone else mentioned Sequences but I think you are right that they are going to be difficult and risky to implement at this point.
Anyway I think my point here is that its not really Identity_Current that is the problem here its the INSTEAD OF trigger. Have I got that all wrong?
Hi Joss,
Admittedly, I was privy to two additional layers of conversation that are not obvious here: a private conversation with Kendal directly, and a closed conversation on a private e-mail distribution list. So, I have a little more knowledge of Kendal's situation than the average reader here.
You are right that the INSTEAD OF trigger is the primary source of Kendal's problem, and is what prevents him from using other, better alternatives to IDENT_CURRENT (which was his "last hope" for precisely this reason). I tried to paint the picture that the INSTEAD OF trigger was necessary, as is the case in many environments: there is an INSTEAD OF trigger that changes the input to some table because that was far easier to implement than changing 20 distributed applications that have ad hoc insert statements that don't include a value for a certain column, specify an explicit value that is no longer valid, etc. There can be many reasons why it is not possible to overhaul the design (and all those applications) to make up for some limitation like this, and a workaround must be found instead.
In my test I used @@SPID to simply show that the value for a session could be different from the one actually created in the table. I didn't use it to find the right value – I have no method to figure out from my tests which value was "stolen" by another session but that IDENT_CURRENT made me believe I generated, and in fact I believe there are IDENTITY values that were assigned that *no* session was told by IDENT_CURRENT that it was the one generated by *any* session. If you want to put together a repro where you use @@SPID to somehow return the right IDENTITY value to the client, I bet I can break it. :-)
The fact remains, of course, that IDENT_CURRENT is unreliable at scale, regardless of whether there is an INSTEAD OF trigger in the picture. Most people wouldn't even consider using it unless they were squarely stuck in a situation like Kendal's, where he couldn't use SCOPE_IDENTITY(), @@IDENTITY, the OUTPUT clause, etc. — all far more reliable methods of determining the last IDENTITY value generated.
Cheers,
Aaron
Hi Aaron,
Thanks for your reply, I really appreciate it. Just wanted to make sure I had the right end of the stick. I completely sympathise with the world of necessary evils ;)
Its always interesting to see the nuts and bolts of why decisions are bad. It makes it much easier when trying to avoid them in the future when you can have some concrete evidence for why something is not going to work, rather than just a vague feeling that it doesn't sound like a good idea. So I really thank you for taking us through the nitty gritty of why using IDENT_CURRENT is a bad idea in just about any circumstance even when its your last hope.
Much appreciated
J
Very nice article! Shared!