In this continuation of my "knee-jerk performance tuning" series, I'd like to discuss four common problems I see with using temporary tables. Any one of these problems can cripple a workload, so they're worth knowing about and looking for in your environment.
Problem 1: Using Temporary Tables Where They're Not Needed
Temporary tables have a variety of uses (probably the most common is to store an intermediate result set for later use), but you have to remember that when you introduce a temporary table into a query, you're interrupting the flow of data through the query processor.
Think of the population of a temporary table as a hard stop, as there's a query (let's call it the producer) to produce the intermediate result set, which is then stored in the temporary table in tempdb, and then the next query (let's call it the consumer) has to read the data from the temporary table again.
I've often found that some parts of a workload actually perform better when the temporary table is completely removed, so the data flows from the producer part of the query to the consumer part of the query without having to be persisted in tempdb, and the query optimizer can produce a more optimal overall plan.
You might now be thinking, "so why would someone use a temporary table if it makes things slower?" – and rightly so! In cases like that, I've found that the use of a temporary table has become institutionalized in the development team; someone found that using a temporary table increased performance many years ago, so temporary tables became the default design choice.
This can be a hard thing to change, especially if you have a senior developer or manager who's convinced that temporary tables should always be used. The simple thing to try is to pick an expensive query (for instance, a long-running one, or one that's executed many times per second) and remove one or more of the temporary tables to see whether performance increases without them. And if so, there's your proof to show the intransigents!
Problem 2: Lack of Filtering When Populating Temporary Tables
Even if you can't remove a temporary table, you may be able to drastically improve performance by making sure that the code that populates the temporary table is correctly filtering the data pulled from source tables.
I've lost count of the number of times I've seen a temporary table being populated with code that starts as SELECT *
, includes a few unrestrictive joins, and has no WHERE clause, and then the later query that uses the temporary table only uses a few columns and has a WHERE clause to hugely scope down the number of rows.
I remember one case where a temporary table in a stored procedure was aggregating 15 years' worth of data from the main database, and then only the current year's data was being used. This was repeatedly causing tempdb to grow until it ran out of space on the disk volume, and the stored procedure would then fail.
Whenever you're populating a temporary table, only use the source table columns that are necessary, and only use the rows that are necessary – i.e. push the filter predicates up into the temporary table population code. Not only will this save space in tempdb, it will also save a lot of time from not having to copy unneeded data from the source table (and potentially remove the need to read source database pages from disk in the first place).
Problem 3: Incorrect Temporary Table Indexing
Just like with regular tables, you should only create the indexes that are actually going to be used by the later query code to help the query performance. I've seen plenty of cases where there's a nonclustered index per temporary table column, and single-column indexes that are chosen without analyzing the later code are often quite useless. Now combine useless nonclustered indexes with a lack of filtering when populating the temporary table, and you've got a recipe for enormous bloating of tempdb.
Also, in general, it's faster to create the indexes after the table has been populated. This gives the added bonus that the indexes will have accurate statistics, which can further help the query as the query optimizer will be able to do accurate cardinality estimation.
Having a bunch of nonclustered indexes that aren't used wastes not just disk space, but also the time needed to create them. If this is in code that's executed frequently, removing these unneeded indexes that are created every time the code runs can have a significant effect on overall performance.
Problem 4: tempdb Latch Contention
It's quite common for there to be a latching bottleneck in tempdb that can be traced back to temporary table usage. If there are lots of concurrent connections running code that creates and drops temporary tables, access to the database's allocation bitmaps in memory can become a significant bottleneck.
This is because only one thread at a time can be changing an allocation bitmap to mark pages (from the temp table) as allocated or deallocated, and so all the other threads have to wait, decreasing the workload throughput. Even though there has been a temporary table cache since SQL Server 2005, it's not very large, and there are restrictions on when the temporary table can be cached (e.g. only when it's less than 8MB in size).
Traditional ways to work around this problem have been to use trace flag 1118 and multiple tempdb data files (see this blog post for more info), but another thing to consider is to remove the temporary tables altogether!
Summary
Temporary tables can be very useful, but they're very easily and commonly used incorrectly. Whenever you're writing (or reviewing code) that's using a temporary table, consider the following:
- Is this temporary table really needed?
- Is the code that populates the table using the correct filtering to limit the temporary table size?
- Are indexes created after table population (in general) and are the indexes being used by later code?
Paul White has a couple of great posts (here and here) about temporary object usage and caching that I recommend reading too.
And one last thing, if you decide not to use a temporary table, don't just replace it with a table variable, a common table expression, or a cursor (all of which are common ways that people try to "optimize away" the temporary table) – figure out the most efficient way to (re)write the code – there is no "one size fits all" answer.
Until next time, happy troubleshooting!
Good post.
I find that temp tables are best used to split queries when they end up being small and they improve cardinality estimation in the rest of the query. Sometimes you can split off small "preliminary" processing from the main, expensive part of the query. This can do wonders, or help nothing at all and cause damage.
One sad aspect is that the "Best Practices Advisor" tool shipped by the SQL team, used to recommend "Consider replacing with a table variable" every time it saw a temp table in use. Always seemed pretty tragic as general advice.
The way to avoid using a temp table in that scenario is to use a TOP 10000000000 statement on the nested sub-query and you essentially split the plan into 2 bits.
Might still result in 1 row estimated. Also does not fix bad histograms. Sometimes you need the stats of a temp table.
I believe you're far better off defining the temp table's clustered index before loading the table.
It's extraordinarily rare to even need a nonclustered index on a temp table, so you're definitely on to a lot of waste there. Sometimes people create the same nonclus index(es) on a temp that are on the original table the temp data came from, but of course just because an index was valuable on the original table doesn't mean it's needed on the temp table.
I've found the best times to use temporary tables is when you have 1 query with many joins, and SQL Server gets confused between what I'd call filtering joins and presentation joins. Do a query first with the filtering joins needed to get the data down to a more manageable level, then use that temp table with the presentation joins, to make the data look more meaningful to the end user.
How does your opinion change, Paul, if the data sizes you must deal with are ERP-sized, as in:
– User X is permitted to see 80% of inventory items, via a list (criteria are complex and dynamic, list of IDs is refreshed nightly), whereas others see only 5%, and there are quite a number of Users with no predictable "this is a normal user profile" because every business runs differently.
– Inventories total 38 Million
– Need to INNER JOIN against a 450 Million row Inventory Status table
– INNER JOIN against 25 Mill Replenishment Orders table
– and more joins
List of "visible ID's" is needed several times, typically, in any given calculation-style SP, or multiple resultset returning SP.
Thus, build a temp of all items as step 1 seems obligatory, rather than a virtual table query to pull the items every time they're needed.
If there's some technique that would make virtual tables better than a temp, I'm all ears.
Hey Scott – for the clustered index, it really depends on whether the data being inserted is sorted in the order of the cluster key, otherwise there will be lots of page splits from the random inserts, and whether you want index column statistics on the clustered index (which won't be there if you create the index before populating). Thanks
If you need that same result set multiple times, then caching it in a temp table is going to be more efficient than calculating it multiple times. Remember I didn't say that temp tables are bad, I said they're often used where they're not needed, and test to see if not using one would be better. In the case you describe, having that result set cache would definitely be better. Thanks
I read Paul White's very helpful article at https://sql.kiwi/2012/08/temporary-tables-in-stored-procedures.html.
In view of the lack of stats for a predefined key, would you say that the UPDATE STATISTICS #table; followed by an OPTION(RECOMPILE) for highly variable populations of the #table would be a viable technique if the other prerequisite, sorted data, was in place when inserting the temp's rows?
Under most circumstances the UPDATE STATISTICS should cause the plan to be invalidated but because there are circumstances where that’s not the case the OPTION (RECOMPILE) would certainly force the issue.
Thanks Paul for another great post! I do have a question…
Wouldn't creating the temp table first (where it is justified), then populate it and then creating index, cause entire stored proc to recompile due to interleaving code(we are on SQL 2012)?
Can we create temp table with primary key, then load data and then update statistics on the table before doing selects/lookups on the temp table? Is this any better? OR will this also trigger entire stored proc recompile as stats are updated?
Can you please advise? Thanks, Balaji.
Yes, interleaving DML and DDL will cause some recompilation, but that's a small consideration compared to having the stats correct. If the data is sorted, then index, load, update stats will work. Not sure how much recompilation that will cause, but again, recompilation nowadays isn't anywhere near as bad as it used to be in older versions. If the data isn't sorted, you'll have massive page splits as the index is populated with essentially random inserts. You need to experiment with your situation to find out which method is the most efficient for you, as generalizing doesn't imply a black or white 'always do this' or 'always do that'.
That's where we've often ended up using temporary tables. We usually did not start with a temporary table, but added it when trying to optimize. Sometimes, it helps; sometimes it does not. Each case is different.
Question re: Problem #4, does the same latching contention apply to table variables too?
Yes it does.
Agree… That's a method I use also to reduce locking on a main table while performing other actions on the data