Harmful, Pervasive SQL Server Performance Myths

Between my travels, presentations, and Q & A moderation, I talk to a lot of people about a wide variety of SQL Server performance issues. Recently, I've had a few interactions where people believe things that are either altogether incorrect, or are only correct in a very narrow set of use cases. Yet their adamance that these things are universally true is unsettling.

So, I thought I would start a new series to help quell some of these myths. Not to point at people and prove that they're wrong, but to stop the spread. When they make these blanket statements in their workplace, or on twitter, or in forums, if they go unchecked, they can "teach" impressionable or less experienced users.

Note that I don't intend to prove that these things are never true, because some can certainly be true in isolated or contrived scenarios. My aim is simply to demonstrate at least one case where it is not true; hopefully, this can start to change these stubborn mindsets.

Here are some of the "facts" I've been told recently, in no particular order:

"A clustered index is always better than a non-clustered index"
"Dynamic SQL made my query slow"
"PIVOT is faster than SUM(CASE)"
"NULLs always cause terrible performance issues"
"Execution plans are useless except for missing indexes"
"NOLOCK is okay because a lot of people use it"
"Oversizing varchar/nvarchar columns is okay"

As I write each post, I'll update this page by linking the corresponding item in the above list.

Do you have any performance myths that are passed around as absolute fact, but you suspect (or maybe even know) that they aren't always true? Let me know in the comments below, on twitter, or at abertrand@sentryone.com.

42 thoughts on “Harmful, Pervasive SQL Server Performance Myths”

Koen Verbeeck says:

March 9, 2017 at 2:16 PM

* TRUNCATE TABLE is not logged (and can't be rolled back)
* A seek is always better than a scan

David Alcock says:

March 9, 2017 at 3:04 PM

Great series, very much looking forward to reading more!!

One that I often come across is that "scans are bad" and that's the end of it. Yes, in most situations a scan operator might indicate that there's potential for some tuning, I'm happy with that, but at the same time there are times when scans are ideal; small tables, low selectivity etc (that's why the optimiser chooses them).

Erik Darling says:

March 9, 2017 at 3:11 PM

Having seen many of these discussions unfold on dba.se recently, I'm looking forward to the drubbing.

Sandra says:

March 9, 2017 at 3:18 PM

Add to myths: Joining tables hurts performance (argument for denormalization).

Matthew Darwin says:

March 9, 2017 at 3:30 PM

A few that I come across:-

A table scan always reads the whole table.
Key lookups are always bad.
Partitioning tables will increase performance.

And, bizarrely, read uncommitted isolation level is better than NOLOCK (no idea who dreamt that one up….)

Anurag Saxena says:

March 9, 2017 at 3:55 PM

Eagerly waiting your response on these points..

Stephen A Mangiameli says:

March 9, 2017 at 5:48 PM

Table variables are more efficient than temp tables.

Xenia says:

March 9, 2017 at 7:51 PM

– Cursors are always "bad" decision.
– The primary key must be clustered index.
– Any data colmn independent from the data type will not request space as far as we don't store values, i.e. null

Mark Allen says:

March 10, 2017 at 12:31 AM

Views should NEVER be used they kill ALL queries performance

sqL_handLe says:

March 10, 2017 at 5:54 AM

Query optimizer will arrive at same plan for query regardless of specified join order.

@dbayoungling says:

March 10, 2017 at 6:51 AM

Linked Servers are the fastes and best way for what ever

Deniz Enemeci says:

March 10, 2017 at 8:00 AM

Heap tables are always bad.

SeanR says:

March 10, 2017 at 2:30 PM

Oh god… This *100.

Kevin Fries says:

March 10, 2017 at 10:26 PM

My favorite howler, told to me by a developer with a query that ran for hours.

The order of your indexed columns should always follow the order of the columns in the WHERE clause. This was on an accounting detail table with 1 billion rows and growing.

Jake says:

March 11, 2017 at 1:21 PM

In SQL Server, the INFORMATION_SCHEMA views are better to use than the sys catalog views because they require fewer joins to get the metadata you want.

Martijn van der Munnik says:

March 12, 2017 at 12:55 PM

A page life expectancy of 300 sec is fine, no need to worry at all with that 500GB maxmem instance.

Sam Bishop says:

March 13, 2017 at 3:57 AM

Temp tables are always better than table variables.

Mark gribler says:

March 13, 2017 at 10:34 AM

Yes the defragmentation argument and the confusion between rebuilding and reorg snd thr staristics. Also the amazing use of frofiling and the dta ….someone please simplify this and make it less of a dark art…

Doug Lane says:

March 13, 2017 at 5:28 PM

At SQL Saturday in Chicago last weekend, I asked Adam Machanic which idea about query tuning people found most surprising. His answer: that IN and EXISTS are logically equivalent and that replacing one with the other does nothing to improve performance. "IT'S A MYTH!!!" he shouted.

I was surprised because I'd always believed EXISTS would execute quicker. Apparently not.

Brian Shay says:

March 13, 2017 at 5:33 PM

0_o Someone said that?!

SqlZim says:

March 13, 2017 at 9:38 PM

I don't know how pervasive this is, but:

"It is understood that rewriting the query with temporary tables can lead to some slowdown due to additional expenses: creation of temporary tables. If the database will not be mistaken with the choice of the query plan, it will perform the old query faster than a new one. However, this slowdown will always be negligible. Typically the creation of a temporary table takes a few milliseconds. That is, the delay can not have a significant impact on system performance, and usually can be ignored."

http://stackoverflow.com/a/16770129/2333499

mistasteve says:

March 14, 2017 at 9:33 PM

Upvote! I remember seeing this thrown about by people who should have known better going back to the days before they used this to advocate NoSQL solutions even.

MrTea says:

March 16, 2017 at 3:10 PM

Set MAXDOP to one in order to sort out your parallelism/CXPACKET problems. (alternatively just set it to one for no reason at all)
Use DISTINCT at the start of all your queries to sort out duplicate result problems.
Use NOLOCK on your queries to sort your performance[concurrency] problems.
Use OPTION(RECOMPILE) in all your stored procedures because parameter sniffing is bad.
Don't use transactions because they slow down your queries.
Don't use foreign keys because they slow down your queries as well.
I should create this specific index because missing indexes/dta suggested it.
My backups are not failing therefore my data is safe.

David Hay says:

April 13, 2017 at 5:34 AM

CTEs are great for performance

Reuben Anderson says:

April 13, 2017 at 6:52 AM

No way! Wow.

Tom Brannon says:

April 13, 2017 at 11:40 AM

I have not tested this in SQL Server, but when I was working with Oracle I was running a query with IN on a large table with somebody watching and I was getting impatient. I'd heard about EXISTS and had time to find and example on the internet, make a few mistakes coding it and it still finished before the original query. Several minutes for IN, a couple of seconds for EXISTS. I was not allowed to look at plans but I expect it was a scan vs a seek using the index. I have become very fond of EXISTS ever since.

Mark says:

April 13, 2017 at 11:42 AM

"col BETWEEN x and y" never uses indexes, you have to use "col >= x AND col <= y" instead.

Chris says:

April 13, 2017 at 1:53 PM

SQL is powerful – you should do everything in the database.

This was used to justify doing all sorts of CAST, CONVERT and string formatting operations in stored procedures that properly belonged in the application layer.

ScottPletcher says:

April 13, 2017 at 2:26 PM

An identity column should be used as the clustering key on (nearly) all tables. As a DBA, just let me say, ARGHHHHH.

Bob Feldsien says:

April 13, 2017 at 2:58 PM

Re read uncommitted is better than NOLOCK: In my opinion, yes and no. Technically there is no difference, true. What I tell my dev team is that when I see NOLOCK all over a stored procedure, I assume they are just copying what someone else did. It was probably a bad idea then and probably is now too. On the other hand, if I see SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED at the start of a stored procedure, I am inclined to think the developer actually understands the issue and made a conscious decision to use it. I regard it as a deviation from best practice and as such there should be an inline comment explaining why.

Paul Hunter says:

April 13, 2017 at 3:21 PM

A favorite of mine is…
Table variables are faster than temp tables because the variable is "in memory".

Aaron Bertrand says:

April 13, 2017 at 3:22 PM

Paul, funny, Derik Hammer recently published that one:

https://sqlperformance.com/2017/04/performance-myths/table-variables-in-memory

Jerry Scannell says:

April 13, 2017 at 3:32 PM

This is something I always felt was true but after seeing some of the comments here, I'm not sure any longer.

If I place entries in a where clause that entirely uses 'and' such that the most likely drop point is first, the query will be faster. Is this right?

Aaron Bertrand says:

April 13, 2017 at 3:33 PM

No, except in a few forced cases, SQL Server will not necessarily process your WHERE clause in the order you wrote it – though it still might make sense to write it that way for maintainability purposes.

SAinCA says:

April 13, 2017 at 4:59 PM

MAXDOP 1 = Enterprise-wide Standard (Not joking)

Paul K says:

April 13, 2017 at 5:45 PM

Recently did a lot of work to implement partitioning and eventually got a 10-20% performance boost. Then got a new SAN and performance went up by 50%. Leads me to think partitioning is not worth the effort.
We do use NOLOCK a lot but I think we know what we are doing :-)
Setting MAXDOP to 1 *did* solve a parallelism/CXPACKET problem (see comment from mrTea) but we know we have to go back into the code and fix it for real.
We used to be in the mode of adding indexes all the time and now we are getting some performance boost by pulling them back out. Hoping the pendulum stops swinging.

Adán Bucio says:

April 13, 2017 at 7:55 PM

I guess the myth part is not about the execution speed but about being equivalent. One being quicker than the other… it depends.

Adán Bucio says:

April 13, 2017 at 7:56 PM

The other one I hear frequently: CTE's are bad for performance.

Sandra says:

April 13, 2017 at 11:57 PM

Gail Shaw has good info comparing IN vs EXISTS and IN vs JOIN
"Exists is better for when comparisons are needed on two or more columns. For eg, this cannot be done easily with an IN"
http://sqlinthewild.co.za/index.php/2009/08/17/exists-vs-in/

Ron Clauw says:

April 14, 2017 at 2:32 PM

"Cursors (FAST_FORWARD / STATIC) are always faster than do while loops."

Ray H says:

April 15, 2017 at 4:49 PM

Ha, LOL. I tell developers that because I find that their answer to most questions involves nested views with scalar UDFs in the the column list :)

Bill says:

April 24, 2017 at 1:21 PM

You can determine what is most efficient by solely inspecting T-SQL, no matter what the costs may be.

Comments are closed.