In my last post, I illustrated one reason you should stop using deprecated system tables like
sysprocesses. This wasn't for performance reasons, directly, or to simply follow Microsoft's documented best practices, but revolved more around the decisions you might make when you only have access to some of the data.
This time around, I want to talk about a feature included with SQL Server client tools that we should not be using these days – not just because it is deprecated but, more importantly, because it has the potential to completely take a server down:
SQL Server Profiler
Since SQL Server 2012 (and, to a much lesser extent, since SQL Server 2008), the most complete and flexible solution for monitoring events on a SQL Server instance has been Extended Events. On the other hand, since it was first invented (roughly right around the time I started my career), SQL Server Profiler has been one of the most prolific ways to accidentally bring a server to its knees.
Not too long ago, something like this happened to us. A developer made a change to their application to reduce the number of round-trips they were making. They ran the application locally and in our development environment, using Profiler with a filtered trace, to confirm their changes were working as expected. Let me remind you at this point that the official documentation for SQL Server Profiler includes the following warning:
Anyway, when they deployed the new version of their application to production, and targeted the production server with the same filtered trace, it didn't go so well. Their wildcard filter on application name didn't take into account the other (similarly-named) applications also connecting to this server, and they immediately started capturing way more info than their open Profiler window could handle. This resulted in a catastrophic uptick in connection time for all users and applications connecting to that instance. It would be an understatement to say that complaints were lodged.
When the culprit was determined, and we got a response from the developer, you can see that the connection time trended back down to normal almost immediately after they stopped their Profiler trace (click to enlarge):
This is definitely a scenario where the old "worked on my machine" statement does not in any way mean it will work well on a busy production server. And this incident has led to an active conversation around modifying the logon trigger that already exists across all of the servers in our environment to simply block the application name Profiler passes in its connection string.
Maybe this isn't a Profiler problem. (But it kind of is.)
I am not here to suggest that other monitoring tools, including Extended Events, can't possibly be used inappropriately to bring a server down in a similar (or worse!) way. There are plenty of opportunities to inadvertently affect an instance of SQL Server, in really adverse ways, without touching SQL Server Profiler.
But Profiler is notorious for this type of symptom because of the way it consumes data. It is a user interface with a grid that presents new rows as it receives them; unfortunately, it makes SQL Server wait while it renders them before allowing SQL Server to transmit more rows. This behavior is similar to how SQL Server Management Studio can slow queries down and cause high
ASYNC_NETWORK_IO waits as it tries to render a large amount of output to its own grid. That is a simplification (and is not to be confused with the way the underlying SQL Trace can be made to behave, which Greg Gonzalez (@SQLsensei) explains in "Don't Fear the Trace"), but it is exactly what leads to the type of scenario shown above: that bottleneck has a cascading effect on any other processes trying to do anything in the same code path as what you're tracing (including trying to establish a connection).
Afraid of Extended Events?
Don't be. It is high time we all ditch Profiler and embrace the present. There is no shortage of tutorials and guides out there, including Microsoft's own QuickStart and several articles right here on this site.
If you have existing traces you rely on, Jonathan Kehayias (@SQLPoolBoy) has a really handy script to convert an existing trace to Extended Events. You can still feel free to configure the original trace with the Profiler UI, if that's where you're most comfortable; just please do it without connecting to a production server. You can read about that script here and see some of Jonathan's other Extended Events articles here.
If you're having a hard time with the user experience, you are not alone, but there are some answers:
- Erin Stellato (@erinstellato) has long been a spectacular advocate for Extended Events, and often wonders aloud why people are reluctant to let go of Profiler and SQL Trace in general. She has some insight (and inspired a lot of comments) in her 2016 post, "Why do YOU avoid Extended Events?"; perhaps there is some insight there into whether your reasons for holding out are still (as) valid in 2021.
- There is an XEvent Profiler built into modern versions of SSMS, with an equivalent extension for Azure Data Studio. Though, confusingly, they called this extension – of all the things one could possibly imagine – SQL Server Profiler. Erin also has a few thoughts about that choice.
- Erik Darling (@erikdarlingdata) has created
sp_HumanEventsto take some of the pain out of switching to Extended Events. One of my favorite "stick to the point" folks, Erik describes
sp_HumanEventsas follows: "If you want a pain-free way to profile query metrics, wait stats, blocking, compiles, or recompiles with Extended Events, this is your unicorn."