SQL Server shouldn't do your dirty work

I often see "problems" that involve requirements for SQL Server to perform "dirty work" like:

In my trigger I need to copy a file to/from the network
My stored procedure needs to FTP a file
After the backup finishes I need SQL Server to zip it, make a copy, and then archive it
When a customer is added, I want to create a new database, and do a bunch of stuff in Active Directory
My SQL Server Agent job needs to scan a directory for files, and perform bulk inserts when it finds new ones

This is not an exhaustive list; I could probably fill a page. The point is that performing these tasks from within SQL Server presents significant obstacles:

Security

Typically, for anything where you deem SQL Server needs file system or other OS-level access, you are going to either (a) give explicit carte blanche rights to the SQL Server service account (and/or the SQL Agent / proxy accounts), or (b) just set SQL Server service accounts to run as an existing domain account that already has all of those rights. This is the "easy" solution – now, instead of individually granting access to this folder and that share and this other resource, you just wipe your hands because they're already domain admins. Next you enable server-level settings that are disabled by default but are standing in your way of accomplishing one or more of the above tasks (e.g. xp_cmdshell).

I don't think I have to explain the level of exposure these actions can represent. Or what kind of problems can happen if this is an actual employee account, and that employee goes away – or determines that he/she is disgruntled before they go away. Yikes. I've seen several cases where a common account is used for all the SQL Servers. Guess what happens if/when you need to change the password for a domain account that is being actively used by dozens or hundreds of SQL Server instances? Never mind how often it will happen if you don't exclude that user from password reset policies?

Performance

In addition to security issues, going outside of the database server can introduce delays while SQL Server relies on some external process it has no control over. Does the database transaction really need to wait for a file to be compressed and copied, or for an FTP transfer to complete, or for your backup domain controller to respond? How does this compound when several users are performing similar tasks, all competing for the same bandwidth and/or disk heads? Also you must consider that once SQL Server has told some batch file to do something, then the containing transaction is rolled back, you can't roll back the external action.

The Answer

Well, usually – and there are always exceptions – the answer is to use external processes for these tasks that really are external to SQL Server. Use PowerShell, use C#, use batch files; heck, use VBScript. Think about which of these tasks really need to be handled *immediately* and while the transaction is still active – I suspect not many. Build a queue table for these, and write to the queue table inside the transaction (which will be rolled back if the transaction is not successful). Then, have a background task or script that consumes rows from the queue table, performs the associated task(s), and deletes or marks each row as completed. Added bonus: SQL Server Agent isn't required here, so you can use any enterprise scheduler, and the methodology still works with SQL Server Express.

4 thoughts on “Stop making SQL Server do your dirty work”

Pingback: Convert UK Postcode to Latitude/Longitude/Northing/Easting using SQL CLR - SQL Server - SQL Server - Toad World
Endrju says:

September 28, 2014 at 12:14 PM

"Build a queue table". No! Use readily available Service Broker, possibly with external activation. Everybody forgets this crazily robust feature in SQL Server.
Aaron Bertrand says:

September 28, 2014 at 1:30 PM

Endrju,

Here's why I will often suggest a queue table instead of service broker.
- Learning curve – I'd bet that more than half of the people out there who copied some service broker example from somewhere have no idea how it works and run to Stack Overflow or their consultant when they need to troubleshoot something.
- Flexibility – with a queue table I can have all kinds of columns that dictate what should happen to an item in the queue. I can control the frequency and size of bursts of items that get processed. I can process the queue manually when I need to. I can prevent duplicates. Service broker is stateless and a lot of these things – which can be very real requirements – are either difficult or impossible.
So yes, Service Broker is an option, but it's not the first option I'm going to recommend when I have no idea about the expertise of the reader or their requirements.
Tony says:

November 17, 2014 at 7:03 PM

Aaron,

You can do ALL of those things in service broker. I frequently use "Request" tables with all the flexibility I need, then send the Id of the row off to be processed by broker. You can write whatever batching/processing/throttling techniques you like with broker just as easily as a custom solution. You can receive and process manually if you want. Its well documented. Broker is as robust and flexible as what you build with it.

Comments are closed.