Being a database administrator can be very challenging at times when you have to troubleshoot performance issues. The database server is just one component of the application ecosystem and it routinely gets blamed as being the performance problem. SQL Server is one of those black boxes that many don’t understand, much like the SAN and network. Production DBAs that monitor their servers can quickly identify if SQL Server is performing outside of its normal baseline, however there are two major areas that we have little visibility into: the SAN and the network. DBAs regularly have to work with other engineers when it comes to troubleshooting performance issues that are not directly related to SQL Server. We can easily track disk performance by monitoring
sys.dm_io_virtual_file_stats, which I wrote about in Monitoring Read/Write Latency; however, tracking network performance problems within SQL Server isn’t as easy.
Poor network performance can be a silent killer for application performance and my personal experience has shown this to be the case on many occasions. Often an application would start having performance issues and the application engineer would say that the application server looks good and starts to point their finger at the database. I would get a call to look at the database server and all indications showed that the database server was in good health (and this is where monitoring for key performance indicators and having a baseline helps!). Since the application and database teams were saying everything was good, we would ask the network team to check things out. The network team would look at a few things and give the all clear on their side as well. Each team troubleshooting and reviewing their respective systems took time, meanwhile the application performance was still suffering. The issue would then get escalated until all the teams would be asked to join a conference bridge to troubleshoot together. Eventually someone would start a deeper network test and determine that we either had a port saturation, routing, or some other complex networking issue. A few clicks or changing something on their end would eventually resolve the application slowness.
The most significant networking issue that I have encountered with clients typically involves bandwidth when performing backups. Many larger organizations are migrating to 10Gb networking for core infrastructure, however when working with both physical and virtual networking, it is easy to misconfigure a setting or port and have it drop to 1Gb. For regular application network traffic you may not notice the degradation in performance, however as soon as you start trying to copy 100s of GB of data for backups, that 1Gb will become saturated and your backup and restore jobs will suffer.
For DBAs it can be challenging to get others to look that deep into problems that they don’t think are their issue because initial indicators don’t reveal the problem. Being able to arm yourself with data before reaching out to other teams will help to get them involved. By using iPerf to do an initial bandwidth test, you can quickly determine if you are getting the network speeds that you are supposed to. For example, if you are utilizing 10Gb network between the application server and database server and you run a test and are only getting 1Gb, then you know something is not quite right. If you can document this finding, then you can with confidence ask your network engineers to look into a bandwidth issue.
How do you get started using iPerf? First you need to download the tool from iPerf.fr. Since I am working on Windows Server 2012, I’ve downloaded the Windows binaries onto my machine. Once you download and unzip the package, you will need to run the program from a command line. I downloaded iPerf 3.0.11 which has been out for almost a year. Make sure to read the documentation of this utility. Since this is a command line tool, there are dozens of options that you can use. In the example below I will only be using a few of them, however depending on your situation, you may need to use other options such as specifying the port or increasing the packet size. Please note that the option commands are case sensitive.
To use iPerf, you have to use at least two servers to test the bandwidth. Once you have copied the binaries to the two servers, you have to first start the iPerf listener on one of the servers. To do this I will run the following command:
This command runs iPerf in server mode and will only allow one connection at a time.
On the second server you will need to launch iPerf using several client options. First we are going to specify -c to specify client mode. We will also use -t to specify the time to run each test and -P to specify the number of simultaneous connections to make. We want to specify multiple connections so that we can put a proper strain on the network. For this test I am going to run the following command:
The command above will start a 30 second transmission test with 10 simultaneous connections.
I ran this test on two virtual machines on my Dell M6800 so there wasn’t a physical network for these VM’s to go through.
From server 2 connecting to server 1 I got 7.57 GBytes transferred with a bandwidth of 2.17 Gbits/sec. Not bad for a couple of VMs on a laptop.
From server 1 connecting to server 2 I got 6.98 GBytes transferred with a bandwidth of 2.00 Gbits/sec. As you can see there is a slight difference in the numbers but still relatively close. Had these numbers been drastically different then I would need to investigate the cause.
It is important to run these tests before going into production and to make a habit of regularly repeating these tests on your production servers. You need to know what is normal, if you aren’t monitoring it, then you can’t measure it. If you know that firmware updates are being performed on your servers, the virtual host, or any core networking equipment, an iPerf test before and after the changes could quickly alert you to identify any negative side effects.
It is also important to perform this test against any servers that interface directly with the database server and any servers the database server directly interfaces with such as network backup devices. IPerf works on both Windows and Linux making it easy to test between the two Operating Systems.
For DBAs, the network no longer has to be a black box that you don’t know anything about. Using iPerf can alert you to any bandwidth issues with the network between your database server and any other server. There is no reason to only rely on PING and TRACERT for limited network troubleshooting. Download iPerf and start documenting your network bandwidth.