Jonathan Kehayias

Execution Plan Impact on ASYNC_NETWORK_IO Waits – Part 1

March 5, 2020 by in Waits | 2 Comments
SentryOne Newsletters

The SQLPerformance.com bi-weekly newsletter keeps you up to speed on the most recent blog posts and forum discussions in the SQL Server community.

eNews is a bi-monthly newsletter with fun information about SentryOne, tips to help improve your productivity, and much more.

Subscribe

Featured Author

Itzik is a T-SQL trainer, a co-founder of SolidQ, and blogs about T-SQL fundamentals and query tuning.

Itzik’s Posts

A few weeks ago, an interesting question was asked on the #SQLHelp hash tag on Twitter about the impact of execution plans on the ASYNC_NETWORK_IO wait type, and it generated some differing opinions and a lot of good discussion.

https://twitter.com/shawndube/status/1225476846537650176

My immediate answer to this would be that someone is misinterpreting the cause and effect of this, since the ASYNC_NETWORK_IO wait type is encountered when the Engine has results to send over TDS to the client but there are no available TDS buffers on the connection to send them on. Generally speaking, this means that the client side is not consuming the results efficiently, but based on the ensuing discussion I became intrigued enough to do some testing of whether or not an execution plan would actually impact the ASYNC_NETWORK_IO waits significantly.

To summarize: Focusing on ASYNC_NETWORK_IO waits alone as a tuning metric is a mistake. The faster a query executes, the higher this wait type will likely accumulate, even if the client is consuming results as fast as possible. (Also see Greg's recent post about focusing on waits alone in general.)

Test Configuration

To run the tests for this, a very simple table was generated based on an example that was provided to me by email from another member of the community, which demonstrated a change in the wait type, but also had an entirely different query between the two tests with an additional table being used in the second test, and it had a comment to turn the results off, which removes the significant portion of this wait type to begin with, so it’s not just a plan change alone.

Note: I would like point out that this is not a negative statement towards anyone at all; the ensuing discussion and further tests that came from the original reproduction that was provided was very educational and it led to further research towards understanding this wait type overall. The original reproduction DID demonstrate a difference, but with additional changes that weren’t part of the original question as posed.

DROP TABLE IF EXISTS [DemoTable];

CREATE TABLE [DemoTable] (
  ID INT PRIMARY KEY,
  FILLER VARCHAR(100)
);

INSERT INTO [DemoTable] WITH (TABLOCK)
SELECT TOP (250000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), REPLICATE('Z', 50)
  FROM master..spt_values t1
  CROSS JOIN master..spt_values t2
  CROSS JOIN master..spt_values t3
  OPTION (MAXDOP 1);
GO

Using this table as a base data set for testing different plan shapes using hints, the following queries were used:

SELECT t1.ID, t2.FILLER, t2.FILLER
  FROM [DemoTable] t1
  INNER HASH JOIN [DemoTable] t2 ON t1.ID = t2.ID;

  SELECT t1.ID, t2.FILLER, t2.FILLER
  FROM [DemoTable] t1
  INNER MERGE JOIN [DemoTable] t2 ON t1.ID = t2.ID;

  SELECT t1.ID, t2.FILLER, t2.FILLER
  FROM [DemoTable] t1
  INNER LOOP JOIN [DemoTable] t2 ON t1.ID = t2.ID;

Since I was running these queries on SQL Server 2019 CU1, the execution plans included the actual wait stats information related to the query execution.

Note: The optimizer would use a Merge Join without the hints being applied for this specific data set and query.

Initial Test Results

For the initial tests I simply used SSMS to run the queries and collected the Actual Execution plan to compare the wait information associated with each query which are shown below. Note that for this size of data, the elapsed times are not significantly different, and neither are the wait times or wait counts for ASYNC_NETWORK_IO.

HASH JOIN


  
  
  
  
  
  
  

MERGE JOIN


  

LOOP JOIN


  
  
  

However, this wasn’t where I wanted to stop testing, because my own experience has shown repeatedly that Management Studio is a very inefficient consumer of results from SQL Server and can itself cause ASYNC_NETWORK_IO waits to occur. So, I decided to change how I was testing things and went to a SQLCMD execution of the queries.

Testing with SQLCMD

Since I use SQLCMD a lot for demos while presenting, I created a testscript.sql file with the following contents:

PRINT 'Minimize Screen';
GO

WAITFOR DELAY '00:00:05';
GO

SELECT t1.ID, t2.FILLER, t2.FILLER
  FROM [DemoTable] t1
  INNER HASH JOIN [DemoTable] t2 ON t1.ID = t2.ID;
GO

SELECT t1.ID, t2.FILLER, t2.FILLER
  FROM [DemoTable] t1
  INNER MERGE JOIN [DemoTable] t2 ON t1.ID = t2.ID;
GO

SELECT t1.ID, t2.FILLER, t2.FILLER
  FROM [DemoTable] t1
  INNER LOOP JOIN [DemoTable] t2 ON t1.ID = t2.ID;
GO

This was executed from the command line as follows, and during the 5 second delay the window was minimized to allow the execution to not render and scroll results during processing:

sqlcmd -S.\SQL2019 -i testscript.sql -dAdventureWorks2017

To capture the actual execution plans, I went with an Extended Events session collecting the query_post_execution_showplan event which, in hindsight, on SQL Server 2019 I thought I should have used query_post_execution_plan_profile instead to use the lightweight query execution statistics profiling infrastructure v3 implementation, but this event does not return the WaitStats or QueryTimeStats information unless query_post_execution_showplan is also enabled at the same time. Additionally, since this is an isolated testing machine with no other workload, the impacts of the standard profiling are not really that big of a concern here.

CREATE EVENT SESSION [Actual Plan] ON SERVER 
  ADD EVENT sqlserver.query_post_execution_showplan
  (ACTION(sqlserver.session_id));

HASH JOIN


  
  
  
  
  
  

MERGE JOIN


  

LOOP JOIN


  
  
  

This actually didn’t work out to be a faster way of executing the query, and performance was actually reduced by using the command line utility to execute the query, even when the window is minimized and not visibly scrolling the results. With the window open the HASH execution time was 15708ms and the ASYNC_NETWORK_IO wait time was 15126ms. However, this demonstrates that for the same exact results the performance of the client consuming the results impacts both the wait time and execution time of the query.

Parallelism Impact?

One of the things that I noticed was that only two of the plans had executed with parallelism, based on the existence of the CXPACKET and LATCH_EX waits in the execution plan XML. So I wondered what kind of impact forcing a serial execution plan would have on the execution of these same queries using OPTION (MAXDOP 1).

HASH JOIN


  

MERGE JOIN


  

LOOP JOIN


  

Notice here that the overall wait counts haven’t decreased significantly. Only the serial loop join plan has a major change in the number of waits or total amount of wait time associated with it, and in isolation this doesn’t mean that it’s a positive benefit, the execution time for the query was not significantly improved and there may be other factors that affected the outcomes of that specific test.

The table below summarizes the ASYNC_NETWORK_IO wait time and count for each of the tests.

PlanType Rows WaitCount WaitTime ExecTime AppName MAXDOP 1 Parallel
Hash 250,000 6,635 4,394 4,961 SSMS N Y
Merge 250,000 6,592 3,169 3,933 SSMS N N
Loop 250,000 6,631 3,576 4,084 SSMS N Y
Hash 250,000 6,610 11,321 11,874 SQLCMD N Y
Merge 250,000 6,602 10,837 11,597 SQLCMD N N
Loop 250,000 6,612 11,177 11,696 SQLCMD N Y
Hash 250,000 6,379 4,047 4,619 SSMS Y N
Merge 250,000 6,608 3,699 4,479 SSMS Y N
Loop 250,000 5,385 2,083 3,918 SSMS Y N

Summary

While this post’s investigation doesn’t cover every single aspect of plan changes or the ASYNC_NETWORK_IO wait type, it does demonstrate that this wait is not significantly impacted by the execution plan that is used for a query’s execution. I would classify this wait type almost like the CXPACKET wait type when performing analysis of a server as a whole; normal to see for most workloads and unless it’s incredibly skewed and there are other performance problems pointing to slow consumption of results by clients like blocking with the lead blocker waiting for ASYNC_NETWORK_IO, then something to be ignored as just ‘part of the normal waits signature for the workload’.