THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Performance Impact: Mismatched Network Duplex Setting can Sink Your OLTP Throughput

This is probably known to a lot of people. But I think it’s worth repeating.

 

I was running some OLTP tests against a pretty powerful server. And everything appeared to be right on the client driver side and on the server side. But I was getting rather miserable throughput numbers. For this class of servers and for this type of OLTP workloads, I’d expect the transaction throughput per second (tps) to be well over 5000 tps. Instead, I was getting about 800 tps.

 

No matter how hard I tried to drive the workload against the server, the throughput just refused to budge. And the server resource consumption didn’t change much either as I increased the number of users going against the server.

 

In addition, for a given load level, when the average response time was expected to be in the 3ms range, the actual average response time was about 4~5ms. Not exactly what was expected, but not five or six times worse off either. However, upon a closer examination of individual transaction response times, I found that sometimes a transaction would take longer than 200ms to complete, and occasionally a transaction might even take well over 1000ms to complete. These long-running transactions (relatively speaking anyway) were what killed the overall transaction throughput.

 

In the end, it was discovered that the root of the problem was a mismatched duplex setting on the server. While the network card on the client driver machine was set to full duplex, the network card on the server was set to half duplex. After a simple change of the duplex setting on the server to full duplex, the transaction throughput for the exactly same workloads went up from 800 tps to more than 5000 tps.

 

Now in the real world, you may not see such a dramatic impact because you may not be consistently running small transactions at an extremely high frequency. For instance, if you are running large reporting queries, you may not feel any difference at all.

 

Published Tuesday, September 16, 2008 1:43 PM by Linchi Shea
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Denis Gobo said:

Ran into the same issue in 2001, we deployed to production and it was dog slow. we changed the connection string to point to staging..fast as hell. So after 5 minutes or so someone decided to check the duplex setting and you can guess the rest  :-)

September 16, 2008 12:53 PM
 

jchang said:

this was more a problem in the early days of Fast Ethernet. For what ever reason, the Intel FE NIC and the Cisco FE switch would not auto-detect and revert to half-duplex. This much more than a 2X reduction, in half-duplex, there are collisions. Consider a one lane road that traffic in both directions must share, versus a 2 lane, one for each direction. GE does not have a half-duplex setting, setting to half-duplex would revert 100Mb/s or 10. See my paper on SQL-Server-Performance on GE direct connect, which explains why there is no half-duplex

September 16, 2008 1:58 PM
 

GrumpyOldDBA said:

I've constantly encountered this since sql 6.0, finding missmatches in nics and switches. For the DBA it is difficult to sometimes convince anyone that there may be network problems, the same I also find with fc networks whereby questions about queue depths, buffers and so on are usually met with toal indifference - I know you've already blogged about HBA queue depth, it helped my cause considerably so thankyou for that. So sadly I have to report this issue is alive and well in production systems today.

September 17, 2008 4:54 AM
 

jchang said:

I think the biggest mistake people make in network assessment is looking at % network utilization to deduce that there is not a problem, which is a nearly useless counter as far as SQL is concerned. SQL really depends on serialized communications, where a network call is sent from client to server, the server responds with one or more packets, some which require acknowledgements. The fact is % network utilization will never be very except in backup or file transfers. packet latency is the bigger deal, plus the full network round trip time/

September 17, 2008 10:19 AM
 

Scott R. said:

Linchi,

This is a great post.  I have long suspected the impacts of misconfigured network duplex status, but have not seen the effects so well documented until now.

I am familiar with the process for configuring a NIC, using Control Panel / Network Connections/ right-click on desired NIC and choose Properties / click on Configure button / click on Advanced tab / Choose the Speed and Duplex property (or similarly named property).  This process will catch the obvious (such as your case – explicitly configured server NIC for half-duplex when full-duplex is desired), but it will leave to question the actual NIC status when “Auto” is requested (what duplex status was actually used – you hope the best is used – full-duplex, but you don’t know for sure).

What I am hoping for is an automatable process (such as a script) for getting the actual (versus requested) network duplex status.

Do you know of a host-based process for determining the actual (versus requested / configured) network duplex status for a given NIC / connection?  I have tried to find a non-invasive, host-based, vendor-neutral, self-detecting approach that:

-  Does not require physical access to secured facilities (data center)

-  Does not require visible or manual recognition (duplex light is lit on the proper NIC in the back of the server cabinet or on the proper network switch port)

-  Does not require collaboration with other IT staff (network, etc.)

-  Allows for automated collection and inventorying (if desired)

I have not had success towards these goals so far.  WMI doesn’t seem to capture network duplex status.  I haven’t found a GUI or command line utility for finding it.  Registry info on NIC configuration is highly vendor and model dependent, and only gives you “requested” configuration – not what is actually used (which may differ).

The alternatives:

-  Visual observation of the duplex status lights on either server NICs or network switch ports

-  Requiring access to physically secured facilities (data center) or relying on those that have such access

-  Use of network monitoring software or hardware tools

-  Involvement of additional IT staff (network, etc.)

may offer a solution, with catches (delays, etc.).

I have to believe that the running part of the NIC (internal registers, etc.) and some part of the OS (drivers, etc.) know how the network connection is established (half-duplex or full-duplex), and thus should be “queryable”.

Any thoughts?

What processes / persons / tools do you use to determine the network duplex status of a NIC?

Scott R.

September 18, 2008 1:11 PM
 

jchang said:

September 18, 2008 9:41 PM
 

Joopv said:

jchang, your insights in ethernet networking is very incorrect.  Your comparision of a half-duplex network with with a one lane road is completely wrong.  Please consider the speed of signals in the cable and the packet length in time: one BIT at 100Mbps will use 4 meters on the cable.  A minimum sized PACKET is 250 bits  in size.

Performance difference between half duplex and full duplex ethernet is at most - under heavy load - 10% or so.  Assuming that otherwise the link is correctly configured.

The performance IS however being impacted when one side is configured in half duplex and the other side is full duplex.

October 20, 2010 6:11 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement