THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Performance impact: hyperthreading for OLTP queries -- II

This is in part a response to a comment by Paul White (@SQL_Kiwi) to my previous post on the performance impact of enabling hyperthreading (HT) on OLTP queries, and in part due to my desire to capture a more complete set of test data for future investigation on this very topic. I’m posting below the results of re-running the same exact test as described in the previous post but with the SQL Server instance bumped up to build 10.50.2500 from 10.50.1600. The former is SQL Server 2008 R2 with Service Pack 1, whereas the latter is SQL Server 2008 R2 RTM.

In addition, I have included the core count as a formal test parameter, and tested the performance impact of enabling HT at the following core counts: 16, 20, 24, 32, and 40. The core count was controlled through the BIOS on machine reboot. The exact BIOS feature for controlling the number of cores is primarily under System Options –> Processor Options –> Enhanced Processor Core Disable (Intel Core Select). This allows one to enter the number of cores to be enabled per socket.

The results are as follows:


In the previous post, enabling HT is reported to increase the throughput by 5~7.5%. In the above chart, the increase in the throughput varies between 9% and 14%. So it may appear that SQL Server 2008 R2 SP1 responds slightly better to enabling HT than does SQL Server 2008 R2 RTM. However, because there is always a margin of error in any test, I’d assign more significance to the overall trends and patterns than the exact numbers.  And in both cases, the over trends and patterns are similar in that this particular OLTP workload responded positively to enabling HT on the E7-4870 processors.

Published Wednesday, January 11, 2012 2:51 PM by Linchi Shea



Greg Linwood said:

Linchi, what is the load distribution between cores? Are you seeing the classic situation where each second core isn't being utilised?

If so, have you tried the old technique of reconfiguring the server with processor affinity (set on for every core) & then assessed whether this affects results? This certainly affected results in SQL 2005 & earlier but I'm not certain whether 10.5 would have the same scheduling problems as earlier versions which were somehow magically fixed with processor affinitisation

January 11, 2012 5:01 PM

Linchi Shea said:


All the cores and all the logical processors when HT was enabled are fully loaded with this particular workload. In one of the previous posts about a sign of the times, I did mention that with the TPC-C read-only transacions unchanged, I was not able to drive the processors to their full capacity with SQL2008 R2 RTM. I still can't with SQL2008 R2 SP1. In that case, I did see every other logical processors relatively idle, although not completely idle. And when that happened, the %Privileged Time was very high. For instance, the total %Processor Time may be 20% and the %Privileged Time could reach 18%, which is extraordinarily high. I never got a chance to try out the affinity configuration with the unchanged workload, but probably should have and may give it a try when I get time. I did try lightweight poolings, and that did not help. Also note that the problem only happened with a relatively high core count.

January 11, 2012 5:28 PM

Greg Linwood said:

Interesting. TPC-C workloads are more I/O intensive than TPC-E, so if you are trying to benchmark processor configuration performance, is running the TPC-E benchmark an option for you?

Re the affinity setting - I'm not even sure whether the old rules apply. SQL 2005 task scheduling certainly behaved very differently with this setting vs without it but I'm not sure about SQL 10.5 as we haven't had to deal with any similar issues on that SKU yet

I'm wondering about the high %Privileged Time observation but don't have any immediate ideas..

January 11, 2012 5:38 PM

Linchi Shea said:


My tests do not involve any disk I/Os. I only use the read-only transactions and cache all the data in memory. I'm interested in evaluating the host server and including the storage component only help comfound the investigation. This is also documented in my previous posts. But perhaps I should do a post on why the storage component should be eliminated from these tests so that it is 100% clear.

January 11, 2012 5:57 PM

Greg Linwood said:

Ah, I missed that - thx for clarifying.

January 11, 2012 6:09 PM

Linchi Shea said:


I didn't answer your question in my previous reply wrt TPC-C and TPC-E. The way I configure the tests, I'd venture to say that it probably wouldn't matter too much, but that's just a guess and I'm not totaly sure until I do some actually comparison. But the reason for the guess is that when all the data is cached, it's just a bunch of page lookups/scans. For two reasons, I have sticked with TPC-C. First, I'm not doing real official TPC-C benchmarking. I use it for some very specific purposes and most times, I configure it to help evaluate server hardware. For that, it has worked really well for me. And because of that, I've accumulated a lot of data that I can use for comparison purposes, making it harder to switch to a different one and start all over again.

The second reason is just that I'm not as familiar with TPC-E, not in terms of what it is, but in terms of not having developed an intimate feel for it. That's actually pretty important in testing. Say, with my version of TPC-C, I can easily and quickly tell if the results do not look right, and so on, and therefore and can quickly correct whatever I might be doing wrong.

January 11, 2012 6:13 PM

Greg Linwood said:

I think using TPC-C makes good sense for hardware testing because the query workload is simpler than TPC-E & is therefore more easily compared against hardware platforms.

The problem with TPC-E's more complex queries is that it is more vulnerable to query compilation inconsistencies, especially across multiple hardware platforms which have different resources such as CPU & RAM (which SQL Server evaluates during query optimisation)

We've run TPC-E in our lab using the official kit (with permission to test but not publish from Microsoft, as we're not TPC members) and have observed significant query optimisation inconsistencies when re-running workloads.

I think the main reason is that TPC-E has more complex joins and the standard kit lacks ideal indexes, leaving the query optimiser with more complex decisions. Even when we add better indexes (particularly indexed views, to help optimise the complex joins) we haven't been able to totally stabilise query optimisation.

I also suspect this is part of the reason other vendors aren't jumping into TPC-E benchmarking - it's inherently difficult to get stable performance out of a SQL DBMS, especially with more complex query workloads..

January 11, 2012 8:19 PM

Linchi Shea said:

Yes, I think I'm doing myself a service by keeping the SQL test workload relatively simple (while relatively scalable within a limit) when evaluating hardware. It makes it easy to verify that whatever change I see in the test result is not a result of a change deeply buried somewhere in the workload itself (e.g. a minor plan change). It's a tool and must be trustworthy. If I can't keep the workload constant or need to put in a lot of effort to keep it constant, then it's not worth it for what I want to accomplish.

January 11, 2012 11:21 PM
New Comments to this post are disabled

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog


Privacy Statement