THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Joe Chang

Hyper-Threading from NetBurst to Nehalem

There seems to be significant confusion about Hyper-Threading. Part of the problem is that vendors like to tout every new feature as the greatest invention since the six-pack, and its follow-on the 12-pack. I used to think the 4-pack was a travesty. But now I am older and can nolonger finish a 12-pack with each meal. Suddenly the 4-pack is not such a travesty.

But I digress. I do applaud innovation. And I do accept that the first generation is never perfect or close to problem-free for that matter. That's why it is called the bleeding edge. If you want to play in this ballpark, you better know how to do a proper investigation. Given that Intel is a company with immense resources. They could have put out a detailed document on when to use HT, and when not to. Instead, like every other company, they touted the positives, and hid the negatives.

The Intel Pentium 4 architecture NetBurst processors up to Xeon 50x0 and 7100, from 2001 to 2006 were the first generation of x86 processors with HT. Technically, the 180nm Willamette/Foster and 130nm Northwood/Gallatin processors were the first generation HT, and the 90nm Prescott/Potomac and 65nm Cedar Mill/Tulsa processors were generation 1.5, improvements, but there was not enough time to make second generation changes.

The NetBurst architecture HT could benefit a very limited range of SQL operations (high network roundtrip, backup with compression). HT was neutral in many, and negative in some, particularly parallel execution plans. For most people, it was better to disable HT, or lock SQL Server to lower half of logical processors (1 logical per physical core, Windows OS processor enumeration has since changed).

The subsequent generation of Intel Core 2 architecture processors Xeon 5100-5400 (from 2006 to 2009),  and Xeon 7200-7400 (2007-2010) were designed by a completely different team and did not have HT so this is a moot point.

Now at Hyper-Threading has returned with the Intel Xeon 5500 (Nehalem architecture, introduced 2009) and 5600 (Westmere), and soon 7500 (Nehalem-EX). BTW, Nehalem was designed by the former Willamette/Prescott team. 
Every indication is that most of the HT issues of NetBurst have been fixed. So I would definitely not recommend disabling HT in BIOS. I generally recommend limiting MAXDOP to 4, but for certain tested queries explicitly setting a higher MAXDOP as appropriate.

AMD Opteron to date has never had HT.

The Intel Itanium processors with dual-core have HT, and I believe this is also a good HT implementation, as there was sufficient time lapse between the first generation of HT to correct any critical issues.

More about Hyper-Threading

The original intent of HT in Willamette (Pentium 4, NetBurst) was to make better use of the 3 superscalar execution units. The Pentium Pro expanded the 2 execution unit in Pentium to 3 units. The x86/IA-32 instruction set architecture knows nothing about the microprocessor having superscalar execution units. The microprocessor must figure out on the fly if multiple instructions can be issued simultaneously. So while superscalar architecture did improve microprocessor performance, there was a rapid fall off in benefit and rapid increase in complexity in adding more superscalar execution units. I recall that the average number of instruction completed per cpu-cycle in the 3 unit was 1, as there were many dead cycles.

(This is also why the Intel/HP joint effort went with Explicitly Parallel something something for Itanium. If the compiler can figure out what instructions can be executed in parallel, then the processor does not have to expend silicon/transistors for this)

So the idea was to have an extra program counter and registers to run 2 threads simultaneously, given that the average utilization of the execution units was low. I recall there being some discussion that without a priority thread instruction, there would be probelms with transactional database engine style code.

In both Itanium and Nehalem, it is my understanding that Intel has backed away from simultaneous multi-threading to time-slice multi-threading. The theory here is that in modern microprocessors, the clock rate is so short relative to memory access, and even L3 cache, that there will be many dead cycles every time there is an L3 or memory access. So now HT will switch in the alternate thread any time a thread encounters dead cycles.

In the NetBurst HT, I did an extensive investigation and could find no SQL operation that benefited from HT, with some negatives. The only database application that did benefit was network round-trip intensive ops, like SAP, that fetch a single row at a time. The gain here was about 15%. When I did work on Quest LiteSpeed, I found that the compression engine could get an astounding 40-50% performance gain with HT. So the theory of HT was sound, and what ever in the SQL engine that has problems with HT does not occur in a simple multi-threaded compression engine.

In summary, HT with the current generation Xeon, and even Itanium, is too good to blindly disable in BIOS, based on tribal knowledge that really applied to the older generation NetBurst processors. Too many "best practices" are written by people who do not actually investigate the underlying reason for a "rule", they just want a list of rules to live by.

Hyper-Threading or HyperThreading

A Google search seems to prefer HyperThreading over Hyper-Threading. The official Intel term, and they have many marketing people who do nothing be determine the official term, seems to be Hyper-Threading

http://www.intel.com/technology/platform-technology/hyper-threading/

but many pages on the Intel web site has HyperThreading.

here is the Intel article for the original Willamette-Pentium 4 NetBurst HT

ftp://download.intel.com/technology/itj/2002/volume06issue01/vol6iss1_hyper_threading_technology.pdf

Published Tuesday, March 23, 2010 11:53 AM by jchang

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Glenn Berry said:

The current generation hyper-threading (in the Nehalem and Westmere) definitely works much better for the OLTP workloads that I have tested it with so far. I think it is also significant that every TPC benchmark result that I have seen for Xeon 55xx so far has hyper-threading enabled.

Good to see you blogging again.

March 24, 2010 7:27 AM
 

jchang said:

hello Glenn, we must meet up some time, I might actually make it to PASS this year. I believe HT was always used for TPC-C except maybe for the really big NUMA systems. But I think HT was turned on for the Xeon 5500 TPC-H as well. One of the concerns with HT in the P4 generation was parallel execution plans. Now even that is fixed.

March 24, 2010 7:57 AM
 

The Tao Of Sql Server said:

Following on from my comments about CPU configuration I have decided to restate my position to "it

August 4, 2010 3:15 PM
 

hung nguyen said:

Hi Joe Chang,

"One of the concerns with HT in the P4 generation was parallel execution plans. Now even that is fixed."

Is it really? Did you talk to the MS folks who handle this case?

http://ozamora.com/2010/09/sql-server-2008-r2-and-nehalem-processors/

July 1, 2011 1:56 PM

Leave a Comment

(required) 
(required) 
Submit

About jchang

Reverse engineering the SQL Server Cost Based Optimizer (Query Optimizer), NUMA System Architecture, performance tools developer - SQL ExecStats, mucking with the data distribution statistics histogram - decoding STATS_STREAM, Parallel Execution plans, microprocessors, SSD, HDD, SAN, storage performance, performance modeling and prediction, database architecture, SQL Server engine

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement