THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Joe Chang

Intel Xeon E7 (Westmere-EX) and Sandy Bridge comments

Last week Intel announced the 10-core Xeon E7-x8xx series (Westmere-EX), superceding the Xeon 6500 and 7500 series (Nehalem-EX). The E7 group consists of the E7-8800 series for 8-way systems, the E7-4800 series for 4-way systems and the E7-2800 series for 2-way systems. Also, the E3-12xx series (Sandy Bridge) for 1-socket servers, superceding the Xeon 3000 series (Nehalem and Westmere). This week at Intel Developer Forum Bejing, Intel has a slidedeck on Sandy Bridge-EP, an 8-core die that will presumably be the Xeon E5-xxxx series superceding the Xeon 5600 series (Westmere-EP) scheduled for 2H 2011.

High Xeon 6500/7500, 4-8 cores E7-8/4/2800, 6-10 cores
Mid Xeon 5600, 4-6 cores E5-xx00, upto 8 cores
Entry  Xeon 3x00, 2-6 cores E3-1200, 2-4 cores

The top-of-the-line Xeon E7-8870 is 10-core, 2.4GHz (max turbo 2.8GHz) and 30M last level cache, compared with Xeon X7560 8-core, 2.26GHz (turbo 2.67GHz) and 24M LLC. HP ProLiant DL580 G7 TPC-E results for 4-way Xeon E7-8870 and 7560 are 2454.51 and 2,001.12 respectively. This is a 22% gain from 25% more cores, and 6% higher frequency, inline with expectations.

IBM System x3850 X5 TPC-H results at scale factor 1TB for the 4-way Xeon X7560 and 8-way Xeon E7-8870 are below.

4-way Xeon 7560 127,676.1 81,039.6 101,719.3
8-way E7-8870 200,899.9 150,635.8 173,961.8

It is unfortunate that a direct comparison (with same number of processors and at the same SF) between the Xeon E7-8870 and X7560 is not available. The presumption is that the Xeon E7-8870 would show only moderate improvement over the X7560. This because the TPC-H is scored on a geometric mean of the 22 queries, of which only some benefit from very high degree-of-parallelism.

The more modest performance gain from Nehalem-EX to Westmere-EX, compared to the previous 40% per year objective, is probably an indication of the future trend in the pace of performance progression. The pace of single core performance progression slowed several years ago. Now, the number of cores per processor socket also cannot be increased at a rapid pace.

Fortunately, the compute power available in reasonably priced systems is already so outstanding that the only excuse for poor performance is incompetence on the software side. My expectation is that transaction processing performance can still be boosted significantly with more threads per core. The IBM POWER 7 and Oracle/Sun SPARC T3 implement 8 threads per core. It is unclear if Intel intends to pursue this avenue. Data Warehouse performance could be increased with columnar storage, already in Ingres VectorWise and coming in the next version of SQL Server. Scale out is now available in PDW (EXA SOL has TPC-H results with 60 nodes).

I am also of the opinion that SIMD instruction set extensions for the row & column offset calculation could improve database engine performance. The object is not just the reduce the number of instructions, but more importantly to make the memory access sequence more transparent, ie, allow for effective prefetching.

At the system level, the processor interconnect technology (AMD Hyper-Transport and Intel QPI) should also allow scale-up systems. HP has mentioned that 16-way Xeon is possible. HP already has the crossbar technology from their Itanium based Superdome and the sx3000 chipset. It is probably just a matter of gauging the market volume of the 8-way ProLiant DL980 to assess whether their is also a viable market for 16-way Xeon systems.

Another observation is the price structure of 2-way and 4-way systems. It used to be that there was very little price difference between 1-way and 2-way systems with otherwise comparable features. So the default system choice frequently started with a 2-way system and higher. On the downside, the older 2-way systems also did not have substantially better memory or IO capability. With the 2-way Xeon 5500 and 5600 systems, there is a more significant price gap between 1-way and 2-way systems. However, the 2-way Xeon 5500 systems also have serious memory and IO capability. So low-end entry system now needs to revert to a single socket system.

The price gap with 4-way systems has also grown along with capabilities, particularly in memory capacity and reliability. The default system upgrade choice should be to replace older 4-way systems with new generation 2-way systems. The new 4-way systems should target very-high reliability requirements.

Published Wednesday, April 13, 2011 3:00 PM by jchang

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS



Glenn Berry said:

I agree completely about replacing older four way systems with two way Xeon 5600 systems right now, and with Xeon E5 systems later this year. These new two way systems have so much CPU and memory capacity compared to a few years ago, they really are the sweet spot for many applications.

April 21, 2011 5:00 PM

joe schmoe said:

Itanium and Mainframes now "at the bottom"

I oldframe Z90 RPG proggies cant keep up with the TPC-H....

June 20, 2011 2:46 PM

Leave a Comment


About jchang

Reverse engineering the SQL Server Cost Based Optimizer (Query Optimizer), NUMA System Architecture, performance tools developer - SQL ExecStats, mucking with the data distribution statistics histogram - decoding STATS_STREAM, Parallel Execution plans, microprocessors, SSD, HDD, SAN, storage performance, performance modeling and prediction, database architecture, SQL Server engine

This Blog


Privacy Statement