Last week Intel announced the 10-core Xeon E7-x8xx series (Westmere-EX), superceding the Xeon 6500 and 7500 series (Nehalem-EX).
The E7 group consists of the E7-8800 series for 8-way systems, the E7-4800 series for 4-way systems
and the E7-2800 series for 2-way systems. Also, the E3-12xx series (Sandy Bridge) for 1-socket servers,
superceding the Xeon 3000 series (Nehalem and Westmere).
This week at Intel Developer Forum Bejing, Intel has a slidedeck on Sandy Bridge-EP,
an 8-core die that will presumably be the Xeon E5-xxxx series superceding the Xeon 5600 series (Westmere-EP) scheduled for 2H 2011.
||Xeon 6500/7500, 4-8 cores
||E7-8/4/2800, 6-10 cores
||Xeon 5600, 4-6 cores
||E5-xx00, upto 8 cores
||Xeon 3x00, 2-6 cores
||E3-1200, 2-4 cores
The top-of-the-line Xeon E7-8870 is 10-core, 2.4GHz (max turbo 2.8GHz) and 30M last level cache,
compared with Xeon X7560 8-core, 2.26GHz (turbo 2.67GHz) and 24M LLC.
HP ProLiant DL580 G7 TPC-E results for 4-way Xeon E7-8870 and 7560 are 2454.51 and 2,001.12 respectively.
This is a 22% gain from 25% more cores, and 6% higher frequency, inline with expectations.
IBM System x3850 X5 TPC-H results at scale factor 1TB for the 4-way Xeon X7560 and 8-way Xeon E7-8870 are below.
|4-way Xeon 7560
It is unfortunate that a direct comparison (with same number of processors and at the same SF)
between the Xeon E7-8870 and X7560 is not available.
The presumption is that the Xeon E7-8870 would show only moderate improvement over the X7560.
This because the TPC-H is scored on a geometric mean of the 22 queries,
of which only some benefit from very high degree-of-parallelism.
The more modest performance gain from Nehalem-EX to Westmere-EX, compared to the previous 40% per year objective, is probably an indication
of the future trend in the pace of performance progression.
The pace of single core performance progression slowed several years ago.
Now, the number of cores per processor socket also cannot be increased at a rapid pace.
Fortunately, the compute power available in reasonably priced systems is already so outstanding
that the only excuse for poor performance is incompetence on the software side.
My expectation is that transaction processing performance can still be boosted significantly
with more threads per core. The IBM POWER 7 and Oracle/Sun SPARC T3 implement 8 threads per core. It is unclear if Intel intends to pursue this avenue.
Data Warehouse performance could be increased with columnar storage, already in Ingres VectorWise
and coming in the next version of SQL Server.
Scale out is now available in PDW (EXA SOL has TPC-H results with 60 nodes).
I am also of the opinion that SIMD instruction set extensions for the row & column offset calculation
could improve database engine performance. The object is not just the reduce the number of instructions,
but more importantly to make the memory access sequence more transparent, ie, allow for effective prefetching.
At the system level, the processor interconnect technology (AMD Hyper-Transport and Intel QPI)
should also allow scale-up systems. HP has mentioned that 16-way Xeon is possible.
HP already has the crossbar technology from their Itanium based Superdome and the sx3000 chipset.
It is probably just a matter of gauging the market volume of the 8-way ProLiant DL980 to assess
whether their is also a viable market for 16-way Xeon systems.
Another observation is the price structure of 2-way and 4-way systems.
It used to be that there was very little price difference between 1-way and 2-way systems with otherwise comparable features.
So the default system choice frequently started with a 2-way system and higher.
On the downside, the older 2-way systems also did not have substantially better memory or IO capability.
With the 2-way Xeon 5500 and 5600 systems, there is a more significant price gap between 1-way and 2-way systems.
However, the 2-way Xeon 5500 systems also have serious memory and IO capability.
So low-end entry system now needs to revert to a single socket system.
The price gap with 4-way systems has also grown along with capabilities, particularly in memory capacity and reliability.
The default system upgrade choice should be to replace older 4-way systems with new generation 2-way systems.
The new 4-way systems should target very-high reliability requirements.