Intel Xeon E7 v2 processors (Ivy Bridge-EX) officially launched today. The E5 v2 processors (Ivy Bridge-EP) and E3 v3 (Haswell) came out last fall. The previous generation E7 was based on Westmere, so the Sandy Bridge generation was skipped for the EX. This makes sense because big systems desire 2-year product stability versus the annual refresh for 2-socket systems.
The new E7 v2 tops out at 15 cores 2.8/3.4GHz nominal and turbo compared to the Westmere E7 with 10 cores 2.4/2.8GHz. The Xeon E5 v2 top model has 12 cores at 2.7/3.5GHz versus the Sandy Bridge E5 at 8 cores 2.7/3.5GHz.
When the Ivy Bridge EP separate dies with 10 and 12 cores was announced, it seemed rather an unusual choice.
Later, when the 15-core EX model was brought, it then became clear the 12-core E5 v2 actually shares a 15-core die with the E7 v2. (the diagram below is from Anandtech, see links at bottom, this reference was inadvertently left out in the original edit)
Below is my rendering of the 3 Ivy Bridge EP dies.
Below are the 10 and 15-core Ivy Bridge EP/X die
I will try to scale these in relation to Sandy Bridge and others when time permits.
Note that the L2 cache is on the other side of the core from the L3 or rather last level cache (LLC).
The Dell website shows the new PowerEdge R920 (but not yet taking orders), featuring 96 DIMM sockets which could support 6TB memory, but 1.5TB is most economical for now, with 3TB for “moderately” extreme situations.
The HP ProLiant DL580 G8 lists support for 32GB DIMMs, so it will probably be sometime before 64GB DIMM support can be verified.
It is not clear if 8-socket system support will be available.
In the period up to SQL Server 2008 R2, with licensing determined only by socket count, the obvious strategy was to pick a system with the desired number of sockets, and the most powerful processor for that type of socket. There was no point to analyzing memory requirements because it was both simple and cheap to fill the DIMM slots with the second largest available memory module (currently 16GB).
From SQL Server 2012 on, the new with core licensing dictates that we should now base our sizing strategy on the appropriate number of cores, and then determine between the E7 versus E5 platforms if applicable.
Intel Xeon E7 v2 Processors
Intel Xeon E5 v2 Processors
The benefit in stepping down the total number of cores (in addition to reduced licensing cost $6.7-10K per core?) is the possibility of higher core frequency in the lower core count processors. Also consider that write (and certain other) operations are not parallelizable, so single thread operations may be running at the turbo mode frequency.
When desired number of cores can be achieved with a 2-socket system,
consider that the E7 supports 24 DIMM slots per socket compared with the E5 at 12 per socket.
Even though we have been conditioned by DBA indoctrination that more memory is better,
this "rule" originated from the days when the maximum memory configuration may have been 64MB to 1GB.
In those circumstances, every MB of memory helped reduce disk IO. By blowing out the budget on memory and months of hard work in performance tuning, and with luck, it might be possible to bring disk IO
within the capability of a storage without so many components that disk drives fail on a weekly basis.
Given the maximum memory supported today,
very few situations really call for 1TB+ memory configuration.
It is extremely likely that a 2-socket Xeon E5 system with DIMM slots filled with 16GB DIMMs (24x16GB = 384GB, $200 each, $4.8K total) is already more than enough by a factor of 4 if not ten.
More than likely, disk IO (excluding log writes) is only a sporadic occurrence.
And if we needed disk IO, we could configure more IOPS capability from an SSD storage system (that is more economical than 1GB of memory was 20 years ago) than we could actually use.
(Yet people still finds ways to run into problems!)
Unless your SAN admin dictated the storage configuration, in which case maybe go for broke on memory.
Another possible option for less than maximum core count situations is whether fill the sockets with low core count processors or only populate half the sockets with the high core count processors.
Example 1: 1 x 12 core or 2 x 6 core processors.
Example 2: 2 x 15 core or 4 x 8 core processors.
Filling the processor sockets enables maximum memory bandwidth (and memory capacity, but in this situation we most probably do not need it).
The decision criteria might be based on the parallelism strategy.
If our expectation is to run critical queries at higher degree of parallelism (8, 10, 12 or 15), one would expect that having all cores on one socket would benefit from having better (true) performance in Parallelism Repartition Streams operations, as the latency between cores is low, favoring fewer sockets of the high core count processors?
Do not bother looking at the plan cost for this, it is strictly based on a model that does not take into account the processor/system architecture.
On the other hand, if we expect to restrict max degree of parallelism lower, say 4, 6 or maybe 8, then more sockets populated with lower core count processors
would benefit in having greater memory bandwidth?
I have not tested these two scenarios side-by-side in otherwise equivalent configurations, so I ask readers to alert me if data to support this assessment should become available.
It is possible that having the fewest sockets is the better solution because of less complicated cache coherency, despite the lower memory bandwidth and capacity.
It is uncertain whether there will be a Xeon E5-4600 v2 series, as this seems unnecessary?
There is also the Xeon E5-2400 v2 series with 3 memory channels instead of 4 for slightly lower platform cost structure. We can also consider the single socket E3 v3 (Haswell) at 4 cores 3.6/4.0GHz with 32GB and 2 x8 PCI-E gen 3 slots. It might seem beneath our dignity to run on a server similar to our desktop or laptop, but the fact is this 4-core system with 32GB is far more powerful than the 4-socket systems of 10 years ago.
I bought a Dell PowerEdge T20 to test out the E3 v3 Haswell processor. Unfortunately the system would only power with 1 DIMM slot populated, not 2 or 4. Dell support has not responded. I may buy a Supermicro motherboard and chassis.
New TPC-E benchmarks were announce for the E7 v2 from IBM and NEC. Below are the recent IBM TPC-E results spanning the Westmere-EX, Sandy Bridge, and Ivy Bridge processors.
|Sockets||Processor||Freq||cores||threads||Memory||SQL Server Version||tpsE|
|8||E7-8870||2.4GHz||80||160||4TB||SQL Server 2012||5,457.20|
|4||E7-4870||2.4GHz||40||80||2TB||SQL Server 2012||3,218.46|
|2||E5-2690||2.9GHz||16||32||512GB||SQL Server 2012||1,863.23|
|2||E5-2697 v2||2.7GHz||24||48||512GB||SQL Server 2012||2,590.93|
|4||E7-4890 v2||2.8GHz||60||120||2TB||SQL Server 2014||5,576.26|
At first I thought that the 4-socket E7 v2 performance gain over the 4 and 8 socket Westmere E7 also involved the new Hekaton feature in SQL Server 2014.
But then I realized that the 2-socket E5 v2 performance on SQL Server 2012 was inline with this being the traditional table structure?
The E7 v2 benchmark details have not been released?
Is there a reason Hekaton was or was not enabled?
IBM System x3850 X6 with 4 x E7-4890 (60 cores, 120 threads) 2TB (64x32GB) memory, 5 RAID controllers connected to 208 SAS SSDs and 1 RAID controller for logs. The Server, processors, memory and miscellaneous items totaled $151K, storage was $681K, and the 30 SQL Server Enterprise Edition 2-core licenses at $13,742 totaled $404K.
The complete list price was $1.25M with a discount of $212K (17% of the complete price) but this might actually be a 25% discount on the hardware or just the storage. The price on the 200GB SSD (SLC?) is $3079 which should easily support a 30% discount.
I would like to know what discount levels people are actually getting on SQL Server EE? The price with Software Assurance is about $10K per core, so this might be the proper budgeting value. Oh yeah, IBM include 1 Microsoft Problem Resolution Services incident as part of the 3-year cost of ownership.
On storage IO side, PCI-E gen 3 has been available in server systems (Sandy Bridge EP) for almost 2 years. PCI RAID controllers came sometime after that. There now also RAID controllers with both PCI-E gen 3 and SAS 12Gb/s on the downstream side.
Much of the storage infrastructure (especially HDDs and SSDs) are expected to remain 6Gbps for some time.
It would be helpful if there were either or both disk enclosures support 12Gb/s SAS and RAID controllers that have 4 x4 SAS ports to better leverage the bandwidth of PCI-E gen 3 x8 on the upstream side when the downstream side is still 6Gb/s.
There is still bandwidth mismatch but such is life.
Internally, disk enclosures have (2) chips, one per controller, with each having sufficient x1 SAS ports for each bay and 2 x4 ports for upstream and downstream traffic. The two controllers supporting dual-path - with dual-port SAS devices.
We would like to be able to have the x4 ports operate at 12Gb/s per lane, while connecting to either 6 or 12 Gb/s, allowing continued used of 6Gbps storage devices. There might be 24 bays communicating at 6Gbps, more than enough to load the x4 port on each of the two controllers.
I am curious as to the lack of SSDs with PCI-E gen3 interface. Dell says their PCI-E SSDs are now on the new NVMe standard.
I suppose the effort to work this in,
along with the combo PCI-E or SATA interface has run into longer than expected debugging effort.
If so, then we will wait patiently.
In server world, it is important for new storage technology to be thoroughly tested.
Intel's Xeon E5-2600 V2: 12-core Ivy Bridge EP for Servers
by Johan De Gelas September 17 2013, and
Intel Readying 15-core Xeon E7 v2,
A technical look at Intel's new Ivy Bridge-EX,
and Toms Hardware
Intel Xeon E5-2600 v2: More Cores, Cache, And Better Efficiency.
The TPC-E supporting files are now available for the two new results on the Xeon E7 v2 and SQL Server version 2014.
In the IBM report, the SQL does use either Hekaton table or compiled SQL.
I will look over the NEC report later.