Microsoft just release Fast Track Data Warehouse 3.0 Reference Guide version. The new changes are increased memory recommendation and the disks per RAID group change from 2-disk RAID 1 to 4-Disk RAID 10.
The earlier FTDW reference architecture cited 4GB memory per core. There was no rational behind this, but it was felt some rule was better than no rule. The new FTDW RG correctly cites the rational that more memory helps keep hash join intermediate results and sort operations in memory.
4-Disk RAID 10
FTDW versions 1 & 2 specified 2-disk RAID groups. The reason was the SAN storage system could not deliver good sequential performance IO with fat RAID groups (comprised of many drives). For some reason, a thin 2-disk RAID group could deliver 100MB/sec per disk. An EMC whitepaper added that a 3-disk RAID 5 group could also deliver good sequential performance. But RAID 5 does not have the necessary performance characteristics to support tempdb IO requirements. FTDW 3.0 now specifies 4-disk RAID 10 groups. The presumption is that 100MB/s sequential IO is still the objective. No explanation was given why a 4-disk RAID group is now specified. It is possible the firmware in SAN systems can now support good sequential performance with larger RAID groups. But an explanation is still desired.
Apparently the reason for the 4-disk RAID 10 group is to match the PDW configuration. No data was provided as to whether this could still meet the old 100MB/s per disk objective.
The reasons cited for the new memory recommendation in FTDW 3.0 are correct. My suggestion however, is that rather than cite specific memory configurations which just happen to be reasonable today, apply the underlying principal. The cost of server class ECC memory modules follow a consistent pattern. An 8GB DIMM today is $300 and a 16GB is $850 (from Crucial). The 32GB memory modules are just becoming available, and probably very expensive, say for example: $5000. The premium for the 16GB memory module is small relative to 2x8 GB modules.
The low cost of memory relative to the overall environment and the fact that memory has positive benefits, but no specific hard requirements drives the strategy. Fill the available memory sockets with either 8GB or 16GB memory modules (depending on system constraints). When the premium for 16GB modules disappears, then discontinue the 8GB module from consideration. When the premium for 32GB memory modules becomes less, then both 16 & 32GB modules can be considered.
My preference is still to employ direct-attach storage systems with RAID controllers in the server system. The latest PCI-E SAS RAID controllers cannot support 2.8GB/s, just under the net bandwidth of a PCI-E gen 2 x8 slot. Each 6Gbps SAS x4 channel can support 2GB/s. Modern 15K disk drives can deliver 150MB/s on the outer tracks. This infrastructure has an amortized cost structure of less than $700 per disk. The cost structure of low-end SAN systems specified by FTDW is typically 2X higher per disk. I also prefer to employ more 2.5in 15K HDDs over more channels than specified by FTDW over using 300GB+ 3.5in 15K HDDs.
Index Light Strategy
FTDW endorses the index light strategy. The presumption is that the data warehouse is very large, in the multi-TB range, too large to fit in system memory. The storage strategy employs hard disk drives because an HDD storage system properly configured with IO distributed over multiple channels and many disk drives can deliver high sequential IO bandwidth economically.
The SQL Server query optimizer cost model is built on the model that table scans run at 10.8MB/s (or 1350 pages/sec) and random IO (from loop joins and key lookups) run at 320 IOPS. The absolute value of each in relation to the actual server is not important. The important point is 1350 pages/s sequential to 320 random IOPS ratio.
The FTDW reference architecture is designed to achieve sequential IO of 100MB/sec per disk, or 12,800 pages/sec. A 3.5in 15K disk can support 175 IOPS (225 for 2.5in 15K) for random IO distributed over the entire drive at queue depth 1. At higher queue depth, 15K disks can support 320 IOPS. So the FTDW reference architecture IO performance characteristic has the ratio 12,800 pages sequential to 320 random IOPS.
There is no provision to adjust the SQL Server query optimizer sequential-to-random ratio. For this reason, FTDW favors avoiding non-clustered indexes so that key lookups and loop joins are not an execution plan option. A good partitioning scheme to help contain the range of scan operations is an essential element of the index light strategy.
Tempdb on SSD
SQL Server issues IO to tempdb at high queue depth, typically 64K in size, to store the intermediate results of large hash joins and sort operations. The pattern is a write burst eventually followed by reads. A very large disk array can support this, but a small array of SSDs can support extraordinarily high IO rates at lower latency than HDDs.
The next consideration is that if only tempdb is on SSD, there is not a hard requirement to employ enterprise grade SSDs. Depending on the situation, some environment can consider consumer grade SSDs. The difference in cost between consumer and enterprise SSD is more than 10X ($30K per TB for enterprise, less than $3K for consumer, all without RAID). For comparison, an HDD storage designed for performance might cost $10K per TB in RAID 10.
Some HD storage systems can support SSD with SATA/SAS, but not all can support the very high IO rates possible with SSD, so it might be a good idea to employ PCI-E SSDs. This configuration does not support clustering, but clustering is not necessarily a true requirement for data warehouses. Going forward, expect storage systems to finally abandon the under-powerful obsolete processor/controllers, and switch to powerful Intel Xeon 5600 or later processors. Then intermixing SSD and HDD should not be a problem.
Index Heavy Strategy
There appears to be a tendency to treat data warehouses as contest of whose is bigger. The data warehouse exists to serve the organization. With good design, the active portion of a data warehouse may fit in the system memory of modern server systems. Current 2-socket systems support 192-256GB, 4-socket server supports 1TB and 8-socket systems support 2TB memory. The cost of 1TB memory, 64 x 16GB ECC DIMMs should be less than $64K.
Another possible strategy is to employ SSD for the main database itself, or the most critical portions of the data warehouse. In an SSD storage system, there is very little difference between large block “sequential” IO and small (8K) block “random” IO. The difference in throughput between 1 page of a (sequential, large block) table scan and a “random” 1 page loop join/lookup IO might be none or 2:1. It also turns out that for data in memory, the actual CPU cost of a table scan per page is approximately the same as 1 key lookup or loop join.
In this case, the strategy should be to employ (sufficiently) covered indexes as necessary. It may even be advantageous to force index in some cases because the optimizer assumes the cost ratio is 4 pages scan equal 1 random IO. Note that the index heavy strategy will result execution plans with loop joins and key lookups. These operations do not scale to high degrees of parallelism, where as the hash join has excellent scaling at very high degrees of parallelism. (It is suspected the reason is each row latches the root level, resulting in contention at DOP 8 and higher. I do not understand why there cannot be just a single latch for the entire query or even one per thread.)