[2008/08/25] This post has been modified significantly to correct some inaccurate statements because I mis-read Joe Chang's post.
Joe Chang posted some interesting results using the TPC-H queries with the scale factor set to 10. I happened to have done something similar, and naturally noticed a rather significant difference between his results and mine. [2008/08/25: Okay the difference is not as significant as I had thought.]
My results were obtained on SQL Server 2005 Enterprise x64 Edition running on Windows Server 2003 Enterprise x64 Edition. The test server, whose make and model will remain undisclosed, had four 2.93GHz quad-core Intel Tigerton sockets (Xeon X7350) with 64GB of RAM with 20GB allocated to the SQL Server buffer pool. The TPC-H scale factor was set to 10, and the data were generated with DBGEN from tpc.org.
My results are as follows:
| Query |
MAXDOP 1 |
MAXDOP 4 |
MAXDOP 8 |
MAXDOP 16 |
| tpcH Query1 |
64845 |
17080 |
9373 |
4865 |
| tpcH Query2 |
571 |
180 |
110 |
97 |
| tpcH Query3 |
17687 |
3272 |
1792 |
1334 |
| tpcH Query4 |
23079 |
3859 |
2293 |
1633 |
| tpcH Query5 |
34255 |
5040 |
2646 |
1935 |
| tpcH Query6 |
2250 |
595 |
332 |
204 |
| tpcH Query7 |
18299 |
3562 |
1754 |
1513 |
| tpcH Query8 |
9636 |
3245 |
2042 |
1809 |
| tpcH Query9 |
50470 |
14094 |
7559 |
4460 |
| tpcH Query10 |
19363 |
4432 |
2289 |
1630 |
| tpcH Query11 |
4558 |
674 |
424 |
390 |
| tpcH Query12 |
10522 |
3815 |
2718 |
982 |
| tpcH Query13 |
26823 |
7577 |
4425 |
2121 |
| tpcH Query14 |
2356 |
624 |
357 |
227 |
| tpcH Query15 |
1705 |
1421 |
280 |
170 |
| tpcH Query16 |
7960 |
3461 |
2155 |
1681 |
| tpcH Query17 |
1070 |
325 |
220 |
276 |
| tpcH Query18 |
68366 |
19719 |
10174 |
5829 |
| tpcH Query19 |
2330 |
711 |
441 |
843 |
| tpcH Query20 |
968 |
314 |
221 |
695 |
| tpcH Query21 |
80858 |
14565 |
8149 |
6072 |
| tpcH Query22 |
9418 |
1762 |
986 |
664 |
| |
|
|
|
|
| Total (second) |
457 |
110 |
61 |
39 |
All the numbers in the above table are query elapsed times in milliseconds except the very last line highlighted in yellow, which are in seconds. The last line contains the total elapsed time for Query1 through Query22 for each MAXDOP setting. All the results were obtained with all the database pages cached in the buffer pool. The same TPC-H query stream (i.e. Query 1 through Query 22) was repeated for 10 times with all the results from the first run thrown out, and the max elapsed time and the min elapsed time for each query for each of the subsequent runs were also thrown out before the elapsed times were averaged.
So what's the difference between Joe's results and mine?
Although there are some differences in the elapsed times in terms of the absolute numbers, the differences are minor enough to be no cause for alarm, especially when you take into consideration the platform differences and probably some differences in the test setup.
However, the deline in the elapsed times in my test results is more significant than it is in Joe's results when MAXDOP is changed from 1 through 16. The difference is not as dramatic as I had throught, and initially commented on in the original version of this post (because I was reading the wrong line for whatever reason).
Given that TPC-H has been around for a long time and all DBMS vendors have tried very hard to optimize their products to performance well on TPC-H, I do not expect SQL Server to be an exception, and am rather pleased to see that it scales very well with MAXDOP on these queries.
I'm a bit embarrassed that I didn't read Joe's results correctly first time around, but felt that it's time well spent to be able to cross check independent tests results from realistic envrionments.