THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Where does Time Go?

A significant part of my job is to evaluate how SQL Server (and sometimes other DBMSs) performs on various hardware platforms, in particular on the processors and its related chipsets as they are being released. So naturally, I’ve been paying attention to performance analysis of DBMSs.

 

One of the papers at the top of my reference list for this area was written by a group of researchers from University of Wisconsin: “DBMSs On A Modern Processor: Where Does Time Go?” This paper was published in 1999, but its findings and methodology are still interesting.

 

The key conclusion of the paper is that when data is memory resident, the main bottleneck to the DBMS performance is L2 data cache misses, and to some extent L1 instruction cache misses. This conclusion appears to be valid even with today’s commercial DBMSs, given that the conclusion is still being quoted in recently published studies.

 

If you are interested in the DBMS performance with respect to processors and memory hierarchies, I highly recommend this paper.

Published Wednesday, January 02, 2008 1:02 AM by Linchi Shea

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Chuck said:

Dude, you rock.

Happy 2008.  I need to resolve better.

:-)

Chuck

January 2, 2008 7:07 AM
 

James Luetkehoelter said:

That's my home town!

That's a decent read, but being from 1999, how much of the methodology and conclusions do you think are still valid with today's processors?

January 3, 2008 9:38 PM
 

Linchi Shea said:

Yeah, "Pentium II Xeon processor running at 400 MHz, with 512 MB of main memory

connected to the processor chip through a 100 MHz system bus" is hardly a modern system these days. But I find the discussions overall interesting. More specifically, its choice of the measurement tools, its discussions of the query execution model, and its discussions of the query execution time breakdown are quite relevant (to my job at least), though the specific numbers have changed or may not apply to the new processors. Speaking of the newer processors, I hope the author would repeat the experiment on a modern multi-core processor.

Also, the paper is still widely quoted in the recent literature.

January 3, 2008 10:17 PM
 

Scott R. said:

Linchi,

Thanks for sharing the link to this paper.  The findings are interesting and relevant despite the time that has passed since its publication.

My gut feel says that while current systems are technically "faster" (higher clock speed, faster cache, faster bus, faster memory, multi-core, etc.) and theoretically “more powerful” than the systems tested in the paper, the problems highlighted by the paper may actually be more acute now than they were then, since the blocking / stalling issues are brought about by differences in the fastest and slowest paths for a given operation and these differences may be proportionally greater now than they were then.  As a result, the percentage of system resource available to do useful work may be less now than in the earlier test results.  The large overall amount of capacity and performance growth in current systems versus the tested systems still yield greater performance, but the amount of gain may be diminishing due to this effect.

Again, just a gut feel on my part – no specific evidence to back it up.  Like you stated, I think it would be great if the tests could be repeated on current systems, but I know the effort involved in such a test would be pretty large.

I would be interested in your opinion of how current systems might compare to the test results and findings in the paper.

The paper refers to “database developers”, which I take to mean product development staff for RDBMS vendors, and not IT application developers or DBAs that use an RDBMS product.  In my opinion, it is only the vendor product development staff that will be able to change these behaviors (stalls, blocking, etc.) due to code and data caching issues, etc.  I would be curious to know how much of this type of work (make the current functions perform much better through carefully analysis and testing) is actively done by RDBMS vendors to improve their respective products, versus just expanding the list of available features and capabilities to sell more product.  I suspect some is done, but how much and how deep (to the level of analysis done by the paper) I don’t know.

I am also curious as to which lettered system (A, B, C, D) corresponded to which vendor / RDBMS product / version.  I know that legal restrictions prevented them from revealing this information, but it would be interesting to know.  I’m guessing that the mix was the RDBMS market leaders at that time, and may have included Microsoft SQL Server (because Microsoft provided the OS for the test system(s) – probably 7.0 because 2000 had not yet been released in 1999), Oracle (probably 7 or 8), IBM DB2, and maybe Sybase.  I would have no clue as to which was which in the test results.  Any guesses on your part?

Thanks,

Scott R.

January 5, 2008 1:01 PM
 

Linchi Shea said:

In my previous post on this subject , I mentioned that I'd very much like to see the authors repeat the

January 15, 2008 12:12 AM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement