THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Did You Know: Can Query Tuning Become Unnecessary?

I wrote a commentary for the SQL Server Magazine e-newsletter last week, and was expecting a to get a bit more feedback than I did.So I'm hoping my blog readers can help out.  I was commenting on a message that one of my business partners had sent out wondering about the changes that new higher speed storage technology might bring.

Please read the whole commentary here (It's not that long.)

Basically, the comment is that new storage technology is appearing that could have I/O's per second (IOPS) rates thousand of times faster than they are now.

Most of the reasons for building good indexes and tuning our queries is to reduce the number of reads that SQL Server would have to perform. But if the reads are so fast that even with tables scans our queries are always 'fast enough', would you spend time on query and index tuning?

At this point, my question is NOT whether you think there will be available and affordable storage systems that can perform as suggested. My question is "IF there were such systems, would you still spend time tuning your queries and evaluating the best indexes?"  Also, if query tuning does become unnecessary, what would you do instead to maintain your database system's performance? Or would you be out of a job? 

I'd really like to hear what you think!


Thanks

~Kalen

Published Sunday, September 07, 2008 7:49 PM by Kalen Delaney
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Adam Machanic said:

Absolutely.  I've seen tiny queries against tables totally cached in memory that took hours to run without the appropriate indexes, and milliseconds with.  Faster I/O will certainly make things easier and certain performance problems less extreme, but it will in no way obviate the need to tune queries.

September 7, 2008 10:18 PM
 

Aaron Fischer said:

IO will still be a bottle neck at a higher scale, perhaps some small applications will be able to ignore this and bad design may flourish(this may currently be the case in any account).  But any application with considerable load will still demand a well design database with proper indexes.

September 7, 2008 10:43 PM
 

jchang said:

in the really old days, 1993-1996 (after the resin fire and before the massive DRAM price crash), some people got it in their heads that database tuning was just about reducing disk IO. There was alot of this reported in the Oracle side, but I wasn't following much SQL in those days. After the big DRAM price crash of 96, servers started to have enough memory to keep disk to a minimum, and performance still stank. So people finally started to admit CPU efficiency of operations was equally important.

Also, see my post on IO cost structure. Even if storage was infinitely fast, ie, zero latency, going outside of the SQL Server buffer cache still incurs a high CPU cost, upto 2-5X the cost of the base operation with all data in the buffer cache. Fast IO really helps, but still tune your SQL. I am ok with letting certain people proceed with silly beliefs, I do not believe in interfering with the laws of natural selection (and deselection) at work.

September 7, 2008 10:56 PM
 

Uri Dimant said:

I agree with others. If you see a high logical reads even with fatser I/O then by reducing/tuning them you will see perfomance gain for sure. Actually, I hear some kind of questions  from the clients, for example the last one was "If the database (pages/plans) is resided in memory what will be DBA doing?":-))

September 8, 2008 12:28 AM
 

Greg Linwood said:

I think the combination of faster I/O in combination with increased 64 bit platform caching capabilities will combine to reduce tuning requirements to some degree, but there will also be compensating factors such as increased database sizes & more sophisticated quering (to support more complex apps).

I'm wondering whether SQL Server architecture will change significantly as a result of the improvements in I/O technology. Perhaps SQL Server won't need a buffer manager at all in the future, bringing even further increases in efficiency..

September 8, 2008 3:31 AM
 

alphatross said:

Kalen poses a fascinating question: I'm going to put my sci-fi head on, and go really non-technical to imagine a future where Storage, CPU etc are so advanced as to make query \ index tuning unnecessary. (I also imagine that in this world, everyone will wear tight silver suits, and eat nutrition pills instead of food, but anyway...) Will DBA's still have a job? I think yes, absolutely - even though tuning can yield amazing leaps in application performance, I still find it *so* hard to find time in my day to devote time to tuning (my last job befoe current was 600+ Prod SQL DB's across 160+ SQL Servers ranging from 6.5 to 2005) - most of my day being reviewing policy (Security, Contingency), new projects\releases, performing builds for test, backups\restores, troubleshooting DTS (now SSIS). Any-one else think this, or am I just ending up in the wrong jobs? Also, imagining this scenario makes me think of my early days of DBA-innocence, when I treated SQL as a purely declarative language - I just defined *what* data I wanted, and the data would come back. In other words, can we consider DB Tuning as a work-around for today's Hardware \ Optimiser limitations (e.g. use of hints) that in a perfect world shouldn't be necessary?

September 8, 2008 6:50 AM
 

unclebiguns said:

Query tuning will still be necessary, because expectations will change.  Now a few seconds for response is acceptable, but as the the hardware gets faster the acceptable wait time will reduce at well.  Think about the internet, just 10 or so years ago everyone was happy with 26.6 modem connection, now if you have less than a 1 MB connection you find it unacceptable.

September 8, 2008 9:16 AM
 

Adam Machanic said:

Greg: I was wondering the same about losing the buffer manager.  If some kind of non-volatile RAM technology becomes fast, affordable, and durable enough, why bother with a disk at all?  I imagine this could also reduce the need for a fairly big chunk of the transaction logging that SQL Server does today.  It will be interesting to see how that plays out.

September 8, 2008 9:37 AM
 

noeldr said:

This reminds me of "Nobody will ever need more than 640k RAM!" ...

Yeah IO can be faster but our apps can become hungrier for data processing too :-)

The best optimizer is between your two ears -- Michael Abrash.

September 8, 2008 10:14 AM
 

Denis Gobo said:

This will be a classic cat and mouse game, the faster your machine gets the more data and in real time processing the business units want.

Maybe you don't need to tune those 100 million row tables but once you have tables with quadrillion rows you will still have to deal with the same problem. Data is being collected at an unprecedented rate. Soon 64bit will not be enough (Okay I'll stop now, I am started to sound like Kurzweil)

September 8, 2008 10:15 AM
 

James Luetkehoelter said:

I agree with the general thought here that query tuning will always be necessary. One of my personal mantras is "Plan for the extreme". The more capable the hardware, the more data that will be put on that hardware. Relying on hardware alone for performance has a tendency to validate sloppy code, bloated code, and "ignored" systems. I deal with penty of that right now with hardware that was purchased 2 years ago that was "uber-fast". Now the performance is "uber-slow", since no one payed attention to good coding practices and the need to maintain a system (or tune a query). I don't expect that cycle to ever go away until we build data storage systems that store and retrieve at a level approaching or exceeding the human brain.

September 8, 2008 11:59 AM
 

Carl Federl said:

Joe Chang has posted here and also has an very interesting CPU utilization for IO benchmark.  For a senario for indicies (with and without), below are the detailed calculations for CPU elapsed times that result in 700 seconds of CPU time without indicies versus 0.07 seconds of CPU time with indicies.  These times do not include disk IO.

Certainly looks to me that performance tuning will still be mandatory!

Have I made a math error?

Joe Chang's CPU for IO benchmark is at http://sqlblog.com/blogs/joe_chang/archive/2008/09/07/8742.aspx#8774 and shows on  a 3Ghz CPU, 4 microsecond to read a pages into memory plus 3 microsecond to evict a clean page from memory, which is 7 microsecond total.

An Example and comparison:

CREATE TABLE FOO

( FooKey integer NOT NULL Primary Key

, FooAttribute  char(20) NOT NULL

, SomeData char(40) NOT NULL

CREATE TABLE FUM

(FumKey integer NOT NULL Primary Key

,FooKey integer NOT NULL Foreign Key references Foo

,SomeData char(40) NOT NULL)

The SQL Statement that returns 100,000 rows.

select *

from Foo

join FUM

on Fum.FooKey = Foo.FooKey

WHERE Foo.FooAttribute   = 'somevalue'

Data statistics:

Foo = 1,000,000 rows at 100 rows per page = 10,000 pages = 78Mb

Fum = 100,000,000 rows at 100 rows per page = 1,000,000 pages = 7,800Mb

Average Cardinality of Fum to Foo is 100 to 1

Analysis:

Without any indicies:

Table scan of table Foo is 10,000 IO and then a nested loop requiring 100 table scans of table FUM for another 100,000,000 IO.

Total IO is 100,010,000 giving an elapsed time of 700,000,000 Ms = 700 seconds

With indicies assuming a nested loop:

Index scan of table Foo, then a bookmark lookup giving 10 IO for index + 100 IO for data.  Then, for each row, a index scan of table Fum at 2 IO for index + 100 IO for data pages is about 102,110 IO

Total IO is 102,110 giving a elapsed time of 714,000 microseconds = 0.07 seconds

September 8, 2008 3:58 PM
 

ton said:

As others have already stated query tuning will always be necessary. The thing that concerns me is the limited number of times that the flash memory in solid state drives can be written to before thay can no longer hold the charge to maintain their bit state. With the mission critical data that's usually stored in the database this would be a HUGE concern for me.

September 8, 2008 4:08 PM
 

Adam Machanic said:

Ton: Two comments regarding flash RAM and durability.  

A) As the technology matures it is getting quite a bit better in terms of number of times that it can be rewritten

B) There are newer flash technologies coming out where the hardware ensures that data writes are spread across all of the available areas on the device in order to further reduce wear.  On top of that, the memory areas are monitored and as they start to decay they're marked as bad.  So your disks get a bit smaller over time, but apparently not by a huge amount (and I'm betting some of that space is also reserved upfront).  

On top of both of these, I think flash SSD technology will mostly be used in SANs such as the new EMC devices.  Doing a RAID 10 across these high-end flash SSDs would be quite safe and durable.

September 8, 2008 4:54 PM
 

mosha said:

I am amazed at the general sentiment in the responses here. I certainly believe that neither query tuning, nor best index evaluation will be necessary in the future, regardless of whether these super-fast IO systems will appear. The fact that people have to tune queries and evaluate indexes in the first place is only because RDBMS's are just not mature enough. How much time exactly do we spend tuning assembly code generated by compiler ? Same should be with database queries too, and it will be, there is no stopping progress, no matter how defensive the DBA population is.

BTW, there are already systems, which don't even let DBA to control indexes - i.e. OLAP servers such as Analysis Services, it will build whatever it thinks is right, and just look how fast OLAP is.

It's only matter of time.

September 9, 2008 1:04 AM
 

GrumpyOldDBA said:

Wishing to be diplomatic here - I know of a business db, a mere 15gb, that currently has to run on lots of hardware to get any performance, queries count their io in the millions, the feeling there is that having applied 16 cores, lots of raid 10, more memory than database size, then SSD will give the performance gains required. So do I think we'll stop tuning - no, rather than write good efficient code against good schemas the underlying quality will degrade and be offset by the hardware - this is the chance to throw even more hardware at poor applications and databases, but as always eventually tuning will be required - I have a query plan that generated so many io it generated an overflow error - 200 billion io if i remember correctly. I am contemplating getting a SSD for my laptop though - anything has to be better than a 5.4k disk

September 9, 2008 7:33 AM
 

Greg Linwood said:

I agree Grumpy. I worked on a site last year which had only a 2Gb SQL db yet was the highest throughput DotNetNuke site in the world at the time. The tuning requirements on that project had absolutely zero to do with I/O! I'm also looking at SSD for the notebook, but it's more about battery life than performance for me..

September 9, 2008 8:32 AM
 

Garland Frye said:

I don't feel like advancements in hardware and/or the SQL Server engine itself will ever absolve us of tuning. I think the fallacy in arguments like these is that the technology of tomorrow is being compared to the business problems of today. It has been my experience that systems are architected around the limits of technology. My argument distilled: System A was architected around the limit of its ability to do 1000 transactions per second. Tuning was necessary to reach the limit of 1000 transactions per second. With new technology the capacity doubles to 2000 transactions per second. System B will be architected around the limit of its ability to do 2000 transactions per second. Like system A before tuning will be necessary to reach this limit.

I think it is somewhat analagous to to my daily commute... When I purchased my house I only looked in areas that were within a radius of 20 miles from my workplace. I knew that I could commute 20 miles in primetime traffic in around 45 minutes. 20 miles along with the time and fuel costs associated with that was my limit. Now, if tomorrow the price of oil dropped to $15 dollars a barrel and the volume of traffic was suddenly offset by highways that would allow us to safely travel at double the speed then the cost + time of my commute would be significantly decreased. At that point I may decide to simply enjoy my new commute but since the low opportunity cost of commuting has made houses further from the city more attractive, I would probably sell at a profit and move even further from the city. Living 40 miles from the city I would still be at my limit of a 45 minute commute + the cost of gas.

My point is businesses will continue to operate at the limits of their profitability model and therefore practices like tuning that are necessary to acheive the limit will continue.  

September 9, 2008 12:38 PM
 

James Luetkehoelter said:

Mosha:

That's a very interesting perspective, and in some ways I agree with you. The problem is that with many database systems the usage of them is so varied and polymorphic that one index plan enhances a certain type of usage (let's be simple and say operations read-write) vs another type of usage (again, simple and say decision support or operation status querying). Will we ever get an RMDBS that is smart enough to tune itself for any scenario? Maybe. I guess my comment about hardware approaching the speeds of the human brain is more a question of data storage and retrieval code that does the job.

I've worked in the Oracle world a fair amount and one of the things that really frustrated me (although the trend is changing now) is the number of items that can and NEEDED to be tuned by hand. I remember one of the mantras for SQL Server 7.0 was "no knobs" - meaning, if there is a rule for how something should be tuned given a certain scenario, it should be in the engine itself. Perhaps it is possible for a database engine to become "knobless", but my gut feeling is that the demands placed on data by business and how things constantly scale and evolve makes the goal of "no knobs" an extremely difficult task.

I do agree with you that if there is something that makes tuning unneccessary, it would come from the data engine itself. Hardware, however, I see as a stop-gap until the next bottleneck is reached.

September 9, 2008 1:18 PM
 

Richard Fryar said:

All comments have missed one important point - query tuning is enjoyable and intellectually stimulating.

Maybe I'm a geek but I actually enjoy squeezing the last ounce of performance from a database.

Kalen asked "IF there were such systems, would you still spend time tuning your queries and evaluating the best indexes?"

My answer is "absolutely".

September 10, 2008 12:56 PM
 

RussellH said:

James:

Oracle with 10g and 11g seems to following that same "no knobs"  mantra.  Most settings should be left at defaults and tablspaces should be self sizing (locally managed tablespaces with auto extent size and auto segment space management).  Oracle will also recommend building specific indexes based on access patterns, but an administrator will still need to create them.

Kalen:

"Most of the reasons for building good indexes and tuning our queries is to reduce the number of reads that SQL Server would have to perform."

But even if all those reads are coming from the buffer cache, reading an index still should be way faster full scanning a large table.  As others have pointed out, reading data from the buffer cache into the cpu and processing it still takes a finite amount of time.  B-tree index access can reduce that time by many orders of magnitude by using an efficient search algorithm.

September 10, 2008 1:15 PM
 

Roy Harvey said:

It is easy for the sort of DBAs who read blogs about databases to say they will still Do the Right Thing by setting up good indexes, but given the number of installations today who don't index worth beans, well I guess I suspect that if such shops have that sort of high performance hardware it will take even longer for them to notice the mess they are in.

September 10, 2008 9:52 PM
 

Brad Featherstone said:

A couple of comments:

1) Money - Enterprise-grade SSD has been available for some time now. At prices that will freeze your nadgers, or get you a Toyota Corolla. It is just like any enterprise-grade storage, it is tested to a 7 x 24 x 365 duty cycle for those Mean Time Between Failure numbers you read.  The commercial & personal grades of storage do not experience such severe testing. In fact a lot of the lower grade hardware are actually enterprise-grade test failures. Remember that it is just like any other hardware – you get what you pay for.

B) Features - I have been in the IT industry for 20+ years, from micro to mini to big iron to client-server to n-tier to whatever comes next, bla, bla, bla...  I have used multiple DB products (all better than the others, of course).  There is one rule that seems to be true: No matter how much horsepower 'ya got under the hood, it will always be pushed to the limit. I have used SSD and it is good! And the folks I report to wanted more. So it goes.

III) 'Faster' - Wazzat mean to you today? Speed and efficiency are two different things. We all know of (and sometimes built) very fast systems that bog down under a load because they are not very efficient.

Four-o) The biggest design woe that enterprise-grade SSD has is heat dissipation. Burns can happen when you use a 2GB RAM laptop. Now consider a 40GB or 400GB SSD. Are we going to go back to fluid-cooling (money) our machines? S’mores anybody?

And Last:

Face it. We will all be doing tuning of code in databases until we bail out of being database folks. It is the Code Monkey Full Employment Act of 2008 and 2009 and 2010 and...

Which ain’t a bad thing, come to think of it.

;-)  Brad

September 30, 2008 6:58 PM
 

Brad Featherstone said:

I finally found a decent reference for the performance & economics of SSD with comparisons to disk drives.  

It is at http://www.snia.org/education/tutorials (Storage Networking Industry Association).  The article is about 2/3rds of the way down the page.  It is named: Solid State Storage for the Enterprise by Brian McKean and Woody Hutsell.

December 3, 2008 7:00 PM
 

Kalen Delaney said:

Hi Brad

Thanks for following up on this... looks like a lot of good reading as soon as I get my book done!

~Kalen

December 3, 2008 7:35 PM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement