THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Joe Chang

Add 54 new indexes to a table that already has 22 indexes

This was the recommendation given by Micrsoft CSS (formerly PSS), along with 200 new indexes on other tables, in response to a case filed by a local company. The new index list was basically the output of the query from Bart Duncan’s blog "Are you using SQL's Missing Index DMVs?" for generating SQL with the missing index DMVs. The Bart Duncan blog has a several qualifiers, like 1) the missing index DMVs are not a substitute for DTA, 2) update overhead is not taken into account and 3) that the index key column order may not be ideal.

What was not mentioned is that one should never blindly apply the missing index items. In addition to sort order by estimated impact, it is imperative to look for opportunities to consolidate indexes. Indexes with the same key columns, but different include lists definitely should be consolidated. If there are common columns in the key list as well as differences, then so long as the common columns are sufficiently selective, the remaining key columns could be moved to the include list. Another important point: it is not always necessary to have a fully covered index, especially if the include list becomes overly fat. If the query involves only a few rows, but has many columns in the select list, this can be handled without a covered index. Finally, new indexes should be applied in batches of a few at a time. Perhaps I should elaborate on how to handle the missing indexes DMVs later.

Now I don’t want to pick on CSS, especially considering that a reasonable amount of time to be spent on looking over the data for a single incidence is about 1 hour. First level technical support people usually do not have more than a few years experience. (This is an inescapable fact. There will not be experienced people if there is not a opportunity to get experience to begin with.) The way it should work is the first level is backed by a second-level with mid-level experience, that in turn are backed by a third level of deep experts with serious problem solving skills. The first level people work under the supervision of the second-level people with the procedures and tools setup by the 3rd level people.

The reason I bring this up is the 250 new indexes was accompanied by a recommendation to engage Microsoft Professional Advisory Services. Previously, PSS was primarily interested in break-fix situations. Other services were to be provided onsite by MCS or Premier. But these are often purchased in big packages, leaving a wide gap between a single incident CSS case and the entry MCS/Premier package. With SQL Server 2005 and later, it is now possible to collect many of the key performance data that  are necessary to provide remote adivsory services. But if it is standard practice at CSS to blindly recommend the full set of the missing indexes DMV, then someone there is !@#$%^&*

ps, Index Selectivity
I suppose I could mention the index selectivity rule for an execution plan involvin a single nonclustered with key lookup. The index selectivity must be better 1 row per 3.5-4 leaf level data pages (index_id 1 for clustered index, 0 for heap). If there are 1000 leaf level pages, then the index must reduce row count to fewer than 250 (high parallel plan) to 285 (non-parallel plan) that will require a key lookup. Calculate this by total rows x density / pages, where density is from the DBCC SHOW_STATISTICS density vector, or by EQ_Rows/pages where EQ_ROWS is from the histogram. My ExecStats tool calculates this, docs on ExecStats doc.

Cluster Key
Missing Indexes does not make recommendations on which should be the cluster key. Use the Index Usage DMV, if one nonclustered index is used for the plurality of seeks and involves a key lookup, then that might be the better choice for the cluster key. Many times, the cluster key is just the identity column, but the majority of access are through the a foreign key from a parent table. So the primary key/cluster key should really be parent key followed by identity.

Update Overhead 
The index update maintenance overhead is somewhat more than the key lookup cost. So if there are more updates than index accesses, this should be weight against alternative execution plans. Of course, a large table scan is far more expensive than a few row updates.

ps I talked to Bob Ward about this at PASS, and he says that CSS should never recommend a blind application of the missing index DMV contents, so this was probably just a case of a young guy on having learned a few things, thinks he ready to give advice without direct supervision. More later 

Published Monday, November 08, 2010 10:53 AM by jchang
Filed under:

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
(required) 
Submit

About jchang

Reverse engineering the SQL Server Cost Based Optimizer (Query Optimizer), NUMA System Architecture, performance tools developer - SQL ExecStats, mucking with the data distribution statistics histogram - decoding STATS_STREAM, Parallel Execution plans, microprocessors, SSD, HDD, SAN, storage performance, performance modeling and prediction, database architecture, SQL Server engine

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement