There are two ways to test how your queries behave on huge amounts of data. The simple option is to actually use them on huge amounts of data – but where do you get that if you have no access to the production database, and how do you store it if you happen not to have a multiterabyte storage array sitting in your basement? So here’s the second ...

The session Skewed Data, Poor Cardinality Estimates, and Plans Gone Bad by Kimberly Tripp (@KimberlyLTripp) has been published on channel SQLPASS TV. Abstract When data distribution is heavily skewed, cardinality estimation (how many rows the query optimizer expects each operator to process) can be wildly incorrect, resulting in ...

This is the fifth, the final part of the fraud detection whitepaper. You can find the first part, the second part, the third part, and the fourth part in my previous blog posts about this topic. The Results In my original fraud detection whitepaper I wrote for SolidQ, I was advised by my friends to include some concrete and simple numbers to ...

This is the fourth part of the fraud detection whitepaper. You can find the first part, the second part, and the third part in my previous blog posts about this topic. Data Mining Models We create multiple mining models by using different algorithms, different input data sets, and different algorithm parameters. Then we evaluate the models in ...

This is the third part of the fraud detection whitepaper. You can find the first part and the second part in my previous blog posts about this topic. Data Preparation The problem of credit card fraud detection is not trivial. With every transaction processed, only a limited amount of data is available, making it difficult if not impossible to ...

Happy Fall! It’s a beautiful October here in Minneapolis / Saint Paul. In preparation for my home town SQL Saturday this weekend, as well as the PASS Summit, I offer an update to the RulesDriven Maintenance code I originally published back in August 2012. It’s hard to believe this thing is now more than two years old – it’s been an incredible ...

I have been working on new features for the RulesDriven Maintenance Solution, including perindex maintenance preferences and more selective processing for statistics. SQL Server stats is a topic I knew about at a high level, but lately I have had to take a deeper dive into the DMVs and system views and wrap my head around things like ...

This post is basically to answer a question asked in class this week: How can we get the last statistics update date for ALL user tables in a database?
After working on the query for a while, I realized that the new metadata function I posted about here can give you that info easily:
SELECT object_name(sp.object_id) as object_name,name as ...

A week ago, I taught my SQL Server 2012 Internals class to a great group of very interactive students. Even though a dozen of them were taking the class remotely, there were still lots of really great questions and and lots of discussion.
One of the students asked if I could summarize all the settings that I recommended changing from the ...

I just started using a new DMV (one that’s actually an ‘F’ not a ‘V’, as in Function) that gives us more info about distribution statistics. It returns info about the last statistics update date (which is also available with a function STATS_DATE()). It also provides the number of rows sampled when the statistics were last updated. This is ...
