Adam Machanic, Boston-based SQL Server developer, shares his experiences with programming, monitoring, and performance tuning SQL Server. And the occasional battle with the query optimizer.
An especially clever community member was kind enough to reverse-engineer the video stream for me, and came up with a direct link to the PASS TV video stream for my Query Tuning Mastery: The Art and Science of Manhandling Parallelism talk, delivered at the PASS Summit last Thursday. I'm not sure how long this link will work, but I'd like to share it for my readers who were unable to see it in person or live on the stream.
Skip past the keynote, to the 149 minute mark.
For the second year in a row, I was asked to deliver a 500-level "Query Tuning Mastery" talk in room 6E of the Washington State Convention Center, for the PASS Summit. (Here's some information about last year's talk, on workspace memory.) And for the second year in a row, I had to deliver said talk at 10:15 in the morning, in a room used as overflow for the keynote, following a keynote speaker that didn't stop speaking on time. Frustrating!
Last Thursday, after very, very quickly setting up and getting sound and video checks, the rest of the talk went surprisingly smoothly. My deck--a brand new version created specifically for PASS--helped me get across the message I wanted to communicate, my demos ran without any failure, and my jokes didn't drive too many people out of the room before the end of the talk. I even received a round of applause when I managed to take a 26 minute query plan and, using a few query rewrites, deliver the same exact data in 9 seconds. That, I have to say, was pretty cool.
Here's the abstract for the session:
Query Tuning Mastery: The Art and Science of Manhandling Parallelism
As a database developer, your job boils down to one word:
performance. In today's multi-core-driven world, query performance is
very much determined by how well you're taking advantage of the
processing power at your disposal. Are your big queries using every
clock tick, or are they lagging behind? And if your queries are already
parallel, can they be rewritten for even greater speed?
In this session, you'll learn to take full advantage of SQL Server
query parallelism. After a terminology review and technology refresher,
the session will go deep, covering T-SQL patterns that allow certain
queries to scale almost linearly across your multi-core CPUs. You'll see
when and why the optimizer makes a parallel plan choice and how to
impact the decision. Along the way, you’ll manipulate costs and row
goals, challenge generally accepted tuning practices, and take complete
control of your parallel queries.
Since the talk was being broadcast live on "PASS TV," I had Paul White join me at the front of the room to moderate questions delivered via Twitter. This worked out reasonably well and I hope to do something similar in the future. Huge thanks to Paul for helping out -- and for giving me a really ugly scowl when one of my jokes fell totally flat.
Demos for the talk are attached. Let me know if you have any questions.
Thanks again to everyone who watched, either in person or at home. I had a blast. Hope you enjoyed it even half as much as I did!
Sometimes, in the interest of testing various scenarios that your server might encounter, it's useful to be able to quickly simulate some condition or another. I/O, memory, CPU pressure, and so on.
This latter one is something I've been playing with a lot recently. CPU pressure in SQL Server creates all sorts of interesting side-effects, such as exacerbating waits and making various other conditions much easier to reproduce.
In order to make this simpler, I've created the attached CLR library. This is a simple pair of stored procedures:
startSeeding takes two arguments: The first parameter is the number of threads you'd like to create, and the second is a bit, indicating whether or not the threads should be affinitized. If they are, the threads will run preemptively, and the CPU pressure will be of the external sort. If they are not affinitized, the pressure will be of the internal sort, and you'll be able to see SQL Server scheduler contention. Of course you can run the procedure two or more times and create a mix of threads, if that's what you'd like to do.
stopSeeding takes no arguments. It attempts to cancel all of the threads that have been started.
"Installing" the script is easy. Create a database on a server that has CLR enabled. Make the database trustworthy to bypass CLR security restrictions. And then run the script to create the objects. The code was compiled for .NET 3.5, so this should work on either SQL Server 2008, 2008R2, and 2012.
Be careful about how many threads you create. It's amazing how quickly things can get out of hand. Start small. Trust me on this.
On that note, I make absolutely no guarantees that this won't crash your server, or worse. As a matter of fact, that's kind of the point!
Enjoy, and let me know if you have any feedback, feature requests, or whatever.
You might have heard of Red Gate's famous SQL in the City events: free, full-day educational events where you can learn from Red Gate's own evangelists in addition to various MVPs and other guests. With just a tiny bit of marketing thrown in for good measure (don't worry, it's not a daylong sales pitch).
Red Gate is doing a US tour this fall, and I'm happy to note that my fair city of Boston is one of the stops... and I am one of the speakers.
The event takes place on October 8. I'll be delivering a talk entitled "The Ten Commandments of SQL Server Monitoring."
I've not yet received all ten commandments, but my tablets are initialized and my stylus is at the ready. I'm sure I'll have some good information to share with you come October.
Hope to see you there!
Want to do interesting, in-depth SQL Server development work for a great company?
If the answer is yes, you should drop me a line immediately. Send me your resume at [my first name] @ [the name of this site].
This job is working on a data warehouse for a financial services company in Boston, MA. I cannot publicly mention the company name or say much about it, but if you e-mail me a resume I'll be happy to share more information. What I can say is that it's a solid company with a great reputation in the industry, and a fantastic place to work. The position is located in Boston, MA, and either requires that you live here or travel here (at your own expense) on a fairly regular basis. The company would prefer to hire fulltime employees, but there may be room for a contractor -- e-mail me and we'll see.
The project itself is to work with a small team of dedicated database developers, on a project very core to the company's mission. There are a lot of interesting challenges, there is a lot of work to be done, and the project is a very high visibility endeavor. The team is small and the project is somewhat loosely managed, so team members are expected to be the kind of people who can drive themselves forward, set their own goals, and make progress without a whole lot of hand holding. If you're the kind of person who is a self-starter and loves to learn, this is an environment in which you can really flourish.
Task-wise you'll be expected, depending on the day, to do some database design, some ETL, and usually a lot of T-SQL development. Performance is of paramount importance and the company is looking for people who have good to excellent understanding of SQL Server internals and can handle complex query tuning challenges. If you're the kind of person who gets excited when Paul White writes a new blog post, you'd be a great fit.
I can't say anything more here; if you're interested, drop me a line and I'll be happy to chat with you either via email or over the phone. I think this is a great opportunity and I hope some of my readers will want to give it a shot.
If you're reading this, please take one minute out of your day and vote for the following Connect item:
If you're really interested, take three minutes: run the steps to reproduce the issue, and then check the box that says that you were able to reproduce the issue.
Imagine that ten hours ago you started a big transaction. You're sitting there waiting for it to finish, and it's running, and running, and running. At some point, you notice that the drive with the transaction log has filled up, so you create a second log file. And the transaction is sitting there running, and running, and running. Is it still doing work? Or did it catch the low disk space issue and start rolling back? Wouldn't it be great if you could actually answer that question? Are you surprised to learn that you can't answer that question?
SQL Server currently has a few DMVs that are supposed to tell us the status of a given request, transaction, and so on. These DMVs are unreliable in the vast majority of situations. That means that we are unable to answer important questions like, "is my transaction still doing any work, or did it die three hours ago?"
The SQL Server team needs to fix this. End of story. Please vote and help me convince the powers that be to do the right thing.
Another year, another fantastic Atlanta SQL Saturday. Hats off to the team that created the event for delivering a top notch day for the attendees.
Thanks to everyone who attended my "Query Tuning Mastery: The Art and Science of Manhandling Parallelism" session. Demos are attached.
Enjoy, and I'll see you at the next one!
In case you haven't noticed, Data Education (the training company I started a couple of years ago) has expanded beyond the US northeast; we're currently offering courses with top trainers in both St. Louis and Chicago, as well as the Boston area.
The courses are starting to fill up fast—not surprising
when you consider we’re talking about experienced instructors like Kalen Delaney, Rob
Farley, and Allan Hirt—but we have still have some room. We’re very excited about bringing the highest
quality SQL training to the middle of the country.
If you’re interested
in taking one of the courses (or more! multiple registration discount!), you're in luck: just enter the
special discount codes for SQLblog.com readers:
Without further ado, the course descriptions (full outlines
at DataEducation.com) … hope to see many of you soon in a Data Ed classroom!
Server 2005, 2008, and 2012 Internals and Query Tuning (May 7-11; St. Louis,
MO): Kalen Delaney is back with her blockbuster internals course, now with information on SQL Server 2012. In this world-renowned five-day course, students will learn how to take a long, hard look at the SQL Server relational engine. After better understanding what’s happening internally, students will get the opportunity to investigate how internals can affect how you set up your databases for maximum performance and reliability. Query tuning within SQL Server 2005 and 2008, as well as parts of SQL Server 2012, will be discussed in depth.
(This course is geared toward both SQL Server DBAs and developers with some experience with application development and architecture.)
T-SQL Querying and Reporting: Building Effectiveness (May 14-16; Chicago, IL): You know this one will be good if MVP Rob Farley is flying all the way from his native Australia to our classroom in Chicago. This three-day course is a journey through the more advanced side of T-SQL, designed to help you create queries that are simply more effective. Rob Farley takes his students on a road trip of unlearning bad habits of querying and reporting, and into a deeper understanding of what makes a query effective. Common student feedback for this course includes “You’ve made me want to go back and rework every query I’ve ever written” and “I didn’t realize how much I didn’t understand about even the fundamental parts of T-SQL.”
(The T-SQL taught will be primarily for SQL Server 2012, but most of the principles taught will also apply equally to SQL Server 2005 and SQL Server 2008.)
This is Mission
Critical: High Availability for SQL Server 2008 and 2012 (June 18-20; Boston,
MA): This course is another perfect example of learning an in-depth topic straight from one of the world’s biggest experts in that area. In this three-day course, Windows, clustering, and SQL Server high availability/disaster recover expert Allan Hirt will dive into new features and enhancements within SQL Server 2012 (as well as 2008), including the enhanced multi-site failover clusters and AlwaysOn availability groups. Windows 8 Server and Server Core will also be discussed as they relate to highly available SQL Server deployments.
(This course is perfect for SQL Server DBAs eager to get an early understanding of SQL Server 2012 and what is means for creating and maintaining mission-critical database systems.
Let me know if you have any questions about these or other Data Education course offerings.
Thanks to everyone who attended my sessions last Friday and Saturday at SQLbits! It was great to meet many new people, not to mention spending some time exploring one of my favorite cities, London.
Attached are the demos for each of the two talks I delivered:
Query Tuning Mastery: The Art of and Science of Manhandling Parallelism
As a database developer, your job boils down to one word: performance. And in today's multi-core-driven world, query performance is very much determined by how well you're taking advantage of the processing power at your disposal. Are your big queries using every available clock tick, or are they lagging behind? And if your queries are already going parallel, can they be rewritten for even greater speed? In this session you will learn how to take full advantage of parallelism, from a developer's point of view. After a quick terminology review and technology refresher the session will go deep, covering T-SQL patterns that allow certain queries to scale almost linearly across your multi-core CPUs. Alas, not all T-SQL queries can go parallel, so you will also learn to watch for those things that can restrict the query optimizer's decisions. Along the way you’ll learn to manipulate costs and row goals, challenge generally accepted tuning practices, and take complete control of your parallel queries.
... and ...
Query Tuning Mastery: Workspace Memory Internals
As SQL Server professionals, we often think of memory in vague, instance-level terms: buffer pool, procedure cache, Virtual Address Space, and so on. But certain tasks require a more in-depth focus, and query tuning is one of them. Large, complex queries need memory in which to work--workspace memory--and understanding the how's, when's, and why's of this memory can help you create queries that run in seconds rather than minutes. This session will give you an in-depth understanding of how the optimizer makes its query memory decisions, with lots of tips and tricks along the way to help you guide the process for top performance. If you work with large queries and are serious about achieving scalability and consistently great performance, you owe it to yourself to attend this session.
It's been several months since the last Who is Active fix, so I thought I'd call this one out specifically via a blog post.
v11.11 contains a few minor fixes and enhancements, which you can read about on the download page.
This will (I believe) be the last release that is compatible with SQL Server 2005 and 2008. v11.xx has been quite a stable release in general, with very few bugs found in the 11 months since I've released it--I do not expect to need to release any more fixes.
In the meantime, I have started work on a new version that will take advantage of several SQL Server 2012 features, as well as some SQL Server 2008 features that I was unable to previously leverage due to my efforts to keep the procedure backward-compatible with SQL Server 2005. The new version will be released in a few months, once I've had time to both implement and thoroughly test the new functionality.
As always, I appreciate any comments or feedback.
Today is the last day of the annual SQL Server Connections show in Vegas, and I've just completed my third and final talk. (Now off to find a frosty beverage or two.)
This year I did three sessions:
SQL302: Parallelism and Performance: Are You Getting Full Return on Your CPU Investment?
Over the past five years, multi-core processors have made the jump from semi-obscure to commonplace in the data center. While servers with 16, 32, or even 64 cores were once an out-of-reach choice for all except the biggest databases, today we regularly expect such specifications in even our lower-end servers. So, are you getting everything you can out of the wealth of processing power at your disposal? By default, SQL Server automatically handles many of the parallel processing details, but DBAs still need to consider a number of things if they want to ensure that they’re taking full advantage. In this session, we will take a detailed look at the ins and outs of how and why SQL Server processes queries in parallel, with examples to help you identify which queries are being processed in parallel and what they’re doing. You will then learn the various methods of controlling parallel processing: SQL Server configuration options, the SQL Server 2008 Resource Governor, and query-level hints. The information you take from this session will enable you to immediately evaluate, understand, and improve the state of parallel processing on your servers.
SQL405: Query Tuning Mastery: The Art and Science of Manhandling Parallelism
As a database developer, your job boils down to one word: performance. And in today’s multi-core-driven world, query performance is very much determined by how well you’re taking advantage of the processing power at your disposal. Are your big queries using every available clock tick, or are they lagging behind? And if your queries are already going parallel, can they be rewritten for even greater speed? In this session you will learn how to take full advantage of parallelism from a developer’s point of view. After a quick terminology review and technology refresher the session will go deep, covering T-SQL patterns that allow certain queries to scale almost linearly across your multi-core CPUs. Alas, not all T-SQL queries can go parallel, so you will also learn to watch for those things that can restrict the query optimizer’s decisions. Along the way you’ll learn to manipulate costs and row goals, challenge generally accepted tuning practices, and take complete control of your parallel queries.
SQL323: What’s Really Happening on Your Server? 15 Powerful SQL Server Dynamic Management Objects
There are two kinds of DBAs in this world: those who scratch their heads, unsure of how to find answers, and those who demand real-time, comprehensive insight. This session is for the latter type, the Type A DBAs who are serious about managing their servers as efficiently as possible. The Dynamic Management Objects – a set of views and functions that first shipped with SQL Server 2005 – are a window into the inner workings of your SQL Server instance. Locked within the objects is the information you need to help you solve virtually any performance problem, quickly debug issues as they’re occurring, and gain insight into what’s actually happening on your server, right now. This session is a fast-paced tour of the ins, outs, whys, hows, and even pitfalls of 15 of the most important views and functions – information gleaned from heavy use of the objects in a number of environments over the past five years. You will learn how to understand transaction behavior, locking, wait statistics, sessions, requests, and much more. No longer will you need to scratch your head, wondering what is slowing down your queries: You will be the master of your SQL Server instance.
Huge thanks to everyone who decided to attend one of my talks! The decks are available on the conference DVD, and the demos are attached to this post.
Please let me know if you have any questions about the material. Thanks again, and enjoy!
If the title of this post doesn't have you scratching your head, you may have been paying very rapt attention last time you saw me speak.
I love the portability of AdventureWorks and the fact that anyone can download it. Since it was released I've used it almost exclusively for demos in talks I've written. However, In recent months I've been moving away from the core tables in the database. Fact is, they're just a bit too small to show performance artifacts of parallelism, spilling to tempdb, and the like -- the topics that I'm currently enamored with.
Instead I've started using a couple of tables modeled after Production.Product and Production.TransactionHistory. These tables are called dbo.bigProduct and dbo.bigTransactionHistory, and I refer to them collectively as bigAdventure.
The bigAdventure tables are several times larger than their AdventureWorks brethren, and allow me to easily create queries that overwhelm the 8 cores on my laptop. Which is exactly what I need to emulate the large data warehouse queries we see in the real world. So far I've been able to do most of what I need with only the two tables, but I hope to add more to the mix soon (for example, I've been
working on a bigger version of Sales.CurrencyRate to help illustrate
some SQLCLR techniques).
The current bigAdventure script is attached to this post. I meant to include it in the demo download for my PASS session, also posted today, but forgot to put it into the ZIP file. I thought that someone out there who didn't attend my session might want to use it, so here you are.
What a rush. Standing on the stage in an almost-full 1,000-person room, I (very) momentarily wondered what I'd been thinking when I submitted a 500-level talk for the biggest SQL Server conference in the world. But despite a rough start--my laptop crashed and I had to reboot it two minutes into the talk--I found my rhythm and the entire 90 minutes went by in a flash. I wish I'd been able to take 90 more!
The scene? PASS Summit 2011. Friday, October 14, 10:15 a.m. (Room 6E, to be exact.) The last day of one of the best PASS Summits I've had the pleasure of attending.
The topic? A fairly obscure area of SQL Server, called workspace memory. Here's the abstract for the talk:
Query Tuning Mastery: Zen and the Art of Workspace Memory
As SQL Server professionals, we often think of memory in vague, instance-level terms: buffer pool, procedure cache, Virtual Address Space, and so on. But certain tasks require a more in-depth focus, and query tuning is one of them. Large, complex queries need memory in which to work--workspace memory--and understanding the how's, when's, and why's of this memory can help you create queries that run in seconds rather than minutes. This session will teach you how to guide the query processor to grant enough memory for top performance, while also keeping things balanced for the sake of concurrency. You will learn advanced monitoring techniques, expert-level application of specialized query hints, and the memory internals needed to put it all together. If you work with large queries and are serious about achieving scalability and consistently great performance, you owe it to yourself to attend this session.
If you were in the audience, I thank you for choosing my session over the many others that were running concurrently. I had a great time, and I hope you did too.
The demos for the talk are attached to this post. Apologies, but I am not sharing the deck at this time as I'm going to be integrating it into a larger course that I hope to start delivering next year. (Through Data Education, naturally!)
Enjoy! And as always, let me know in the comments if you have any questions.
Last Saturday, September 17, I was lucky to be able to present two sessions at an excellent SQL Saturday in the Atlanta area. The day drew a large crowd and had a great speaker lineup. All in all, a huge success, and a very well-managed event. Congratulations to the organizers!
One of the highlights for me, aside from speaking, was helping out with logistics the night before and creating a cocktail for the event. Check out Audrey Hammonds's blog for details.
My two sessions were:
"SQL Server Parallelism and Performance"
Over the past five years,
multi-core processors have made the jump from semi-obscure to
commonplace in the data center. Today we regularly expect to see 16, 32,
or 64 cores in even our lower-end servers. Are you getting everything
you can out of the wealth of processing power at your disposal? Attend
this session to take a detailed look at how and why SQL Server processes
queries in parallel, as well as methods for controlling parallel
processing via configuration options, the Resource Governor, and
query-level hints. This session will enable you to immediately evaluate,
understand, and improve the state of parallel processing on your
"15 Powerful SQL Server Dynamic Management Objects"
The Dynamic Management
Objects--a set of views and functions that first shipped with SQL Server
2005--are a window into the inner workings of your SQL Server instance.
Locked within is the data you need to help solve virtually any
performance problem, quickly debug issues, and gain insight into what's
actually happening on your server, right now. This session is a
fast-paced tour of the ins, outs, whys, hows, and even pitfalls of 15 of
the most important views and functions--information gleaned from heavy
use of the objects in a number of environments. You will learn how to
understand transaction behavior, locking, wait statistics, sessions,
requests, and much more. Attend this session and you will be the master
of your SQL Server instance.
I promised to share the slides and demos from the second session, and those are attached to this post.
Many thanks to everyone who attended my sessions and for all of the positive feedback I received! (I would also thank anyone who gave me negative feedback, but in this case the worst I received was that I had a typo on one of my slides.) I had a great time, and hope to return to Atlanta again sometime soon.
Autumn is creeping inevitably closer here in the US, and that means that speaking season is about to kick into high gear. Here's my current schedule for the remainder of the year:
September 8, 17:00 GMT (online) - 24 Hours of PASS webcast: "Baseline Basics or: Who Broke the Database?"
In this session, excerpted from my PASS Summit precon, I'll explain the whys and hows of using baselines to assist with performance tuning. If you find yourself more often than not tuning reactively rather than proactively, this session is for you. This is a free webcast, so why not join in?
September 8, 18:30 EDT (Waltham, MA) - New England SQL Server Users Group: "Windowing Functions in SQL Server 2008, Denali, and Beyond"
Two in one day! This session will kick off the 2011/2012 New England SQL Server season, and will cover what are (in my ever-so-humble opinion) the most important T-SQL enhancements in the past three versions of SQL Server: windowing function enhancements. I'll discuss what's there in today's shipping versions, and the new and incredibly powerful functionality that Denali brings to the table. If you're in the Boston area, don't miss it!
September 17 (Atlanta, GA) - SQL Saturday #89: (Two Talks)
After the great time I had at this year's TechEd show in Atlanta I could hardly wait to get back. Luckily, the fantastic SQL Server community in the Atlanta area scheduled this event, which gave me the perfect excuse to pack my bags for a return trip. The speaker lineup for this event is rock-solid, and I'll be contributing with two talks: my introductory dive into SQL Server parallelism, and an overview of my 15 favorite activity monitoring dynamic management objects. This should be a great event by anyone's standards, and it's free, so if you live nearby you have literally no excuse not to attend.
October 11, 08:30 PDT (Seattle, WA) - PASS Community Summit Pre-Conference Seminar: No More Guessing! An Enlightened Approach to Performance Troubleshooting
No more guessing! It's not just a catchphrase; it's a way of life. When faced with performance problems we have a choice: we can either run around panicking, wasting everyone's time (including our own), or we can use the huge amount of information at our disposal to figure out what's actually wrong and fix it. As the calm and collected type, I prefer the second option, and so should you. If you'll be attending the PASS conference, join me on Tuesday to learn how you, too, can quickly and accurately pinpoint the root cause of your performance issues.
October 12-14 (Seattle, WA) - PASS Community Summit Spotlight Session: Query Tuning Mastery: Zen and the Art of Workspace Memory
Your query is running, and it needs to sort some data. Or to hash some data. Or to perform a parallel operation. These things take memory, and as any SQL Server professional knows, in the world of SQL Server memory is worth much, much more than its weight in gold (even given today's hugely-inflated prices). Attend this session to learn the ins and outs of workspace memory: what it is, why it's needed, where the memory comes from, and most importantly, how to control it to make certain queries faster and other queries not have to wait as long. Workspace memory tuning is a mostly untapped performance opportunity that many DBAs can heavily benefit from learning how to leverage.
November 1-3 (Las Vegas, NV) - SQL Server Connections: (Three Talks)
My final speaking engagement of the year will be at the always-fun SQL Server Connections show in Vegas. (It's in Vegas! How could it not be fun?!) I'll be doing three talks during the course of the show: An introductory talk on my favorite topic the past couple of years, parallelism in SQL Server; a much more advanced parallelism talk to build on that one; and a talk on the various dynamic management objects that can be used in the quest for ultimate SQL Server performance. Save a spot for me at the poker table!
I'm really looking forward to these events. If you're going to be there let me know in the comments, and/or feel free to find me at any of the shows and say hi. (Buying me a drink or two wouldn't hurt either.) See you there!