<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tags 'Performance' and 'SCOM'</title><link>http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=Performance,SCOM&amp;orTags=0</link><description>Search results matching tags 'Performance' and 'SCOM'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>SCOM, 90 days in, III. Stuff to Add</title><link>http://sqlblog.com/blogs/merrill_aldrich/archive/2011/03/03/scom-90-days-in-iii-stuff-to-add.aspx</link><pubDate>Thu, 03 Mar 2011 06:57:44 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:33880</guid><dc:creator>merrillaldrich</dc:creator><description>&lt;p&gt;This is the third installment of a series on our deployment of System Center at my workplace, emphasis on SQL Server MP. &lt;/p&gt;  &lt;p&gt;At this point we’ve got Operations Manager installed, and up and running, and we’ve been able to categorize all the monitored servers into production, preproduction, test and DR using groups that have dynamic membership rules. We’ve got the SQL management pack working with out-of-the-box settings, and used it to locate all the SQL Server stack services like the engine, reporting services, integration services, etc. But the out-of-the-box settings don’t work quite exactly for our environment, though they are pretty good. Here are some things we had to add, and you might want to consider for your SCOM config.&lt;/p&gt;  &lt;h2&gt;&lt;/h2&gt;  &lt;h2&gt;Alerts on Job Failure / Long Jobs&lt;/h2&gt;  &lt;p&gt;The first thing we noticed when watching the SQL MP’s behavior is that SQL Agent job failures don’t alert very effectively. Discovery of specific jobs, and then monitoring of their last run status, may be disabled out-of-the-box. When turned on, they only show as “warning” and not “critical.” This is apparently by design, in order to cut down on the noise in environments with a lot of failing jobs. In place of the monitoring for individual jobs is a generic “some job failed” monitor at the level of the whole SQL Agent for the instance, but it doesn’t detail which job.&lt;/p&gt;  &lt;p&gt;In our group, we really need to know if a specific production job fails, so one of the first steps was to create an override for this to make job failures fire a critical alert, so that they would trigger our email and text notifications. This is a pretty straightforward override that is documented on the &lt;a href="http://www.itwalkthru.com/2010/12/configuring-system-center-operations.html"&gt;web&lt;/a&gt; – basically you just have to override the severity of the job failure and configure it to trigger an alert. Second, we disabled the default monitor that alerts generically on any job failure, since those would essentially be duplicate alerts.&lt;/p&gt;  &lt;p&gt;So, if you are headed down this road:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Check to see the overall “health” of the SQL Agent jobs in your environment first, to determine if it’s practical to monitor them individually&lt;/li&gt;    &lt;li&gt;If so, it works to substitute monitoring of each job for monitoring only for generic “some job failed” events, by overriding these two monitors&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Next, there’s an monitor for long-running jobs. This is probably appropriate for many jobs, but there are some SQL Agent jobs that are designed basically to run forever, and will trigger false alarms. Biztalk, for example, has some maintenance jobs that just fire up and then execute in a loop indefinitely. We had to go through and cherry-pick each of these jobs to override/suppress the long running job monitor.&lt;/p&gt;  &lt;h2&gt;Missing Perf Counter Collection&lt;/h2&gt;  &lt;p&gt;For whatever reason, probably because they can’t please everybody all the time, some perf counters that I rely on are not configured in the default management pack, and you might consider adding these as we did.&lt;/p&gt;  &lt;p&gt;The first is the &lt;strong&gt;Memory Grants Pending&lt;/strong&gt; counter in SQL Server. This counter basically indicates a specific type of memory pressure where queries are stacking up in a queue waiting for execution memory to become available; it typically happens in a data warehouse type of workload with huge, nasty queries that consume large amounts of execution memory. I happen to have a server that does this fairly often.&lt;/p&gt;  &lt;p&gt;We added a custom monitor into the Performance health rollup category for this counter, and set it up as off by default. We then created overrides to enable it on the data warehouse server(s) where this type of thing is an issue:&lt;/p&gt;  &lt;p&gt;&lt;a href="http://sqlblog.com/blogs/merrill_aldrich/MemGntsPendingCapture_273B7A7C.jpg"&gt;&lt;img style="background-image:none;border-bottom:0px;border-left:0px;padding-left:0px;padding-right:0px;display:block;float:none;margin-left:auto;border-top:0px;margin-right:auto;border-right:0px;padding-top:0px;" title="MemGntsPendingCapture" border="0" alt="MemGntsPendingCapture" src="http://sqlblog.com/blogs/merrill_aldrich/MemGntsPendingCapture_thumb_38AE6C05.jpg" width="644" height="495" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Setting up a monitor like this can be done simply with a custom monitor creation wizard in the SCOM console GUI – the tricky bit is getting the variables into the dialog box (in this case the Object: field) that allow SCOM to handle the multiple-instance nature of SQL Server correctly. That is, there could be counters for instance1 and instance2 and instance3 on the same server, and the monitoring has to account for that. It manages that by allowing you to put variables into the setup that will be substituted for things like instance name at run time. This article was very helpful on instance-aware monitors: &lt;a title="http://blogs.msdn.com/b/boris_yanushpolsky/archive/2007/11/21/opsmgr-sp1-creating-rules-and-monitors-for-multi-instance-components.aspx" href="http://blogs.msdn.com/b/boris_yanushpolsky/archive/2007/11/21/opsmgr-sp1-creating-rules-and-monitors-for-multi-instance-components.aspx"&gt;http://blogs.msdn.com/b/boris_yanushpolsky/archive/2007/11/21/opsmgr-sp1-creating-rules-and-monitors-for-multi-instance-components.aspx&lt;/a&gt;&lt;/p&gt;  &lt;p&gt;Essentially, if you are going to be authoring monitors or rules for anything instance-able (SQL Server services, databases, etc.) then you really have to “grok” how to use the SCOM variables in the correct fields in the monitor or rule setup. It looks complicated, but I found I got the logic once I used it a couple times.&lt;/p&gt;  &lt;p&gt;The second is the &lt;strong&gt;Page Life Expectancy&lt;/strong&gt; counter. This is less for alerting, and more for tracking overall instance health and looking for chronic memory pressure.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Database Size. Definitely.&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;&lt;a title="http://blogs.technet.com/b/kevinholman/archive/2010/11/19/collecting-sql-database-size-as-a-performance-counter.aspx" href="http://blogs.technet.com/b/kevinholman/archive/2010/11/19/collecting-sql-database-size-as-a-performance-counter.aspx"&gt;http://blogs.technet.com/b/kevinholman/archive/2010/11/19/collecting-sql-database-size-as-a-performance-counter.aspx&lt;/a&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;Enough said.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;Mirroring Queues&lt;/strong&gt;&lt;/p&gt;  &lt;p&gt;Paul Randal posted some great guidelines about how to monitor for mirroring performance and issues, like &lt;a title="http://www.sqlskills.com/BLOGS/PAUL/post/Importance-of-monitoring-a-database-mirroring-session.aspx" href="http://www.sqlskills.com/BLOGS/PAUL/post/Importance-of-monitoring-a-database-mirroring-session.aspx"&gt;http://www.sqlskills.com/BLOGS/PAUL/post/Importance-of-monitoring-a-database-mirroring-session.aspx&lt;/a&gt;. If you use database mirroring, it’s logical to want to both gather stats about how it’s performing and be alerted to issues. We’re at the very beginning of that setup, but have at least got rules configured to monitor for the send and redo queue length on our mirrored databases. There’s also a solution you can download and implement here: &lt;a title="http://rburri.wordpress.com/2010/09/10/sql-server-db-mirroring-mp-update/" href="http://rburri.wordpress.com/2010/09/10/sql-server-db-mirroring-mp-update/"&gt;http://rburri.wordpress.com/2010/09/10/sql-server-db-mirroring-mp-update/&lt;/a&gt;&lt;/p&gt;  &lt;h2&gt;Disable Database Space Monitors on Mount Points&lt;/h2&gt;  &lt;p&gt;The SQL Server internal functions for reporting OS disk space available don’t work very effectively with mount points, and that issue shows in SCOM’s SQL MP, where the reporting on available disk space per database is just wrong. I’ve already &lt;a href="http://sqlblog.com/blogs/merrill_aldrich/archive/2010/12/03/operations-manager-sql-monitoring-issue.aspx"&gt;covered this&lt;/a&gt;, so I won’t go back over it again. Hopefully this improves in a future version. We have elected to disable the collection of the invalid data, and instead monitor disk space and the space allocated vs. used inside the database files instead. It’s a workaround at best.&lt;/p&gt;  &lt;p&gt;Here’s how I implemented a workaround in SCOM:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;I made a group called “SQL Servers that use Mount Points.” No joke. I put in a static list of all the machines I know are configured this way.&lt;/li&gt;    &lt;li&gt;I set overrides on the collection of database free space and log free space to disable them for that group.&lt;/li&gt;    &lt;li&gt;We ensured the DBAs would be alerted on low disk space conditions from the Windows Server components, and we just have to sort of live with that instead.&lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;That’s it for this installment. Next up, monitoring for blocking, which is a much more complicated animal. Happy monitoring!&lt;/p&gt;</description></item><item><title>SCOM, 90 Days In, I</title><link>http://sqlblog.com/blogs/merrill_aldrich/archive/2011/02/23/scom-90-days-in-i.aspx</link><pubDate>Thu, 24 Feb 2011 04:11:51 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:33731</guid><dc:creator>merrillaldrich</dc:creator><description>&lt;p&gt;At my office we’re about 90 days into our implementation of System Center Operations Manager for Windows Server and SQL Server monitoring. All in all it’s been a good experience, and I’m really excited to have access to this tool. I’ve logged a fair number of years as a DBA on products like Idera’s SQL Diagnostic Manager and Quest Spotlight on SQL Server Enterprise (and “roll-your-own” solutions) in smaller environments, and liked them, but they always, in my experience, struggled with really large or complex server environments. So far, SCOM shines in that scenario. I have also recently come out of a bad experience with a big enterprise tool (that shall remain nameless) which I am quite happy to see shrinking in my rear-view mirror.&lt;/p&gt;  &lt;p&gt;I’m planning to put a few posts here with semi-organized thoughts and comments about starting out with SCOM and building an implementation for anyone else starting down this road. It’s been really fun to dig into.&lt;/p&gt;  &lt;h3&gt;Required Question: Why buy a monitoring solution?&lt;/h3&gt;  &lt;p&gt;This is a controversial topic with some DBA’s. I respect those who like to create their own solutions, and those who feel that that activity is a big part of the job of a DBA. I especially respect the need to know what you are looking at regardless of tools – monitoring software is not a &lt;em&gt;substitute&lt;/em&gt; for expertise, it helps us to &lt;em&gt;apply&lt;/em&gt; expertise. But ultimately my take on this question comes down to &lt;a href="http://sqlserverpedia.com/blog/professional-development/opportunity-cost-versus-real-cost/"&gt;opportunity cost&lt;/a&gt;. Basically, if you really write your own monitoring solution, you are robbing time and energy from other, more specialized activities that would directly improve your business, because you’re re-writing software that can be purchased off the shelf. &lt;/p&gt;  &lt;p&gt;The monitoring software from third-party vendors might not be perfect. Yes, we absolutely need to know what we are doing with SQL Server. And yes, you need to have that uncomfortable conversation with your boss about purchasing something with hard dollars instead of spending time (soft dollars) on development. However, the odds that an individual DBA can write a whole monitoring solution from the ground up, with features comparable to a purchased product, for less true, net cost than a smart software purchase, seem slim to me. You’ll pay anyway, in hours or FTEs.&lt;/p&gt;  &lt;h3&gt;What’s SCOM?&lt;/h3&gt;  &lt;p&gt;System Center Operations Manager (SCOM or OpsMan) is Microsoft’s enterprise server monitoring system. It’s a modular, distributed platform that can be configured to monitor practically every Microsoft server product from the OS to SQL Server to IIS, Exchange, and so on, plus can be augmented to monitor other systems with purchased or home grown plug-ins or modules (“management packs” or “mp’s” in SCOM terms). &lt;a title="http://www.microsoft.com/systemcenter/en/us/operations-manager.aspx" href="http://www.microsoft.com/systemcenter/en/us/operations-manager.aspx"&gt;http://www.microsoft.com/systemcenter/en/us/operations-manager.aspx&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Our impetus to get into the SCOM solution at my workplace was about two simple but ambitious desires:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;To have a “single pane of glass” operations monitoring solution, where everyone responsible for servers could see the same events and use a common tool, while not compromising functionality for each specialty. That is, as a tool it should be good at &lt;em&gt;everything&lt;/em&gt; from OS to SQL Server to SharePoint or Exchange.&lt;/li&gt;    &lt;li&gt;The failure of the previous software we’d attempted to use for that function.&lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;Ninety days is about the minimum amount of time needed to see if a product like this can really walk the walk, and I have to say I am quite pleased with it.&lt;/p&gt;  &lt;h3&gt;Good stuff&lt;/h3&gt;  &lt;ul&gt;   &lt;li&gt;One “pane of glass” for almost everyone in IT really works, especially if you’re a Microsoft-centric shop. I don’t have enough real information to talk to non-MS system monitoring, but I can see that Redmond is committed to this tool for their server products.&lt;/li&gt;    &lt;li&gt;Distributed application modeling, which is fantastic.&lt;/li&gt;    &lt;li&gt;Discovery is excellent. All components of SQL Server, not just the engine, with pretty complete information about edition, version, service pack and other server or instance properties.&lt;/li&gt;    &lt;li&gt;If set up correctly, the management of change (adding servers or instances, removing SQL Server components from servers) is great – the discovery piece can automatically find and start (or stop) monitoring new (retired) servers pretty seamlessly. In our department there is basically &lt;em&gt;zero&lt;/em&gt; time to set up monitoring for a new SQL Server.&lt;/li&gt;    &lt;li&gt;Scalability is good. We have a mid-sized environment with about 700 windows servers in three domains, of which about 200 have some SQL component installed. We have roughly 175 SQL Server Engine instances hosting 2500 databases. SCOM handles this quantity without so much as a blink, and we have a lot better coverage than we ever had in the past. Pre-SCOM, we’d have to manually set up monitoring for each server, and we could only afford (in licenses and in terms of scalability of the monitoring software) to monitor production. Now we have visibility into every box in both of our datacenters.&lt;/li&gt; &lt;/ul&gt;  &lt;h3&gt;Less-good stuff&lt;/h3&gt;  &lt;p&gt;As you can tell, I’m pretty happy, but as with anything there are a handful of issues:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;It’s large and fairly complex, using multiple servers. Probably not a quick win for a small shop. Our implementation took two people and a consultant about half-time for a couple of months to do the OS and SQL pieces; other applications like IIS, Exchange, SharePoint will require more hours to set up. This should be expected, though, for a tool like this. Having the consultant turned out to be a very, very good idea. (I have his and his company’s contact information if you want it – email me at merrillaldrich (a) gmail (.) com.) The investment of time was worth it, and gave us a &lt;strong&gt;lot&lt;/strong&gt; more value than the equivalent time spent building something.&lt;/li&gt;    &lt;li&gt;The system itself requires a pretty beefy SQL Server instance to keep up with the load placed on it by a complicated environment.&lt;/li&gt;    &lt;li&gt;The out-of-the-box GUI for things like “dashboard” monitoring overall performance of a particular SQL server isn’t great; it’s fixable with some time investment in creating custom views, and it’s just a GUI/presentation problem, not an issue with the underlying data. Essentially, where another (probably smaller and less enterprisey) SQL monitoring tool will give you a really targeted and well designed performance dashboard with two clicks of setup.exe, this one needs some time to create such a view, and there is a learning curve involved.&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Next up – details about how we customized the SQL MP to get to good quality, low noise alerting and performance dashboards.&lt;/p&gt;  &lt;p&gt;&lt;strong&gt;P.S. If you are a SCOM user and a DBA, I’d love to talk with you at SQLSaturday #65 in Vancouver. Please drop me a line – merrillaldrich (a) gmail (.) com.&lt;/strong&gt;&lt;/p&gt;</description></item></channel></rss>