Backups are one of the most fundamental parts of being a good SQL Server DBA. Nothing matters as much as being able to recover from disaster. As a DBA I am constantly checking if my backups exist, completed correctly and are available for recovery in the event that something happens. Until recently a lot of this has been done through the use of custom monitoring scripts that generate reports to tell me whether or not backups completed successfully over the weekend and manually restoring random backups periodically to ensure that the backup is in fact restorable. SQL Server 2008 has done a lot to make this easier with Policy Based Management, but what happens when you have a SQL instance on your network that you don’t actually know about?
Over the last two weeks I have been working on installing and configuring System Center Operations Manager at my company in conjunction with members of our Server Administration Team. I know that I promised to write reviews of tools for managing SQL a while ago, and I have started on two of the reviews already, I just need to finish them, but one thing that has become apparent about monitoring SQL Server is the SQL specific tools are only going to help you with the servers you know about. This is where System Center Operations Manager is very different.
Now you might ask how could there be a server that you don't know about? Its really much easier than you might actually think with SQL Server Express installations becoming more common place, along with SQL Server Embedded Edition (also known as the Windows Internal Database System). Over the last six months, three times I've had someone walk into my office and ask me where the backups for their SQL Express database would be. Much to their dismay, disgust, whatever, I sadly had to inform them that:
- I don't manage their SQL Server Express Installs
- Unless they proactively worked with the Backup Admins to have their system backed up, no backups existed and they had lost everything.
- They can put their database on one of the dedicated SQL Servers to prevent this from occuring again in the future.
Early on I recognized the problem associated with SQL Express and Embedded Edition sprawl in our environment. Vendors these days seem to like to use one of the free versions of SQL Server for their products and whoever installs the product doesn't think about the ramifications of having SQL Server Express managing their data. In my company it is documented what instances of SQL Server the DBA staff supports, so in the case of these missing backups we were safe. However, I don't necessarily like telling someone or some department that they lost their information, even when I am not at fault. This is where SCOM comes into play.
One of the greatest things about SCOM is that it doesn't just manage SQL Server, or at least it isn't designed to just manage SQL Server. It manages your entire enterprise, and as a part of that, you will learn about ever server that has SQL Server installed on it since the SQL Server Management Pack Discovery process finds the instances through registry scans of the server. I have found no less than a half a dozen SQL Server Express or Embedded Editions in the past two weeks and I have only added 20 servers to the management group. It just happens that the servers I am adding are IT specific servers that support things like WSUS, and a few applications built on top of WSS, both which run using Embedded Edition. In the case of three of the instances, they weren't discoverable using sqlcmd -L or the other common ways of finding SQL Server Instances on your network, so even if I was more proactive in scanning for new instances I didn't create, they would have been missed.
I'm going to try and knock out a few reviews this week after playing with a couple of these tools for over 2-3 months now, and give my likes and dislikes about each of them. Expect to see more about SCOM as well since I am having to dedicate a significant portion of my time to rolling it out currently.