|
|
|
|
The random ramblings and rantings of frazzled SQL Server DBA
-
Setting up Database Mail in SQL Server 2005 and 2008 is a common task that is performed post server setup. However, if you just took over a fairly large environment and faced having to configure Database Mail on dozens, maybe hundreds of servers, SQL Server Management Studio doesn’t offer much in the way of making this a scriptable task.
Unlike most of the other wizards and designers in SQL Server Management Studio, the Database Mail configuration wizard doesn’t have a Script button to generate the script that will be executed against SQL Server to commit the configuration settings in the server. I find this really odd because Management Studio is built on top of SMO and SMO has the necessary Script() method for the SqlMail object that can generate the scripts. Things being what they are, a quick solution to the problem of having to configure multiple servers the same way for Database Mail is to configure one server, and then generate the configuration script using SMO and Powershell.
[void][reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo");
#Set the server to script from $ServerName = "ServerName";
#Get a server object which corresponds to the default instance $srv = New-Object -TypeName Microsoft.SqlServer.Management.SMO.Server $ServerName
#Script Database Mail configuration from the server $srv.Mail.Script();
The above Powershell script will generate an output similar to the following (additional formatting has been done to prevent text wrapping in this blog post):
EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'AccountRetryAttempts', @parameter_value=N'1', @description=N'Number of retry attempts for a mail server' EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'AccountRetryDelay', @parameter_value=N'60', @description=N'Delay between each retry attempt to mail server' EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'DatabaseMailExeMinimumLifeTime', @parameter_value=N'600', @description=N'Minimum process lifetime in seconds' EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'DefaultAttachmentEncoding', @parameter_value=N'MIME', @description=N'Default attachment encoding' EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'LoggingLevel', @parameter_value=N'2', @description=N'Database Mail logging level: normal - 1, extended - 2 (default), verbose - 3' EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'MaxFileSize', @parameter_value=N'1000000', @description=N'Default maximum file size' EXEC msdb.dbo.sysmail_configure_sp @parameter_name=N'ProhibitedExtensions', @parameter_value=N'exe,dll,vbs,js', @description=N'Extensions not allowed in outgoing mails' EXEC msdb.dbo.sysmail_add_account_sp @account_name=N'DatabaseMailAccount', @email_address=N'SQLServers@domain.com', @display_name=N'Database Mail Account', @replyto_address=N'no-reply@domain.com' EXEC msdb.dbo.sysmail_add_profile_sp @profile_name=N'DatabaseMailProfile' EXEC msdb.dbo.sysmail_add_profileaccount_sp @profile_name=N'DatabaseMailProfile', @account_name=N'DatabaseMailAccount', @sequence_number=1 EXEC msdb.dbo.sysmail_add_principalprofile_sp @principal_name=N'guest', @profile_name=N'DatabaseMailProfile', @is_default=0
This script can then be used to create an identical configuration of DatabaseMail on all of the SQL Server 2005/2008 servers in the environment.
EDIT:
Aaron's comment below makes a good point about the scripting, and while the SMO Script() function doesn't generate statement terminators (see Aaron's blog post Ladies and Gentlemen Start your Semi-Colons for why this matters) it is possible to account for that in the Powershell script above. The only thing required is to change the last line to:
[system.string]::join(";" + [System.Environment]::NewLine, $srv.Mail.Script())
Thanks for pointing that out Aaron.
|
-
In the past month I’ve answered a couple of questions similar to: Is there a database role I can assign (other than dbo) that will allow a user to "see" stored procedures. As it is, I have granted the user db_datareader so, from SSMS, they can see tables but no stored procedures. Is there a way to "give" them readonly access to stored procedures? and What is the role that I need in SQL 2008 to be able to read the entire schema out of a database but not make changes to it? The answer to both of these questions is that there is no built in database role that allows this kind of access. The db_datareader role would provide access to see tables and views, but not to see the DDL code behind those objects, and it wouldn’t allow a user to see stored procedures or functions at all. The ddladmin database role would provide access to see the objects, but also to change them as well, which is not the intent here. To accomplish the desired result of this type of request requires that a new database role be created. To demonstrate this, lets create a test database with a couple of objects in it: create database test go use test go create table test (rowid int identity) go create view testview as select rowid from test go create procedure testproc as begin select * from test end go create function testfunction(@rowid int) returns int as begin declare @retval int select @retval = rowid from test where rowid = @rowid return(@retval) end go Then we’ll create a test database user and assign that user to the db_datareader role in the test database. USE [master] GO CREATE LOGIN [Test] WITH PASSWORD=N'test' GO USE [test] GO CREATE USER [Test] FOR LOGIN [Test] GO USE [test] GO EXEC sp_addrolemember N'db_datareader', N'Test' GO Then we’ll login to Object Explorer with the Test account and the only objects that can be seen are the tables and the view: If you try to view the definition of the table it will succeed, however, if you try to script the definition of the view, an error will result: To provide access to the Test login to view the Stored Procedure and Function, as well as to script the objects along with the view, we’ll need to create a user defined database role and GRANT that role the VIEW DEFINITION permission on the dbo Schema in the Test Database. use test go CREATE ROLE [ddl_viewer] GO GRANT VIEW DEFINITION ON SCHEMA ::dbo TO ddl_viewer GO Then we’ll add the Test user in the database to the ddl_viewer role in the database: USE [Test] GO EXEC sp_addrolemember N'ddl_viewer', N'Test' GO With this in place if we refresh the Object Explorer view for the database we’ll see that the Test user can now see all of the objects in the database: In addition to this, the user can script out any object to see the definition of the object, but they can not execute any of the stored procedures or call any of the functions in the database directly. Having a database role that allows users to view objects and DDL definitions in a database can be a valuable addition in SQL Server for a number of reasons. For development based environments it allows support developers to ensure that the appropriate code has been released into production environments to support post release implementations. For report writers, it allows them to validate that the version of a view that exists in Development matches that of production, and when using tools such as Data Dude or SQL Compare, it allows non-DBA’s to compare schema’s between production and test/development databases.
|
-
It seems like every week I get hit up with some kind of “Its a view only for this specific report,” or “It is a simple change that the vendor would have made if they had access” type of request related to SQL Server. If you’ve never dealt with me, let me start off this blog post by saying that I am a staunch advocate of Change Control Processes for production systems, I don’t care how simple you might think the change might actually be. Personally, I submit change requests for even the most simplistic tasks like creating a new index in the database or even changing the frequency of DBCC CHECKDB commands on production servers. Why do I do things like this? Basically put; because you never know! Early on in my career with SQL Server, I learned the impact that so called “Small Changes” to a SQL Server database could have in a production environment. As a Business Analyst I had elevated access to a production system, which I really had no business having, that I did a significant amount of work on. One of my primary job functions was to prevent bills from going out to customers that contained errors on them, so I created a table in the database to exempt accounts from the billing process by inserting the account primary key into the table, which was then checked during Billing Nomination to exclude those accounts until the associated problem was fixed. So that I had a full history of when an account was added and removed from the table and by who, a audit table was added with auditing triggers for INSERT/UPDATE/DELETE on the exception table. At some point, someone decided to alter this table and add an additional column to it to make it so they didn’t have to join to the accounts table to get the LDC (local distribution company) account number for troubleshooting problems with the accounts. When this was done, it broke reports that I had written for our CTO at the time because now the column name for the LDC account number existed in two of the tables, so being young, dumb, thinking that this column wasn’t needed, I mean really how hard is it to write a JOIN, I dropped it from the table. This is where the walls came crashing down. Being relatively new to SQL there were a lot of things I didn’t know, like for example, SELECT * is dangerous to use, especially in auditing triggers when the table definitions don’t match. When i dropped the column from the base table, I didn’t change the audit table to match, causing the columns to mismatch and the trigger to fail. Now if the only thing that actually used this table was me, stupid assumption on my part, this wouldn’t have been a big deal, but the new account creation process had been recently changed to add new accounts to this table to create a billing hold during the account setup period. This is actually what broke, and it broke very fast causing a cascading effect of problems across multiple systems. It took the DBA and application developers a bit of time to trace this down and when they did I got a phone call asking what I had changed. Such a small change, such a big mess. The impacts of this solidified for me that there is no such thing as a small change to a production database very early in my career. It took multiple people nearly two hours to cleanup the residual effects of this small change, primarily because a new account manager had to manually reenter the failed new accounts into the system and let them process again. I’d like to say that I never made a change outside of change control processes after this again, but I’d be lying if I did. However, I can say that I haven’t made untested adhoc changes like that ever again, and it has been over two years since I have made any undocumented change to a system. (Well at least intentionally, I have made mistakes and been connected to the wrong server and done seriously bad stuff like dropping a 80GB table which is why I use SSMSToolPack to color code my servers now.)
|
-
While trying out the new DACPAC feature in SQL Server 2008 R2 recently, I ran into a bug that makes no sense to me at all. In my environment we use an Active Directory Group for the SQL Database Administrators and that group is a member of the sysadmins fixed server role inside of SQL Server, which I’d suspect is pretty standard in most environments. The new Data-tier Application feature in SQL Server 2008 R2, has a bug that makes it impossible to Register a new Data-tier Application if the account being used to create it is not explicitly created as a login inside of SQL Server. Registering a Data-tier Application is a straight forward process. Right click on the database and select Tasks –> Register as Data-tier Application. Click Next on the first screen which looks like the below screenshot: Once discovery completes, you’ll get a screen like the following: Clicking Next again will cause SQL to register the Data-tier Application, or at least attempt it. However, if you use an AD Group to control rights and your account only has access to SQL Server through that group, the registration will fail with the following underlying error message: Cannot insert the value NULL into column 'created_by', table 'msdb.dbo.sysdac_instances_internal'; column does not allow nulls. INSERT fails. I decided to waste a bit of time troubleshooting this, just as if I had gotten this error message from an internally developed database by first looking at the DDL statement for the table. CREATE TABLE [dbo].[sysdac_instances_internal]( [instance_id] [uniqueidentifier] NOT NULL, [instance_name] [sysname] NOT NULL, [type_name] [sysname] NOT NULL, [type_version] [nvarchar](64) NOT NULL, [description] [nvarchar](4000) NULL, [type_stream] [varbinary](max) NOT NULL, [date_created] [datetime] NOT NULL, [created_by] [sysname] NOT NULL, CONSTRAINT [PK_sysdac_instances_internal] PRIMARY KEY CLUSTERED ( [instance_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY], CONSTRAINT [UQ_sysdac_instances_internal] UNIQUE NONCLUSTERED ( [instance_name] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO SET ANSI_PADDING OFF GO ALTER TABLE [dbo].[sysdac_instances_internal] ADD DEFAULT ('') FOR [description] GO ALTER TABLE [dbo].[sysdac_instances_internal] ADD DEFAULT (getdate()) FOR [date_created] GO ALTER TABLE [dbo].[sysdac_instances_internal] ADD DEFAULT ([dbo].[fn_sysdac_get_currentusername]()) FOR [created_by] GO The bold line above was the interesting part of the DDL Script since it was a default constraint on the column that was specified in the error. So I used sp_help_text to dump the definition of that function out to see what it was doing: CREATE FUNCTION [dbo].[fn_sysdac_get_currentusername]() RETURNS sysname BEGIN RETURN dbo.fn_sysdac_get_username(SUSER_SID()); END GO So we have a function that calls another function passing the SUSER_SID() in as the parameter. This wasn’t very helpful, so I used sp_help_text again to dump out the definition of this other function: CREATE FUNCTION [dbo].[fn_sysdac_get_username](@user_sid varbinary(85)) RETURNS sysname WITH EXECUTE AS OWNER BEGIN DECLARE @engineEdition int = CAST(SERVERPROPERTY('EngineEdition') AS int); DECLARE @current_user_name sysname; IF (@engineEdition = 5) BEGIN --SQL Azure does not have syslogins. All the logins reside in sql_logins SELECT @current_user_name = name FROM sys.sql_logins where sid = @user_sid END ELSE BEGIN --OnPremise engine has both sql and windows logins in syslogins SELECT @current_user_name = name FROM sys.syslogins where sid = @user_sid END RETURN @current_user_name; END This is where I really didn’t understand the implementation of this. I can understand the creation of a helper function that checks to see if this is SQL Azure to a point, since Azure has limitations, and not all of the coded functionality in standard SQL Exists in Azure, but why Microsoft chose to query sys.syslogins here is beyond me. This explains why the registration process fails with a NULL insert for accounts belonging to an AD Group since there isn’t an entry in sys.syslogins for the account, there is only an entry for the AD Group. What makes this really confusing is they chose to use SUSER_SID(), and then didn’t use the SUSER_NAME() function, which returns the appropriate login name for AD Group members. I filed this as a bug on Connect, so if you happen to try and use the DACPAC feature and run into this, vote for it: https://connect.microsoft.com/SQLServer/feedback/details/577174/dacpac-fn-sysdac-get-username-returns-null-value-for-not-null-created-by-column-on-sysdac-instances-internal-table?wa=wsignin1.0# If a Active Directory Group is used to control permissions in the database engine the fn_sysdac_get_username(SUSER_SID()) function call to enforce the DEFAULT CONSTRAINT on the msdb.dbo.sysdac_instances_internal returns NULL because a group member in AD doesn't have an entry in sys.syslogins which is what is being used to lookup the login name for the SID. This prevents registration of Data Tier applications unless the AD Account is explicitly added to the server, defeating the purpose of using AD Groups. The error returned is:
|
-
This month’s T-SQL Tuesday, started by Adam Machanic (Blog|Twitter), is being put on by SQLServerCentral author and MCM, Robert Davis (Blog|Twitter). This month the topic is learning and teaching so I thought I’d share some information about how the Army turns civilians into soldiers, based on my experiences as a Drill Sergeant. From Day 0 of Basic Training, known as pickup day, almost every aspect of Basic Training is well thought out, planned for, and fairly structured. What seems like chaos to the privates going through it, is actually a well orchestrated machine that was planned long before they ever laid eyes on one of their Drill Sergeants. I am not saying that it isn’t stressful, or challenging mentally and physically for them, but there is as we say, a method to the madness. Transforming civilians into soldiers is accomplished through a process known as soldierization. It is a lot like building a house, you have to create a solid foundation before you try and put up the walls. In the Army every block of instruction is building on the skills previously learned. Imagine what it would be like to hand a group of eighteen year olds M16’s, walk them out into the woods and have them attempt to perform a combat patrol without any kind of instruction. Most probably have never handled a firearm of any type before, some have never been in the woods before, and none of them will know the difference between a tactical column, and a wedge formation. However, by the end of Basic Training, they will have spent multiple days and nights in the woods, performing combat patrols and applying the various tactical movement techniques. To accomplish this in the Army we do everything in three phases, essentially the Crawl, Walk, then Run model for training. First we instruct soldiers on the basics and establish a basic foundation from which everything else builds off of. Then the privates have to perform the measure at slow pace, receiving corrections from the instructors to ensure that they perform the tasks to standard. Once they have demonstrated the ability to perform a task slowly and apply all of the terminal learning objectives correctly, the move to full speed execution of the task where the again receive corrections to ensure that at full speed they continue to perform to standard. There is a significant amount of structure behind this, but different soldiers respond and learn in different ways and as a Drill Sergeant you have to be able to adjust how you present information at times to break through to those hardcores that take on the nickname “can’t get right”. However, you can’t just wing it when it comes to teaching these skills, there are established instruction packages that contain all of the information and how you are supposed to present it, and if you don’t know and follow this information, you will fail. One of the items that we teach in Army Basic Training is Drill and Ceremony, and there is a highly structured method in place for teaching all of the positions and movements known as modules. When I had to pitch my first module as a Drill Sergeant Candidate, I thought it was ridiculous to have to memorize a block of instruction on how to teach the position of attention. I mean really; how hard can it be? There is actually a really good reason that there is such a structured method of teaching, its really easy to miss important items if you try and wing it, and since this is the foundation for every other aspect of Drill and Ceremony, if you screw it up, the rest of what you teach will be that much harder. In Drill Sergeant School you have to memorize and “pitch” at least three of these modules, each one randomly selected at 6PM and had to be pitched the next morning at 8AM. It made for some interesting nights, but having done it, and seen how bad things can be when you don’t know the module, I gained a new appreciation for the structure behind this method of teaching. To teach the position a lead instructor pitches the module, with a demonstrator, another Drill Sergeant that does everything wrong until corrected, showing how the position is properly performed. After the initial introduction, the lead instructor asks for questions, and then starts the entire process back over, using the demonstrator as an assistant instructor while working back through the model, step by step and having the privates demonstrate the position of attention. Once everyone has appropriately demonstrated the position of attention the first time, we transition to the Run phase and have the privates practice alternating from Rest to Attention repeated. From there they learn how to execute facing movements, render the hand salute, assume the rest positions at the halt, and properly return to the position of attention before ever taking their first step in a formation. One thing I have tried to apply in my civilian life teaching is this same type of methodology, but it is really much harder to do than it is in the Army, because someone else hasn’t developed the structured packages of instruction that I can just download and use. However, if you pay attention to how successful speakers like Bob Ward, Paul Randal, Brent Ozar, and Buck Woody structure their presentations, you can begin to see how to build your own packages of instruction that build the foundation early and start putting walls up immediately. I have never been the kind of person that could read about something and actually understand it. I don’t generally do well in lecture based classes, or in situations where there isn’t some type of practical application of the information being taught. Most of what I know about SQL Server, I have learned through practical application. I learned how to write TSQL by reading TSQL that was already written by Developers and figuring out how it worked. I learned how to write C# the exact same way. This applies to my life in general, and is not just related to IT or SQL Server. I learned how to work on cars by working on cars with my dad as a kid. I’d say that I learn the most when I can take something that is already working and tear it apart to put it back together. However, the further along in life I have gotten, I have actually found that I learn even more when I have to teach someone else. How you actually go about teaching someone else depends a lot on who they actually are, and the topic that you are trying to teach. To me,
|
-
Last year after returning from PASS Summit, I blogged about The benefits of attending PASS realized!, where I provided a specific example of how attending had helped me in troubleshooting a problem in my environment. I love PASS Summit, its the only event I know of, next to maybe the MVP Summit, that draws so many of the SQL Server Experts from around the world, and they all love to help people with their problems. However, today really highlighted that what you take away from PASS is more than just the limited amount of information you might pickup while actually attending the sessions. One of the options that is available when you register for PASS Summit is the option to purchase the conference DVD set. The last two years I opted to not purchase the DVD set, but I won’t be making that mistake again this year. In the last nine months, I have logged into the 2009 Summit site and accessed the online videos of the Summit sessions at least once every two or three weeks. Today, a discussion on Twitter about parallelism and CXPACKET waits popped up, and I referenced Bob Ward’s (Blog|Twitter) session about wait types where he covered this wait type with good detail, as well as how to look at when it might, or might not be a problem for your system. It just happens that the 2009 sessions are online for attendees, and the person I was talking to Tom LaRock(Blog|Twitter) had attended the summit. I also still have access to the 2008 presentation slide decks and materials, which happens to be the first year I attended the Summit. How much would you pay to have Adam Machanic, Paul Randal, Kimberly Tripp, Kalen Delaney, Andrew Kelly, Brian Knight, or any of the other experts in SQL Server available at your fingertips to go over that topic you hear them speak about? Well you can have that without going to Summit, it will cost you $595 for the DVD set as a non-attendee. However, if you attend Summit, the DVD set is only $195, so that is $400 that would go towards your conference costs if you had registered and purchased the DVD set. Even if you don’t attend the Summit, I’d try and fight for the DVD set for your employer. Even at $595, they are one heck of deal, when you consider the quality of the presentations that are included in the set.
|
-
As a DBA, I control a lot of data in my environment. Most of this data is stored in relational Online Transaction Processing (OLTP) databases that support the daily operations of the applications in our environment. While it is possible to do basic reporting off of this data in its current form, there are trade offs associated with trying to report on data stored in OLTP systems. Taking a business from basic reporting to the next level, and creating a true Business Intelligence solution around the data contained in relational systems, can be a daunting task for a OLTP DBA to do.
For just over a year now, I have been learning this other side of data management, and i have to admit, that what seems like it should be a simple task has proven to be much more difficult in practical application. My last employer began a path to BI not long after I started with them, and did a lot of research into the popular BI tools available on the market in 2008. They looked at almost every solution in the top right quadrant of the Gartner 2008 BI Magic Quadrant:

Ultimately they decided on Oracle BI EE as the platform of choice for BI, and I have to admit given that the majority of the business data was stored in Oracle, this selection made sense. On to of that, the OBIEE platform is really nice. I attended a good bit of training on OBIEE Administration to fill the roll as a backup to our Oracle DBA, and I was pretty much lost when it came to designing and implementing dimensional based data models to support BI reporting. It was really hard for me to think about data the way that the trainers were teaching it. Denormalization, star schema’s, fact tables, attributes, dimensions, and a myriad of other items were all foreign to me. I trudged through things and built a basic BI model in OBIEE on top of a SQL OLTP system during one of the training sessions, but I continued to have problems with how to properly model the data, and consistently had to dump my current design and essentially start over to correct flaws in the model that didn’t allow it to answer the questions being asked of the data.
One of the best tools that I was provided by a trainer was a copy of The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition) by Ralph Kimball. This book was really helpful to gaining an understanding of the what, why, and how of designing a Data Warehouse following proven patterns and practices working with common types of data. This book coupled with the training I had with OBIEE made it possible to build a BI model that worked for building drag and drop reporting from one of the SQL OLTP systems that I managed. However, I later tried to reproduce this same model using the Microsoft BI Stack and immediately ran into the same problems I had originally working with OBIEE, coupled with the fact that I had no idea how to work with the BI side of SQL Server.
I’d like to be able to say that I pressed on and figured out how to provide a BI solution with SQL Server, but I had other priorities in life at the time and gave up on playing with this. Less than a year later, I found myself faced with the opportunity to once again dive deeper into the Microsoft BI Stack and build a solution to a business problem associated with data maintained in SQL Server. However, instead of going the BI route, I leveraged my knowledge of TSQL and SSRS and created a basic reporting solution on top of the existing OLTP.
Since I created this initial solution, I have been doing a lot of research and reading on the Microsoft BI Stack using SQL Server 2008 R2 and Sharepoint 2010. It turns out that I had restricted myself a bit more than I should have. Last night when I got home from work I checked the mail and found the latest copy of MSDN Magazine in my box. It had a number of articles on Sharepoint 2010, and since we are rolling out a Sharepoint 2010 farm at work right now I decided to throw it in my laptop case to take to the office for the Sharepoint Developer to look at. Today it peaked his interest so I decided to dig through my stack of older MSDN and TechNet Magazines in my office and for whatever reason, the cover of the August 2009 issue of TechNet caught my eye. In it are the first two of three articles that provide an excellent primer for getting started with the Microsoft BI Stack.
Planning Your First Microsoft BI Solution Building a Data Foundation for a BI Solution
After reading these two articles I jumped online to find the third article that was mentioned in the August issue and found it immediately.
Building Your First Cube
I’d highly recommend all three of these to anyone that is a DBA and looking to learn more about the Microsoft BI Stack. The first article provides some really good background on what BI is, and the problems that it attempts to address. The second article covers the basics of how to build the underlying Data Warehouse structures. The third article covers how to build a SSAS cube on top of the Data Warehouse.
|
-
Yesterday I happened to catch Brad McGhee’s (Blog|Twitter) tweet about receiving a new MD3000 DAS storage array today. I shot him a tweet that I could send him information about using the command line interface for the MD3000 to do advanced configuration of the array, optimizing it for SQL Server. Back in April I started what i had hoped to be a series about my efforts working with a MD3000 storage array titled Working with Sub-Optimal Disk Configurations (Making the best of what you’ve got). Things being what they are, I never got back around to writing further on this, so consider this to be a continuation of that effort. Background on the MD3000 Series Disk Arrays The first thing to know about the MD3000 storage arrays is that the Modular Disk Storage Manager (MDSM), which is the graphical user interface for setting up and managing the disk array, is fairly limited in what it actually lets you do. When I first started looking at the two MD3000 disk arrays in use in my environment when I took over it, I used the MDSM to collect array configuration and profile information, and one of the things I noted was that the arrays were all set to use 128 KB segment sizes. I wanted to test the performance of various other segment sizes with SQLIO, but I couldn’t figure out how change the segment size in the MDSM so I hit the internet search engines and found that there are some significant limitations in the MDSM. The MDSM allows you configure the storage array using simple defaults only. The Dell website has a Dell™ PowerVault™ MD3000/MD3000i Array Tuning Best Practices document that I expected to to tell how to actually tune the storage array. However, it only provides high level summary information about what kind of options are actually available, and doesn’t provide specifics for tuning the array. One of the things provided in this tuning guide is the following table for configuration options of the array: This made it apparent to me that the available MDSM configuration templates weren’t going to allow me to make the changes I wanted to make to play around with SQLIO on the array. For example the array segment size is set by default at 128KB for all of the templates, while the MD3000 array actually supports sizes of 8, 16, 32, 64, 128, 256, and 512 KB. To use a array segment size other than 128 KB, you have to manually configure the array using the command line interface(CLI). To test the various Segment sizes, I would need to move from the MDSM GUI to the Command Line Interface (CLI) to actually be able to test the impact of various configurations with SQLIO. However, finding out how to use the CLI was a bit more complex than I actually expected. Dell has a two different document libraries for the MD3000 and MD3000i storage arrays. If you do a basic comparison, the two document libraries are similar in nature, and in fact, the arrays are configured using identical configuration options depending on the specific firmware version that the array uses. To work with the CLI, Dell offers a rudimentary CLI Guide, but its really short its actual coverage of the available CLI options that actually exist for the MD3000 series of disk arrays. However, if you really dig into things, you will find a reference for the CLI in the Dell™ PowerVault™ Modular Disk Storage Manager CLI Guide. What is really interesting about the MD3000 series of disk arrays is that they aren’t actually Dell specific hardware. They are instead a basic hardware platform that is used by multiple vendors including Dell and IBM (the DS4000 uses the exact same CLI as the MD3000). If you do a non-specific search for the CLI executable smcli command reference, the top two results are not actually Dell references (SANtricity Storage Manager Command Line Interface and Script Commands and SANtricity Storage Manager Command Line Interface and Script Commands). These two documents actually are excellent references to the CLI management of the MD3000 and MD3000i arrays, since they provide better details than the Dell documents, about the CLI commands available to configure the array. Configuring MD3000 Series Disk Arrays for SQL Server With all of the potential configuration options available for MD3000 series disk arrays, what exactly are the BEST configurations for SQL Server? The answer unfortunately is not as straight forward as it might seem. Like anything with SQL Server, the right answer depends on the specific workload that the storage array intends to target. In the next blog posts, I’ll provide details about the different performance characteristics noted under various configurations of the MD3000 array used in the initial blog post Working with Sub-Optimal Disk Configurations (Making the best of what you’ve got).
|
-
If you follow me on Twitter (@SQLSarg) you probably saw tweets about a bug I encountered in SQL Server 2008 recently. What was hoped to be a normal Monday morning started off with alerts from a new production server that DBCC CHECKDB had failed for our new Sharepoint 2010 primary content database. Thankfully, the Sharepoint 2010 portal was still in pre-production configuration and validation, so while I had a problem it wasn’t necessarily a major one. When I started looking at the issue, one item that struck me as odd was, the CHECKDB job didn’t output error information when it was running against this database, it just failed, and terminated the connection, so in order to determine what exactly was wrong I had to rerun DBCC CHECKDB WITH ALL_ERRORMSGS, NO_INFOMSGS on the database again. Since we moved the content database from our Dev/Test server where it was initially configured to the production server on Friday, I decided to run CHECKDB against the Dev/Test server while CHECKDB ran again on the production server so I could help determine if the problem pre-existed the migration, or was new to the production server. Of course it being a Monday, both of the CHECKDB runs failed exactly the same way, without providing information as to where the corruption in the database existed. So on the test server, I placed the database into SINGLE_USER mode and reran CHECKDB. The output was similar to the following: Msg 824, Level 24, State 2, Line 1 SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:691341; actual 0:0). It occurred during a read of page (1:691341) in database ID 25 at offset 0x0000015191a000 in file 'D:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\DATA\Test.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. The same week that we moved this content database from our development to or anticipated production environment, we had a disk failure in our RAID 4 array, so I immediately suspected that we might have a problem with one of our other disks, so I ran disk diagnostics on the development server but found nothing wrong. I was about to toss in the towel and admit failure as a DBA when I tried rebuilding indexes as a last hope to see if possible the corruption was index related and I ran into a Numbers table in the Sharepoint content Database. Knowing that this isn't supposed to be there, my memory kicked in and I realized I had created this in model as a part of my previous blog posts on Tempdb, Does tempdb Get Recreated From model at Startup and Does the tempdb Log file get Zero Initialized at Startup? As a followup to those posts I decided to get really stupid and create a massive Numbers table in model to see if I could significantly impact instance startup by doing something stupid, only I never got around to playing further since it took a while to create the Numbers table seeded with the maximum value for an interger column. So I dropped it from the content database and immediately it became corruption free. So I ran CHECKDB on model and it was fine, ran a SELECT * FROM Numbers and it returned from model. Try it in another Sharepoint database and I got the exact same error. I ran a reindex of the other Sharepoint database on the development server and it failed once again on the Numbers table, and once again, dropping this table from the database cleared the issue, so it is related to this oversized Numbers table that I created in model. At this point I decided to create a connect feedback for the issue and solicit other members of the community for verification that this is in fact a problem inside of SQL Server 2008. The details are available on the following link: https://connect.microsoft.com/SQLServer/feedback/details/569843/database-corruption-from-model-database-with-large-table I have been able to repro this on Windows Server 2008 R2 x64 and Windows Server 2008 x64with SQL Server 2008 x64 SP1 and SP1 + various CU's installed, as well as on Windows Server 2008 x86 with SQL Server 2008 SP1 x86 installed. However, I couldn't repro it SQL Server 2005 on any platform. Here is the entire repro script from the Connect Feedback (Be forewarned, this will break your SQL Instance until you drop the Numbers table from Model): USE [master] GO ALTER DATABASE model SET RECOVERY FULL GO USE [model] GO -- Suppress data loading messages SET NOCOUNT ON -- Create Sample Data using a Table Varable SELECT TOP 2147483647 IDENTITY(INT,0,1) AS N INTO Numbers FROM sysobjects a, sysobjects b, sysobjects c, sysobjects d, sysobjects e -- Create a Primary Key and Clustered Index to control data order ALTER TABLE dbo.Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (N) USE master; GO ALTER DATABASE [model] SET RECOVERY SIMPLE GO USE [model] GO DBCC SHRINKFILE (N'modellog' , 0, TRUNCATEONLY) GO USE [model] GO DBCC SHRINKFILE (N'modeldev' , 6144) GO USE [master]; GO CREATE DATABASE Test GO USE [Test] GO SELECT * FROM Numbers GO I also sent this to a good friend of mine at Microsoft, Bob Ward, who graciously took the time to run the repro script and validate that this is in fact a form of bug in SQL Server 2008. He is still looking into it, but I would suspect that since it requires a 4GB+ table to be created in the model database, and the corruption problem has a valid workaround of dropping the table from model and any user database created based on the bloated model database, that this doesn’t necessarily warrant a Critical fix priority. What is really interesting with this bug is that smaller tables, 1-2GB inside of model don’t trigger the same error. I seems to be size based, with really large objects in model actually being the root cause of the problem. I have been keeping in touch with Bob since he first replied to my email regarding this, and he is still looking into the reported bug. One certainly has to question what kind of person would actually create a 4GB table in the model database of SQL Server, knowing that this object would be copied to every database created on the server after it was created (wait a minute, I actually did this). As a note, the tempdb database on the server also reported corruption issues when it was restarted with this large object in the model database, and there was an impact to startup time by restarting the instance with this object in the model database (potentially a topic for another blog post as originally intended).
|
-
As I try and make changes in our environment and enforce best practices for development in my own work as well as that of other team members, one of the topics that had to be made was what path we would take for source control. Since we are in the process of rolling out Sharepoint 2010, the architecture team and I decided to go the route of Visual Studio Team Foundation Server 2010 as our primary source control system. However, after a week of working with it, I have to say that its not very friendly if you aren’t using Visual Studio 2010. To start off with, if you use Visual Studio 2008, you have to perform the installation following a specific set of steps for Visual Studio 2008 to be able to communicate with TFS 2010. After installing Visual Studio 2008 I had to install SP1 for Visual Studio: Microsoft Visual Studio 2008 Service Pack 1 (Installer) Then Install Team Explorer: Visual Studio Team System 2008 Team Explorer Then reinstall Service Pack 1 for Visual Studio. Then Install the forward compatibility update: Visual Studio Team System 2008 Service Pack 1 Forward Compatibility Update for Team Foundation Server 2010 (Installer) When I did that, I had to manually input the URL to the TFS Server and project to add it, like the below image. However, this was not the simple end to my woe’s in dealing with TFS 2010 from VS 2008. It turns out that even when setup following these steps, VS 2008 with TFS Explorer 2008 can’t create new Team Projects. You also can’t integrate SQL Server Management Studio with TFS 2010 unless you also install the TFS Explorer 2010 and the MSSCCI Provider. Even then this didn’t work out to allowing me to create usable projects inside of TFS 2010 for existing databases, so I reverted to installing the Power Tools for TFS 2010 and making use of the Windows Explorer Extensions for TFS 2010 that are similar to how Tortoise works with SVN for source code. This at least allowed me to script all the objects out to individual files and check them into source control for future controlled usage. So what’s my problem with this? First, the documentation for this just sucks. I found most of my answers on the forums, or through trial and error, and that’s not good. I have yet to find an actual BOL topic that explains the steps needed to setup VS 2008 to communicate with TFS 2010, or the associated limitations, like not being able to create new team projects from VS 2008 when working with TFS 2010. If you try to do this you will get an error similar to: TF30172: You are trying to create a team project either without required permissions or with an older version of team Explorer. Contact your project administrator to check your permissions or to determine how to upgrade Team Explorer. However, you can’t use Team Explorer 2010 with Visual Studio 2008, so there is a major disconnect between versions that exists even when using the forward compatibility update. In the next few blogs I’ll go over how exactly I am progressing with TFS 2010 and integrating it with SQL Server Management Studio and Visual Studio 2008 to figure this all out.
|
-
This blog post has nothing to do with databases, data storage, or even computers. If you are interested in learning more about NoSQL solutions take a look at Brent Ozars (Blog |Twitter) blog post NoSQL Basics for Database Administrators. The NoSQL in this post means it has nothing to do with SQL at all, but I’ve been working on a fun side project at home the last few nights and thought I’d share some pictures of it. Most people know that along with being a database geek, I am also a Drill Sergeant in the Army Reserves. Becoming a Drill Sergeant became goal of mine eleven years ago after I first joined the Army. The thing that stood out the most to me about Drill Sergeants I knew in the Army was their level of professionalism, knowledge, and methods of handling things, and I wanted to be amongst the best. One of the things that comes with being a Drill Sergeant is having the Campaign Hat that everyone associates with the Drill Sergeant image. The Campaign Hat always looks nice and stiff in pictures, movies, however, nothing could be further from the truth. The hat is actually made of a soft felt material that likes to curl up like a bowl in the Florida heat and humidity. To keep the hat in its appropriate shape it is stored inside of a hat press. The Army issues you the Campaign Hat, but they don’t issue out, nor do they even sell a press to keep it in shape (I am sure this makes sense to someone somewhere). Most of the civilian military supply stores outside of the Army Posts sell some form of a hat press, but I decided to build my own. In its simplest form a hat press is just two pieces of wood, one with a hole cut out to go over the top of the hat. I decided to use a pine end panel available at any hardware store like Lowes or Home Depot to build the hat press. I cut it out 16x16 for the top, and then the same size for the bottom, but with a three inch handle to carry it by. It all started out simple, and then I got carried away. I decided to transfer the image of the Drill Sergeant Badge onto the bottom of the hat press and then carve it out of the wood using my Dremel tool, and then carve my name onto the top. It took me a lot amount of time to do the carving, but it turned out pretty decent. So there is my latest NoSQL project. I am quite happy with the results, and it happens to be functional as well.
|
-
SQL Server users love the new features inside of the platform and each new release brings more exciting enhancements that motivate end users towards upgrading from previous editions. However, one of the biggest gripes I see from the SQL Server Community is that the hot new features are primarily Enterprise Edition only. Today I read a blog post by John Magnabosco over on Simple Talk titled “What If TDE Was Available In Standard Edition?” that once again brought up the subject of a newer Enterprise Edition feature that “should” be in Standard Edition according to John. Now to be perfectly fair to John, who I don’t know but I am sure he is a great person, the argument he makes in his blog post has been made before. The key statement in his blog post is:
You shouldn't have to be a large company with a large budget to create a secure environment.
To a degree I couldn’t agree with him more, but you don’t have to have TDE to create a secure environment. TDE is the icing on the cake in my opinion. However, in the not so recent past, this same type of argument has been made about other aspects of Enterprise Edition by the community. At PASS Summit 2008, which happened to be my first Summit and my first year as a MVP, during a insider session with Microsoft, some of the better known MVP’s drove the point home to Microsoft that stability shouldn’t be an Enterprise only feature. What specifically were they talking about? The ability to Lock Pages in Memory with 64 bit SQL Servers that can now scale memory beyond the previous limits of 32 bit architectures and with potentially devastating consequences if the SQL Server working set got trimmed or paged out by Windows Server 2003 under memory pressure. The result was a Cumulative Update to SQL Server allowing a this in Standard Edition.
Another big feature that debutted in SQL Server 2008 RTM was backup compression, which offered huge performance improvements and storage savings for backing up SQL Server databases. A number of third party tools have existed for a long time offering this feature including LiteSpeed by Quest Software and SQL Backup by Redgate, both of which are still industry standard tools for SQL Server backups despite the feature being available in the RTM of SQL Server 2008. Why are they still industry standard? Simple, backup compression was introduced as a Enterprise Only feature, and for $300-800 a third party tool makes much better financial sense versus the cost of an Enterprise license over a Standard license. Not to short change the third party tools, they offer a number of other awesome features beyond backup compression as well. As of SQL Server 2008 R2 RTM, backup compression is now available in Standard Edition of SQL Server as well.
So it begs the question, of this blog post. Should SQL Server advanced features be available a la carte? The competing RDBMS vendors have been selling features a la carte for a long time now. Are we to the point that SQL Server will also join the ranks with Oracle, Informix, and Sybase (this is an old link I know) on this? Would that even make the SQL Server community happy, or would we just begin complaining about the added costs on top of the base cost of the platform?
At some point Microsoft has to monetize their efforts on the product and the advanced features of it, there is after all a significant cost associated with developing, testing, releasing, and supporting it. SQL Server 2005 brought a number of features into the Standard SKU that were previously Enterprise only features, while also introducing a number of newer features in the Standard SKU, as well as a number of Enterprise only features. Coupled with this, the SQL Express platform is extremely robust for FREE and in SQL Server 2008 R2, now supports 10GB databases, which is larger than a number of the most basic databases in use by a lot of systems I encounter in consulting work.
What are your thoughts? What Enterprise only features do you think should be in Standard Edition? Should SQL Server features be available a la carte allowing you to determine what’s the most important to you, and if you are willing to pay for just those features?
|
-
While catching up on blog comments while I was out I found on my blog post Reinventing the Wheel – Automating Data Consistency Checks with Powershell, a comment was posted by Grant regarding how I had concatenated the $ScriptName variable into the email subject line in the $subject variable. Shouldn't the line $subject = "$ScriptName returned results" be $subject = $ScriptName + " returned results" In TQL, you would have to do a + based concatenation like Grant asked, but in Powershell, its not necessary. Take for example the following: $string1 = "String1"; $string2 = "$String1 concatenated with String2"; Write-Host $string2; The output will be: String1 concatenated with String2 Does it really matter? No, its pretty semantic. You could also use the Format() method of the String Object in the .NET model to accomplish the same thing if you were so inclined: $string1 = "String1"; $string2 = [String]::Format("{0} concatenated with String2", $string1); Write-Host $string2; There’s more than one way to skin a cat, at least so I am told, I’ve never skinned a cat in real life. I don’t know when I figured out that you can do direct placement of variables inside of strings and save key strokes, but a lot of my Powershell code is written with the variables inline to the text that they are being concatenated into like the subject line in the blog post mentioned above.
|
-
Brent Ozar (Blog|Twitter) and Tim Ford (Blog|Twitter) are putting on a new type of event in the month of August after SQL Saturday #40 in South Florida July, 31st, properly named SQLCruise. The concept is great, at least in my opinion, you pay for a cruise, get to have a break, and at the same time attend a mini-conference on SQL Server with training provided by two great speakers. The cost is relatively affordable, so what could possibly make it better? How about a sponsor offering up a free room for two, and admission into the mini-conference for free! That’s right Peter Shire (Twitter) and the folks over at SQL Sentry created a contest titled “Why I need to go on the SQLCruise” that is open to anyone to enter. All you have to do is tell SQL Sentry why you need to go the cruise through a creative use of social media, and you could win.
When I first heard about the SQLCruise, I thought, this could be fun, and my wife could come to boot. However, finances being what they are, it just didn’t work out to be feasible for both of us to go, and there is no way I can justify to myself, let alone my wife, me going on a cruise by myself. So when the contest came out, we both got creative and came up with a couple of funny parody type scenarios for why we need to go on the cruise. However, tonight Matt Velic (Twitter) posted his entry into the contest, and it really made me rethink submitting my own entry. Matt’s entry is below:
After watching Matt’s entry it really made it clear that I don’t need to go on SQLCruise, at least maybe not as much as some other people in the community. Trusting that Matt’s entry is entirely true, he does need to go on SQLCruise, so rather than posting my own entries into the contest, I figured I’d use my entry to maybe give Matt a bump. Having read the rules to the contest, this doesn’t even fit into the grey area for possibly being considered as an entry. However, the point behind Social Media is for messages to spread, at least I think that is what its all about, so I am going to spread Matt’s message for why he should go on SQLCruise. Even if I won making a legitimate entry, there’s no way I could in good spirit attend for primarily fun, when someone else really needs to go.
|
-
Today Joe Webb (Blog|Twitter) blogged about The Power of Regex in Powershell, and in his post he shows how to parse the SQL Server Error Log for events of interest. At the end of his blog post Joe asked about other places where Regular Expressions have been useful in PowerShell so I thought I’d blog my script for parsing SQLIO output using Regex in PowerShell, to populate an Excel worksheet and build charts based on the results automatically.
If you’ve never used SQLIO, Brent Ozar (Blog|Twitter) has a article on SQLServerPedia titled SAN Performance Tuning with SQLIO that includes a lot of information as well as a video tutorial on its use. I have been using SQLIO quite a bit this year, mainly to gather performance information for systems before and after reconfiguring their storage to show percent impact information to the business for making changes in storage like migrating from RAID 5 to 10, and aligning disk partitions used for database and log file storage. I have also used it in benchmarking new SAN configurations and the SAN configuration of our Virtual Machine environment to show team members why I have recommended that a number of our physical servers be migrated onto Virtual Machines.
I generally use a param.txt file similar to:
G:\MSSQL\testfile1.dat 4 0x0 32768
which creates a 32GB test file and then have a DOS batch file that runs the actual SQLIO tests using a script like:
sqlio -kW -t16 -s10 -o8 -fsequential -b8 -BH -LS -Fparam.txt
sqlio -kW -t8 -s360 -o8 -frandom -b8 -BH -LS G:\MSSQL\testfile1.dat > post.txt sqlio -kW -t8 -s360 -o8 -frandom -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -frandom -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -frandom -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -frandom -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -frandom -b8 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -frandom -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -frandom -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -frandom -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -frandom -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -fsequential -b8 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -fsequential -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -fsequential -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -fsequential -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kW -t8 -s360 -o8 -fsequential -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -fsequential -b8 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -fsequential -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -fsequential -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -fsequential -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt sqlio -kR -t8 -s360 -o8 -fsequential -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
Which runs the gamit of tests using 8, 32, 64, 128, and 256K block sizes, with 16 threads, and 8 pending IO. You can certainly expand the test set much further than this and do a very exhaustive testing regimen, I am by no means saying that my tests above are the best fit, but they have sufficiently met my needs for the tuning I have been doing. Once the tests have been run the post.txt file will contain an output entry similar the following for every test that was run.
sqlio v1.5.SG using system counter for latency timings, 14318180 counts per second 16 threads writing for 60 secs to file d:\MSSQL\testfile1.dat using 8KB random IOs enabling multiple I/Os per thread with 8 outstanding buffering set to use hardware disk cache (but not file cache) using current size: 2048 MB for file: d:\MSSQL\testfile1.dat initialization done CUMULATIVE DATA: throughput metrics: IOs/sec: 18527.91 MBs/sec: 144.74 latency metrics: Min_Latency(ms): 0 Avg_Latency(ms): 6 Max_Latency(ms): 4900 histogram: ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+ %: 78 6 8 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
Linchi Shea (Blog) wrote a blog post three years ago that showed how to Parse the sqlio.exe Output using Perl. I found this in my search for a smarter way of parsing these long output files I was generating other than manually, but I don’t have Perl installed anywhere and I really felt that I should be able to do this natively using the tools readily available to me on my Windows 7 laptop, so I looked to PowerShell for a solution.
Reading the file data in is accomplished using Get-Content which returns a collection of strings. At first I tried casting this to a [string] object directly, but soon found out that when I did this the resulting string was missing the carriage returns and line feeds that existed in the original file. This posed a couple of problems for me in parsing the outputs initially, but doing a [string]::Join operation using [Environment]::NewLine as the concatenater allowed an appropriately CRLF delimited string to be returned.
$filedata = [string]::Join([Environment]::NewLine,(Get-Content $FileName))
Now that I had my file data loaded into a the $filedate object, the next step was to split this string based on the “sqlio v1.5.SG” header that is output at the beginning of each test run. Since PowerShell is .NET based, this is easily accomplished by using the Split() method of the System.String object which is System.Type for the $filedata object.
$Results = $filedata.Split( [String[]]"sqlio v1.5.SG", [StringSplitOptions]::RemoveEmptyEntries )
At first I started to write a foreach loop with this object, lets face it old habits die hard, but instead I made use of piping which allows a collection like the one returned by the Split() method to be worked on in a single operation. Piping in PowerShell is similar to a manufacturing process where an object gets transformed and then passed on until the ultimate result is achieved. Once an object is piped, it becomes accessible through the use of the $_ variable name, allowing the object to be transformed along the pipeline. By piping the result of the Split() I was able to write a SELECT statement that utilized Regex to get the information of interest out of each test run.
select @{Name="Threads"; Expression={[int]([regex]::Match($_, "(\d+)?\sthreads\s(reading|writing)").Groups[1].Value)}},` @{Name="Operation"; Expression={switch ([regex]::Match($_, "(\d+)?\sthreads\s(reading|writing)").Groups[2].Value) { "reading" {"Read"} "writing" {"Write"} } }},` @{Name="Duration"; Expression={[int]([regex]::Match($_, "for\s(\d+)?\ssecs").Groups[1].Value)}},` @{Name="IOSize"; Expression={[int]([regex]::Match($_, "\tusing\s(\d+)?KB\s(sequential|random)").Groups[1].Value)}},` @{Name="IOType"; Expression={switch ([regex]::Match($_, "\tusing\s(\d+)?KB\s(sequential|random)").Groups[2].Value) { "random" {"Random"} "sequential" {"Sequential"} } }},` @{Name="PendingIO"; Expression={[int]([regex]::Match($_, "with\s(\d+)?\soutstanding").Groups[1].Value)}},` @{Name="FileSize"; Expression={[int]([regex]::Match($_, "\s(\d+)?\sMB\sfor\sfile").Groups[1].Value)}},` @{Name="IOPS"; Expression={[decimal]([regex]::Match($_, "IOs\/sec\:\s+(\d+\.\d+)?").Groups[1].Value)}},` @{Name="MBs_Sec"; Expression={[decimal]([regex]::Match($_, "MBs\/sec\:\s+(\d+\.\d+)?").Groups[1].Value)}},` @{Name="MinLat_ms"; Expression={[int]([regex]::Match($_, "Min.{0,}?\:\s(\d+)?").Groups[1].Value)}},` @{Name="AvgLat_ms"; Expression={[int]([regex]::Match($_, "Avg.{0,}?\:\s(\d+)?").Groups[1].Value)}},` @{Name="MaxLat_ms"; Expression={[int]([regex]::Match($_, "Max.{0,}?\:\s(\d+)?").Groups[1].Value)}}`
The select returns a new type of object that has Properties named Threads, Operation, Duration, IOSize, IOType, PendingIO, FileSize, IOPS, MBs_sec, MinLat_ms, AvgLat_ms, and MaxLat_ms. This alone is sufficient to proceed to creating an Excel file for the information, but I wanted the information sorted in the same manner every time I ran this, primarily because I change how I run my tests sometimes, and I like consistency in the output. Since I am so anal retentive thorough, I fixed this problem by piping the output from the select to the Sort-Object commandlet and sorted by the IOSize, IOType, Operation, and Threads.
Getting the data into Excel was actually fairly simple to do, thanks to Buck Woody(Blog|Twitter) and SQL Rockstar Tom LaRock(Blog|Twitter). Buck wrote a blog post titled Create an Excel Graph of your Big Tables – With Powershell, and Tom wrote a Simple Talk Article, Monitoring SQL Server Virtual Log File Fragmentation, that use the Interop Excel objects to create an Excel workbook from PowerShell, populate the workbook with data, and build charts using the data. My code for the Excel integration is based entirely on their examples.
$Excel = New-Object -ComObject Excel.Application $Excel.Visible = $true $WorkBook = $Excel.WorkBooks.Add() $WorkBook.WorkSheets.Item(1).Name = "RawData" $WorkBook.WorkSheets.Item(3).Delete() $WorkBook.WorkSheets.Item(2).Delete() $WorkSheet = $WorkBook.WorkSheets.Item("RawData") $x = 2 $WorkSheet.Cells.Item(1,1) = "Threads" $WorkSheet.Cells.Item(1,2) = "Operation" $WorkSheet.Cells.Item(1,3) = "Duration" $WorkSheet.Cells.Item(1,4) = "IOSize" $WorkSheet.Cells.Item(1,5) = "IOType" $WorkSheet.Cells.Item(1,6) = "PendingIO" $WorkSheet.Cells.Item(1,7) = "FileSize" $WorkSheet.Cells.Item(1,8) = "IOPS" $WorkSheet.Cells.Item(1,9) = "MBs/Sec" $WorkSheet.Cells.Item(1,10) = "Min_Lat(ms)" $WorkSheet.Cells.Item(1,11) = "Avg_Lat(ms)" $WorkSheet.Cells.Item(1,12) = "Max_Lat(ms)" $WorkSheet.Cells.Item(1,13) = "Caption"
$Results | % { $WorkSheet.Cells.Item($x,1) = $_.Threads $WorkSheet.Cells.Item($x,2) = $_.Operation $WorkSheet.Cells.Item($x,3) = $_.Duration $WorkSheet.Cells.Item($x,4) = $_.IOSize $WorkSheet.Cells.Item($x,5) = $_.IOType $WorkSheet.Cells.Item($x,6) = $_.PendingIO $WorkSheet.Cells.Item($x,7) = $_.FileSize $WorkSheet.Cells.Item($x,8) = $_.IOPS $WorkSheet.Cells.Item($x,9) = $_.MBs_Sec $WorkSheet.Cells.Item($x,10) = $_.MinLat_ms $WorkSheet.Cells.Item($x,11) = $_.AvgLat_ms $WorkSheet.Cells.Item($x,12) = $_.MaxLat_ms $WorkSheet.Cells.Item($x,13) = [string]$_.IOSize + "KB " + [string]$_.IOType + " " + ` [string]$_.Operation + " " + [string]$_.Threads + ` " Threads " + [string]$_.PendingIO + " pending" $x++}
$WorkBook.Charts.Add() | Out-Null $Chart = $WorkBook.ActiveChart $Chart.SetSourceData($WorkSheet.Range("H1:H$x")) $Chart.SeriesCollection(1).xValues = $WorkSheet.Range("M2:M$x") $Chart.SetSourceData($WorkSheet.Range("H1:H$x")) $Chart.SeriesCollection(1).xValues = $WorkSheet.Range("M2:M$x") $Chart.Name = "IOPS"
$WorkBook.Charts.Add() | Out-Null $WorkBook.ActiveChart.SetSourceData($WorkSheet.Range("I1:I$x")) $Chart = $WorkBook.ActiveChart $Chart.SeriesCollection(1).xValues = $WorkSheet.Range("M2:M$x") $Chart.Name = "MBs Sec"
I don’t know why, but the first Chart doesn’t refresh properly unless I call SetSourceData() and set the SeriesCollection xValues properties twice. Since the code block that loads the data into the worksheet tracks the rows with the $x variable, the Range for the chart is always row 1 to $x. The H column happens to be the column that holds the IOPS values and the I column holds the MBs Sec values in the workbook. The only thing missing at this point a parameter declaration at the head of the script file to require an input parameter $FileName which is passed to Get-Content to read the information.
param( [Parameter(Mandatory=$TRUE)] [ValidateNotNullOrEmpty()] [string] $FileName)
The end result is a set of charts showing your IOPS and MBs Sec results from the test run.
The complete script is attached to this blog post. If you use it let me know your thoughts.
|
|
|
|
|
|