THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Tamarick Hill

  • July, the 31 Days of SQL Server DMO’s – Day 1 (sys.dm_exec_requests)

    The first DMO that I would like to introduce you to is one of the most common and basic DMV’s out there. I use the term DMV because this DMO is actually a view as opposed to a function. This DMV is server-scoped and it returns information about all requests that are currently executing on your SQL Server instance. To illustrate what this DMV returns, lets take a look at the results.

    image

    image

     

    image

    image

    As you can see, this DMV returns a wealth of information about requests occurring on your server. You are able to see the SPID, the start time of a request, current status, and the command the SPID is executing. In addition to this you see columns for sql_handle and plan_handle. These columns (when combined with other DMO’s we will discuss later) can return the actual sql text that is being executed on your server as well as the actual execution plan that is cached and being used. This DMV also returns information about various wait types that may be occurring for your spid. The percent_complete column displays a percentage to completion for certain database actions such as DBCC CheckDB, Database Restores, Rollback’s, etc. In addition to these, you are also able to see the amount of reads, writes, and cpu that the SPID has consumed.

    You will find this DMV to be one of the primary DMV’s that you use when looking for information about what is occurring on your server.

  • July, the 31 Days of SQL Server DMO’s - Intro

    DMO’s burst onto the SQL Server scene in 2005 and when they did they unlocked a wealth of information. I’ve became a major fan of DMO’s as they tend to simplify my troubleshooting as well as provide me with valuable information about what is going on within the SQL Server engine. I would recommend that those of you who are not familiar with DMO’s, take the time to really learn more about them.

    For those of you who may not be familiar with DMO’s, for the month of July, I will be writing about one DMO per day. Don’t get me wrong, I’m no DMO expert or anything like that, but I’ve worked with them enough to feel that I can give you some good information about DMO’s to help you get started with using them. During these blog sessions, I will not be providing you with any complicated queries to solve all of your SQL Server problems that you may or may not have. I will be simply introducing you to various DMO’s and illustrating what type of information they provide. After you learn more about these individually, then you will be able to join whatever DMO’s you need to pull back the information you are seeking.

    I hope that you all benefit in some form or fashion from my next 31 DMO postings!!! Enjoy!

  • Logical Query Processing

     

    When I first began working with SQL Server several years ago, one thing that really hung me up was, not understanding why I could reference an Alias column in certain parts of my query, but not in all parts of my query. To be more specific, why I could reference my Alias column in my Order By clause, but I couldn’t reference the alias in my Group By, Having, or Where clauses. I never fully understood why this was so until I read about Logical Query Processing in “Microsoft SQL Server 2008 T-SQL Fundamentals” written by Itzik Ben-Gan.

    To illustrate the issue, when we run the following query against the AdventureWorks2012 database, referencing an Alias name in the Order By clause poses no issue for SQL Server and the data is returned.

    SELECT OrganizationLevel, YEAR(BirthDate) YearOfBirth, MONTH(HireDate) MonthOfHire, sum(VacationHours) TotalVacationHours

    FROM AdventureWorks2012.HumanResources.Employee

    WHERE OrganizationLevel = 4 --and YearOfBirth = 1985

    GROUP BY OrganizationLevel, Year(BirthDate), Month(HireDate), VacationHours

    HAVING VacationHours > 50

    ORDER BY YearOfBirth DESC, MonthOfHire DESC

    clip_image002

    However, if we uncomment the Alias reference in the WHERE clause, SQL Server will return an error

    SELECT OrganizationLevel, YEAR(BirthDate) YearOfBirth, MONTH(HireDate) MonthOfHire, sum(VacationHours) TotalVacationHours

    FROM AdventureWorks2012.HumanResources.Employee

    WHERE OrganizationLevel = 4 and YearOfBirth = 1985

    GROUP BY OrganizationLevel, Year(BirthDate), Month(HireDate), VacationHours

    HAVING VacationHours > 50

    ORDER BY YearOfBirth DESC, MonthOfHire DESC

    image

    So what exactly is the reason for this error? Well the answer comes down to the way that the query is broken down and processed by SQL Server. The order of query processing in SQL Server is:

    1. FROM

    2. WHERE

    3. GROUP BY

    4. HAVING

    5. SELECT

    6. ORDER BY

    Using the above order of processing, our sample query first pulled out all of the data FROM the AdventureWorks2012.HumanResources.Employee table and filters for only data WHERE OrganizationalLevel = 4. Next the query GROUP’s the result set BY OrganizationalLevel, Year(BirthDate), Month(HireDate), and VacationHours. Next the query performs another filter on this group set for anything HAVING VacationHours > 50. Next the query SELECT’s the specified columns from the SELECT list and assigns the Alias names at this point. Lastly the query ORDER’s the result set BY your specified criteria, and then returns the data to the requesting client.

    As we see, the alias names were not created until Step 5 (SELECT). As a result, this explains why we are able to reference the Alias names in Step 6 (ORDER BY), but not in Step 2 (WHERE), Step 3 (GROUP BY), or Step 4 (HAVING).

    Understanding how Logical Query Processing works will help you write better and more efficient queries. For more information on the topic, I highly recommend reading “Microsoft SQL Server 2008 T-SQL Fundamentals” written by Itzik Ben-Gan.

  • SQL Dependency DMV’s

     

    SQL Server Dynamic Management Views (DMV's), or more appropriately termed Dynamic Management Objects (DMO’s), have been around for quite a while now. They made their first appearance in SQL Server 2005 and they unlocked a wealth of information that was either impossible to get, or would take extremely complicated queries to get the information that you were looking for. Since I began learning about DMV's I have been a big advocate for using them. After the discovery of two new DMV's that were added in SQL Server 2008, my love for DMV's has grown even stronger.

    These two DMV's in question are used to track down objects that are referencing a specific SQL Server object(sys.dm_sql_referencing_entities), or objects that are referenced by a specific SQL Server object(sys.dm_sql_referenced_entities).

    DBA's are often tasked to identify all objects that may be depending upon a specific object or to find all objects that a specific object depends on. For example, let's use the AdventureWorks2012 database and let's assume that after deploying a new Employee Management solution, it is determined that we no longer need the table 'HumanResources. Department' which was used by our old legacy application. As a result, we need to identify all objects that may be referencing this table before we decide to drop it. To identify this we can use one of the SQL Dependency DMV's to get the needed information.

    SELECT referencing_schema_name, referencing_entity_name

    FROM sys.dm_sql_referencing_entities ('HumanResources.Department', 'OBJECT')

     

     

    This simple DMV was able to identify all objects, in this case only 2 views, which are currently referencing our 'HumanResources.Department' table. After gathering this output, we realize that we can probably decommission the 'vEmployeeDepartment' view as well. Before we do so, we need to analyze all objects that this view references in its definition. Again a simple DMV query will return the information needed.

    SELECT referenced_server_name, referenced_database_name, referenced_schema_name,

    referenced_entity_name, referenced_minor_name

    FROM sys.dm_sql_referenced_entities ('HumanResources.vEmployeeDepartment', 'OBJECT')

     

     

    This DMV was able to give us a vast amount of information about the view. It was able to tell us every object that is referenced in the view all the way down to the column level. If this view had any Cross Database or Cross Server references, that information would also be displayed.

     

    For more information about these two DMV's, please visit the below links:

    http://msdn.microsoft.com/en-us/library/bb630351(v=sql.110).aspx

    http://msdn.microsoft.com/en-us/library/bb677185(v=sql.110).aspx

     

     

  • Pros and Cons of Using Read-Uncommitted and NoLock

    Definitions

    Read-Uncommitted Isolation Level

    The read-uncommitted isolation level is the least restrictive isolation level within SQL Server, which is also what makes it popular for developers when looking to reduce blocking. Blocking is “typically” not an issue when using this isolation level because the query will not request shared (S) locks on the tables that it is reading. In addition to this, other processes are still allowed to read and modify data within any table that your query may be accessing under this isolation level. I say this is “typically” not an issue because there are some locks that are still generated and can cause certain types of blocking. A schema-stability (Sch-S) lock will be placed on the table(s) being accessed and a shared (S) lock will be placed on the database. The (Sch-S) lock on the table will only prevent DDL actions from occurring on the accessed tables such as dropping columns within the table or dropping the actual table itself. The (S) lock on the database is put in place to prevent the database from being dropped while it is being accessed. When an isolation level is used, the isolation level is in effect for everything running under that current connection.

    The syntax for using this isolation level is below:

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    Nolock Table Hint

    The nolock table hint behind the scenes performs the exact same action as running under the read-uncommitted isolation level. The only difference between the two is that the read-uncommitted isolation level determines the locking mechanism for the entire connection and the nolock table hint determines the locking mechanism for the table that you give the hint to. For example, if you have a stored procedure that will read data from 20 tables, the read-uncommitted isolation level would cause every table to be accessed without requesting shared (S) locks. If you only need a subset of those 20 tables to not acquire shared (S) locks, but have other tables that you need to only read committed data, then you would want to use the nolock hint instead.

    The syntax for using a nolock hint is below:

    SELECT * FROM DatabaseName.dbo.TableName WITH (NOLOCK)

    Benefits

    Reduced Blocking

    As stated in the above sections, the primary benefit of using the read-uncommitted isolation level or the nolock table hint is that this reduces blocking. Users are able to use these two options and not run the risk of being blocked by readers and writers within the database. In addition to this, users do not have to worry about running the risk of blocking other readers and writers within the database tables.

    Risks

    Dirty Reads

    There are many risks that you run when using the read-uncommitted isolation level or nolock table hints. The first risk that could occur is “Dirty Reads”. A dirty read occurs when your query reads a data page that is different in memory than the page on disk. Any data that has been changed, but not yet committed, is considered to be “dirty”. For some applications this may be ok, but for many applications, this can cause a major problem for the users that rely on the system.

    A typical example that is used throughout the SQL Server community to illustrate this issue is: Suppose you need to get some repairs done on your wife’s car and you pull up to an ATM machine to withdraw the money needed. Your account currently has $300 in it. You put your debit card into the machine, enter your pin, and tell the machine you want to withdraw $200. Your wife then calls you and tells you that the car is working fine now and you don’t have to worry about getting repairs. You immediately cancel your transaction and drive off. Little did you know, the weekly balance report that you requested from your bank began processing after you entered your withdrawal amount but before you cancelled your transaction. The report is then emailed to you saying you have a balance of $100. Immediately you are now concerned and wondering what happened to your other $200. The issue is that the bank report read ‘dirty’ data that had not yet been committed and then passed along that ‘dirty’ information to you. This is a primary example of when reading ‘dirty’ data is not a good idea.

    Non Repeatable Reads

    Non repeatable reads becomes an issue when your application has code that executes which requires for data to be read multiple times and return the same consistent data upon each read. When using the read-uncommitted isolation level or nolock table hints shared (S) locks are not requested on the tables you are reading. As a result writers are able to access those tables and modify the data within them during the middle of your read operations. This can drastically skew your expected results.

    To help illustrate this issue: Suppose I go to the bank to deposit some money into my account. I currently have a Checking account with a $100 balance and a Savings account with a $100 balance. I plan to deposit $300 into my checking account. Meanwhile, my wife is at home on her mobile application checking our account balance. The mobile application runs a stored procedure which reads the AccountBalance table one time for each account that you own, and then reads the table one final time to give you the sum of all your accounts. The application shows my wife we have $100 in the Checking account, $100 in the Savings account, and our two accounts have a total value of $500. Based on this information my wife received, she is completely confused by this inaccurate data.

    What happened is the stored procedure read the table multiple times expecting the data would be the same in each read. However, an outside transaction (the deposit) was able to modify the data in between the multiple reads that the stored procedure issued. The developer of the stored procedure never anticipated this but he used a nolock hint on the table so that the application would not run into any blocking and results would return quicker. The developer did indeed prevent blocking, but at the expense of creating the risk of the application returning inaccurate data.

    Same Data is Read Twice

    There are rare occasions when the same data can be read twice when using the read-uncommitted isolation level or nolock hint. To illustrate this issue we have to give a little background first. Clustered Indexes are created on SQL Server tables to physically order the data within the table based on the Cluster Key. The leaf pages of the index contain the data pages which contains the actual data for the table. Data pages can hold 8K worth of data.

    Scenario: You have an ETL process that will Extract all records from a table, perform some type of transformation, and then load that data into another table. There are two types of scans that occur in SQL Server to read data: allocation scans and range scans. Range scans occur when you have a specific filter (where clause) for the data you are reading, and an index can be used to help seek out those specific records. When you do not have a filter, an allocation scan is used to scan all of the data pages that have been allocated to that table. Pending you are not doing any type of sort operations, your data will read the data pages in the order as it finds them on the disk. For simplicity, let’s assume there is no fragmentation so your data pages are in order 1-10. So far your process has read pages 1-6. Remember your process is not requesting shared (S) locks so you are not blocking other users. Meanwhile, another process begins which inserts records into your table. This process attempts to insert records onto Page 3, but the page is full and the record will not fit. As a result the page has to be split and half of the records will remain on Page 3 and the other records will be moved to a new page which will be page 11. Your process has already read the data that was on Page 3, but now half of that data has been moved to page 11. As a result, as your process continues it will read Page 11 which contains data that has already been read. If there is no type of checks on the destination table, you will end up with bad duplicate data.

    This is an example of how changing the locking mechanism can cause you to read duplicate data into your process and reduce the integrity of the data.

    Phantom Reads

    A phantom read occurs when you read data that is there one minute, and gone the next. A phantom read can become problematic when you have a process that performs some type of operation based on the data that it has read. To illustrate this issue, suppose you have stock in CompanyA and CompanyA decides to pay all of its stock holders a dividend. There is a stored procedure that first reads a table with all of the current stock holders and builds a temp table based on that list. Then the stored procedure initiates a dividend payout process for all of the accounts in the temp table. In the middle of this procedure running, a stock-holder sales all of his shares in the company. The sale of the stock did not complete until after the temp table was built, but before the dividend process completed. This will cause an issue with the dividend process because it will not be able to pay the dividend since the person no longer owns the stock.

    Using the read uncommitted isolation level or no lock hint allowed the user to sale his stock at the same time the dividend process was running. If this isolation level or hint was not used, the user would have been blocked and unable to sale his stock until after the dividend payout completed.

    Misconceptions

    Using nolock table hints for update or delete statements

    Sometimes I see developers create code which performs Update or Delete statements and use the nolock table hint. It is important to note that this hint only works for read-only operations. Anytime you are modifying data, such as in an Update or Delete statement, this hint is completely ignored by SQL Server. When performing these types of operations, locks have to be generated to ensure that transactions are atomic, consistent, isolated, and durable.

    Conclusion

    SQL Server is a very complex enterprise database solution with many options and flags that can be changed to alter the behavior of SQL Server. Although many of these options have a justified use, it is important to understand the risks that are associated with changing these options. The read-uncommitted isolation level and nolock table hint are no exception to this rule. Generally it is best practice to stick with the default isolation level and refrain from using table/query hints unless it is absolutely necessary and the solution has been thoroughly tested. Using read-uncommitted and nolock should be the EXCEPTION and not the RULE.

More Posts « Previous page
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement