THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Linchi Shea

Checking out SQL Server via empirical data points

  • Killing a SQL Server thread? Don’t!

    Sometimes, when you kill a session (i.e. a spid) in a SQL Server instance, the spid just refuses to go away not because it’s doing a rollback. Perhaps, it’s stuck on a certain dependency on something external to SQL Server or it’s just simply stuck for some decipherable reasons. And the spid may hang around for as long as the instance is online and will only go away when the instance is restarted.

     

    In more than a few occasions, I have heard people wondering if they can just kill the thread of the SQL Server process from the OS, that is, the OS thread corresponding to the SQL Server spid.

     

    That is a very bad idea! Killing such an OS thread will likely crash the SQL Server instance.

     

    You can easily demonstrate the harm of doing this with Process Explorer. You’ll need to run Process Explorer locally on the server of your test SQL Server instance, and needless to say that this should not be done on a production instance.

     

    First, connect to the test SQL Server instance, and open a New Query window in SQL Server Management Studio. Note the spid for the connection, and run the following query (replacing <spid> with the spid for the connection) to find the kpid value in the resultset for the spid:

     

    select * from master..sysprocesses where spid = <spid>

     

    This kpid is the id the OS assigns to the thread that is currently bound to the worker executing a task or idling on the spid.

     

    Then, fire up Process Explorer and locate the OS process for the SQL Server instance. If you have multiple SQL Server instances running on the server, you can mouse over each sqlservr.exe to identify the right instance.

     

    Double click on the process, and select the Thread tab. This will give you a list of all the OS threads for the SQL Server instances. Highlight the Thread ID corresponding to the kpid found above, and click on the Kill button to terminate the thread.

     

    After you have killed the thread from Process Explorer, you may still be able to use the SQL Server instance for a while. You’ll for sure see stack dump error messages in the errorlog, and you may see error messages such as the following:

     

    2010-02-04 14:45:31.03 Server      Potential image corruption/hotpatch detected. This may be a sign of a hardware problem or caused by presence of CLR/jitted images on the stack. Check SQLDUMPER_ERRORLOG.log for details.

     

    The session related to the kpid value would be dead. But the SQL Server instance itself may or may not continue to function. If you repeat the above steps to kill the threads for the other user sessions, eventually this will crash the SQL Server instance, i.e. the SQL Server instance will become unresponsive. After that, the only way to get the instance back online is to restart it. Most likely, you'll have to first kill the corresponding process from the OS because the instance probably won't respond to any of the normal restart methods.

     

     

     


  • What server is it trying to connect to?

    It’s common to be called to find out why an app cannot connect to one of your database servers. So you start by checking the server, and it’s working just fine. Then, you check the client machine and its database connectivity to the server, and that is also no problem. But still, the app continues to fail on trying to connect to the server.

     

    Naturally, you start to question whether the app is trying to connect to the right server, i.e. the server you have just checked. But the app folks swear up and down that it is the right server name, and they even show you the name in the their configuration file. Sure enough, that’s the name specified in the config file.

     

    What do you do now?

     

    You need to absolutely verify what server the app is really trying to connect to, not what people may believe it is trying to connect to. Here, Process Explorer can come in very handy.

     

    So, simply fire up Process Explorer (you can launch it from command line as procexp), and double click on the app’s process. Go to TCP/IP tab, and watch which server it tries to connect. When it tries to connect to server XYZ, you should see a line showing XYZ in the Remote Address column along with SYN_SENT in the State column.

     

    Process Explorer is a fantastic tool that is widely known and used in many troubleshooting situations already. The above is one of the situations where it can save you tons of time chasing ghosts.

     


  • Would you optimize SQL for less performance?

    What do you mean? Okay, that does sound like an oxymoron, doesn't it?

    Let's say you are trying to optimize a stored procedure, and your proposed change results in the stored procedure running not faster, but perhaps a bit slower than it currently does. Can you then declare your optimization effort a success?

    Just some food (well a question) for thought.


  • Add-Content and Out-File are not for performance

    When you write Powershell scripts and need to write a text file, you have a number of ways to accomplish that. The often suggested approach is to use cmdlets Add-Content or Out-File. Well, this is not news.

     

    But some may not notice that these cmdlets are not there for performance. Sure, they are convenient. But if you need to write to a file many times, you may want to consider something else. To demonstrate the point, try the following script, which tests the performance of writing to a text file 10,000 times:

     

    # Measure Add-Content

    $t = get-date;

    for ($a=1; $a -le 10000; $a++)

    {

       add-content -path "c:\junk\test1.txt" -value "Hello";

    }

    $s = $(get-date).subtract($t).TotalMilliseconds;

    write-output ("Add-Content:  " + $s);

     

    # Measure Out-File

    $t = get-date;

    for ($a=1; $a -le 10000; $a++)

    {

       "Hello" | out-file -filepath "c:\junk\test2.txt" -append

    }

    $s = $(get-date).subtract($t).TotalMilliseconds;

    write-output ("Out-File:     " + $s);

     

     

    # Measure StreamWriter

    $t = get-date;

    $sw = New-Object System.IO.StreamWriter "c:\junk\test3.txt";

    for ($a=1; $a -le 10000; $a++)

    {

        $sw.WriteLine("Hello")

    }

    $sw.Close();

    $s = $(get-date).subtract($t).TotalMilliseconds;

    write-output ("StreamWriter: " + $s);

     

    The output from this script is similar to what I obtained from my laptop and shown below:

     

    Add-Content:  34015.8427

    Out-File:     37781.4918

    StreamWriter: 781.255

     

    From running this scripts multiple times on my laptop, using StreamWriter was consistently about 4 ~ 5 times faster than either Add-Content or Out-File. This should not really be surprising because each time you call Add-Content or Out-File, you open and close the file, and that is expensive.

     

    Performance wise, using Add-Content or Out-File is similar to using StreamWriter in the following way (i.e. including opening and closing the file inside the loop):

     

    # Measure StreamWriter, open/close the file for every write

    $t = get-date;

    for ($a=1; $a -le 10000; $a++)

    {

        $sw = New-Object -typename System.IO.StreamWriter("c:\junk\test4.txt", "true");

        $sw.WriteLine("Hello")

        $sw.Close();

    }

    $s = $(get-date).subtract($t).TotalMilliseconds;

    write-output ("StreamWriter2:" + $s);

     

    And if you run the above script, you should find its performance very similar to that of Add-Content or that of Out-File.

     

    So what is the takeaway? Well, do not use Add-Content or Out-File if performance matters such as when you write a massive number of lines to a file.

     


  • Bad database practices: allowing apps to connect to the server hostname

    It’s common to see a client application referencing the hostname of a SQL Server instance in its connection string. For instance, assume that you have a server whose hostname is NYCSQL01, and you install a default SQL Server instance on it. Naturally, you can connect to the SQL Server instance as NYCSQL01, and that’s the name you give to your clients for their apps to connect to.

     

    Why is this a bad practice?

     

    It’s a bad practice when you have to move the database that the client application is using to a different host. You may think that happens rarely enough to be an issue. But in the real world, moving a database to a different server can happen for a variety of reasons, and is not a rare event. If you reference the server hostname in a connection string of a client app, such a move would force you to modify the client app’s connection string. And that can be a nightmare!

     

    Ideally, no connection strings should be hardcoded in an application, and they should be easily editable. In practice, this may not be the case. But even if you have no problem editing the connection strings, finding out all the connection strings that reference the hostname and making sure you are not missing anything may not be an easy task. In fact, it’s often a huge challenge. And then, when a database is used by many apps, coordinating all the client-side changes is yet another big headache.

     

    The right thing to do is to shield your client apps from any change in the server hostname so that you can move a database to a different server without having to make any change on the client app side. To accomplish this, you can’t allow your client apps to connect with the server hostname. In other words, you need to remove your client apps from any dependency on the server hostname.

     

    A common solution is to create a network alias that will redirect the client connection to the same IP address as that of the hostname, and enforce the rule that a client app can only connect to SQL Server via a network alias. With this configuration, when you need to move a database to a different host, you just have to update the network alias to point to the new host, and the client app can be left untouched.

     

    A number of points are worth noting.

     

    First, this is not just an issue with the hostname of a standalone server. The virtual network name of a clustered SQL Server instance should be treated the same. That is, no client app should connect using the virtual network name of the clustered SQL Server instance. A network alias to that virtual network name should used instead. The reason is the same because you may need to move a database from that clustered instance to a different instance, and you may not be able to use the same virtual network name.

     

    The second noteworthy point is that it may be handy, though not always necessary, to create a separate network alias for each user database (or a group of tightly coupled databases) so that each user database can be moved independently.

     

    The final point to note is that it is convenient to manage the network aliases in a clustered instance by exposing them as network name resources. This makes it easy for the DBAs to know exactly what network aliases are defined and to check if they are functioning properly. For this reason alone, it may be worthwhile to run a SQL Server instance on a single-node cluster instead of a standalone server.

     

    Different environments adopt the network alias approaches differently. Some environments were forced to require network aliases because some databases (not necessarily SQL Server databases) need to failover between data centers on top of storage replication, and the client apps need to deal with that failover. There is no choice but to use a network alias. Other environments are more proactive in enforcing network aliases, and are reaping the benefits of that best practice. Still some environments are just limping along with unnecessary client dependencies on server hostnames without realizing there is a much better alternative. I hope you are not in this last category.

     


  • Get an overview report on your log shipping setups

    [Updated Dec 23 to include a T-SQL script at the bottom of the post].

     

    You may have an environment where log shipping is set up on various databases between various servers. Would it be handy to have an overview report highlighting which database is being log shipped where? Even if you have some kind of record to keep track of the databases that are being log shipped, wouldn’t it be nice to have an automated means to help validate that record?

     

    By an overview report, I mean a three-column report that lists, for each log-shipped database, its source server, its name, and all the secondary servers.

     

    Obviously, how to accomplish this depends on how your log shipping is set up. Let’s assume that you are using the native log shipping feature of SQL Server 2005/2008. In that case, one of the easiest approaches to collecting the log shipping information would be to run system stored procedures such as sp_help_log_shipping_monitor.

     

    But which servers should you run sp_help_log_shipping_monitor? Well, for the report to be complete, you should just run it against all your SQL 2005/2008 instances. Or else, how do you know you are not missing some log shipped databases.

     

    So before you can scan your environment for log shipped databases, you would, at the bare minimum, have to first provide a list of the SQL Server instances in your environment. Then, it’s just a matter of running sp_help_log_shipping_monitor against each SQL instance on the list, sort out, and combine the results. If a SQL instance contains a database that is being log shipped (regardless where this is the primary or a secondary), sp_help_log_shipping_monitor will return a list of the log-shipped databases along with their primary and secondary SQL instances. If a SQL instance does not have any log-shipped database, sp_help_log_shipping_monitor does not return anything.

     

    By gleaning and collating information from the results of running sp_help_log_shipping_monitor against all the SQL Server 2005/2008 instances in your environment, you can then construct a comprehensive picture of the log shipping configurations in your environment. The attached Perl script does precisely that.

     

    Note that you can’t just download the Perl script and run it against your environment as is. I’ve left a key subroutine—getProdSQL()—unfinished, and you need to fill it with the logic particular to your environment. This usually is just a few lines of code that retrieve from a database the list of the names for the SQL Server instance from that you want to scan for log shipping. I have no way of writing the logic for such routine for you.

     

    Before you can run the script, you also need to change a few SMTP mail related parameters so that they are correct in your environment. The SMTP email routine is included so that the script can email the extract log shipping overview report to you. This is not essential, but convenient.

     

    [Added the following section on Dec 23] 

    If you have a table that stores the SQL2005/2008 instances in your environment, and you have a utility server that has a linked server configured for each of these SQL2005/2008 instances, you can also use the following simple T-SQL to scan for and retrieve the log shipping configurations in your environment:

     

    if object_id('tempdb..#log_shipping_entries') is not null

       drop table #log_shipping_entries

     

    create table #log_shipping_entries (

       primary_server             sysname,

       secondary_server           sysname,

       database_name              sysname

    )

     

    if object_id('tempdb..#log_shipping_monitor') is not null

       drop table #log_shipping_monitor

      

    create table #log_shipping_monitor (

       status                     bit NULL,

       is_primary                 bit NULL,

       server                     sysname NULL,

       database_name              sysname NULL,

       time_since_last_backup     int NULL,

       last_backup_file           nvarchar(500) NULL,

       backup_threshold           int NULL,

       is_backup_alert_enabled    bit NULL,

       time_since_last_copy       int NULL,

       last_copied_file           nvarchar(500) NULL,

       time_since_last_restore    int NULL,

       last_restored_file         nvarchar(500) NULL,

       last_restored_latency      int NULL,

       restore_threshold          int NULL,

       is_restore_alert_enabled   bit NULL

    )

     

    declare @prod_server sysname,

            @sql nvarchar(2000)

     

    declare prod_cr cursor

    for select distinct ServerName

         <… You need to fill in the rest of the query to retrieve the

            Server names in your environment>

       

    OPEN prod_cr

     

    FETCH NEXT FROM prod_cr into @prod_server

     

    WHILE @@FETCH_STATUS = 0

    BEGIN

    select @sql = N'EXEC [' +

                  @prod_server +

                  N'].master..sp_help_log_shipping_monitor'

     

        truncate table #log_shipping_monitor

        insert #log_shipping_monitor

        EXECUTE (@sql)

       

        insert #log_shipping_entries

    select distinct t1.server as primary_server,

                    t2.server as secondary_server, t2.database_name

          from (select distinct server, database_name

                  from #log_shipping_monitor

                 where is_primary = 1) as t1,

               #log_shipping_monitor t2

         where t1.database_name = t2.database_name

           and t2.is_primary = 0

      

         FETCH NEXT FROM prod_cr into @prod_server

    end

    close prod_cr

    deallocate prod_cr

     

    select * from #log_shipping_entries

     


  • The Transact-SQL Prime Directive – a bad example

    A while back, I ranted that the design and implementation of Transact-SQL should be guided by a prime directive that guarantees no interference with the flow of set-based data in Transact-SQL.

     

    That was primarily motivated by the fact that no such guarantee exists today in T-SQL. That is, when you move set-based data around in T-SQL, you may find it stopped dead in tracks for no apparent or meaningful reason.

     

    Recently, I’ve run into another example of not being able to move set-based data around in T-SQL, and here is how to reproduce it.

     

    You need to have three SQL Server 2008 instances, and let’s assume they are SQLA, SQLB, and SQLC.

     

    On SQLB, create a linked server to SQLC, and create the following stored procedure:

     

    use master
    go
    create proc p_testB
    as
    create table #tmp(i int)
    insert #tmp
    EXEC SQLC.master.dbo.p_testC
    select * from #tmp
    go


    On SQLC, create a linked server to SQLB, and create another stored procedure:

     

    use master
    go
    create proc p_testC
    as
    create table #tmp(i int);
    insert #tmp
    select *
      from openquery(SQLB, 'select top 1 id from sysobjects')
    select * from #tmp
    go

     

    Now, if you want to access the resultset of p_testB from SQLA, you naturally expect to be able to do a remote proc call on SQLA as follows (assuming that linked server to SQLB is already created):

     

    EXEC SQLB.master.dbo.p_testB;

     

    The only problem is that the above call does not work, and you would get the following error message:

     

    Msg 3910, Level 16, State 2, Line 1
    Transaction context in use by another session.
    Msg 1206, Level 18, State 199, Procedure p_testB, Line 4
    The Microsoft Distributed Transaction Coordinator (MS DTC) has cancelled the
    distributed transaction.

     

    Now, strictly speaking and to some extent, this is not a T-SQL issue per se. Rather, it has something to do with how SQL Server 2008 handles loopback calls and the implementation of MSDTC.

     

    However, the point is that it does not matter what the underlying root cause for this problem is. What matters is that you have no guarantee that you can move set-based data around in T-SQL, and that is a fundamental problem. If the design and implementation of T-SQL had been guided by a prime directive such as stated earlier, guaranteeing the flow of set-based data would have taken precedence and whatever the underlying limitations with MSDTC (or changes with how loopback calls are handled or problems with any underlying supporting technologies) may be, they would have been fixed/modified/replaced in order to guarantee the set-based data flow.


  • Performance impact: Speeding up SMO script generation – the test code

    The attached is the C# code that I used for generating the test results charted in my previous post.

     

    Note that for testing no generated script is actually persisted into a file by this program. The Script() method is applied, but the resulting script strings are thrown away. For example, here is how a stored procedure is scripted:

     

    StringCollection sc = db.StoredProcedures.ItemById(objectId).Script();

    foreach (string s in sc)

      { };

     

    You can run the compiled program as follows:

     

    cmd>smpScript.exe <server name> <database name> <number of threads>


  • Performance impact: Speeding up SMO script generation

    Generating scripts through SMO can be as simple as walking down the database object tree and applying the Script() method to each scriptable object. Well, that is until you start to try it on a database that has a large number of objects (say a few thousands), and the long wait the becomes rather frustrating. I’m not talking about such silly performance problems as the horrific roundtrips caused by the IsSystemObject check (which has a simple workaround). Rather, it’s just sloooow to script out thousands (or even a few hundreds) of database objects.

     

    But there is hope, and one could even argue that it’s a matter of how you use SMO.

     

    One way to significantly speed up scripting the objects in a database is to do away with single-threaded programming and resort to multiple threads.

     

    I have run some simple tests with a various number of threads in a C# program against a database that has 632 top-level objects, including tables, stored procedures, user-defined functions, and views, but not counting triggers, indexes, and constraints that hang off each table.

     

    The chart below shows that on my test workstation, using 6~8 threads in the C# program produced the optimal performance in terms of the scripting elapsed time. When done in a single thread, the performance was the worst. With 6~8 threads, the elapsed time was almost four times shorter than with a single thread, a ~400% improvement. Not bad with a simple change!

     

     

     

    Unfortunately, in 99% of cases I see people scripting the objects in a database via SMO in a single thread. If you drive it with Powershell, I'm not sure if there is a simple way to do multithreading. If you know how to do multithreading in Powershell, please share. 

     

    The reason that I could dramatically speed up the script generation via SMO using multiple threads is that the resources on the workstation and on the server were greatly underutilized with a single thread. Note that when I pushed it hard with 50 threads, overhead on the workstation became a significant drag. But even in that case, it performed much better than a single-thread program.

     

    It is worth reiterating that the specific numbers charted above were obtained on my workstation. In a different setting, the number of threads for the most optimal performance may vary. But I do not expect the overall message of multiple threads leading to shorter script generation time with SMO to change.

      

     


  • Performance impact: UNION ALL views, ANSI_PADDING, and bad query plans

    Whether or not you specify it explicitly, ANSI_PADDING setting is there when you create a table, and can have an impact on the performance of some queries. If you are not careful, it can even hurt performance big time!

     

    Let’s demonstrate that with an extremely simple example.

     

    Assume that you have this simple view with a UNION ALL between two tables:

     

    CREATE VIEW myView

    as

    select * from tb1

    UNION ALL

    select * from tb2

    go

     

    The two tables are created as follows:

     

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING OFF

    GO

     

    -- Create the main table   

    CREATE TABLE tb1 (

         [TicketID]    [char](13),

         [Edition]     [smallint],

         [FlowType]    [char](3),

         [Amount]      [float] NOT NULL,

         [Code]        [char](3) NOT NULL,

         [LogDate]     [datetime] NOT NULL,

         [id]          [int] NOT NULL

    )

    GO

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

     

    CREATE TABLE [tb2](

         [TicketID]    [char](13),

         [Edition]     [smallint] NOT NULL,

         [FlowType]    [char](3),

         [Amount]      [float] NOT NULL,

         [Code]        [char](3) ,

         [LogDate]     [datetime] NOT NULL,

         [id]          [int] NOT NULL

    )

    GO

     

     

    And you run the following query on the view to return a single row from tb1:

     

    SELECT TicketID, Amount

      FROM myView

     WHERE FlowType ='2'

       and TicketID = '200002000'

       and Edition = 0

     

    Since both tb1 and tb2 have a covering index on this query (see the attached repro script for the indexes), you’d expect an efficient index seek. But you may discover that SQL Server chooses to process the query with an index scan, which could easily sink the performance if the table is rather large.

     

    What’s going on?

     

    Well, notice that the only difference between tb1 and tb2 is that tb1 is created with ANSI_PADDING set to OFF whereas tb2 is created with ANSI_PADDING set to ON. If you create these two tables with the same ANSI_PADDING setting, SQL Server does generate a correct plan for the query.

     

    So for some reason, the SQL Server 2005/2008 optimizer can’t deal with this type of mismatched ANSI_PADDING settings in a view intelligently. Although one may question whether the two tables should have different ANSI_PADDING settings, it’s clear that the generated plan is not optimal. And I’d argue that this is a bug in the SQL Server optimizer because there does exist an optimal plan that uses an index seek, and the optimizer fails to find it.

     

    To see the details of the behavior for yourself, you can run the attached repro script in any test database. The repro demonstrates the bad plans on both SQL Server 2005 and SQL Server 2008. On SQL Server 2000, you should see a good plan with an index seek. But this does not mean that SQL Server 2000 is immune to the problem. In fact, with a different but slightly more complex repro script, I see the same bad plans on SQL Server 2000.

     

    Note that to save time, the repro script populates the table with 100,000 rows, and can finish in several seconds on any decent machine. With 100,000 rows, you won’t feel any pain even with a bad plan that does an index scan. If the table had millions of rows, the pain would be quite acute.

     

    Also, I have filed a bug report at Microsoft Connect. Please vote there to get some traction from Microsoft on this issue.

     


  • Bad database practices: abusing linked servers

    In SQL Server, it is rather handy to retrieve data from a different SQL Server instance and use the result locally in another SQL statement for further processing. In theory and in the set purists’ fantasy land, it shouldn’t matter where you get your data or even how you get the data as long as you can use them to further compose a set-based solution.

     

    That is all fine. After all, in a distributed environment the data you want may not be on the same instance where the processing takes place. Well, that is fine until it comes to performance and troubleshooting. If you use linked servers indiscriminately, sooner or later you’ll run into these issues, and they are not pretty.

     

    In the real world, you may have seen different kinds of linked server abuses than what I have seen. For me, two types of abuse stand out rather prominently. (For brevity, I’ll limit the discussions primarily to remote procedure calls, although they apply as well to any other use of linked servers such as openquery and heterogeneous joins.)

     

    The worst kind of linked server abuse is in making an excessive number of linked server hops. Let me clarify what I mean by that. If ProcA makes a call to ProcB via a linked server ServerB and ProcB does not make any further cross server linked server calls, it would be one hop. If ProcB itself makes another one-hop call to ProcC on ServerC, there would be two hops from ProcA to ProcC, and so on and so forth.

     

    Generally speaking, I would consider a two- or three-hop call stretched, but in reality I have seen calls that span four hops. I can only classify that as abusive, even though a lot of times the developer who added the very last hop did not know that the proc he was calling had three hops already. He was just making use of the resultset, a practice perfectly in alignment with the set-based approach.

     

    So what is the problem? The headache lies in performance and troubleshooting.

     

    Although getting data across several server hops does not necessarily mean poor performance, there is no doubt that increasing the number of hops significantly increases the susceptibility to poor performance. Unless you are getting a tiny amount of data, retrieving data across server is relatively expensive, and if that is subject to some implicit loop in an execution plan, your chance of poor performance has just been dramatically magnified. As pointed out in my previous post, you may end up processing data on the more expensive side of the server boundary, and by adding another hop, you only manage to further obscure that danger rather than expose or highlight it. This is especially dangerous when you have layers of views on top of cross-server queries and that view is hidden several hops down stream. You may think you are just doing a local join when in fact you are doing a distributed join.

     

    In addition to real or potential performance problems, excessive number of server hops makes it very difficult to troubleshoot an issue, whether it’s performance or function related. The fact that SQL Server error messages rarely identify which server they come from only help exacerbate the grief. Often, this forces you to have to execute the calls at different levels in order to see where the problem may come from, a task that may be hard to perform in many circumstances. Just getting the right parameters for a call at each level can become time consuming.

     

    The second type of linked server abuse that I consider among the worst is to combine multiple layers (or hops) of linked server calls with dynamic SQL. For instance, to gain flexibility, one may store linked server names in a table, and construct calls dynamically at the run time. This solves the problem of keeping the code stable when one changes the linked server names. But this approach makes it nearly impossible to troubleshoot when there is a problem somewhere in one of the hops because looking at a message you would have no idea where it comes from, and you can’t just scan through the code to spot problematic location. It’s not a pretty situation, and I wouldn’t wish that on anyone.

     

    It is often beyond your control as to why there are so many linked server hops. And once they are there, they can quickly become such a tangled web that any attempt to shrink the number of hops is too painful to be worth any benefits such a successful attempt may bring about. Hopefully, by highlighting the pain one may suffer from the abuse, we can help deter its future practice.

     


  • I didn't know you could do this!

    create proc p_test

    as

      -- multiple resultsets

      select @@servername as ServerName;

      select 'abc';

     

      select srvname from sysservers;

    go

     

    create table #tmp(ServerName sysname)

    go

     

    INSERT #tmp

    EXECUTE p_test;

     

    I did not know that the last statement above (i.e. INSERT EXECUTE) would automatically concatenate the multiple resultsets returned from the procedure p_test and insert them into the target table, and you didn’t have to do UNION ALL explicitly. But it’s a better practice to always include UNION ALL, I think, to communicate the intention explicitly.

     


  • Bad database practices: moving data to procedures vs. moving procedures to data

    Is it better to move data to procedures or move procedures to data?

     

    The answer is, of course, “it depends.” Let’s consider a scenario where you have two SQL Server instances: ServerA and ServerB, and you have a procedure on ServerB (call it procB), but need to access data on ServerA.

     

    Three database solutions are common:

     

    Solution 1:    You can first copy the data from ServerA to ServerB (using any number of technologies such as replication or a simple query via a linked server), and then have procB access the data locally on ServerB, however many times.

    Solution 2:   You can have procB access the data on ServerA from ServerB through a linked server using a mechanism such as a distributed join or an openquery.

    Solution 3:    You can create a procedure procA on ServerA, and have procB make a remote procedure call to procA to retrieve its result, and then further process the returned data in procB on ServerB, if necessary.

     

    In a given application, any of these three solutions may be optimal, depending on the characteristics of that application. In particular, it depends on (1) the amount of data the procedure needs to access, (2) how current the data must be, and (3) how often the procedure needs to access the data.

     

    For instance, if the amount of data on ServerA is large, it is okay for procB to read from a copy of the data that may be slightly or even significantly lagging behind its source, and (3) it needs to be accessed frequently in procB, then some version of Solution 1 can be a good choice.

     

    The key consideration is that moving data, especially a large amount of data, across servers can be expensive, and moving a piece of code is cheap.

     

    It follows that if a process involves moving a large amount of data across server boundary for processing on a different server, it may be cheaper to move the code, or a relevant part of it, to where the data resides, and only send back the result, if the result is significantly smaller than the amount of data needed for processing.

     

    Note that what is considered ‘a large amount’ should not be measured in terms of how many gigabytes, or even how many megabytes. Rather, it should be measured by the relative cost of moving the data versus that of the total process. So for instance, your procedure may be moving only 128KB of data from server A to server B, which generally speaking should be inexpensive, and the whole procedure executes quickly. But if this procedure is called very frequently, moving that 128KB constantly effectively ends up moving ‘a large amount’ of data, and can become rather expensive.

     

    This sounds trivial. But unfortunately, I have seen too many cases in the real world where little thought is given to these considerations when creating procedures that access data across multiple servers.

     

    In some instances, it is a case of abstraction abuse. By that, I mean people sometimes are too easy getting carried away with using other stored procedures and views (i.e. abstraction units), not knowing that they are incurring expensive cross-server data operations, or not exploring alternative options to cut down moving data across servers.

     

    In other cases, programming convenience ends up trumping application performance. For instance, I have seen a case that goes something like the following:

     

    The procedure has a three table join that includes two tables from a linked server and a very small local table. The result is inserted into a temp table. The temp table is further processed in a series of update statements. Some of the updates use yet another very small local table. The final result set from these updates is small, and is used in the rest of the procedure.

     

    It turns out that the two remote tables are relative large, and this procedure is called frequently. In addition, since the distribution join touches a significant proportion of each table of these two remote tables, basically all the rows are copied over locally to process the join.

     

    One way to optimize this procedure is to move the code segment mentioned above to the remote server and wrap it in a procedure there. This basically moves the entire process of preparing the final result set into the procedure on the remote server, and the local procedure now just needs to make a call to this new procedure, gets the small result set and move on with the rest of its processing. This is a perfect example where you want to move your procedures to your data instead of moving your data to your procedures.

     

    So the bad practice is to not think through the trade-offs between moving data to procedures and moving procedures to data when architecting cross server solutions, and end up moving an unnecessary amount of data. 


  • Find the complete call tree for a stored procedure

    Would it be nice to print out the complete call tree of a stored procedure? By complete call tree, I mean the following:

     

    ·              At the very top level, the call tree should identify all the procedures that are called directly.

    ·              For any given level, the next level of the call tree should identify all the procedures called by the procedures at the current level.

    ·              If a procedure calls another procedure on a different server via a linked server, that call should be identified on the call tree.

    ·              The leaf level identifies the procedures that do not call any other procedure.

     

    In other words, the procedure call tree should recursively identify all the dependent procedures that are called either directly or indirectly regardless whether they are on the same instance or a different instance.

     

    I don’t know whether there is any tool out there that would print out such a procedure call tree. But this is not a trivial task if you want to be 100% complete. For instance, if another procedure is called through a piece of dynamic SQL, all bets are off. In that case, what is called will not be known until the run time when the dynamic SQL is executed. That is beyond the reach of any static code analysis.

     

    Another difficulty is that there is no rigorous T-SQL parser that is publicly available you can use to shred the code of a stored procedure to find all the stored procedure calls. Using a rigorous and efficient T-SQL parser would have been the ideal way to accomplish this task.

     

    Fortunately, as long as we are not talking about looking for all the dependencies including called procedures, views, tables, and so on, we can manage to get the procedure call tree with a little bit of regular expressions on the fact a stored procedure is always called with an EXECUTE statement. Although there are syntactic variations, it is rather straightforward to single out all the EXECUTE statements in a piece of T-SQL code, and then parse for the stored procedure names being called.

     

    The attached Perl script does exactly that. If you want to find out what are being called by a procedure, say, p_MyGreatProc in database myDB on server NYSQL, you can run the Perl script as follows:

     

    cmd>perl spCallTree.pl NYSQL.myDB.myOwner.p_Foo > callTree.log

     

    Yes, the script is written to demand a four-part procedure name on the command line on purpose so as to remove any ambiguity, and all the stored procedures in the output are presented in the four-part format for the same reason.

     

    An example of the logged output callTree.log may look like the following:

     

    NYSQL.myDB.myOwner.p_Foo =>

    {

       'NYSQL.db1.dbo.p_Foo1' =>

        {

               'NYSQL.db2.dbo.p_Foo2' =>

               {

                     'PASQL.db1.dbo.p_Foo3' =>

                     {

                           'CASQL.db2.dbo.sp_executesql' => undef

                     },

                     'NJSQL.db3.dbo.sp_executesql' => undef

               },

               'NYSQL.db.dbo.sp_executesql' => undef

        },

        'NYSQL.db1.dbo.p_Foo4' =>

        {

               'NYSQL.db2.dbo.p_Foo5' => undef,

               'NJSQL.db2.dbo.p_Foo6' => undef

        }

    };

     

    In this example, p_Foo calls several procedures directly or indirectly across a bunch of servers, including NYSQL, NJSQL, PASQL, and CASQL, and undef signifies that the procedure does not call any other procedure. So for instance, in the above output, NYSQL.myDB.myOwner.p_Foo calls NYSQL.db1.dbo.p_Foo1, which in turn calls NYSQL.db2.dbo.p_Foo2, which in turn calls PASQL.db1.dbo.p_Foo3, which in turns calls CASQL.db2.dbo.sp_executesql.

     

    In the case of sp_executesql, more procedures may be called inside the dynamic SQL string executed by sp_executesql. But since we can’t evaluate the string, the script does not look further, and simply points it to undef. In case of EXECUTE on a dynamic SQL string variable, the output will simply identify the variable just in case you want to examine it further.

     

    The Perl script callTree.pl finds out the call tree by scanning the code of the top most procedure for any EXECUTE statement after it has removed all the comments and quoted strings. Well, you would not want to include any reference to EXECUTE mentioned in a comment block, right? For each EXECUTE statement, it parses for the name of the procedure being called, including all the four parts (if necessary). It calls sp_helptext to get the code for the called procedure, and then scans the code for any EXECUTE statement. And this continues recursively until no more procedures are called.

     

    I wrote this script fairly quickly as a throw-away script, but have used it many times. I can’t claim that it’s bug free, but so far has worked fine. You do need to have Perl on your machine to run the script. I’ve been using it with Perl v5.10.0, binary release 1003. You can download the latest release of ActivePerl from ActiveState.

     

    If you need to debug a production issue that involves a procedure makes nested procedure calls with some of the calls going across liked servers, you would appreciate the need to see the complete call tree, and therefore, the utility of this little Perl script.

     

    I have seen a fair amount of cross-server nested procedure calls, and they are simply nasty. I definitely have better things to do than manually find out which calls which and where. I’m sure you too. So hopefully you’ll find this little script useful.

     

    And if you have a better or a different way of printing out the complete call tree for a procedure, let me know.

     

     


  • Bad database practices: managing databases without a DBA central inventory

    Aaron Bertrand has been writing an excellent series about Bad habits to kick, highlighting some of the bad practices, primarily, in the areas of T-SQL coding. I’m going to steal his idea and comment on the bad practices I have seen in managing databases.

     

    If we are going to play a phrase association game, the first thing that comes to mind after “bad database practice” is “managing databases without a central DBA inventory or repository”.

     

    It seems to me that there is no need to argue for the necessity of a central inventory of your database assets in order to manage them effectively, just like you need to keep an inventory whether you are Wal-Mart or a corner store. Sadly though, the need for a central DBA database inventory is not universally recognized, and not all shops rigorously keep track of their database assets or take full advantage of the inventory, if they have one. .

     

    I will submit to those who are not maintaining a central DBA inventory diligently and those who are maintaining somewhat an inventory only haphazardly that you can improve your database management significantly if you invest just a bit of time to create and maintain a central DBA inventory and integrate it into your database management practices.

     

    Briefly, a DBA central inventory should have information on all your servers, your databases, and their configurations. This may include information that can be automatically collected such as the service accounts, the startup parameters, the collations, the sp_configure settings, the drive configurations, the filegroups and files, the most recent backups, and so on. It may also include information that you need to enter manually such as the preferred clustered nodes, who owns which database, what app uses which database, the key contact info for each database, the maintenance windows, the status of each database in its lifecycle, and so on.

     

    For the central inventory to be useful, it is absolutely necessary that the information is comprehensive, covering everything about your SQL Server database environment across the enterprise. Essentially, it should be everything in your instances and databases minus actual user data, i.e. all the meta data. So in addition to server instances and databases, you should also record the configured linked servers including their configurations, jobs, database mail configurations.

     

    Since most of the information can be collected automatically, the marginal cost of being comprehensive is very small, but the potential added benefits are enormous. Just consider the case when your server is trashed and you need to rebuild it, but to your horror not all the information for rebuilding the server to its exact pre-crash state is available. You do not want to be in that situation.

     

    You can argue that a lot of meta information about the databases can be obtained by restoring the databases from their backups, and therefore the value of keeping the information in a separate central inventory is not great. Well yes, you can always try to retrieve the information from restoring the backups. It’s just that sometimes you may not have the backup (okay, that really should not happen, but it does happen). On a more practical level, it often takes time to do the restore when it would be much quicker to just query the inventory.

     

    When you do have a central inventory in place and it is being diligently kept up to date, you can effectively drive a lot of your database administration with that central inventory. It will for sure make it that much easier to meet your management reporting needs. You never know what kind combination of aggregations your management may ask of you. Some more typical ones include summary reports on the storage space allocation and usage on various criteria (e.g. by owner, by app, weekly/monthly trends), summary reports on certain severe SQL errors that have become interesting of late across the enterprise, and certain SLA summaries covering all the servers.

     

    A DBA central inventory can also provide much relief to your day-to-day operations. For instance, in addition to whatever monitoring/alerting setup you may have in place, often you may need to do a sanity check of your environment on demand. You can develop the checks for one server, and then just drive the same checks across all the servers with a list of the servers from the central inventory. How about getting a list of servers that are not patched to a certain level? Well, that’s just a simple SQL query away.

     

    Moreover, a well-kept DBA central inventory can serve as an effective communication tool among the DBAs, making the work as a team more consistent. After all, you do not want critical information to be kept in some DBA’s head. While that DBA can do an excellent job, the other DBAs may struggle to get the information right, resulting in wasted time or inconsistent deliveries in meeting the customer request.

     

    And if you are a best practice freak, you’d love to have a DBA central inventory because a lot of best practices can therefore be checked through running some queries. For instance, checking the consistency of certain database configurations across the enterprise is then just a matter of running a query.

     

    Much more can be said about the benefits of maintaining a comprehensive and up-to-date DBA central inventory, and using it to drive many of your database administration tasks.

     

    If you do not currently have such a central inventory in place or are not taking full advantage of it, I hope this post will get you to think about it and perhaps decide to invest some more time in it to boost your database administration to another level.

     


More Posts Next page »

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement