THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Tibor Karaszi

  • Adding a PK online?

    I just read in a forum about a user who want to replikate a table, but the table doesn't have a PK. The table is pretty large, and having the table not available while adding the PK is undesireable. The table has a clustered index already, and there are other columns which are known to be unique (presence of unique indexes).

    What I wanted to test is whether we can just add the PK constraint using the ONLINE option. Show answer is "yes". We can't turn a unique index into a PK using some meta-data only operation, unfortunately. That would be the easiest step. But we can add a unique constraint using the ONLINE option - there's even an example syntax for this in BOL. We can then remove the pre-existing unique index using ONLINE. Since we are using ONLINE, we need to be on Enterprise or Developer Edition.

    I wanted to test this, and below is my test script:

    USE tempdb
    SET NOCOUNT ON
    GO

    IF OBJECT_ID('t'IS NOT NULL DROP TABLE t
    GO
    CREATE TABLE t(c1 INT NOT NULL, c2 CHAR(100))
    CREATE UNIQUE CLUSTERED INDEX ON t(c1)

    INSERT INTO t
     
    SELECT TOP(5000000ROW_NUMBER() OVER(ORDER BY a.id), 'x'
     
    FROM syscolumns AS 
     
    CROSS JOIN syscolumns AS b
     
    CROSS JOIN syscolumns AS c
    GO

    -----------------------------------------------------
    --Now try to add a PK "online"...:
    -----------------------------------------------------

    --Add a nullable identity?
    ALTER TABLE ADD c3 INT IDENTITY NULL
    --Msg 8147, Level 16, State 1, Line 1
    --Could not create IDENTITY attribute on nullable column 'c3', table 't'.
    GO


    --Add a PK using ONLINE?
    --Prepare a new connection with following INSERTs
    --to verify it can run simultaneously:
    --INSERT INTO t(c1, c2) VALUES(5000001, 't')
    --INSERT INTO t(c1, c2) VALUES(5000001, 't')
    --INSERT INTO t(c1, c2) VALUES(5000002, 't')
    --GO
    --INSERT INTO t(c1, c2) VALUES(5000003, 't')

    --Above prepared? OK, execute below and jump to
    --other window to verify it is online
    ALTER TABLE ADD CONSTRAINT PK_t PRIMARY KEY NONCLUSTERED (c1WITH(ONLINE = ON)
    GO

    --Verify the indexes using my own sp_indexinfo
    EXEC sp_indexinfo 't'

  • Is there and overhead to RPC events?

    I recently read a discussion whether RPC events add overhead compared to just submitting text. I got curious and did some testing, which I want to share.

    Background, SQL events
    Using most APIs, if you just submit a query to SQL Server, you will get what we call an SQL event (for instance in Profiler an SQL:BatchCompleted). The client app submits some text, SQL Server parses the text, and either creates an execution pland or re-uses a cached plan (if such exists, based on checksum of the query text submitted). Now, this is very straight forward. But there's also problems, mainly risk for SQL Injection and bad query plan re-use. Your query probably has a WHERE clause, and if you just concatenate the values you will have in the WHERE clause, then you will submit different text each time. Different text means different execution plans - bad plan re-use. Also, if you have some malicious user or are under hacker-attack, then somebody can inject some query or part of query inside what you think would be only the search value (this is a classic).

    Stored procedures as RPC events
    Now, consider if we are using stored procedures. We might want to pass the proc name, and the values for each parameter, where we pass the parameter values as binary data. Do this right (for instance in ADO, you configure your command object's CommandType property as being CommandType.StoredProcedure (instead of CommandType.Text which is default). This causes the batch be shown in Profiler as an RPC event, for instance RPC:Completed. So, consider executing the same procedure either by constructing a string such as "EXEC myProc @p1 = '20080223', @p2 = 'Stockholm'" as text (SQL event), or passing it as a stored procedure using parameter objects (RPC event). We expect RPC event to be more efficient, right? We'll get to that in a moment.

    Queries as RPC events
    Now, we can also submit queries, without using stored procedures, so they are passed as RPC events. We do this by adding at least one parameter to the command object. This causes ADO.NET to use an RPC event and execute sp_executesql, parameterizing accoring to the parameter object for the command object. This causes your code to be safe from SQL injection and will allow for better plan re-use. I.e., in most cases a very good thing - second best to use stored procedures!

    Non-parameterized queries as RPC events
    Now, consider if we don't want to parameterize our queries (we don't have a WHERE clause, we just don't want to, we use some programming layer which doesn't allow for it, or we are just plan curious - like now). Can we still get RPC events? Yes, by adding a dummy parameter to the command object. We don't have to use this command object in the query - just the existence causes an RPC event instead of SQL event. This allow us to compare SQL events to RPC event and look at only the overhead for the sp_executesql part.

    Overhead for using sp_executesql
    Below is some VB.NET code which does just that. For the first two, we use exactly the same query multiple executions and compare timing for doing them as SQL events or RPC events. For the second two, we change what produceID we search for (but not parameterizing the search condition) which causes different execution plans for each execution. I verified by counting execution plans that the first two uses the same execution pland and the second two generates a new plan for each iteration in the loop. Here's the VB.NET code:

    Imports System.Data
    Imports System.Data.SqlClient
    Module Module1

        
    Sub Main()

            
    Dim As Int32iterations As Int32 20000
            
    Dim tickcount As Int64
            
    Dim ProductID As Int32 43664
            
    Dim ProdStatus As Int16
            
    Dim sql As String

            
    sql "SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = "
            
    Dim cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")
            
    cn.Open()

            
    Dim utilCmd As New SqlCommand("DBCC FREEPROCCACHE"cn)
            
    utilCmd.ExecuteNonQuery()

            
    'Definitions
            '**********************************************************************************************
            'Same query every execution, SQL:BatchCompleted
            
    Dim cmdAllText As New SqlCommand(sql "43664"cn)
            
    cmdAllText.CommandType CommandType.Text

            
    'Same query every execution, RPC:Completed
            
    Dim cmdParmText As New SqlCommand(sql "43664"cn)
            
    cmdParmText.CommandType CommandType.Text
            cmdParmText.Parameters.Add
    ("@dummy"SqlDbType.Int)
            
    cmdParmText.Parameters("@dummy").Value -9999

            
    'Different query every execution, SQL:BatchCompleted
            
    Dim cmdAllText2 As New SqlCommand()
            
    cmdAllText2.Connection cn
            cmdAllText2.CommandType 
    CommandType.Text

            
    'Different query every execution, RPC:Completed
            
    Dim cmdParmText2 As New SqlCommand()
            
    cmdParmText2.Connection cn
            cmdParmText2.CommandType 
    CommandType.Text
            cmdParmText2.Parameters.Add
    ("@dummy"SqlDbType.Int)
            
    cmdParmText2.Parameters("@dummy").Value -9999



            
    'Execution
            '**********************************************************************************************
            
    Console.WriteLine("Same query every execution, SQL:BatchCompleted")
            
    tickcount Environment.TickCount
            
    For To iterations
                ProdStatus 
    cmdAllText.ExecuteScalar()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)

            
    Console.WriteLine("Same query every execution, RPC:Completed")
            
    tickcount Environment.TickCount
            
    For To iterations
                ProdStatus 
    cmdParmText.ExecuteScalar()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)

            
    Console.WriteLine("Different query every execution, SQL:BatchCompleted")
            
    tickcount Environment.TickCount
            
    For To iterations
                cmdAllText2.CommandText 
    sql i.ToString()
                
    ProdStatus cmdAllText2.ExecuteScalar()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)

            
    Console.WriteLine("Different query every execution, RPC:Completed")
            
    tickcount Environment.TickCount
            
    For To iterations
                cmdParmText2.CommandText 
    sql i.ToString()
                
    ProdStatus cmdParmText2.ExecuteScalar()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)

            
    Console.ReadLine()

        
    End Sub

        Sub 
    PrintOuput(ByVal iterations As Int32ByVal tickcount As Int64)
            
    Console.WriteLine("Number of executions: " iterations.ToString)
            
    Console.WriteLine("Total exec time (ms): " tickcount)
            
    Console.WriteLine("Exec time per query (ms): " tickcount iterations)
            
    Console.WriteLine("")
        
    End Sub


    End 
    Module

    And here's a typical execution result from above code:

    Same query every execution, SQL:BatchCompleted
    Number of executions: 20000
    Total exec time (ms): 1357
    Exec time per query (ms): 0,06785

    Same query every execution, RPC:Completed
    Number of executions: 20000
    Total exec time (ms): 1513
    Exec time per query (ms): 0,07565

    Different query every execution, SQL:BatchCompleted
    Number of executions: 20000
    Total exec time (ms): 2402
    Exec time per query (ms): 0,1201

    Different query every execution, RPC:Completed
    Number of executions: 20000
    Total exec time (ms): 14446
    Exec time per query (ms): 0,7223

    As you can see, there is an overhead to use RPC events and sp_executesql. I.e., if we don't parameterize our command object, then we only pay with no gain. The overhead is definitely noticeable were we generate a new plan for each execution. An interesting aspect is that if I remove the call to DBCC FREEPROCCACHE, (so we get new plan for each iteration in the loop, but plans can be re-used from prior execution of the VB code), then this is much cheaper and closer to the others (about 5.7 seconds instead of 14 seconds). So, we do pay extra for sp_executesql especially when we generate a new plan.

    Is above a weird example?
    Yes, it is, and I want to emphasize that. Typically, you will use sp_executesql when you actually parameterize your queries. That will cause better plan re-use (and protect from SQL injection). Above is only to show whether sp_executesql has a cost - without considering the benefits.

    How about using stored procedures?
    Say you are using stored procedures, and wonder how different ways to execute a procedure compares? Let's have a look. We can do it three ways (probably more, but below are the ones I'm interested in):

    1. Build the proc command and pass it as a string. No parameter objects, no sp_executesql. You are still open to SQL injection. Don't do this, we only want to show it for comparsion reasons. Profiler will show an SQL:BatchCompleted event with TextData something like "EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664".
    2. Use parameters for your command objects, but you forget to define the command object of type CommandType.StoredProcedure. Profiler will show an RPC:Completed event with TextData something like "exec sp_executesql N'EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID',N'@theID int',@theID=43664".
    3. The right way. Define the command object as of type CommandType.StoredProcedure, and of course add parameter object to the command object. Profiler will show an RPC:Completed event with TextData something like:
      "declare @p2 tinyint
      set @p2=5
      exec dbo.GetOrderStatusOut @SalesOrderID=43664,@Status=@p2 output
      select @p2"
      Note that the Profiler tool adds the declare, set and select parts, it is not part of what is submitted from the client app. The client app actually submit the paramaters as binary values. Profiler add this to be nice to us, so we can copy that text and execute it, read it, etc.

    Anyhow, here's the VB code:

    Imports System.Data
    Imports System.Data.SqlClient
    Module Module1

        
    Sub Main()

            
    Dim As Int32iterations As Int32 20000
            
    Dim tickcount As Int64
            
    Dim ProductID As Int32 43664
            
    Dim ProdStatus As Int16
            
    Dim sql As String

            Dim 
    cn As New SqlConnection("SERVER=localhost\a;DATABASE=AdventureWorks2008;INTEGRATED SECURITY=TRUE")
            
    cn.Open()

            
    'Create the procedures we will execute
            
    Dim utilCmd As New SqlCommand(""cn)
            
    utilCmd.CommandText "IF OBJECT_ID('dbo.GetOrderStatusRes') IS NOT NULL DROP PROC dbo.GetOrderStatusRes" vbCrLf
            utilCmd.CommandText 
    +"IF OBJECT_ID('dbo.GetOrderStatusOut') IS NOT NULL DROP PROC dbo.GetOrderStatusOut"
            
    utilCmd.ExecuteNonQuery()
            
    utilCmd.CommandText "CREATE PROC dbo.GetOrderStatusRes @SalesOrderID int AS SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID"
            
    utilCmd.ExecuteNonQuery()
            
    utilCmd.CommandText "CREATE PROC dbo.GetOrderStatusOut @SalesOrderID int, @status tinyint OUT AS SET @status = (SELECT status FROM Sales.SalesOrderHeader WHERE SalesOrderID = @SalesOrderID)"
            
    utilCmd.ExecuteNonQuery()

            
    'Run each once so it is in cache
            
    utilCmd.CommandText "EXEC dbo.GetOrderStatusRes @SalesOrderID = 43664"
            
    utilCmd.ExecuteScalar()
            
    utilCmd.CommandText "DECLARE @s tinyint EXEC dbo.GetOrderStatusOut @SalesOrderID = 43664, @status = @s OUT"
            
    utilCmd.ExecuteNonQuery()

            
    utilCmd.CommandText "DBCC FREEPROCCACHE"

            
    'Definitions
            '**********************************************************************************************
            'Exec proc as string, SQL:BatchCompleted
            
    Dim cmdAllText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = " ProductID.ToString(), cn)
            
    cmdAllText.CommandType CommandType.Text

            
    'Parameterize, but still as text, RPC:Completed with sp_executesql
            
    Dim cmdParmText As New SqlCommand("EXEC dbo.GetOrderStatusRes @SalesOrderID = @theID"cn)
            
    cmdParmText.CommandType CommandType.Text
            cmdParmText.Parameters.Add
    ("@theID"SqlDbType.Int)
            
    cmdParmText.Parameters("@theID").Value ProductID

            
    'Parameterize as stored procedure, RPC:Completed
            
    Dim cmdRpc As New SqlCommand("dbo.GetOrderStatusOut"cn)
            
    cmdRpc.CommandType CommandType.StoredProcedure
            
    Dim prmSalesOrderID As SqlParameter = New SqlParameter("@SalesOrderID"SqlDbType.Int)
            
    prmSalesOrderID.Value ProductID
            cmdRpc.Parameters.Add
    (prmSalesOrderID)
            
    Dim prmStatus As SqlParameter = New SqlParameter("@Status"SqlDbType.TinyInt)
            
    prmStatus.Direction ParameterDirection.Output
            
    cmdRpc.Parameters.Add(prmStatus)


            
    'Execution
            '**********************************************************************************************
            
    utilCmd.ExecuteNonQuery()
            
    Console.ReadLine()

            
    Console.WriteLine("Exec proc as string, SQL:BatchCompleted")
            
    tickcount Environment.TickCount
            
    For To iterations
                ProdStatus 
    cmdAllText.ExecuteScalar()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)
            
    Console.ReadLine()

            
    utilCmd.ExecuteNonQuery()
            
    Console.WriteLine("Parameterize, but still as text, RPC:Completed with sp_executesql")
            
    tickcount Environment.TickCount
            
    For To iterations
                ProdStatus 
    cmdParmText.ExecuteScalar()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)
            
    Console.ReadLine()

            
    utilCmd.ExecuteNonQuery()
            
    Console.WriteLine("Parameterize as stored procedure, RPC:Completed")
            
    tickcount Environment.TickCount
            
    For To iterations
                cmdRpc.ExecutteNonQuery
    ()
            
    Next
            
    tickcount Environment.TickCount tickcount
            
    Call PrintOuput(iterationstickcount)

            
    Console.ReadLine()

        
    End Sub
        Sub 
    PrintOuput(ByVal iterations As Int32ByVal tickcount As Int64)
            
    Console.WriteLine("Number of executions: " iterations.ToString)
            
    Console.WriteLine("Total exec time (ms): " tickcount)
            
    Console.WriteLine("Exec time per query (ms): " tickcount iterations)
            
    Console.WriteLine("")
        
    End Sub

    End 
    Module

    And here's the result from a typical execution:

    Exec proc as string, SQL:BatchCompleted
    Number of executions: 20000
    Total exec time (ms): 1810
    Exec time per query (ms): 0,0905

    Parameterize, but still as text, RPC:Completed with sp_executesql
    Number of executions: 20000
    Total exec time (ms): 1700
    Exec time per query (ms): 0,085

    Parameterize as stored procedure, RPC:Completed
    Number of executions: 20000
    Total exec time (ms): 1388
    Exec time per query (ms): 0,0694

    We can see that doing it the "proper" way is cheapest, but there's not a big difference between the three. The first alternative is not good, though, since we aren't protected from SQL injection. And since you then will be using parameter object anyhow, just go ahead and define the CommandType as stored procedure while you're at it.


  • Using sa as owner for jobs and databases

    This blog is not about avoiding logging in using the sa login. Hopefully we all know about this, and work towards avoidning this practice.

    Instead I want to talk about using sa, but not to login (authenticate), but as owner for jobs and databases. I want keep these thing de-individualized - so we avoid things like person A leaving the company and we don't dare to remove that login/Windows account. We can of course create some SQL login or Windows login especially for this purpose and use that. But sa is already there. Another advantage is that sa always has the same sid number (makes moving databases across instances a bit easier).

    The way Agent work is that if the owner is member of sysadmin server role, then it won't attempt any imersonation for the job steps. I.e., Agent won't use SETUSER (2000 and earlier) or EXECUTE AS USER = (2005 or later). This means that Agent will never actually authenticate using sa (Agent will always authenticate using a Windoes authentication - and then verify that it is sysadmin). I.e., we can change password for sa, disable sa, or even run in Windows Only mode.

    And, just to be obvious: If the job should be owned by some individual, in order for operating in a proper security context, then we should use that individual as owner and not sa!

    How do you handle job and database ownership? Do you have situations where the owner does matter, details?


  • Util procedure to show database size

    One of the first things I want to do when I look at a new SQL Server is to get an idea of space usage details for each database, including total space usage. For this I have been using my own sp_db_space_usage for a while now, so I decided to add it to my website (www.karaszi.com). Check it out here.
  • Be careful with constraints calling UDFs

    You might just not get what you think. I would be surprised if this hasn't been blogged already, but if so, it would be worth repeating. Here's the deal (example from a forum,, slightly re-worked):

    I want the values in one column to be unique, assuming the value in another column is 1. Can I use an UDF for that?

    On the surface, yes. You can write an UDF to wich you pass the value which should be conditionally unique and in that UDF check how many rows has this value AND othercolumn = 1. If more than 1 row, then function returns 0, else 1 (or something else to signal "OK" or "Not OK"). Now, you can call this function in a CHECK constraint. Something like CHECK(myFunction(uniqueCol) = 1). this will on the surface do its job, as long as you INSERT into the table. But if you update a row and only set the otherColumn for some row from 0 to 1, then the check constraint will not be checked. The optimizer is smart enough to understand that the update doesn't change anything that we refer to in our CHECK constraint, so why bother checking the constraint? End result here is that the constraint doesn't do what we want it to do. Use a trigger instead (or some other method). Here's a repro:

    USE tempdb
    GO
    IF OBJECT_ID('t'IS NOT NULL DROP TABLE t
    IF OBJECT_ID('t_uq'IS NOT NULL DROP FUNCTION t_uq
    GO

    CREATE TABLE t(c0 INTc1 NVARCHAR(50), c2 bit)
    GO

    CREATE FUNCTION t_uq(@c1 NVARCHAR(50))
    RETURNS bit
    AS
    BEGIN
     DECLARE 
    @ret bit
     
    IF (SELECT COUNT(*) FROM WHERE c1 @c1 AND c2 1) > 1
       
    SET @ret 0
     
    ELSE 
       SET 
    @ret 1
     
    RETURN @ret
    END
    GO

    ALTER TABLE ADD CONSTRAINT t_c CHECK(dbo.t_uq(c11)

    INSERT INTO t(c0c1c2VALUES(1'a'0--OK
    INSERT INTO t(c0c1c2VALUES(2'a'0--OK
    INSERT INTO t(c0c1c2VALUES(3'b'1--OK
    INSERT INTO t(c0c1c2VALUES(4'b'1--Fails

    --So far so good, but watch now:

    UPDATE SET c2 WHERE c0 2
    --No error, the constraint doesn't do its job!

    --We have invalid data:
    SELECT FROM t


  • Ola Hallengren's maint procedures supports striping and Red Gate

    Ola has released a new version of his db maint scripts, which now includes support for striping as well as Red-Gate. Ola updates these scripts now and then, so it is worth cheking out his versions page:

    http://ola.hallengren.com/Versions.html


  • Found it!

    "What exiting item or idea did you find?", you probably think... The answer is: SQL Server nostalgia.

    Some of you might know that in the very early years, there were three companies involved in releasing SQL Server on the "PC" platform. In addition to Sybase and Microsoft, we also had Ashton-Tate, which at that time were market leader for "PC databases" with dBASE. I was involved with Microsoft here in Sweden and I vaguelly recall a meeting with the other involved partners and I was given a tie-clip from Ashton-Tate. The other day, I wondered whether I saved that little gem, and Indeed I did:

    Ashton-Tate tie clip

    How about that for useless piece of information? :-)


  • Do you perform log backup for the model database?

    Are you stupid, you might think... But stop and think for a while. Model is no different from other databases. And by default it is in full recovery model. So as soon as you do your first database backup (you do backup your system databases, right?) the log for model will start filling up and autogrow. "But, hey, I don't do any modifications in model!", you probably say now. Fair, but other things happens in each database from time to time. Bottom line is that ldf file for model will start growing after a while . Perhaps not huge, but I find it "un-neat" to have a model with 3 MB mdf file and 20 MB ldf file.

    Personally I prefer to have model in simple recovery since I feel that is a better default recovey model. An alternative is to regurarly set model in simple recovery and back to full recovery (schduled job).


  • Will EMPTYFILE on primary ldf "doom" it somehow?

    I just read a newgroup question whether doing SHRINKFILE with the EMPTYFILE option for the primary log file somehow cause ill effects.

    Shrinkfile for the ldf will not move any data (log records) or so. For an ldf file it is basically a preparation to tell the engine that you are about to remove this file (ALTER DATABASE ... REMOVE FILE).

    Now, the first (primary) log file is special and cannot be removed. So, what if we do an EMPTYFILE on the primary log file. Will we end up in some limbo-state? I did a test and performed EMPTYFILE on the primary file. Nothing bad happened. I then did EMPTYFILE on the other log file and removed that file successfully. So it seems that this should not cause any havoc. Just pretend you never did that EMPTYFILE operation against the primary log file.

    It isn't doable to create a repro script which show shrinking and removing nf log files.It will require some engagement for you. The reason is that we never know from what file and where the nect virtual log file comes from. So, if you are about to run below, be prepared to read up on DBCC LOGINFO and other command, understand what VLF is, perhaps some operation need to be done everal times before what we expect will happen... And as always, use at own risk.

    --Drop and create database named x
    SET NOCOUNT ON
    USE 
    master
    IF DB_ID('x'IS NOT NULL DROP DATABASE x
    GO
    CREATE DATABASE [x] ON  PRIMARY 
    NAME N'x'FILENAME N'C:\DemoDatabases\DbFiles\a\x.mdf' SIZE 10MBFILEGROWTH 3MB )
     
    LOG ON 
     
    NAME N'x_log'FILENAME N'C:\DemoDatabases\DbFiles\a\x_log.ldf' SIZE 2MB FILEGROWTH 1MB)
    ,( 
    NAME N'x_log2'FILENAME N'C:\DemoDatabases\DbFiles\a\x_log.ldf2' SIZE 2MB FILEGROWTH 1MB)
    GO

    --Get the database out of "auto-truncate" mode.
    ALTER DATABASE SET RECOVERY FULL
    BACKUP DATABASE 
    TO DISK = 'nul'

    --Fill up the log some
    USE x
    CREATE TABLE t(c1 INT IDENTITYc2 CHAR(300DEFAULT 'a')
    GO
    INSERT INTO DEFAULT VALUES
    DELETE FROM 
    t
    GO 2000

    --Investigate log
    DBCC SQLPERF(logspace)
    DBCC LOGINFO

    --Empty log
    BACKUP LOG TO DISK = 'nul'

    --Investigate log
    DBCC SQLPERF(logspace)
    DBCC LOGINFO

    --"Empty" primary log file
    DBCC SHRINKFILE(2EMPTYFILE)

    --Investigate log
    DBCC SQLPERF(logspace)
    DBCC LOGINFO

    --Fill up the log some
    INSERT INTO DEFAULT VALUES
    DELETE FROM 
    t
    GO 2000

    --Investigate log
    DBCC SQLPERF(logspace)
    DBCC LOGINFO

    -- Do above several times and see
    -- that 2 is still allocated from...

    --Can we get rid of file 3?
    BACKUP LOG TO DISK = 'nul'
    DBCC SHRINKFILE(3EMPTYFILE)
    --We might need to do above a few times 
    --until 3 is "clean" - no used VLFs

    --Investigate log
    DBCC SQLPERF(logspace)
    DBCC LOGINFO

    ALTER DATABASE x REMOVE FILE x_log2

    --Might need to do some stuff to get rid of file physically
    CHECKPOINT
    BACKUP 
    LOG TO DISK = 'nul'
    GO
    SELECT FROM sys.database_files


  • SQL Server Agent jobs and user contexts

    In what user context does a job run? I recently found myself in a forum discussion and gave my stock reply, later realizing that I haven't actually tested this for a long time (I used to demo this in class during 6.5 courses - when we actually had time for slight diversions). Lets start with my assumptions:

    1. Job owned by sysadmin, TSQL jobsteps:
      Agent log in to SQL Server using its own credential (windows authentication from service account) and execute the TSQL commands - no impersonation performed.
    2. Job owned by sysadmin, other jobsteps:
      Agent starts a process using its service account - no impersonation performed.
    3. Job owned by non-sysadmin, TSQL jobstep:
      Agent log in to SQL Server using its own credential (windows authentication from service account), then uses EXECUTE AS LOGIN = '<job_owner_login>' to "impersonate" the login who owns the job. I.e., the TSQL commands will be executed in the security context of the job owner's login.
    4. Job owned by non-sysadmin, other jobsteps:
      Agent starts a process using the service account as specified by the Agent proxy selected for this jobstep. (See Books Online "sp_add_proxy" and "CREATE CREDENTIAL" and this for more information about Agent Proxies and credentials in SQL Server.)

    So, how to prove above? Lets use a combination of jobstep output file and a Profiler trace. The ouput files will be used to catch the login/user names, and the Profiler trace to verify what TSQL commands will be submitted by Agent (for TSQL jobstep). For the TSQL jobsteps, we execute:

    SET NOCOUNT ON
    SELECT 
     
    CAST(ORIGINAL_LOGIN() AS VARCHAR(20)) AS Original_login
    ,
    CAST(SUSER_SNAME() AS VARCHAR(20)) AS Effective_login
    ,
    CAST(USER_NAME() AS VARCHAR(20)) AS Db_user
     

    And to test the other jobsteps, where we will be using a CmdExec jobstep as example (we are interested in what Windows account the process will be started as), and use whoami.exe which returns the windows user name. Each jobstep is configured to output the result to a file.

    • I change my real machine name to "Machine" in this text.
    • I'm not in a domain.
    • I'm logged in interactively (to Windows) as Windows account "Tibor".
    • I'm logged in to SQL Server using Windows authentication.
    • I'm sysadmin.
    • My service account for both Agent and SQL Server services is "SQLService".

    Here's the result from having the job owned by me (being sysadmin):

    Original_login       Effective_login       Db_user            
    -------------------- -------------------- --------------------
    MACHINE\SqlService   MACHINE\SqlService   dbo

    machine\sqlservice

    As you see, no attempted impersonation performed. Steps 1 - 2 in above list confirmed.

    We now going to test this for somebody who isn't sysadmin:

    • Create SQL Server login named "Kalle"
    • Create user in msdb for above
    • Assign Kalle permission to create jobs (SQLAgentUser role in msdb)
    • Create account "MySqlProxy" in Windows.
    • Creade credential "c_MySqlProxy" in SQL Server for above.
    • Create proxy "p_MySqlProxy" for above, and allow login "Kalle" to use this for CmdExec jobsteps.

    Now we modify the job and set the owner to Kalle, and also specify for the CmdExec jobstep to use the proxy p_MySqlProxy. Also, since non-sysadmins can't use output files (I didn't know that), we instead configure each step's output to go to table. Execute job. No, finally we can check each job step output for result:

    Original_login       Effective_login       Db_user            
    -------------------- -------------------- --------------------
    MACHINE\SqlService   Kalle                guest

    machine\mysqlproxy

    The first section above confirms that Agent uses it's own login, but then changes login context to Kalle (verified by Profiler trace capturing the "EXECUTE AS LOGIN = N'Kalle' WITH NO REVERT" command). The job step was configured to run in the master database; since I didn't add Kalle as a user to master, you see the user name guest.

    The second section verifies that The CmdExec step started a process using the mysqlproxy windows account, and specified using our SQL Server Agent Proxy and SQL Server credential object.

    (I will not reply to support questions here. For support, I recommend you visit a forum, for instance this. Other discussions, comments and corrections are of course very welcome!)

    Finally, I scripted out the job (as owned by me - being sysadmin), if you want to re-create it and play with it. Use at own risk, do apropriate modifications etc.

    USE [msdb]
    GO

    /****** Object:  Job [CredTest]    Script Date: 09/19/2009 12:12:47 ******/
    BEGIN TRANSACTION
    DECLARE 
    @ReturnCode INT
    SELECT 
    @ReturnCode 0
    /****** Object:  JobCategory [[Uncategorized (Local)]]]    Script Date: 09/19/2009 12:12:47 ******/
    IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
    BEGIN
    EXEC 
    @ReturnCode msdb.dbo.sp_add_category @class=N'JOB'@type=N'LOCAL'@name=N'[Uncategorized (Local)]'
    IF (@@ERROR <> OR @ReturnCode <> 0GOTO QuitWithRollback

    END

    DECLARE 
    @jobId BINARY(16)
    EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'CredTest'
           
    @enabled=1
           
    @notify_level_eventlog=0
           
    @notify_level_email=0
           
    @notify_level_netsend=0
           
    @notify_level_page=0
           
    @delete_level=0
           
    @description=N'No description available.'
           
    @category_name=N'[Uncategorized (Local)]'
           
    @owner_login_name=N'MACHINE\Tibor'@job_id @jobId OUTPUT
    IF (@@ERROR <> OR @ReturnCode <> 0GOTO QuitWithRollback
    /****** Object:  Step [A_TSQL]    Script Date: 09/19/2009 12:12:48 ******/
    EXEC @ReturnCode msdb.dbo.sp_add_jobstep @job_id=@jobId@step_name=N'A_TSQL'
           
    @step_id=1
           
    @cmdexec_success_code=0
           
    @on_success_action=3
           
    @on_success_step_id=0
           
    @on_fail_action=2
           
    @on_fail_step_id=0
           
    @retry_attempts=0
           
    @retry_interval=0
           
    @os_run_priority=0@subsystem=N'TSQL'
           
    @command=N'SET NOCOUNT ON 
    SELECT  
     CAST(ORIGINAL_LOGIN() AS VARCHAR(20)) AS Original_login
    ,CAST(SUSER_SNAME() AS VARCHAR(20)) AS Effective_user
    ,CAST(USER_NAME() AS VARCHAR(20)) AS Db_user'

           
    @database_name=N'master'
           
    @flags=8
    IF (@@ERROR <> OR @ReturnCode <> 0GOTO QuitWithRollback
    /****** Object:  Step [B_OS]    Script Date: 09/19/2009 12:12:48 ******/
    EXEC @ReturnCode msdb.dbo.sp_add_jobstep @job_id=@jobId@step_name=N'B_OS'
           
    @step_id=2
           
    @cmdexec_success_code=0
           
    @on_success_action=1
           
    @on_success_step_id=0
           
    @on_fail_action=2
           
    @on_fail_step_id=0
           
    @retry_attempts=0
           
    @retry_interval=0
           
    @os_run_priority=0@subsystem=N'CmdExec'
           
    @command=N'whoami.exe'
           
    @flags=16
    IF (@@ERROR <> OR @ReturnCode <> 0GOTO QuitWithRollback
    EXEC @ReturnCode msdb.dbo.sp_update_job @job_id @jobId@start_step_id 1
    IF (@@ERROR <> OR @ReturnCode <> 0GOTO QuitWithRollback
    EXEC @ReturnCode msdb.dbo.sp_add_jobserver @job_id @jobId@server_name N'(local)'
    IF (@@ERROR <> OR @ReturnCode <> 0GOTO QuitWithRollback
    COMMIT TRANSACTION
    GOTO 
    EndSave
    QuitWithRollback:
        
    IF (@@TRANCOUNT 0ROLLBACK TRANSACTION
    EndSave:

    GO

  • SQL Client config, 32 and 64 bit

    Say you want to change the preferred netlib connection order. Or add a server alias.

    You can do this using the "SQL Server Configuration Manager" program. But installing this on each client machine where you want to do the modification might not feel that attractive.

    Another option is to use a tool like regmon while doing the config on a reference machine, sniff the registry modifications and then shoot these out to the client machines. This might be overkill, though.

    Yet another option is to use the cliconfg.exe tool, which ship with Windows. This tool is already available on your machine. However, on a 64 bit machine, you need to consider whether the client app is a 32 or 64 bit app. The processor architectore for that app will determine where in the registry the app (client libraries) will look. Below is my results from working on a x64 XP installation.
    64 bit (native): HKLM\SOFTWARE\Microsoft\MSSQLServer\Client\SuperSocketNetLib
    32 bit (WOW): HKLM\SOFTWARE\WOW6432Node\Microsoft\MSSQLServer\Client\SuperSocketNetLib

    Razvan Socol was kind enough to enlighten me, a while ago, of the fact that runnig the 64 bit version of cliconfg.exe will modify in the "64 bit" registry entry (as per above) and vice versa for the 32 bit version of cliconfg.exe. Razvan mentioned that starting cliconfg.exe from Run will start the 64 bit version, and from a 32 bit app (Explorer for instance - which I couldn't find how to do, but I'm sure somebody will enlighten me) will start the 32 bit version.

    Above made me wonder in what folder each file is. Here are my findings (on the test machine I was using - a pretty clean XP x64 machine):

    64 bit version of cliconfg.exe: C:\Windows\System32
    32 bit version of cliconfg.exe: C:\Windows\SysWOW64

    (And, before you ask, no, above is not a typo. There is some logic behind this. :-) )


  • Heaps, forwarding pointers, ALTER TABLE REBUILD and non-clustered indexes

    Let's start with some background on forwarding pointers:

    Forwarding pointers in heaps can be a mess to get rid of. A forwarding pointer happens when you modify a row so that the row doesn't fit on the same page anymore. SQL Server moves the row to a new page and leaves a forwarding pointer where the old row used to be. This means that non-clustered indexes are not affected by the moving of the row - it can still point to the old location. So the modification is less costly than if all non-clustered indexes needed to reflect the row move. But having forwarding pointers can be bad for performance when you read data. A perhaps less known fact is that a scan over a table needs to follow forwarding pointers - essentially "jumping back and forth" a lot if you have many forwarding pointers. That can be really bad for performance (if you have table scans, of course). So, how do we get rid of forwarding pointers? Well, we can shrink the database file, but that is a little like curing a headache by jumping into a big meat grinder. We can export all data and re-import it - not very practical.

    Greg Linwood reminded me that in SQL Server 2008, we can do ALTER TABLE ... REBUILD. Now, I knew about this option, and every time I tell about it in class I've been thinking silently for myself "I need to test whether this is a way to get rid of fwd pointers". (You generally talk about ALTER TABLE ... REBUILD when you talk about enabling compression on a heap.) So, doing a REBUILD of a table using ALTER TABLE sounds promising. Will it get rid of forwarding pointers? Will it also rebuild all non-clustered indexes?

    Quick answer for those who don't care reading the TSQL script:
    ALTER TABLE ... REBUILD will remove forwarding pointers, but for some strange reason it will also rebuild all non-clustered indexes on that table.

    See the TSQL code below. It was adapted from a challenge by Linchi Shea to produce a data loading script resulting in worst performance (where I immediately thought of forwarding pointers). See for instance http://sqlblog.com/blogs/linchi_shea/archive/2009/06/07/t-sql-exercise-simplest-data-loading-script-to-produce-worst-query-performance-june-7th-update.aspx.

    USE tempdb
    GO

    IF EXISTS (SELECT FROM sysobjects WHERE name 'test')
       
    DROP TABLE test
    GO
    CREATE TABLE test (
       
    INT NOT NULL
      ,
    x2 INT NOT NULL
      ,
    CHAR(10) NOT NULL DEFAULT ('')
      ,
    CHAR(10) NOT NULL DEFAULT('')
    )
     
    DECLARE @rows INT = 666666@toKeep INT = 400000@diff INT
     
    INSERT 
    test (xx2)
    SELECT TOP(@rows)
           
    ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS r
          
    ,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) % 10 AS s
      
    FROM master..spt_values a CROSS JOIN master..spt_values b
    GO
    ALTER TABLE test ALTER COLUMN CHAR(892)
    ALTER TABLE test ALTER COLUMN CHAR(100)
    GO
     
    DECLARE @rows INT = 666666@toKeep INT = 400000@diff INT
    DELETE TOP
    (@rows @toKeep
      
    FROM test WHERE x2 IN(2468)
    GO

    CREATE INDEX x1 ON test(x)
    CREATE INDEX x2 ON test(x2)

    SELECT FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')
    --First  run (no nc indexes ): 387157 fwd records (out of 400000 rows), 96104 pages
    --Second run (two nc indexes): 387157 fwd records (out of 400000 rows), 96105 pages

    CHECKPOINT
    DBCC 
    DROPCLEANBUFFERS
    DECLARE @t time SYSDATETIME()
    ALTER TABLE test REBUILD
    SELECT DATEDIFF(ms@tCAST(SYSDATETIME() AS time))
    --First run, no non-clustered indexes, three subsequent executions (ms): 19351, 20683, 20275
    --Second run, with two non-clustered indexes, three subsequent executions (ms): 31803, 35065, 37511

    SELECT FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('test'), NULL, NULL, 'DETAILED')
    --First  run, heap = "index" 0 (no nc indexes ): 0 fwd records, 50002 pages
    --Second run, heap = "index" 0 (two nc indexes): 0 fwd records, 50003 pages
    --But: avg_page_space_used_in_percent changes for the nc indexes after the rebuild.
    --That I take as nc indexes are also rebuilt.


  • Do maintenance plans require SSIS?

    To many, this is a non-issue, since SSIS is installed anyhow. But not everyody installs SSIS. I for instance prefer to keep my production systems as clean as possible and only have what I really need (a principle which is harder and harder to live after as years go by...). Also, not all Editions of SQL Server comes with SSIS.

    I did a test some months ago for SQL Server 2005 with a recent sp and also on SQL Server 2008. SQL Server 2008 did require SSIS (I tested both without and with SSIS installed), where 2005 sp2 didn't. I recently learned from Phil Brammer in MVP group that there has been progress. Here's the run-down, you don't need SSIS installed to execute maint plans:

    SQL Server 2005 requires sp2.

    SQL Server 2008 requires CU3 or sp1 (see http://support.microsoft.com/kb/961126/). I dodn't test this combo (2008 sp1 without SSIS), but I have no reason to doubt that KB article.


  • Watch out for that autogrow bug

    Under some circumstances, autogrow for database files can be set to some 12000 percent. I think this is limited to SQL Server 2005 and for databases upgraded from SQL Server 2000 (I didn't bother to search - feel free to comment if you know). So, if you have a reasonably sized database and autogrow kicks in, you can do the maths and realize that pretty soon you are out of disk space.

    I wrote a proc that I schedule that check for out-of-bounds values in sys.database files. The proc generates a bunch of messages it prints (handy if you have as Agent job with output file) and also constructs an error message and does RAISERROR (handy if you implemented alerting, for instance according to http://www.karaszi.com/SQLServer/util_agent_alerts.asp).

    I prefer to schedule below as Agent job and use Agent alerts to notify me if we do have db with autogrow out-of-whack. As always, don't use code if you don't understand it.

    USE maint
    GO

    IF OBJECT_ID('check_autogrow_not_percent'IS NOT NULL DROP PROC check_autogrow_not_percent
    GO

    CREATE PROC check_autogrow_not_percent
    AS
    DECLARE 
     
    @db sysname
    ,@sql NVARCHAR(2000)
    ,
    @file_logical_name sysname
    ,@file_phyname NVARCHAR(260
    ,
    @growth VARCHAR(20)
    ,
    @did_exist bit
    ,@msg NVARCHAR(1800)
    ,
    @database_list NVARCHAR(1000)

    SET @did_exist CAST(AS bit)
    SET @database_list ''

    --For each database
    DECLARE dbs CURSOR FOR
     SELECT 
    name FROM sys.databases
    OPEN dbs
    WHILE 1
    BEGIN
      FETCH 
    NEXT FROM dbs INTO @db
      
    IF @@FETCH_STATUS <> BREAK

      SET 
    @sql 'DECLARE files CURSOR FOR
     SELECT CAST(growth AS varchar(20)), physical_name, name
     FROM ' 
    QUOTENAME(@db) + '.sys.database_files
     WHERE is_percent_growth = 1
     AND growth > 20'
      
    EXEC(@sql)
      
    OPEN files
      
    WHILE 1
      
    BEGIN
        FETCH 
    NEXT FROM files INTO @growth@file_phyname@file_logical_name
        
    IF @@FETCH_STATUS <> BREAK
        SET 
    @did_exist CAST(AS bit)
        
    SET @database_list @database_list '["' @db '": "' @file_logical_name '"]' CHAR(13) + CHAR(10)
       
    SET @msg 'Out-of-band autogrow in database "' @db '"' 
       
    ' with growth of ' @growth +
       
    ', logical file name "' @file_logical_name '"' 
       
    ', physical file name "' @file_phyname '"' 
       
    '.'
       
    RAISERROR(@msg101WITH NOWAIT
      
    END
      CLOSE 
    files
      
    DEALLOCATE files
    END
    CLOSE 
    dbs
    DEALLOCATE dbs
    IF @did_exist CAST(AS bit)
      
    BEGIN
       SET 
    @msg 'Databases with out-of-control autogrow in databases: ' CHAR(13) + CHAR(10) + @database_list
       
    RAISERROR(@msg161WITH LOG
      
    END
    GO

  • Spooky: What do you connect to?

    I only recently discovered that SSMS will connect to different things. For instance, press the "New query" button. What were you connected to? The answer is the same server as your "current" server. But what is the current server? It is the server where you happened to have focus when the pressed the "New query" button. So, can you say whether you had focus in a query window, Object Exporer or Registered Servers?

    This also applies to when you double-click a .sql file. And it doesn't stop there. Open the "Registered Servers" window. Now, click on a server group. Go to explorer and double-click a .sql file. What were you connected to? Yes, all the servers in that group. Now, don't get me wrong here; the ability to open the same query window against several servers can be a very useful thing. What I had no idea until just about now is how easily thsi can happen by mistake. Just by cklicking the New Query window, or even double-clicking an .sql file. So - be aware...

    (FYI: SSMS 2005 doesn't seem to do this for clicking a file in explorer, and the functionality to have a query window against several server didn't exist in SSMS 2005...)


More Posts Next page »
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement