THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Tibor Karaszi

  • Send email after Agent job execution

    After executing a job, I want to have an email with the following information:

    • Success or failure in the mail subject (not just in the body)
    • Success or failure and execution time for each job step in mail body
    • Have output file(s) for the job steps attached to email

    Agent has none of above built-in so I created my own procedure for this. This is meant to be added as a final job step for your jobs. You find all details and the procedure here: Feedback is of course very welcome.

  • SQL Live Monitor

    I just found this one out there and wanted to share it. It connects to an instance and show you a bunch of figures. Nothing you can't extract yourself with SQL queries, but sometimes it is just nice to have one tool which is very easy to use. Here's what it looks like when connecting to an instance with no load on it:

    SQL Live Monitor

    As you can see, there are some hyperlinked pages as well, and there are also some interesting options (like logging to CSV or for PAL analysis) under the "Option" button. One more thing I liked about it is that there is no installation, just an exe file to download and run. Here is where you find it:

  • Who owns your jobs?

    Yep, another Agent blog. The topic of job owner has been discussed here before, but this is a specific situation, which potentially is a ticking bomb. First a quick recap of how agent does things:

    When a job is owned by somebody who is sysadmin, then no impersonation is performed.

    • T-SQL jobsteps will be executed as Agent's own Windows authenticated login.
    • Other jobsteps (which operates as the OS level) is done in Agent's service account's context.

    Above is, assuming you didn't ask for impersonation for the jobstep (Run As).

    Last week, at a SQL Server admin class, Andreas Jarbéus from Swedish Match asked about job owned by Windows account and what happens when that person leaves the company. I was about to say "no problem, if the job owner is sysadmin", but I felt there was a bit more to it. The scenario is:

    • The job owner is member of a Windows/AD group. Say you have a "SQL Server DBA" group in your AD.
    • This group is added as a login to your SQL Servers. The persons in that group do not have individual logins in SQL Server - only through this group.
    • This group is sysadmin. (I'm not saying that having jobs in general owned by sysadmins is a best practice. You can for instance have vendors installing some software and you don't want to make them sysadmin. In such a case, clearly we don't want the job to keep running if that person would disappear. So, I'm targeting jobs owned by sysadmin here, and I would bet you have bunch of those.)
    • Somebody in this group creates an Agent job.
    • The person in the group leaves the company and is removed from this Windows/AD group.

    Above is a pretty common scenario. The thing now is that the job no longer works. And here's the reason:

    If the owner was somebody in Windows/AD who's SID exists in SQL Server (the individual was added as a login to SQL Server), then the job will still work. This also applies for SQL Logins which might be disabled (like "sa" for instance). The reason for this is that the SID for the job owner exists in sys.server_principals and SQL Server will know that this SID is sysadmin. Since Agent won't do impersonation for sysadmins, there's no need to "go out to Windows" and see if this SID exists in Windows/AD. Remove the login, and the job will stop working, though - but you have at least don't something explicitly inside your SQL Server to cause this (removed the login).

    But when the owner's SID don't exist in sysadmin we have a problem. The problem is that the recorded owner of the job is the SID for the Windows user, even though that SID doesn’t exist in sys.server_principals. As long as this still exists in Windows/AD, the job will work just fine. Agent will run this job owned by ?, ask Windows who this SID is and see that this SID is a member of a group which exists as a login in SQL Server and that is sysadmin. Now, imagine what happen if the SID doesn't exist in Windows anymore. The job is owned by ?, and that is all that SQL Server knows. The SID no longer exist in Windows/AD so you there's nobody to tell SQL Server "Yeah, this used to be a sysadmin in SQL Server - through a group membership in Windows/AD".

    Even worse, the job seems to keep working until you re-start the SQL Server service, some caching going on. Makes it even more difficult to determine why the job suddenly stops working: "We deleted this Windows account 4.5 months ago." This is easy to repro, and you don't even need a domain (local Windows accounts and groups work just fine):

    • Create a Windows group for this.
    • Create a Windows account for this.
    • Add the Windows account to above group.
    • Add above group as a login to SQL Server.
    • Make this group sysadmin.
    • Create a job, with two jobsteps:
    • T-SQL which executes: PRINT ORIGINAL_LOGIN() + ' | ' + SUSER_SNAME() + ' | ' + USER_NAME()
    • CmdExec which executes: WHOAMI.EXE
    • (Above isn't really necessary, but useful for testing other impersonation aspects of Agent)
    • Specify the Windows account as owner of the job.
    • Run the job, make sure it works fine.
    • Remove the Windows account from the Windows group.
    • Run the job and see it fails.

    I'm sure this has been blogged before, but I was a bit surprised by it. I have now added a check for this in my "Check best practices" stored procedure I run every day... On the flip side, this doesn't seem to happen for multi-server (MSX) jobs, the ownership seems to have special handling for these - you will see if you check the owner_sid in sysjobs.

  • Agent Alerts Management Pack updated

    I have updated the Agent Alerts Management Pack, cleaned up the T-SQL code which creates the alerts. You can now just copy, paste and execute that code. Earlier, for some of the some of the alert definitions it just generated the calls to the query result window, and you had to copy this to a query window and execute it. Now you just copy the code and execute it. You find it here:
  • Ola Hallengren's maint procedures now supports logging to table

    This can for instance be very useful if you want to keep track how long time things take, over time.

    Check them out here.Version history here.

  • Article released about moving databases

    Just a short notice that I've released an article about moving databases between SQL Server instances: You find it at
  • Restricting memory using Resource Governor

    You might know that Resource Governor (RG) can be used to allow you to affect resource allocation inside SQL Server. RG was introduced with SQL Server 2008 and requires Enterprise Edition or similar/higher. There are two things you can govern using RG: cpu and memory.

    For introductory information on RG, see for instance these BOL topics.

    A misconception I see from time to time is that the memory setting is to restrict the Buffer Pool, BP, (mainly used for caching pages from disk). This seems difficult since a page in BP has a cost, but isn't owned by whoever brought it into cache or who referenced it last time. So, it seems difficult with current implementation of cache handling and ageing to govern the BP memory for RG. What RG does govern is "working memory" for a query. Think memory used for sorting, locks, hashing and such.

    We just had such a discussion in the MSDN forums, and I decided to do a test to see if we could show that RG do not govern the BP. Brief outline of the script

    • Restrict the BP to 300 MB
    • Create two databases (a and b)
    • Create a table with size of 255MB in each database
    • Create two logins with name a and b, which will be used to execute queries
    • The logins end up with two different resource pools (aPool and bPool), where aPool has max mem 80% and bPool has max mem 20%
    • A loop which generates queries that are executed using xp_cmdshell and SQLCMD
    • The loop first scan over the first approx 9% of data, first login a in database a, then in login b in database b
    • Next iteration is then done over the following 9 % of data
    • After 11 iteration, we've done most data
    • Finally check how much memory in the BP each database is using

    If RG were to govern the BP, we expect to see database a using lots of more pages than database b. That not what we are seeing. You will probably see that database b is using slightly more memory than a. That is because you done exactly the same queries over exactly the same data for each database and the most recent database you queried was database b (pushing out some of the pages with same cost that database a had in cache). You might want to execute the loop a few times to possibly see things even out a bit.

    Seeing is believing, they say, and to me this show that Resource Governor do not govern the Buffer Pool.

    If you believe there are faults in how the script does things, feel free to discuss. If you have some alternative script you want us to try, I suggest you open a thread in the MSDN forum (for example) and there post a script which is pretty immediately executable for us(possibly with modifications of file path and server name). Unless the changes you propose are so minor that can be immediately explained in your text.

    Use script at own risk.


    --Configure Buffer Pool for max 300 MB memory
    EXEC sp_configure 'max server memory', 300

    --Drop and create databases a and b
    (NAME = N'a', FILENAME = N'R:\a.mdf' , SIZE = 250MB , FILEGROWTH = 5MB )
    LOG ON
    ( NAME = N'a_log', FILENAME = N'R:\a_log.ldf' , SIZE = 30MB , FILEGROWTH = 5MB)

    (NAME = N'b', FILENAME = N'R:\b.mdf' , SIZE = 250MB , FILEGROWTH = 5MB )
    LOG ON
    ( NAME = N'b_log', FILENAME = N'R:\b_log.ldf' , SIZE = 30MB , FILEGROWTH = 5MB)

    --Create tables and populate with data, approx 250 MB each
    INSERT INTO a..t (c2, c3)
    FROM sys.columns AS a CROSS JOIN sys.columns AS b
    GO 25
    INSERT INTO b..t (c2, c3)
    FROM sys.columns AS a CROSS JOIN sys.columns AS b
    GO 25


    USE a
    USE b

    --Configure RG:
    IF EXISTS(SELECT * FROM sys.resource_governor_workload_groups WHERE name = 'aGroup')
    IF EXISTS(SELECT * FROM sys.resource_governor_workload_groups WHERE name = 'bGroup')
    IF EXISTS(SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'aPool')
    IF EXISTS(SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'bPool')



    IF OBJECT_ID ('dbo.MyClassifier') IS NOT NULL DROP FUNCTION dbo.MyClassifier;

    CREATE FUNCTION dbo.MyClassifier ()
    @GroupName SYSNAME;
    IF SUSER_SNAME() = 'a'
    SET @GroupName = 'aGroup';
    SET @GroupName = 'bGroup';
    ELSE SET @GroupName = 'Default';
    RETURN @GroupName;


    --Allow xp_cmdshell
    EXEC sp_configure 'xp_cmdshell', 1

    --Execute queries as A and B to read bunch of data
    --We have 2,250,000 in the tables
    --We will construct this command: SELECT COUNT(c2) FROM a..t WHERE c1 BETWEEN 1 AND 200000
    --The start and end values will differ, with ranges in 200,000 increments. 11 loops will add up to 2,000,000.
    DECLARE @v INT = 1, @sql VARCHAR(2000)
    WHILE @v <= 2200000
    @sql = 'SQLCMD /Slocalhost\a /Ua /Ppwd /Q"SELECT COUNT(c2) FROM a..t WHERE c1 BETWEEN ' + CAST(@v AS VARCHAR(20)) + ' AND ' + CAST(@v + 199999 AS VARCHAR(20))+ '"'
    EXEC xp_cmdshell @sql, no_output
    SET @sql = 'SQLCMD /Slocalhost\a /Ub /Ppwd /Q"SELECT COUNT(c2) FROM b..t WHERE c1 BETWEEN ' + CAST(@v AS VARCHAR(20)) + ' AND ' + CAST(@v + 199999 AS VARCHAR(20))+ '"'
    EXEC xp_cmdshell @sql, no_output
    SET @v = @v + 200000

    --Check how many pages from each db in BP:
    DB_NAME(b.database_id) AS database_name
    ,(COUNT(*) * 8192) / (1024 * 1024)) AS buffer_count_MB
    FROM  sys.dm_os_buffer_descriptors AS b
    GROUP BY  b.database_id
    ORDER BY database_name

    EXEC sp_configure 'xp_cmdshell', 0

  • What does this Express Edition look like, anyhow?

    Chances are that you rarely get to see the Express Edition of SQL Server if you mainly work with servers. This blog post is inspired from a forum discussion where some functionality is missing in SQL Server Management Studio (SSMS) and I suspected that the tools were installed as a Basic installation instead of Complete installation. So how do I check that? Difficult if I don’t have such an install to look at. I’m training this week and have a few spare machines to play with so I went ahead and installed a couple of SQL Server 2008 R2 Express Editions.

    When you install SQL Server, you are asked what to install. Depending on your installation media, the install is pre-selected for the edition of your SQL Server (the product key is already entered for you), or one of the free editions is selected. You can select any of the free editions even if you have a pre-pidded installation. The free editions are:

    • Enterprise Evaluation Edition
    • Express Edition
    • Express Edition with Advanced Services

    I decided to install the two Express Editions, on two different machines, to see what they look like.

    Express Edition
    This is pretty much only the database engine. Here is what options the Feature Selection gave for installation:

    Express bare install

    I selected all three. Setup then suggested the instance name SQLEXPRESS. One option I haven’t seen before was the ability to install support for User Instances. The installation took 6 minutes. Setup created a program group:

    • Microsoft SQL Server 2008 R2
      • Configuration Tools
        • SQL Server Configuration Manager
        • SQL Server Error and Usage Reporting
        • SQL Server Installation Center (64 bit)
      • Import and Export Data (64 bit)

    Below services were installed

    • SQL Active Directory Helper Service, Disabled
    • SQL Server (SQLEXPRESS), Automatic
    • SQL Server Agent (SQLEXPRESS), Disabled
    • SQL Server Browser, Disabled
    • SQL Server VSS Writer, Automatic

    So, this is a pretty much “bare” SQL Server with close to no tools. Or more to the point, no SSMS.

    Express Edition with Advanced Services
    This edition includes some bells and whistles, especially SSMS. The feature Selection dialog looks like:

    Express bare install

    Again, I selected all options. Setup then suggested the instance name SQLEXPRESS. The installation took 20 minutes. Setup created a program group:

    • Microsoft SQL Server 2008 R2
      • Configuration Tools
        • Reporting Services Configuration Manager
        • SQL Server Configuration Manager
        • SQL Server Error and Usage Reporting
        • SQL Server Installation Center (64 bit)
      • Integration Services
        • Data Profile Viewer
        • Execute Package Utility
      • Import and Export Data (32 bit)
      • Import and Export Data (64 bit)
      • SQL Server Business Intelligence Development Studio
      • SQL Server Management Studio

    Below services were installed

    • SQL Active Directory Helper Service, Disabled
    • SQL Full-text Filter Daemon Launcher (SQLEXPRESS), Automatic
    • SQL Server (SQLEXPRESS), Automatic
    • SQL Server Agent (SQL Express), Disabled
    • SQL Server Browser, Disabled
    • SQL Server Reporting Services (SQLEXPRESS), Automatic
    • SQL Server VSS Writer, Automatic

    So you get some goodies with Advanced Services, what probably attract most is SSMS.

    What about SQL Server Agent?
    We usually say that Express doesn’t come with Agent, but the Agent service is installed. But disabled and you won't be able to start it even if you enable the service. There’s logic behind this. You might want to upgrade your Express to a higher edition. Thanks to all bits already on the disk, the upgrade essentially just adjusts to your product key, no installation of files is necessary. See Upgrading from Express Edition to Standard Edition
    for info on how to upgrade Express to a higher edition.

    Express Edition of SSMS?
    There used to be such a beast. But nowadays, you instead have below options:

    • Management Tools Basic
    • Management Tools Complete

    If you install Express Edition with Advanced Services, your only option is Management Tools Basic. This corresponds to the old SSMS Express. Your SSMS is limited and lacks a lot of functionality. The interesting aspect is that you can also select Management Tools Basic from a “real” SQL Server installation media (Standard Edition, for instance). I doubt anyone wants to do this, but it can happen by mistake. One way to see if your SSMS is Basic is to look for SQL Server Profiler in the Tools menu (thanks Mladen Prajdić). If it isn’t there, you have the Basic installation of SSMS. Read on for another way to check this.

    Update 2011-08-06: I just learned that 1) there are some minor differences between Basic from Express install and Basic from non-express install and 2) Basic does have a node (GUI) for Agent, if connected to a non-express instance. I learned it in this thread and you can find out more in this blog post.

    How do I find out what is installed?
    There’s a great discovery tool built in the setup program, and this is already on your machine:

    • Start menu
    • Microsoft SQL Server 2008 R2
    • Configuration Tools
    • SQL Server Installation Center.

    Here you select:

    • Tools
    • Installed SQL Server features discovery report

    This will lead you to a HTML page which clearly states what SQL server stuff is installed on your machine. It will tell you the edition of the server components, and it will tell you whether your tools are installed as Basic or also Complete (if you installed the “full” SSMS, you see both Basic and Complete).

  • Are we Borg?

    Is it time to stop remeber things? For real, this time?

    Today I needed to find two pieces of SQL Server related information. One was straight foward, I posted it on #sqlhelp on twitter and a minute or so later I had the answer. The other was a bit more involved and I asked around in our MVP mail list - a couple of hours later I had bunch of suggestions and tips. These are only examples, it could just as well have been some web-forum, newsgroup, or some other forum. We've all had cases like this, but it made me think and reflect.

    Travel back in time, say 20 years or so. Computers were hard to use. Well, not computers per se, but rather the software made them hard to use. (Without software computers are easy to use - they don't do anything!) We were lucky if the software came with a printed manual. Whatever the quality of that manual.

    Then came electronic help with the software. That was a big step forward. And it even had index and search!

    With the Internet, things really started to pick up. Sure, there were BBS before that, but not broadly used. With Internet, we were given e-mail. Mail lists. Newsgroups. Altavista, Google and Bing (sounds like a supergroup). Web-forums. Facebook. Twitter. Etc. And not only that, we carry the Internet in our pockets, wherever we are.

    So, this is what hit me today. I used to keep tips and tricks in a register. Nuggets I picked up, here and there. The past few years, I do less and less of this. There are so many great minds out there, who also are willing to lend a helping hand. So, almost when I realize I need some bit of information, that information is available. Sounds pretty much like the collective mind of the Borg to me. Perhaps not quite there yet, but give it a year or five.

    But what worries me is: Where does that leave us, as human beings. Where is the difference between learning and remembering. Deducing versus asking and being given the information. I'm sure you can relate to this, at some level. For instance when we were forbidden to use pocket calculatos at school, because they were too powerful  Or the other way around.

    To put it bluntly: If I don't have to remember anything about SQL Server (the info is out there, right?), I won't be a very profficient SQL Server person. I'm sure we can formulate lots of ways to differentiate between learning/creating/deducing and remembering. And, don't get me wrong. I think the development is really interesting - and the community out there is way cool.

    But, I just find the development a bit interesting... and worrying ... and thrilling.

  • Upgrading from Express Edition to Standard Edition

    Say you encounter an SQL Server which is Express Edition, and it really should have been some higher edition. Sounds familiar? It is common for me as a consultant to find plenty of SQL Servers at a customer's site. Many of the databases in these will be moved (typically using backup and restore) to a "real" SQL Server. But in some cases, this might not be desirable. You want to convert the whole instance, from Express to a "real" SQL Server edition.

    I'm attending a great SharePoint course for Daniel Bugday this week, and the topic of how SharePoint is installed was discussed. Selecting "Standalone", SharePoint install will not only install all SharePoint parts on a single server, it will also install SQL Server Express Edition as the database engine. In short, SharePoint is one example where you might have an existing Express Edition which you want to make a "real" (higher) edition. There are other such scenarios, of course.

    As of SQL Server 2008, you can let SQL Server's setup program change from one edition to some other edition by just selecting "Maintenance", "Edition Upgrade" in setup and specify the product key that came with the install media for the desired edition (which you have acquired first, of course). The cool thing here is that setup basically just registers the different product key and WHAM - you have the higher edition! This is what I wanted to test, and specifically going from Express to a higher edition. Below is my experience:

    I have an Express edition which came with the SharePoint installation. It is SQL Server 2008 (not R2), with no tools (SSMS etc) installed. I want to make this a "real" Standard Edition. I have a Standard Edition 2008 installation media, with a product key.

    The database engine 
    I first ran setup from the installed SQL Server (Start, All Programs, Microsoft SQL Server 2008, Configuration Tools, SQL Server Installation Center (64-bit)). I selected "Maintenance", "Edition Upgrade" in setup and specify the product key that came with my standard Edition. This did not allow me to continue. I got an error (which is lost now) and I could not continue.
    Next try was to run setup from the installation media (my standard Edition install media) instead. This did allow me to continue. The change was very quick (a couple of minutes). I then verified using SQLCMD that I now have Standard Edition. Great! The database engine done.

    You might want to have the tools (a.k.a. Shared Components) as well on the machine. I refer to things such as SQL Server Management Studio, Integration Services etc. So, I just ran setup from the installation media and installed the desired shared components. Nothing strange here. I now have Management Studio, Books Online and whatever more I might want to have.

    SQL Server Agent
    SQL Server Express doesn't come with SQL Server Agent, right? Not really. It comes with agent, but the service is disabled and you cannot start it even if you try. But what about after an edition upgrade? I could still not start Agent. However, checking SQLAGENT.OUT revealed the problem. The first thing agent does at startup is to login to SQL Server and verify that it is sysadmin. The SharePoint setup installed Agent service account as Network Service, which was added as a login in SQL Server, but the login isn't sysadmin. Now, this is in general a good thing (don't run your services with higher privileges than necessary), but in this case it causes Agent to not start. Some of the SharePoint services are configured to use Network Service, and apparently the server roles securityadmin and dbcreator is enough for SharePoint. So, this needs to be taken care of. I could use some other service account for Agent, and make sure that service account is sysadmin in SQL Server. For my test, though, it is OK to make the login in SQL Server for Network Service as sysadmin. After that was done, Agent started just as expected. And just for fun, I created a Maintennance Plan for all databases, which also worked just fine.

    What about the rest?
    Obviously, you might want to check other things, such as database options, file size, auto grow settings, maintenance for the databases etc. But that isn't strictly related to the edition change, it is just general best practices. For my case, for instance, I might want to upgrade the SQL Server from 2008 to 2008 R2, while I'm at it. Specifically for SharePoint, it seems its setup program created the database files in the Hive (a SharePoint term for a folder in Program Files, Common Files), and you probably want to move the SharePoint database files to a more suitable location. But that is outside the scope for this post, and hopefully no sweat for your dba.

    In short, my test show that is is easy to change from one edition to a different edition, even if you want to change from Express Edition.

  • Point-in-time restore of database backup?

    SQL Server 2005 added the STOPAT option for the RESTORE DATABASE command. This sounds great - we can stop at some point in time during the database backup process was running! Or? No, we can't. Here follows some tech stuff why not, and then what the option is really meant for:

    A database backup includes all used extents and also all log records that were produced while the backup process was running (possibly older as well, to handle open transactions). When you restore such a backup, SQL Server will from backup media copy the data pages to mdf and ndf file(s), log records to ldf file(s) and then do REDO (roll forward the log records) and possibly also UNDO (rollback open transactions - this is skipped if you specify NORECOVERY option). Why does it need the log records? Because you can do modifications in the database while the database backup is running. After backup process was started (and before it finishes) you can both modify pages not yet copied to backup media, but also pages already copied to backup media. So, the data pages in themselves do not present a consistent state of the database. At restore time, SQL Server uses the log records included in the database backup to "sync" the modifications that were performed while the backup process were running. So, what you end up with is what the database looked like at the end time of the backup process. I'm sure this is well documented somewhere in Books Online so I won't dive further into the tech stuff here.

    So, what is the purpose of the STOPAT option for RESTORE DATABASE?

    It is to give you an early warning, in case you try to restore a database backup which is too late for a subsequent STOPAT operation for a log restore. Here's an example:

    1. 13:00 db backup
    2. 13:10 log backup
    3. 13:20 log backup
    4. 13:30 db backup
    5. 13:40 log backup
    6. 13:50 log backup

    Say you now want to restore to 13:15. Intuitively, you say that you will restore 1, 2 and for 3 you do STOPAT 13:15. Yes, that is right.

    But say you are under stress, and perhaps even used some tool which confuses things for you? So, you end up restore 4, 5 and for 6 you try STOPAT 13:15. Clearly, you can't stopat 13:15 for backup number 6 (it only covers 13:40 to 13:50). I.e., restore of 6 gives you an error message. So, you have to re-start the restore process (1, 2 and 3), which could take a long time (if the backups are large). If you had specified STOPAT 13:15 when you restore backup 4, SQL Server would have given you an error immediately, so you wouldn't have wasted valuable time restoring unnecessary backups!

    That is the purpose of STOPAT for RESTORE DATABASE.

  • Another bugfix for sp_indexinfo

    JackMcC reported correctly at that sp_indexinfo will count rows double (or even tripple) if there are lob or row overflow allocations. Now fixed. Thanks Jack.
  • Why we never want to trick SQL Server into rebuilding a log file

    "Never" is a strong word, so let me say that we really really want to avoid it, if at all humanly possible. In short, we will have a (potentially) broken database, both at the physical level and at the logical level. Read on.

    Just to be clear, when I refer to a "log file" here, I'm talking about a transaction log file, an .ldf file. Ever so often we see in forums how log files are "lost", "disappears" or so. Now, SQL Server relies on the ldf file at startup to bring the database to a consistent state. This is known as "recovery", or "startup recovery". This is pretty well known, I for instance teach this at every intro level admin course. So, what if

    • The ldf file isn't there?
    • Or isn't accessible to the db engine?
    • Or is broken somehow?
    • Or is from a different point in time from the data file? (You'd be surprised to all the things one see over the years.)

    Well, SQL Server will do the only reasonable, refuse us into the database and produce an error message (in eventlog etc).

    What we see from time to time, is trying to "trick" SQL Server into re-creating an ldf file. So, why is this so bad? I will try to explain why. Let me first say that SQL Server doesn't do these things to be mean to us, or to prove a point. If SQL Server know that the ldf file is not necessary for recovery (the database was "cleanly shutdown"), then it can and will re-create a log file for us at startup. The problem is that it isn't these cases we see in forum. The cases we see in the forums is when this didn't happen. SQL Server relied on the ldf file in order to bring the database to a consistent state.

    Enough beating around the bush, here is an example of why we don't want to trick SQL Server to forcefully re-create a log file:

    Say you have a transaction in which you add a row to the order header table (oh) and five rows to the order details table (od). Physically, each command is reflected in the tlog, the page is modified (but still cached) and at commit, the log records are written to the ldf file. There are lots written about these algorithms, but the concepts are pretty simple. For more details, read this.

    Say that we forcefully stop SQL Server, delete the ldf file, start SQL Server, see that the database isn't accessible and somehow "trick" SQL Server into creating an ldf file for us. What is the problem? Why is this so bad? The problem is that you can have no confidence in the state of your data, both at the physical level and at the logical level. Let me start explaining what I mean by the logical level and problems at this level:

    The logical level
    By the logical level, I consider the user data. The rows in the tables, quite simply. We inserted one row in the oh table and five rows in the od table. These can be inserted into a total of two pages or 6 pages (in reality more, since each table is likely to have indexes etc, I'll touch on this when we get to the logical level). Say that three of order details rows have been written to disk, but not the order header row, and not the other two order details rows. This is just an example; you can pick and choose any combo you want. It can get just as bad as you can imagine! You might think that the pages are in the end written in the same sequence as we modified them. No, that is now how the cookie crumbles (read the article I posted link to). You can probably imagine other scenarios, closer to your data. For instance, we will not do some certain modification to a table unless some other prior modification was also performed. In essence, rebuilding a log file leave us with no confidence in the data. Are you prepared to go through your database and manually verify all the data in there? Majority of you will say "no", and it might not even be doable (data volumes, lack of information to verify etc). So, logical inconsistencies are bad. Really bad. We don't want this. Ok?

    The physical level
    This log is not only used for "user-data". It i also used for system data. Tables has indexes, where each row in a table is reflected by a row in each index. We have allocation pages stating what storage is used. We have IAM pages and PFS pages. We have linked lists. And whatnot. You probably realize that these structures also require that some modification is performed in full or not at all. (an extent allocation will be reflected in both the GAM or SGAM page and also in th extent data itself, etc). What do you think is used to protect this? Yes, you got it, the ldf file. Trick SQL Server into re-creating an ldf file and you have all sorts of physical inconsistencies. Actually, physical inconsistencies are a bit better than logical since we do have a way to check for these. I'm of course talking about the mighty DBCC CHECKDB command, a command with lots of smarts (right Paul?) to check that the db is consistent at the physical level. And what if it isn't? CHECKDB spits out errors. Sure it has repair options, but those generally mean (unless you are the type of person who wins are lotteries) that the repair will rip out whatever makes the db inconsistent resulting in data loss (which also has no knowledge of what you might consider logical consistent data).

    So, what to do?
    I hope the answer is obvious. Don't get into this situation in the first place. Don't go deleting ldf files for instance. Do take backups frequently enough so you don't end up in a situation like "I need to rebuild the log or I'm toast.". If something strange happens, don't start mucking about with the system unless you know what you are doing. Hire a specialist, call MS support or so. If I had a brand new Ferrari, I wouldn't disassemble the engine in case I hear weird noise from the it. Heck, I barely pop the hood of my Beamer!

    And no, I won't get into what commands can be used to re-build the ldf file. I expect all readers of this to not get into a situation where it is needed. :-)

    (I understand one can encounter a machine with no backups and something happens to the ldf file, and such scnearios. Of course I do. I feel really really bad every time I read about such a situation, because there is no graceful way to handle it. That is why I have such a harsh tone above. I don't want this to happen to anybody. One would hope that this is obvious, but nothing ever is on the Net. So I've learned. So, please leave such unconstructive comments out of the discussions!)

  • New util proc: sp_tableinfo

    Some of you might use my sp_indexinfo to easily get lots of details about the indexes in a database. I added an sp_tableinfo with info rolled up to table level. This allow to quickly see the largest tables in a database and things like that. Check it out at
  • Why can't we have column alias in ORDER BY?

    You probably think something like "What! Of course we can!". And so did I when I read the title for a post about this in a forum. Let us first establish that you can have a column alias in ORDER BY:

    p.ProductID AS pid
    ,p.ListPrice AS pr
    FROM Production.Product AS p
    ORDER BY pr

    The problem was that a table name was also in the ORDER BY clause. And this is when it becomes weird. The problem isn't the column alias, it is the table alias. Basically the desire is to write something like:

    p.ProductID AS pid
    ,p.ListPrice AS pr
    FROM Production.Product AS p

    Above was apparently allowed in SQL Server 2005 2000 (I never did it myself, however). (Update: Erland Sommarskog caught my error, this was allowed in 2000, not 2005.) In order to understand why above is backwards, we need to consider the logical query flow. The first part of a query which is executed is the FROM clause (logically, the optimizer can do as it pleases as long as we get the same result). In above example, I renamed the table (for the rest of the query) from "Production.Product" to "p". That means that we cannot refer to "Product" or "Production.Product" anymore in the query. However, SQL has imperfections and the topic of this blog is an attempt from MS to rectify such earlier imperfections (some of them), i.e., improving the product.

    Continuing the logical query flow (WHERE, GROUP BY, HAVING), we towards the end find the SELECT clause. Here is where you can rename the columns (a.k.a. column alias) for the rest of the query. After the SELECT list, it is meaningless to talk about different tables (where the data originally comes from), all we have is the result of the query (so far). This is why, we can in the ORDER BY refer to a column alias (generally); ORDER BY comes logically after the SELECT list.

    However, the original problem was that one also wanted to include a table alias, which shouldn't be allowed. So, why is it allowed, in the first place - consider it makes no sense from a logical standpoint? For that we need to look as somethining weird, which is allowed (both in T-SQL and ANSI SQL) and that can be somewhat useful:

    ProductID AS pid
    ,ListPrice AS pr
    FROM Production.Product
    ORDER BY ModifiedDate

    We don't need to return something to the client app in order to sort over it. Sure, that can be useful, but from a logical perspective, this is spooky. We sort over something which doesn't exist! So, the imperfections of the SQL language is the reason things starts to break down. Basically, what we do is to sort over something as if we would have included it in the SELECT list. We are saying, in the ORDER BY clause something like

    "Hey, dbms! Can you please pretend we included Product.ModifiedDate in the SELECT list so we can sort over it. Bur don't return it to the client app, I don't want to see that column."

    Where it really breaks down is when we combine table and column alias. Let me repeat the problematic query:

    p.ProductID AS pid
    ,p.ListPrice AS pr
    FROM Production.Product AS p

    The problem is that it is meaningless to specify a table alias here, the column does exist in the SELECT list. The only reason we can refer to something table-something at all, is to let the DBMS pretend we included something additional from the base tables, so we could sort over it. Such as (allowed): 

    ProductID AS pid
    ,ListPrice AS pr
    FROM Production.Product AS p
    ORDER BY p.ModifiedDate

    One could of course argue that below would be more logical (not allowed):

    ProductID AS pid
    ,ListPrice AS pr
    FROM Production.Product AS p
    ORDER BY Product.ModifiedDate

    Arguing that one form of two above is more logical than the other is pretty irrelevant, houever. Both are weird constructs, and the language is what it is. Anyone feel like checking the ANSI SQL standard which of above two (first, second or both) are allowed/preferred? No me, I'm off to grab some breakfast!


This Blog


Privacy Statement