THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

  • Geek City: Point in Time Recovery for Operations on Memory—Optimized Tables

    In my class in Budapest last week, someone asked if point-in-time recovery was supported for transactions on memory-optimized tables.  I couldn’t think of a reason why not, but I had never tried it. And just because there is not a technical inhibitor to allow certain functionality, it is often the case with a new technology that certain features aren’t included just because it just hasn’t been tested sufficiently.

    But as I thought about it a bit more, I realized there was no way that point-in-time recovery could work for disk-based tables unless it also worked for memory-optimized tables. After all, the transactions are all stored in the same log.

    But, I decided to verify anyway.

    So I created an in-memory database (IMDB) and a memory-optimized table. (If you want to run the code, you’ll need a folder on your C: drive called HKData and one called HKBackups.  Or you can edit the code to change the paths.) I then did an initial database backup of the database with an empty table.

    USE master
    GO
    SET NOCOUNT ON
    GO

    IF db_id('IMDB') IS NOT NULL DROP DATABASE IMDB;
    GO

    CREATE DATABASE IMDB; 
    GO
    ----- Enable database for memory optimized tables
    -- add memory_optimized_data filegroup
    ALTER DATABASE IMDB
        ADD FILEGROUP IMDB_mod_FG CONTAINS MEMORY_OPTIMIZED_DATA
    GO

    -- add container to the filegroup
    ALTER DATABASE IMDB
        ADD FILE (NAME='IMDB_mod', FILENAME='c:\HKData\IMDB_mod')
        TO FILEGROUP IMDB_mod_FG
    GO

    USE IMDB;
    GO
    -- create a memory-optimized table
    CREATE TABLE dbo.t_memopt (
           c1 int NOT NULL,
           c2 char(40) NOT NULL,
           c3 varchar(100) NOT NULL,
     
           CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1)
            WITH (BUCKET_COUNT = 100000)
    ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
    GO

    BACKUP DATABASE IMDB TO  DISK = N'C:\HKBackups\IMDB-FULL.bak'
        WITH NOFORMAT, INIT,  NAME = N'IMDB-empty-table', SKIP, NOREWIND,
           NOUNLOAD,  STATS = 10;
    GO

    I then inserted a row into the table, waited a minute, grabbed the time, waited another minute, and inserted another row. I found that if I didn’t put the WAITFORs in, it would get a bit confusing about exactly where I wanted to stop restoring.

    INSERT t_memopt VALUES (1, 'a', REPLICATE ('b', 100));
    GO
    WAITFOR DELAY '0:1:0';
    SELECT getdate();
    WAITFOR DELAY '0:1:0';
    INSERT t_memopt VALUES (2, 'c', REPLICATE ('d', 100));
    GO
    SELECT * FROM t_memopt;
    GO

    I then noted the time captured with the getdate(), saw the two rows returned,  and then made a log backup.

    BACKUP LOG IMDB TO  DISK = N'C:\HKBackups\IMDB-log.bak'
        WITH NOFORMAT, INIT,  NAME = N'IMDB-2-inserts-in-log'
    GO

    Now I could test the restore.

    USE master;
    GO
    RESTORE DATABASE IMDB
        FROM DISK = 'C:\HKBackups\IMDB-FULL.bak'
        WITH REPLACE, NORECOVERY;
    GO
    RESTORE LOG IMDB
        FROM DISK = 'C:\HKBackups\IMDB-log.bak'
        WITH RECOVERY, STOPAT = '<insert getdate() value noted above>';
    GO

    USE IMDB;
    GO
    SELECT * FROM t_memopt;
    GO

    When I saw just the one row row returned, I knew the point-in-time recovery had worked!

    So now I can start working on a real juicy bits and bytes type blog post. Hopefully I’ll have it up next week.

     

    ~Kalen

  • Geek City: Creating New Tools for Statistics

     

    I was just starting to work on a post on column statistics, using one of my favorite metadata functions: sys.dm_db_stats_properties(), when I realized something was missing.

    The function requires a stats_id value, which is available from sys.stats. However, sys.stats does not show the column names that each statistics object is attached to. So how can I know which stats_id is the one I want?

    So I looked at sp_helpstats, and it shows the column names, but NOT the stats_id!  I spent some time trying to figure out how to translate the stats_name into the column name, so that I could just use sys.stats. For example, if you see a stats_name of _WA_Sys_0000000C_38B96646, the 38B96646 is the HEX representation of the object_id and the 0000000C is the HEX representation of the column_id. I can find the column name with this information, but it’s not quick and easy, especially not to generalize in TSQL code.

    What DID turn out to be quick and easy was to rewrite sp_helpstats so that it returns the stats_id. It turned out to be an incredibly simple change. I just added an additional column to the temp table created in the procedure, and then added that column to the INSERT into the temp table and the SELECT from the temp table to return the final results. The cursor already fetched the stats_id column to use in its work; it just didn’t return it.

    Note that my script, which I am attaching, has not been thoroughly tested. It has been tested enough for my use when poking around the statistics, and using the sys.dm_db_stats_properties() function.

    Perhaps some of you will find this useful. And if not, at least I’ll know where to find the script next time I need it!

    ~Kalen

  • Did You Know: Sometimes I have to troubleshoot issues not dealing with SQL Server

     

    In the last couple of weeks, I’ve had to troubleshoot several technical problems that, while not critical, were extremely frustrating, annoying and inconvenient.

    Two of them had to do with setting up my new computer. I am now using a Microsoft Surface Pro 3 as my main work computer. And mostly, I absolutely love it!  However, there have been a couple of problems.

    1. Getting the tablet into the docking station involved a whole lot of jiggling. This is the first time I’ve ever used a docking station and I love the idea. But there were no instructions about HOW TO actually insert the tablet, so that the little piece that actually made the connection between the tablet and docking station was really connected. The first time, it took me a while to realize I wasn’t connected and that I should be looking for a little light to come on, on the side of the docking station, unless I wanted to run down my battery. I would put the tablet in and jiggle it around, and hope that I got it, and sometimes it worked and sometimes it didn’t. I tried looking for a HOW TO INSERT INTO DOCKING STATION guide, but could find nothing, as if it was just too obvious no one should need help. Finally, I found this page:

    http://www.microsoft.com/surface/en-us/support/hardware-and-drivers/docking-station 

    and I almost overlooked the solution. I read that I had to just slide the Surface into the docking station but that just wasn’t working for me. It would not ‘just slide in’. Then I read the next part which said: then slide the sides of the docking station into the sides of your Surface.

    Well, it had never mentioned that I had to slide anything open before putting the tablet in. But once I knew the sides were suppose to separate, it only took a few seconds to figure out how to do it. Slide the sides open, insert the tablet, slide the sides closed.  And now I’m set.

    2. The docking station has a mini display port that can be used to connect to an external monitor, which of have. In fact, I have two, for two different computers. So I looked at Amazon, and found an adapter for connecting a mini display port to a VGA cable. It hooked up fine. But the color just looked wrong. I tried the other monitor, and the color still looked wrong. I couldn’t put my finger on exactly what ‘wrong’ meant, but it didn’t look the same as on the tablet screen. And certain colors I’d used for highlighting on some documents turned completely black on the external monitor. So I used my search-fu to dig through pages and pages that were returned when I just searched for ‘color is funny on external monitor.’  Finally I found a page that talked about adjustments on the external monitor, which I didn’t think were necessary since the monitors worked fine with other computers. But on that page, I got enough information to determine that what was wrong was that ‘red’ wasn’t being displayed. The background on the tablet screen was purple, and the background on the external monitor was blue. And there was no red anywhere.  So I was able to narrow down the search and look for details about no red available, and found a page about bad pins. The adapter had only cost $8, and Amazon will replace it if it is bad, so I just ordered a new one. And when I returned from my trip to Philly on Friday night, it was waiting for me. First thing Saturday morning I plugged it in, and now my extended monitor has a purple background, too. So the problem was just defective hardware.

    3. My second computer had previously been used as my powerhouse machine, and that’s the one I would run my SQL Servers on.  I would run TSQL scripts but didn’t create content, with either Word or PowerPoint on that machine. But occasionally I needed to actually have a Word doc on that machine, or I needed to use it for presenting, so I would need PowerPoint files on it. It seemed that some Word docs and some PowerPoint files would get corrupted when I copied them to that machine. I got a really nasty corruption error and then nothing would open. But not all files were like this. It happened most often when I downloaded from my Cloud storage (I have been using Box.com  since long before OneDrive). When I copied over a local network share from one machine to the other, it was fine. Or if I copied onto a USB stick and then onto the other machine it was fine. But downloading from Box.com would corrupt files. I was almost ready to leave Box.com but then testing using OneDrive gave me the same behavior. Downloading from either Cloud storage location would corrupt my files. But only one one machine. I could download to my new tablet and everything was fine. Not cool. But I was going out of town and couldn’t spend more time debugging.

    However, I had to keep debugging, because the first day of class, I didn’t have the slides with me, and I had to download them. And they wouldn’t open. So I spent the first day presenting off a PDF file. In my hotel that night, I copied from my tablet onto a stick and then onto the machine I was using for teaching, so I thought I was set. Then Friday morning, I got an email from the manager, with a PowerPoint slide attached. She asked me to display the slide during break as it had the instructions for the student evaluations.  The file was corrupted and would not open. There was no other source for the files, I had to open it. So I did one more online search, in the last 5 minutes before class started, and found this:

    http://answers.microsoft.com/en-us/office/forum/office_2010-office_install/cant-open-excel-and-powerpoint-files-downloaded/7db4f743-9d31-4c54-bb14-003e47978872

    It turned out it wasn’t a corruption problem at all, it was a trust center issue.

    If you are using Office 2010, you need to check your setting in the Trust Center.  Go to File>Options>Trust Center>Trust Center Settings>Protected View.  Make sure the first box is NOT checked.

    Image

    I unchecked the box and the manager’s PowerPoint slide opened right up. And so have all other downloaded files.

     

    4. And finally, there is a technical problem with my phone, a Nokia Lumia 920 Windows Phone. The navigation software with step by step directions has just never worked well. I have gotten lost multiple times. I try some fix I’ve heard about, and on the next trip, it happens again. The main symptom is that it just freezes, and I’m driving along, thinking I have miles to go before my turn, and actually I’ve passed it already. And when that happens, there is no way to force a refresh. Even stopping the navigation and choosing a new destination comes back right to the screen that says I still have all these miles to go before I need to turn.  Last week when I was driving to my hotel outside Philly it froze up right before a really critical intersection, and I didn’t know which direction to turn. I knew I had to take a certain hiway, but I thought I was going west, and the signs were for north and south. So I made the wrong choice, and ended up totally lost. The trip took me almost 2 hours when it should have taken me 45 minutes.

    So in my hotel I started a dedicated search. Was there better navigational software for this phone? Was it a hardware problem and I needed to replace the phone? Or a bug in the navigation app? Finally I found a support forum discussion that might be relevant. It mentioned making sure that Wi-Fi was off, because that could sometimes interfere with GPS. And I do almost always forget to turn the Wi-Fi off when I leave home or hotel and hit the road where there isn’t Wi-Fi everywhere.

    It seem a bit better now with the Wi-Fi turned off, but it’s not perfect. On my drive home from the Seattle airport Friday night, I put it to the test. Although it didn’t freeze, it did lose the signal multiple times. I didn’t get lost, since I knew my way home, but it would have been nice to know that I would be able to navigate next time I was in unknown territory. And I still have no confidence in that.

    So three out of four isn’t bad, but I still need to figure out what to do about my phone. Maybe it’s time for an upgrade?

    Online searching is certainly a powerful capability we have now, and figuring out the best search terms takes skill and patience. I like to think that all my searching for SQL Server solutions has helped me find other technical solutions more efficiently.  And I decided to blog this so I would have a record of my solutions in case I ever needed them again!

    I hope you are all having a wonderful start to your new year.

    Oh, and GO HAWKS!

    ~Kalen

  • Geek City: More Hekaton Details

    I apologize to all you folks who attended my Hekaton precon at the SQLPASS Summit last month, who have been eagerly awaiting my answers to some of the unanswered questions. I had got most of the answers sorted out, but I was waiting for one last reply, and put the whole list on the back burner. I just realized today it was now December, and I still hadn’t published this.

    So here it is, six questions that I didn’t completely answer during my talk, although for some of them I did make a guess.

    1. Do SCHEMA_ONLY tables affect the upper limit of memory available for memory-optimized table?

    Remember, there actually is no hard upper limit for the amount of memory you can use for memory-optimized tables. The recommended maximum of 256 GB has to do with the number of CFPs (Checkpoint File Pairs) that can be supported. So in that sense you might be able to have more data in SCHEMA_ONLY tables because you don’t have to worry about the size of the CPFs, as the data in those tables is never written to the files. But you still need to be aware of the overall system memory limit and make sure your machine has enough memory for all the memory_optimized tables, both SCHEMA_AND_DATA and SCHEMA_ONLY) as well as memory for the buffer pool for your disk-based tables. Plus memory for other SQL Server and system needs.

    If you have bound a database to a resource pool with a fixed upper limit of memory, all your tables have fit within that memory limit.

     

    2.  Is Point-in-time RESTORE possible for databases that have performed transactions on memory-optimized tables?  (i.e. can we do a RESTORE LOG WITH STOPAT command?)

    Simple answer here: YES

     

    3. Can a database with memory-optimized tables be restored or attached onto a SQL Server 2014 instance using STANDARD Edition?

    No. Both restore and attach will require that all filegroups are available, including the filegroups for your memory-optimized tables. And once you have those special filegroups, the database cannot be restored or attached on STANDARD Edition and you get an error. So you really don’t even have to have any memory-optimized tables. As soon as you create a memory-optimized filegroup, the database will no longer be accepted.

    I wish I could tell you what the exact error would be, but I don’t have a SQL Server 2014 STANDARD Edition to test on. I have frequently requested that Microsoft provide a feature that allows us to ‘switch off’ Enterprise features if we are using EVALUATION or DEVELOPER Edition, so we can actually test using a base other than ENTERPRISE. But so far, Microsoft has not complied.  There are two Connect items that seem to address this this need. The first one at https://connect.microsoft.com/SQLServer/feedback/details/331297/developer-edition-in-standard-workgroup-mode is marked “Closed as WON’T FIX’ but the other is still active at https://connect.microsoft.com/SQLServer/feedback/details/496380/enable-sql-developer-edition-to-target-specific-sql-version, so you might want to go give it a vote.

     

    4. Can views on memory-optimized tables be accessed from within a natively compiled procedure? (I think the question was sort of hinting at a workaround where you could create a view on a disk-based table and access that through the native procedure.)

    The answer is NO. You can create a view on a memory-optimized table, but you cannot access it from a native procedure. No views at all can be referenced in a native procedure.

     

    5. With REPEATABLE READ Isolation, with which SQL Server needs to guarantee READ STABILITY, what happens if a column that wasn’t read is changed? (i.e. your query read col1 and col2 from Table1, what happens if another query changes col3?)

    It’s the row that is important. If Tx1 reads col1 and col2 from Table1 in REPEATABLE READ and doesn’t commit, then Tx2 updates col3 in Table1, when  Tx1 commits it will then fail due to REPEATABLE READ violation.

    6. Can transactions on memory-optimized tables run in RCSI (READ COMMITTED SNAPSHOT Isolation)?

    RCSI is just a variant of READ COMMITTED Isolation. Operations on memory-optimized tables have to run in something higher than READ COMMITTED, so RCSI is moot.

     

    I hope this is useful!

     

    ~Kalen

  • Did You Know? My PASS Demo Scripts are Up (and other news)!

    And so another PASS Summit passes into history. It was an awesome week, filled with old friends and new, and lots of superlative technical content! My Hekaton book was released just in time, and it was great seeing the excitement. Red Gate gave away all the copies they had, the bookstore sold all they had, and I gave away 20 in my sessions.

    There's also a review of my book up already: http://www.i-programmer.info/bookreviews/21-database/7950-sql-server-internals-in-memory-oltp.html  

    My PASS Demo scripts are now available on my website. Just go to www.SQLServerInternals.com and click on Resources in the left-hand menu, and then choose Conference Content.

    The winners of my book giveaway were announced on Twitter, and email was sent out. I thought I was going to have to say that one of the winners had not yet responded, but just as I started writing, the sixth winner’s  email arrived.  I’m still scrambling to get caught up on everything that got put off while I was at the MVP Summit and PASS Summit last week, but I hope to have the books shipped by the end of this week.

    Here’s the list:

    image

    And now that PASS is over, my next project is updating my SQL Server Internals course to SQL Server 2014. The first public delivery will be in Stockholm in February.  Check my schedule for more details: http://sqlserverinternals.com/kalens-schedule/

    We are having a glorious Autumn here in the Beautiful Pacific Northwest. I hope you are enjoying your Autumn (or Spring in the Southern Hemisphere) as much as I am enjoying it here!

     

    ~Kalen

  • Did You Know? It’s Anniversary Time Again

    And it coincides with a new book of mine being published. So I decided I would give away some presents!

    Today marks 27 years that I have been working with SQL Server. On this day in 1987 I started working for the Sybase Corporation in Berkeley, California, and SQL Server has consumed by working life, ever since….

    I still have quite a few copies of my SQL Server 2012 Internals book, plus all the copies of the new In-Memory OLTP book, so I’ll be giving away a couple of each. (Many more of the In-Memory OLTP book will be given away at the PASS Summit next week.)

     

    So what do you have to do?

    1. Send an INSERT statement to this email address

    2. Include "Happy Anniversary" in the subject line. The mail will be filed as junk if you don't.

    3. The body of the email should consist of a single T-SQL INSERT statement (shown below).  For the last column, choose which book you’d prefer. 
        A 1 will indicate the SQL Server 2012 Internals book, and a 2 will indicate the In-Memory OLTP book.

    4. Doublecheck the syntax! Make sure you have open and close quotes for all strings, and that the quotes are STRAIGHT quotes, as in the INSERT statement below. If you use Word to add a quote of your own or replace one that's there, it's usually a "smart" quote, which in this case is NOT smart. SQL Server will reject the INSERT if the quotes are not STRAIGHT. I suggest you use SSMS to write the insert.

    5. Send the email before 23:59:59, October 30, US Pacific Time.

    Here is the INSERT statement. Replace the strings with your specific information, and include it in your email:

    INSERT INTO Raffle(emailAddress, firstName, lastName, country, city_region, choice)
                   VALUES('Your email address', 'First Name', 'Last Name', 'Country', 'City, State or Region', [1|2] );

    Due to postage costs, I’ll only choose one winner from outside the US. I’ll choose 3 from within the US.

    Examples:

    If my friend John wanted to enter, and wanted the In-Memory OLTP book, his INSERT might look like this:

    INSERT INTO Raffle(emailAddress, firstName, lastName, country, city_region, choice)
                   VALUES('john@emailserver.com', 'John', 'Cook', 'US', 'Houston, TX', 2 );

     

    If my son-in-law in Germany wanted to enter, and he wanted the 2012 Internals book, his INSERT might look like this:

    INSERT INTO Raffle(emailAddress, firstName, lastName, country, city_region, choice)
                   VALUES('manu@emailserver.com', 'Manuel', 'Meyer', 'Germany', 'Weil der Stadt', 1 );

     

    Thank you!

    ~Kalen

  • Did You Know? My Hekaton Book is at the Publishers!

    I am very excited to be able to report that my latest book is at the printers now, and should be ready in time for the PASS Summit!

    Front and back

    And since the last files were sent off to the publisher (Red Gate) last Friday, I could finally finish up the slides for my Pre-Conference Seminar, all about In-Memory OLTP.

    Check it out here:

    http://www.sqlpass.org/summit/2014/Sessions/Details.aspx?sid=6152

    Although there will be some copies of the book at  the Red Gate booth, along with many of their other wonderful titles, Red Gate will also be giving me a box full of books to give away during the session. There won’t be enough for everyone (unfortunately) but your chances of snagging one will be good!

    See you in November!

    ~Kalen

  • Did You Know? I’ll be talking for a full day at the PASS Summit!

     

    I’ll be speaking on my new favorite topic… Hekaton! (aka In-memory OLTP)

    http://www.sqlpass.org/summit/2014/Sessions/Details.aspx?sid=6152

    I was delighted to have the chance to give a taste of the presentation for the the 24 Hours of PASS preview on September 9th.  I was in Germany at my daughter's house that day, so although my US audience heard me in the morning, for me, it was almost supper time.  I am always impressed by the quality of the questions I get when giving online talks, and this was no exception. I was only able to address about a third of the questions that came in, but the organizers sent me a list of most of the submitted questions to address later. 

    Note that I won’t be giving really detailed answers to most of the questions. After all, this session was just a preview of my precon at the PASS Summit. In addition, I am just finishing up a book on In-memory OLTP for Red Gate, that hopefully will be released at the Summit. And that will have lots more details!

    Also keep in mind that this is a brand new technology, so the best practices and use cases are still being explored and those people out there actually implementing In-memory OLTP solutions are discovering new things all the time!

     

    Q:  Is the size of the table (or tables) that you want to use Hekaton on dependent on the amount of RAM you have available?

    Absolutely. You can’t have more data in memory than will fit in your memory. In addition, if not all your tables are in-memory tables, you’ll need to make sure there is enough memory for the buffer pool to hold needed pages from the disk-based tables. You also need to make sure you have enough memory for multiple versions of rows in your in-memory tables, based on how much updating you’re going to be doing.

     

    Q: How can I tell if a disk-based table has overflow rows, before I attempt to migrate it to Hekaton.  Also, how would I workaround this?

    You can look for any objects that have rows in sys.allocation_units with a type description of anything other than IN_ROW_DATA. This will give tables with either ROW_OVERFLOW or LOB data. If you just want to know the tables with ROW_OVERFLOW data,  look for where the type_desc is ROW_OVERFLOW_DATA. This query should get you started:

    SELECT OBJECT_NAME(object_id) FROM sys.partitions
    WHERE index_id < 2
      AND partition_id IN (
       SELECT container_id FROM sys.allocation_units
       WHERE type_desc != 'IN_ROW_DATA'
       AND total_pages > 0)

    Working around this might take a table redesign (which is a little beyond the scope of this short Q&A).

     

    Q:  Is it a requirement that the memory should be double the amount of data to efficiently use the Hekaton engine?

    As mentioned in the first answer, you need to have room for all the table data for tables you want in-memory, and you need to have enough room for the indexes on those tables, and for the versions of rows as the data changes. In some cases you might end up needing more than twice the data size, but it completely depends on the volume of your data changes, and the indexes you’ve created. If you have insufficient memory, it won’t just be a matter of not efficiently using the Hekaton engine; you will not be able to do any more data modifications once you are out of memory. But, there is no hard and fast rule. You need to  know your data and your operations, and you need to test thoroughly.

     

    Q: What happens as unique row counts exceed allocated bucket sizes?
    Q: Do buckets grow as data is inserted ?   Can you manually alter the number of buckets?

    I will assume you are asking what happens when the number of unique values exceeds the predefined bucket count. Here, this does become a matter of efficiency. If there are lots fewer buckets that unique values, one bucket will be used for multiple values and your search operations will be slower.  The number of buckets is fixed at the time the table is created and cannot be changed, either automatically or manually. You’ll have to drop and recreate the table.

     

    Q: When does the data write to disk, and do you have to compress the row history to free up memory space?
    Q: The data file name does not need an extension (like 'mdf')?

    New rows of data are written to disk using a mechanism called ‘continuous checkpoint’ which uses SQL Server’s  filestream technology to write to files on disk. There is no concept of a data file, so no .mdf is used for the in-memory data. The database containing the in-memory tables still needs a regular data file and a log file but the in-memory tables are not written there.  The filestream files where the in-memory data is written, to be used only in case needed for recovery, do not use a file extension. You might want to read about the filestream technology, which was introduced back in SQL Server 2008.

     

    Q: How do you calculate how much memory the table is going to need?

    The required space depends on the data in the rows, the number of rows, and the volume of updates. There is no simple formula for calculation because no one knows what your volume of changes will be. You can make a guess if you know your workload, and run some tests. There is good metadata available to tell you how much space you are using for your in-memory tables once you have created them. You can look at the memory used, run some of your workload, and then see how much more memory is used, to see how the memory requirements are growing.  You might start with the DMV: sys.dm_db_xtp_table_memory_stats.

    Q: I really like to use in memory OLTP but we frequently alter table . What is your recommendation on using In Memory OLTP for these table?

    As you’ve heard, once an in-memory table is created, it cannot be altered, not even to add more indexes, or to change the bucket count for any of the existing hash indexes. You’ll have to test whether dropping and recreating the table, or creating a new table and copying the data, can be done is a reasonable amount of time for your requirements. (Keep in mind that even with disk-based tables, some of the ALTER TABLE operations are not fast, efficient changes. Some ALTER TABLE options require that SQL Server completely recreate every row. )

     

    If you want even more details on Hekaton in SQL Server 2014, there are still seats in my precon (last I heard)!
    http://www.sqlpass.org/summit/2014/Sessions/Details.aspx?sid=6152

     

    And check out the Red Gate booth in the vendor area to find my new In-Memory OLTP Internals book!

     

    Have fun!

    ~Kalen

  • Geek City: Lost Identity

    Way long ago, when Sybase first added the IDENTITY property, it took people quite a while to get the hang of using it. Along with being able to automatically generate sequential numbers for each row, we had to tune a configuration option with the geekiest of names: “Identity Burning Set Factor”.  And figuring out how to use this value seemed to be one of the dark arts, so almost everyone left it at the default. (If you really want to know how to use the burning set, you can read the Sybase docs here.) The question of WHY we needed to even think about a burning set was never even an issue for most people.

    So when Microsoft included IDENTITY columns with no requirement of defining a burning set, it was a big relief. This isn’t to say that there were no issues with using IDENTITY columns at all. I still had to explain to students over and over that even with IDENTITY there was no guarantee of no gaps and no guarantee that you would never get duplicates. Basically, gaps could happen if you rolled back INSERT operations (or explicitly deleted rows) and duplicates could occur if you set the table property IDENTITY_INSERT to ON.  But once people understood this, everything else was pretty straightforward. Until now, with SQL Server 2012.

    I had actually started hearing strange reports of unexpected gaps in IDENTITY values quite a while ago, but the cases seemed to be not reproducible and certainly not predictable. But just recently, I came across the explanation and realized the reason it seemed so unpredictable was because there is a combination of conditions that both have to occur to see this particular behavior.

    It is true that there is a change of IDENTITY behavior in SQL Server 2012, to allow for greater throughput in heavy insert environments. In earlier versions, as each identity value was generated, it had to be logged separately. This meant that even for minimally logged operations, like SELECT INTO, every row inserted had to be logged separately. I blogged about this behavior here. But in 2012, identity values are generated in batches and only the maximum value of the IDENTITY column in the batch is logged. So minimally logged operations on tables with IDENTITY columns could be more efficient, but I still haven’t got around to actually testing that yet. But a related change is that if there is a server failure, SQL Server has to be sure it won’t reuse an IDENTITY value. If SQL Server was stopped without a CHECKPOINT, on restart the engine will add to the maximum value of the IDENTITY that it is aware of to set a new starting point for each table, thus potentially leaving a gap in the sequence. The size of the gap is dependent on the data type of the column. The value will be increased by 10 for tinyint, 100 for smallint, 1000 for int, and 10000 for bigint. Numeric and decimal increase their values based on the precision, but it seems the maximum is 10,000.

    If SQL Server is stopped ‘gracefully’, with a CHECKPOINT performed, there should be no gap. Ideally, a checkpoint is always issued for a service shutdown, unless you use the T-SQL command SHUTDOWN WITH NOWAIT, or if there is a crash, but apparently, there is a bug. It seems that stopping your SQL Server using Configuration Manager, or even the service control in SQL Server Management Studio will not perform a checkpoint. To make sure you get the checkpoint you can use SQL Server’s SHUTDOWN command (without the WITH NOWAIT option!) or of course, manually issue a CHECKPOINT before shutting down. So only crashes could cause these gaps.

    ---------------------------------------

    The gaps on restart only happen because SQL Server assigns IDENTITY values in blocks. You can choose to revert to the old behavior that allocated IDENTITY values one at a time, and logged each row even with minimally logged operations. To do this, you need to add trace flag 272  to your SQL Server startup parameters, set through the Configuration Manager.  Unofficial reports state that you have to use a lower case t to specify the flag, i.e. –t272 and NOT –T272.

    enable traceflag

     

     

    If you want to observe this gappy behavior yourself, you can run the following code. I tested it on SQL Server 2012 SP1, running on Windows Server 2008R2 SP1. I noticed that I didn’t get any gaps with only a single row in the table, after I restarted. I needed at least two rows. I used a database called testdb, but you can use any testing database you have.

     

    USE testdb;
    SET NOCOUNT ON;
    GO

    IF OBJECT_ID('ident1') IS NOT NULL DROP TABLE ident1;
    GO
    IF OBJECT_ID('ident2') IS NOT NULL DROP TABLE ident2;
    GO
    IF OBJECT_ID('ident3') IS NOT NULL DROP TABLE ident3;
    GO
    IF OBJECT_ID('ident4') IS NOT NULL DROP TABLE ident4;
    GO
    IF OBJECT_ID('ident5') IS NOT NULL DROP TABLE ident5;
    GO

    -- Create 5 tables with different data types for the identity column
    CREATE TABLE ident1
    (col_tinyint tinyint IDENTITY);
    GO
    CREATE TABLE ident2
    (col_smallint smallint IDENTITY);
    GO
    CREATE TABLE ident3
    (col_int int IDENTITY);
    GO
    CREATE TABLE ident4
    (col_bigint bigint IDENTITY);
    GO
    CREATE TABLE ident5
    (col_numeric30 numeric (30,0) IDENTITY);
    GO

    -- Insert 2 rows into each table
    INSERT INTO ident1 DEFAULT VALUES;
    INSERT INTO ident2 DEFAULT VALUES;
    INSERT INTO ident3 DEFAULT VALUES;
    INSERT INTO ident4 DEFAULT VALUES;
    INSERT INTO ident5 DEFAULT VALUES;
    GO
    INSERT INTO ident1 DEFAULT VALUES;
    INSERT INTO ident2 DEFAULT VALUES;
    INSERT INTO ident3 DEFAULT VALUES;
    INSERT INTO ident4 DEFAULT VALUES;
    INSERT INTO ident5 DEFAULT VALUES;
    GO

    -- uncomment the checkpoint for the second test;
    -- CHECKPOINT;

    --- Shutdown your SQL Server using the SHUTDOWN WITH NOWAIT command, the configuration manager, or SQL Server Management Studio.

    -- Restart your SQL Server

    INSERT INTO ident1 DEFAULT VALUES;
    INSERT INTO ident2 DEFAULT VALUES;
    INSERT INTO ident3 DEFAULT VALUES;
    INSERT INTO ident4 DEFAULT VALUES;
    INSERT INTO ident5 DEFAULT VALUES;
    GO

    SELECT * FROM ident1;
    SELECT * FROM ident2;
    SELECT * FROM ident3;
    SELECT * FROM ident4;
    SELECT * FROM ident5;
    GO

    And here are the results I got when I restarted after shutting down using SHUTDOWN WITH NOWAIT:

    Gaps

     

    You can run the test a second time, running a checkpoint right before shutdown to see that no gaps are left.

    Of course, the issue of WHY a checkpoint is automatically performed before a controlled service stop is another issue, and hopefully there will be a fix for that soon.

    But even once it’s fixed, uncontrolled shutdowns will still leave gaps and you need to be aware of this and not expect your IDENTITY values to always be perfectly contiguous.

     

    I hope this is useful!

    ~Kalen

  • Geek City: A Hint of Degrees

    This is just a quick post to describe a test I just ran to satisfy my own curiosity.

    I remember when Microsoft first introduced the query hint OPTION (MAXDOP N). We already had the configuration option ‘max degree of parallelism’, so there were lots of questions about how the hint interacted with the configuration option. Some people thought the configuration option set an absolute maximum, and the hint could only specify something less than that value to be meaningful. Other people thought differently, and I actually can’t remember what I thought at the time. All I  remember is that there was confusion. So I decided to test it recently.

    I have a machine with 8 logical processors, so I can have a degree of parallelism up to 8. And my first test showed that all 8 were used when I left the configuration option set to the default and didn’t use a hint.  I ran this test on both SQL Server 2012 SP1 and SQL Server 2014 RTM. The results were the same. I needed to look at the actual execution plan, which runs the query, because the decision of how what degree of parallelism to use is not determined until runtime. If I look at an estimated execution plan, I will see a parallelism operator, but not degree of parallelism will be reported, because it is not known.  I can look get the actual plan either by using the button on the Management Studio toolbar called “Include Actual Execution Plan”, or I can use SET STATISTICS XML ON.

    So, as mentioned, the first test showed a degree of parallelism of 8:

    USE AdventureWorks2012;
    GO
    SET STATISTICS XML ON;
    GO
    SELECT * FROM SALES.SALESORDERDETAIL
    ORDER BY UnitPrice DESC;
    GO

    Screenshot showing DOP

    I then changed the configuration option, and ran the same test.

    EXEC sp_configure 'max degree of parallelism', 4; RECONFIGURE;
    GO
    SELECT * FROM SALES.SALESORDERDETAIL
    ORDER BY UnitPrice DESC;
    GO

    This time, the degree of parallelism was 4, as expected.

    I then added the MAXDOP hint, with a value smaller than 4 and then a value larger than 4.

    SELECT * FROM SALES.SALESORDERDETAIL
    ORDER BY UnitPrice DESC
    OPTION (MAXDOP 2);
    GO
    SELECT * FROM SALES.SALESORDERDETAIL
    ORDER BY UnitPrice DESC
    OPTION (MAXDOP 6);
    GO

    In both cases, the degree of parallelism was the same as the value specified in the hint. So it seems, with this basic test, that a MAXDOP hint will completely override anything set in the configuration option. Whether this was true in the version when the hint was first introduced, I cannot say. But it’s true for me now, and that’s what’s important.

    So I’ll clean up and then go back to playing with Hekaton.

    EXEC sp_configure 'max degree of parallelism', 0; RECONFIGURE;
    GO
    SET STATISTICS XML OFF;
    GO

    Have fun!

    ~Kalen

  • Geek City: The Hekaton Saga Continues, and some answers to questions

    My Hekaton (aka In-memory OLTP) books is with the publisher. And I have no doubt they will do a much better job than my previous publisher did in bringing my SQL Server 2012 Internals Book into the light of day. 

    My only regret is that my awesome Tech Editor, Benjamin Nevarez, who has been with me since my SQL Server 2008 Internals book, was not available for this book since he is now writing books of his own! But he recommended the person who tech edited his book and I’m pretty sure I can trust Ben’s judgment on this. I’m intentionally not mentioning his name in this post, but I’m sure it will come up in a future post.

    I gave my first full day Hekaton presentation as a precon in Atlanta on May 2. I was worried that I wouldn’t have enough content to fill an entire day, but it pretty much worked out perfectly. As expected, there were some questions I couldn’t answer on the spot, either because I was too jet lagged, or because I just hadn’t thought about it enough yet, but I have some of those answers now, thanks to Sunil Argawal and Jos de Bruijn at Microsoft.

     

    1. Are default constraints allowed on memory-optimized tables?

    YES! However, there is a bug in the memory-optimization advisor that shows them as unsupported. In addition, not every expression that would be supported for defaults on disk-based tables is supported for defaults on memory-optimized tables. Only expressions that are allowed in natively-compiled procedures are allowed in default constraints on memory-optimized tables, but this does include a lot of the most frequently used ones: sysdatetime(), newid() and constants.

    2.  Are the data values in the rows compressed in any way.

    NO! No compression is used to store the data values in the rows.

    3. If a transaction fails after it is has obtained its timestamp, but then fails during validation, will its timestamp be reused?

    NO! Timestamps are never reused once they have been assigned.

    4. How is the timestamp used for reading rows determined?

    Every transaction that reads rows from memory-optimized tables reads as of a certain ‘point in time’, i.e. a timestamp. It then can only read row versions that were inserted after the read before the read timestamp, and not deleted until after the read timestamp (or not deleted yet at all!).  SQL Server maintains a current timestamp used for all read operations, which is incremented when a read/write transaction enters validation and is assigned its own timestamp. It can also be incremented for internal system transactions that are never visible to users. So basically, the read timestamp is the timestamp of the most recent committed transaction in the system.  Multiple read operations can share the same timestamp used for their reads.

    5. The documentation says that cross-database transactions are not allowed with memory-optimized tables. What about transactions involving temp tables (so they are in tempdb)?

    Since there can be no memory optimized tables in tempdb, this restriction does not apply. Cross-database transactions are allowed with tempdb and model, and read-only cross-database transactions are allowed with master.

    6. Can an identity value in a memory-optimized table be reseeded?

    (Note: earlier versions of Hekaton did not support identity at all, and I still get questions from people thinking that is still true. Hekaton tables can have identity columns as long as the seed is 1 and the increment is 1.) No reseed is possible but you can use IDENTITY_INSERT to manually increase the seed.

    7.  Will DBCC CHECKDB check memory-optimized tables?

    NO! DBCC CHECKDB does not include any checks on any part of Hekaton. However, checksums are maintained for the checkpoint files, and explicit checksums are supported for database backups, even those containing memory-optimized tables.

    8.  IF a checkpoint file grows larger than 128K due to large transactions, are there special considerations on merging it with neighboring files? E.g. is SQL Server more aggressive in merging large files?

    A checkpoint file pair (CFP) can be self-merged if the data file is larger than 256 MB and over half of the rows are marked deleted. Otherwise no…

    9.  Can transaction logs containing operations on memory-optimized tables be used for recovery to a point in time or to a log mark?

    YES!

    10. Can the checkpoint files be excluded from virus scanners?

    Right now, there is no easy way to do this since the checkpoint files have no suffix. However, the team at Microsoft will be looking into this.

     

    Thanks for all the great questions!

     

    And if you want more Hekaton details, check out the class I recorded for SSWUG a couple of weeks ago:

    http://www.vconferenceonline.com/event/home.aspx?id=1065  

     

    ~Kalen

  • Geek City: A row with no fixed length columns

    Today I will return to my roots and post about the bits and bytes of internal storage. I received a request a while back to help with deciphering the DBCC PAGE output for a row with no fixed length columns. The person had read the section in my book (SQL Server 2008 Internals) on row storage, but was still having trouble. He presented this table with one row:

     

    USE testdb;

    CREATE TABLE varchk(name varchar(10))
    INSERT INTO varchk VALUES ('kalen')

    He used DBCC IND to find the page number:

    DBCC IND (testdb, varchk, 1)

    and then then took the file and page number returned to pass to DBCC PAGE:

    DBCC TRACEON(3604)
    DBCC PAGE (testdb, 1, 188, 1)

    He saw these bytes for the row:

    30000400 01000001 0010006b 616c656e †0..........kalen

    These were his questions:

    How to find the variable column offset?
    How to find the end fixed length column and start of Null bitmap?
    How to find the offset of each variable-length column?

     

    And here was my reply:

    Here is your row:

    30000400 01000001 0010006b 616c656e, it is exactly 16 bytes long.

    3000 = Status Bits

    0400 = 2 byte integer for length of fixed length portion, reverse bytes because it is a single integer = 0004 = 4 bytes for everything up to last fixed length column

    0100  = 2 byte integer for number of columns, reverse bytes because it is a single integer = 0001 = 1 byte

    00 (shaded) = Null bitmap

    0100 = 2 byte integer for number of variable length columns, reverse bytes because it is a single integer = 0001 = 1 column

    1000 (shaded) = 2 byte integer for position where 1st variable length column ends, reverse bytes because it is a single integer = 0010 = 16, which is last column where variable length column ends

    6b616c656e = variable length data

    6b = k

    61 = a

    6c = l

    65 = e

    6e = n

    Your question said you were looking for the offset of the variable length column. Please reread the section (in the book) that describes the storage of rows with variable length columns.  We are storing the ending POSITION, not the offset, and I say this:

    A data row that has any variable-length columns has a column offset array in the data row with a 2-byte entry for each non-NULL variable-length column, indicating the position within the row where the column ends. (The terms offset and position aren’t exactly interchangeable. Offset is 0-based, and position is 1-based. A byte at an offset of 7 is in the eighth byte position in the row.)

    I know that’s pretty geeky, but for anyone else struggling with understanding row storage, maybe all you needed was just one more example!

     

    Have fun!

    ~Kalen

  • Geek City: Did You Know … that not everything you know is actually true!

    Software changes, new versions not only add new features, they also change internal behavior of old features,  and not all the changes are documented!  And how often, after a software upgrade, do you go through and test everything you knew was true, to make sure it is STILL true after the upgrade? I write books about the SQL Server software, and although I do verify all the code in my books against the new version, I admit that I don’t always check every single fact mentioned in the entire book to make sure it is still true.

    There are a number of such changes that I’ve discovered over the last year or so, that I’ll be telling you about in a couple of blog posts. Plus, there are things that I read on other peoples’ blogs, where the author states something is true that I know is no longer true. Here’s an example:

    It used to be true, a very long time ago, that you could not rollback a TRUNCATE TABLE operation. That changed so long ago, I can’t even remember what version it was in. But in all current versions, you can rollback a TRUNCATE TABLE, and it very easy to prove that. All you have to do is question whether or not the fact that “you cannot roll back a TRUNCATE TABLE” is actually true. But if you think it’s true, and never question it, you’ll never know.

    Here’s the example I use in my classes to show that you CAN rollback a TRUNCATE TABLE:

    USE testdb – or any test database of your choice
    GO

    IF object_id('smallrows') IS NOT NULL
        DROP TABLE smallrows;
    GO

    CREATE TABLE smallrows
    (
        a int IDENTITY PRIMARY KEY,
        b char(10)
    )
    GO

    INSERT INTO smallrows VALUES
            ('row 1'),
            ('row 2'),
            ('row 3'),
            ('row 4'),
            ('row 5');
    GO
    SELECT * FROM smallrows;
    GO


    -- TRUNCATE TABLE can be rolled back
    BEGIN TRAN
    GO
    TRUNCATE TABLE smallrows;
    GO
    SELECT * FROM smallrows;
    GO
    ROLLBACK TRAN;
    GO
    SELECT * FROM smallrows;
    GO

    In my classes, right after I talk about DELETE and TRUNCATE operations, I talk about UPDATE. You might be aware that UPDATE can be performed in a couple of different ways, SQL Server can do something called an “update-in-place” where it just changes the old bytes to the new bytes, and logs one simple update operation. Or, the UPDATE can be performed as two separate operations: DELETE the entire old row (and update all the indexes and log the DELETE plus all the index changes) and then INSERT an entire new row (and update all the indexes and log the INSERT plus all the index changes.)  Obviously, update-in-place is preferred, but there are some conditions that must be met in order for SQL Server to perform an update-in-place. And one of the conditions I had on my list for years turned out not to be true any more when I tested it several months ago.

    The main reason that SQL Server might not do an update-in-place is because you are updating the clustered key value. And since the clustered key value determines where the row goes in the table, changing that value will change the row location. It cannot be done in-place (even when the row doesn’t actually have to move, as you’ll see in the example.) Another reason for not doing an update-in-place used to be because the table had an UPDATE TRIGGER. Prior to SQL Server 2005, the contents of the special ‘inserted’ and ‘deleted’ tables that were available inside a trigger were generated from the transaction log. So in order for these rows to be available, the entire new row and old row had to be logged. But in SQL Server 2005, with the introduction of the version store, mainly used to support row version for snapshot isolation, SQL Server started using the version store technology to get the old and new versions of updated rows if there was a trigger that needed them.

    So did having an update trigger still preclude update-in-place?

    I decided to test it.

    This first block creates a table with no clustered index and no trigger.


    -- no clustered index, no trigger
    IF object_id('test1') IS NOT NULL DROP TABLE test1;
    GO
    CREATE TABLE test1
    (a int,
      b char(200) );
    GO
    INSERT INTO test1 values (1, replicate('a',200));
    GO
    SELECT * FROM fn_dblog(null, null)
    WHERE AllocUnitName like '%test1%';
    GO

    You should see 5 rows in the log, with the last one being an operation LOP_INSERT_ROWS, which indicated the single row inserted.

    Now update the row:


    UPDATE test1 SET a=2;
    GO
    SELECT [Current LSN], Operation, Context, [Log Record Length] as Length, AllocUnitName
    FROM fn_dblog(null, null)
    WHERE AllocUnitName like '%test1%';
    GO

    You’ll see something like this:

    image

    There is one additional row, with operation LOP_MODIFY_ROW, for the update-in-place. Notice the length of that log record is 100 bytes.

    In the INSERT log record, the length is 304 bytes.

    Now run the same test on a table with a clustered index on column a.


    --  clustered index, no trigger
    IF object_id('test2') IS NOT NULL DROP TABLE test2;
    GO
    CREATE TABLE test2
    (a int primary key clustered,
      b char(200) );
    GO
    INSERT INTO test2 values (1, replicate('a',200));
    GO
    SELECT * FROM fn_dblog(null, null)
    WHERE AllocUnitName like '%test2%';
    GO
    UPDATE test2 SET a=2;
    GO
    SELECT [Current LSN], Operation, Context, [Log Record Length] as Length, AllocUnitName
    FROM fn_dblog(null, null)
    WHERE AllocUnitName like '%test2%';
    GO

    This time, after the UPDATE,  you’ll see these log records:

    image

    Notice there is no LOP_MODIFY_ROW for the clustered index. There is a LOP_DELETE_ROWS and LOP_INSERT_ROW and their lengths are much longer than the length of the LOP_MODIFY_ROW. The entire old row and entire new row are being logged. This may not seem like a big deal, but imagine instead of one row, that you are updating thousands or even millions of rows. Also note that there is just the single row in the table, so the row doesn’t have to move when updated. There’s only one place for it to be! But SQL Server only knows that the UPDATE is changing a clustered key value, so it is performed as the two step operation.

    Finally, I’ll do the test one more time, on a table with no clustered index, but with an UDPATE trigger.


    -- no clustered index, update trigger
    IF object_id('test3') IS NOT NULL DROP TABLE test3;
    GO
    CREATE TABLE test3
    (a int,
      b char(200) );
    GO
    CREATE TRIGGER upd_test3 on test3 for UPDATE
    AS
      SELECT * FROM inserted;
      SELECT * FROM deleted;
    RETURN
    GO
    INSERT INTO test3 values (1, replicate('a',200));
    GO
    SELECT * FROM fn_dblog(null, null)
    WHERE AllocUnitName like '%test3%';
    GO
    UPDATE test3 SET a=2;
    GO
    SELECT [Current LSN], Operation, Context, [Log Record Length] as Length, AllocUnitName
    FROM fn_dblog(null, null)
    WHERE AllocUnitName like '%test3%';
    GO

    Again, you’ll see only the one LOP_MODIFY_ROW, even though both the inserted and deleted rows were accessed in the trigger.

    image

    So it was obvious that I had to update my course material.

    In my class two weeks ago, I found out another old ‘fact’ that is no longer true, but that will have to wait until next time. And of course, a whole new version is coming out in just a few months. I wonder what old facts will no longer be facts?

    Have fun!

    ~Kalen

    p.s. Right after I published this, I noticed I had already written up the update-in-place with triggers, just about a year ago. So if you missed it then, you know about it now!

  • Geek City: SQL Server 2014 In-Memory OLTP (“Hekaton”) Whitepaper for CTP2

    Last week at the PASS Summit in Charlotte, NC, the update of my whitepaper for CTP2 was released.

    The manager supervising the paper at Microsoft told me that David DeWitt himself said some very nice things about the technical quality of the paper, which was one of the most ego enhancing compliments I have ever gotten! Unfortunately, Dr. DeWitt said those things at his “After-the-keynote” session, not in the keynote that was recorded, so I only have my manager’s word for it. But I’ll take what I can get.

    If you’d like to read the update, which has information about the new “range” indexes, natively compiled procedures, checkpoint files, recovery and garbage collection in addition to all the content from the CTP1 paper, you can get it here:

    SQL_Server_2014_In-Memory_OLTP_TDM_White_Paper.pdf

    There will be more enhancements made for RTM, including some graphics for the index maintenance operations on range indexes, plus best practice suggestions.  These enhancements may be in the form of a third paper, but they might be in book form. I’ll let you know!

    Thanks

    ~Kalen

  • Did You Know? There is a PASS Conference in Stockholm!

    And I’ll be there!

    For the first time EVER, I will not be speaking at or attending the US PASS Summit. However, I will be speaking at the PASS SQL Rally in Stockholm November 4-6. I am so excited!

    I’ll be giving a pre-con you can read about here:
    http://www.sqlpass.org/sqlrally/2013/nordic/Agenda/PreconferenceSeminars.aspx#

    And then I’ll speaking about SQL Server Hekaton (In-Memory Database) in a session on Tuesday.

    I hope to see many of you there.

    ~Kalen

More Posts Next page »

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement