As a teacher, and also a student of languages, one issue that’s always fascinating to me is how people choose to shorten a compound, or multi-part, name. This is very problematic when the word is from another language. For example, we use the word latte in English as a short form of caffè latte, but we shorten it to just the part that means ‘milk’, not coffee! My daughter was an exchange student in Spain and told me another example. Her host mother kept asking her if she had a ‘body’. The question was in Spanish, except for the word ‘body’ and my daughter just couldn’t figure out what she was talking about. The woman was confused thinking for sure my daughter should understand since it was an English word! Finally, they figured out the woman was referring to a bodysuit, but shortening it to something that really had no meaning on its own. In English, we have the words grape and grapefruit. I had a friend (a native Chinese speaker) who could just never remember which was which and frequently mentioned grapes when she meant grapefruit! And sometimes it did it the other way around and asked her native-English speaking husband to get her a bag of grapefruits, when she just wanted grapes. These are two very different things! I have more examples, but I’m getting a bit off track here.
In SQL Server, we also have concepts that have compound names that are frequently shortened. For ‘isolation level’, you could just refer to ‘isolation’ and people would probably know what you mean. But if you just said ‘level’, your meaning might not be so obvious.
And what about pairs of terms with different qualifiers, like clustered indexes and nonclustered indexes? If someone just says index, you usually need to know which kind.
And there are other terms which have the same base, and only one of them has a qualifier. And that’s really why I decided to write this blog post. In SQL Server 2005, Microsoft introduced snapshot isolation for read operations. There were two different levels of snapshots. The terms we use are SNAPSHOT and READ COMMITTED SNAPSHOT. As a teacher, I always found this very confusing, as did many of my students. People would say SNAPSHOT, but it was not at all clear if they really meant READ COMMITTED SNAPSHOT. So I started using an additional qualifier, so that both terms would be qualified. I started saying TRANSACTIONAL SNAPSHOT, but that was too cumbersome, so then in class I just started calling it FULL SNAPSHOT. I just wanted both terms to have some qualifier, so if someone just said SNAPSHOT, I could ask which one they meant.
Then in SQL Server 2014, Microsoft introduced In-Memory OLTP, with two new kinds of indexes. One is called NONCLUSTERED HASH, and the other is just plain old NONCLUSTERED. And again, I found this unbalanced. If someone just says NONCLUSTERED, am I sure they’re just not shortening NONCLUSTERED HASH inappropriately? So I started using the word RANGE as the qualifier for the non-HASH index. Then we could talk about NONCLUSTERED RANGE and NONCLUSTERED HASH, or just shorten the names to RANGE and HASH. And when I’m teaching about these indexes, I am always careful to point out that the word RANGE is not part of the syntax, it’s just a descriptor I use.
However, one of the engineers at Microsoft was not happy with the fact that my writings used the term RANGE index. He said it wasn’t part of the spec, so I shouldn’t use it. It would just confuse people. He claimed Microsoft never used that term and I shouldn’t either. I argued. I explained my pedagogical reasons. He finally just shrugged and decided that I probably wasn’t adding any additional obfuscation and grudgingly admitted I might actually be making things clearer. But he wasn’t happy about it. He just didn’t want me using the term RANGE.
Now SQL Server 2016 is coming out. I’m doing research on some of the new storage structures and new metadata. And I’m looking at a DMV called sys.dm_db_xtp_memory_consumers with a column called memory_consumer_desc. And one of the possible values for this column is ‘Range index heap’. I just checked, and the view is also available in SQL Server 2014, but I just never got around to exploring it until now. And if I create a memory-optimized table called OrderDetails2 on a SQL Server 2014 instance, with three RANGE indexes, and then run this query:
SELECT object_name(object_id) as object_name,
I will get these rows of output (among others):
So RANGE INDEX is not just a term I chose to use to make things easier for my students and my readers. Someone else, designing DMVs, also thought it was a useful term to use.
I am just about done upgrading my In-memory OLTP Internals whitepaper for SQL Server 2016 RTM. Once that’s published, you’ll be able to read all the details of the various memory consumers used for memory-optimized tables. Hopefully, you won’t have to wait too long!
As I mentioned in my previous post, I am heading in new directions, and the first goal has now been reached. Rather than working with training companies to offer my SQL Server Internals training, I am now partnering with Denny Cherry and Associates to offer a special opportunity to those wanting to experience a multi-day training event with me.
Starting next April, we will be offering a three-day event called SQL Server Internals for Tuning, where you can learn about those areas of SQL Server Internals that will help you diagnose and tune problematic queries.
The first event is scheduled for April 4-6, 2016 in Orange County, CA.
You can get details, including the outline for the three day course, and a link for registration, here: http://www.dcac.co/training/internals-for-query-tuning-class-by-kalen-delaney
And after that? As we’re working hard to prepare for the April event, we’re also starting to think about where the next event will take place. Boston? Denver? Orlando? Or some place farther afield like Oslo or Darmstadt?
Let us know what you think! And I hope this new venture gives me a chance to meet some of my readers that haven’t yet had a chance to attend any training events with me.
You know, when Microsoft announces ‘deprecation’ of a feature, there is no guarantee that the feature will be gone any time ‘soon’. All that an official notice of deprecation means is that “This feature will be removed in a future version of Microsoft SQL Server.” But it could be a far future version. SQL Trace, for example, was announced as deprecated in SQL Server 2008, when Extended Events first made their appearance, but SQL Trace, and its trusty sidekick SQL Profiler, are still around, even in the upcoming SQL Server 2016.
So you might not have believed me, in the various forums, or announcements from my training partners, when I said “This may be the last time I will be teaching my 5-day class in <insert-city-or-country-name-here>” But it’s true: My 5-day SQL Server Internals class, has been deprecated. I will be moving in new directions.
Those directions are not yet completely set in stone, as I am still working out various possibilities. I will not be stopping all teaching, at least not for another whole bunch of years, but I will be reducing my travelling drastically. And the 5-day classes have a very limited future. Next year will probably be the last year for that class; and I may only teach it two or three times next year. Here are some of the possibilities I am considering to replace that class:
- I do have a 3-day reduced version of my class, focusing on data/index storage and query tuning internals, that I have offered as an onsite class for companies that just couldn’t have their database folks take 5 whole days for training. However, in the past that was never offered as a public class. That will change in the coming year. I’ll announce here on this blog when I actually have the 3-day class scheduled.
- I will also be starting to offer my shorter one and two day classes as on-site classes. If you have a group of DBAs or developers interested in any of my one-day seminars (most of which have been presented as precons for one conference or another), now is your chance. There is a link on my website that allows you to request information about on-site class.
- I am considering offering my 5-day internals class as an online class, spread out over many more than 5 days. As some of you know, I used to be a University instructor, at UC Berkeley, and there are things about that model of teaching that I really miss. I would love to just present a single topic at a time, talk about it IN-DEPTH for a couple of hours, and then give some homework exercises relating to that topic. A couple of days later, there would be another class, where we’d go over the homework, and answer any questions that came up (after actually having time to think about what was discussed) and then cover another topic. I think I could cover the same content as in my 5-day class over a period of perhaps 15-20 sessions. There are a lot of details still to be worked out before this can happen, including technical logistics, but I’ve wanted to do something like this for a long time.
- Other possibilities include working in partnership with other companies, but these discussions are still in the very early stages.
Of course, I could also win the lottery, and that would open up even more possibilities for what I might do in the future. But I’ve got enough to think about for now, even without that.
Hopefully, I’ll see lots of you folks next week at the SQL Pass Summit in Seattle!
I’ve done it again… I’ve neglected my blog so long that my name has been removed from the list of SQLBlog contributors on the right-hand side of the main page. So I need to fix that.
I have been accumulating a list of questions that have come up during my 5-day SQL Server Internals classes, that I cannot immediately answer. Sometimes I can figure out the answer myself with just a bit of research or a simple test, and sometimes I can ask one of my Microsoft contacts for help. (Other times, I actually know the answer, but I just space out on it when asked, after being on my feet talking for 5 straight days. )
I kept hoping that I would eventually get answers to ALL the questions on the list, but I realized I should wait to post what I have. And that will get me back up on this blog site.
1. Can memory-optimized tables use BPE (Buffer Pool Extensions)
No, BPE is only for data read from disk and stored in memory buffers. Memory optimized tables do not use pages or buffers.
2. What if a plan guide references a table that is specified in a query without a schema name, and there are two different tables, in different schemas, with the same name?
For a plan guide of type ‘SQL’, all that matters is the TSQL-statement that is included in the plan guide. The same plan guide could be invoked when accessing different tables with the same name.
3. Why does the Native Compilation Advisor complain about the EXEC command?
In SQL Server 2014, EXEC is never allowed in Natively Compiled Procedures. In SQL Server 2016, we can EXEC a Natively Compiled Procedure or function from within a Natively Compiled Procedure, but we can’t EXEC a non-native module, and we can never EXEC a string.
4. Can we replicate a memory-optimized table?
Memory-optimized tables can be subscribers for transactional replication, but that is the only supported configuration. See this MSDN page for more details:
5. Does auditing work on memory-optimized tables?
6. Are the values in sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats reset when an index is rebuilt?
YES, just as if you had restarted your SQL Server Service.
7. When do I need to use the option ‘scan for startup procs’?
This configuration option is related to the procedure sp_procoption, that allows you to mark a stored procedure as a ‘startup’ procedure, to be executed every time your SQL Server service starts. However, the configuration option to tell SQL Server to scan for these procedures doesn’t see too useful. When the first procedure is marked for startup, this option is turned on automatically, and when the last one is unmarked, it is turned off. So there is really no need for the configuration option, unless you suspect a problem and want to inhibit the startup procedures.
8. When are SQL Trace and the SQL Profiler going away?
They’re still available in SQL Server 2016 CTP 2.4, so I assume they will still be in SQL Server 2016 RTM. After that, I think no one knows yet.
That’s all I’ve got for now.
So here it is, the bits and bytes post I’ve been promising….
This post is not about Hekaton (in-memory OLTP).
This is about the snapshot-based isolation levels that were introduced in SQL 2005, which was called optimistic concurrency at the time. With the advent of Hekaton’s truly optimistic concurrency in SQL Server 2014, we’re not calling it optimistic concurrency anymore. Or maybe we calling optimistic concurrency for reads. True optimistic concurrency has no locks at all, and with disk-based tables (i.e. not memory-optimized tables), SNAPSHOT isolation level and Read Commit Snapshot use locks for data modification operations.
So DBSI (disk-based tables with one of the snapshot-based isolation levels) allows older committed data to be available by writing it to a part of tempdb called the version store. There is a DMV that shows you the contents of the version store, and we’ll take a look at it shortly.
The code below will create a new database called test (dropping it first if it already exists). Of course, you could just use an existing database of your own, which is why I also test for the existence of the table. The table has all fixed length columns, mostly character, so it will be easy to recognized the row contents. After creating the table, and inserting one row, I get the page number for the row using the undocumented fn_physlocformatter function, and take a look at the row.
IF db_id('test') IS NOT NULL
DROP DATABASE test;
CREATE DATABASE test;
ALTER DATABASE test SET read_committed_snapshot OFF;
IF object_id('test_table', 'U') IS NOT NULL DROP TABLE test_table;
CREATE TABLE test_table
Col1 char(5) NOT NULL,
Col2 int NOT NULL,
Col3 char(3) NOT NULL,
Col4 char(6) NOT NULL
INSERT test_table VALUES ('ABCDE', 123, 'Old', 'CCCC');
SELECT sys.fn_PhysLocFormatter(%%physloc%%), * FROM test_table;
Here is my result:
I then look at the page indicated, so you may have to replace the 292 with whatever you got as a page number.
DBCC PAGE(test, 1, 292, 1);
Here’s my row contents:
It’s a bit out of scope for today to describe what the bits and bytes in the row actually mean. I’ve done it before, and every single bit is described in my SQL Server Internals book.
I enable RCSI (READ COMMITTED SNAPSHOT ISOLATION) for the database, then update the row and look at it again. I also take a look at the version store, before it can be cleaned out.
ALTER DATABASE test SET read_committed_snapshot ON;
UPDATE test_table SET Col3 = 'New';
DBCC PAGE(test, 1, 292, 1);
SELECT * FROM sys.dm_tran_version_store;
And here’s the new row:
Notice that the row is now longer. The first 25 bytes are the same, except for the the 3 bytes that used to contain ‘Old’ (446c64) are now holding ‘New’ (4e6577). The new row is actually 14 bytes longer, which is the number of bytes that gets added to every row that gets involved with DBSI. There are 8 bytes for the transaction_sequence_number at which the row was updated, and 6 bytes for a pointer to a location in the version store where the previous version of the row can be found.
And in the version store I see this:
I’ve chopped down some of the column names, but if you run your own SELECT from the DMV, you can see the full names. The database ID is 10, so you can use the db_id() function to verify what database the row is from. The rowset_id is another name for the hobt_id, which is another name for partition_id. So once you get in the right database, you can verify the object with this query:
WHERE partition_id = 72057594040549376;
The metadata also shows the minimum row length and the actual row length (25 bytes). And that used to be as far as I went. In class, I would just say that the last column shown was an encoding of the row data. But I had never looked at it closer. Until recently. And then I realized, as you may have done already because of the queries I’ve just run, that the record_image_first_part contains exactly the same byte string as the original version of the row. And that original version of the row is what was saved in the version store when the row was updated. So it’s really not some mysterious ‘encoding’ of the row data… it IS the row data. So we can not only see the database and the object that the changes are from, but we can also see the rows that have been updated.
Be warned that the version store can become quite large, and it’s not always useful to be able to analyze every single row that has been updated. But sometimes, you just gotta know.
So now you do.
So still no juicy bits and bytes post, only this short one to tide me over.
Another question from my Budapest class wasn’t really a question at all. It was pointing out that something was wrong in my slides! And this makes me think back about a blog post from a while back about features that change that can be easily overlooked.
In the section in my class on using LOB datatypes (text, ntext, image) vs the MAX types (varchar(MAX), etc), I mention that you can’t pass LOB types as a parameter. It was pointed out to me that this ‘fact’ is not true. I was told they use text parameters all the time in their applications. However, when I sat down to test this out, I realized I needed to put the text data into something before passing it as a parameter, and I definitely can’t declare a variable of type text. So I need to use the varchar(max) type to hold the data to pass in, even though the procedures is created to accept a text type parameter. Here it is:
USE test; -- use the name of any testing database you have
IF object_id('bigdata', 'P') IS NOT NULL DROP PROC bigdata;
CREATE PROC bigdata (@a text)
SELECT 'The input parameter is ', datalength(@a), ' bytes long'
DECLARE @message varchar(max);
SELECT @message = replicate (convert(varchar(max),'Halleluja '), 100000);
EXEC bigdata @message;
So now you (and I) both know!
In my class in Budapest last week, someone asked if point-in-time recovery was supported for transactions on memory-optimized tables. I couldn’t think of a reason why not, but I had never tried it. And just because there is not a technical inhibitor to allow certain functionality, it is often the case with a new technology that certain features aren’t included just because it just hasn’t been tested sufficiently.
But as I thought about it a bit more, I realized there was no way that point-in-time recovery could work for disk-based tables unless it also worked for memory-optimized tables. After all, the transactions are all stored in the same log.
But, I decided to verify anyway.
So I created an in-memory database (IMDB) and a memory-optimized table. (If you want to run the code, you’ll need a folder on your C: drive called HKData and one called HKBackups. Or you can edit the code to change the paths.) I then did an initial database backup of the database with an empty table.
SET NOCOUNT ON
IF db_id('IMDB') IS NOT NULL DROP DATABASE IMDB;
CREATE DATABASE IMDB;
----- Enable database for memory optimized tables
-- add memory_optimized_data filegroup
ALTER DATABASE IMDB
ADD FILEGROUP IMDB_mod_FG CONTAINS MEMORY_OPTIMIZED_DATA
-- add container to the filegroup
ALTER DATABASE IMDB
ADD FILE (NAME='IMDB_mod', FILENAME='c:\HKData\IMDB_mod')
TO FILEGROUP IMDB_mod_FG
-- create a memory-optimized table
CREATE TABLE dbo.t_memopt (
c1 int NOT NULL,
c2 char(40) NOT NULL,
c3 varchar(100) NOT NULL,
CONSTRAINT [pk_t_memopt_c1] PRIMARY KEY NONCLUSTERED HASH (c1)
WITH (BUCKET_COUNT = 100000)
) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
BACKUP DATABASE IMDB TO DISK = N'C:\HKBackups\IMDB-FULL.bak'
WITH NOFORMAT, INIT, NAME = N'IMDB-empty-table', SKIP, NOREWIND,
NOUNLOAD, STATS = 10;
I then inserted a row into the table, waited a minute, grabbed the time, waited another minute, and inserted another row. I found that if I didn’t put the WAITFORs in, it would get a bit confusing about exactly where I wanted to stop restoring.
INSERT t_memopt VALUES (1, 'a', REPLICATE ('b', 100));
WAITFOR DELAY '0:1:0';
WAITFOR DELAY '0:1:0';
INSERT t_memopt VALUES (2, 'c', REPLICATE ('d', 100));
SELECT * FROM t_memopt;
I then noted the time captured with the getdate(), saw the two rows returned, and then made a log backup.
BACKUP LOG IMDB TO DISK = N'C:\HKBackups\IMDB-log.bak'
WITH NOFORMAT, INIT, NAME = N'IMDB-2-inserts-in-log'
Now I could test the restore.
RESTORE DATABASE IMDB
FROM DISK = 'C:\HKBackups\IMDB-FULL.bak'
WITH REPLACE, NORECOVERY;
RESTORE LOG IMDB
FROM DISK = 'C:\HKBackups\IMDB-log.bak'
WITH RECOVERY, STOPAT = '<insert getdate() value noted above>';
SELECT * FROM t_memopt;
When I saw just the one row row returned, I knew the point-in-time recovery had worked!
So now I can start working on a real juicy bits and bytes type blog post. Hopefully I’ll have it up next week.
I was just starting to work on a post on column statistics, using one of my favorite metadata functions: sys.dm_db_stats_properties(), when I realized something was missing.
The function requires a stats_id value, which is available from sys.stats. However, sys.stats does not show the column names that each statistics object is attached to. So how can I know which stats_id is the one I want?
So I looked at sp_helpstats, and it shows the column names, but NOT the stats_id! I spent some time trying to figure out how to translate the stats_name into the column name, so that I could just use sys.stats. For example, if you see a stats_name of _WA_Sys_0000000C_38B96646, the 38B96646 is the HEX representation of the object_id and the 0000000C is the HEX representation of the column_id. I can find the column name with this information, but it’s not quick and easy, especially not to generalize in TSQL code.
What DID turn out to be quick and easy was to rewrite sp_helpstats so that it returns the stats_id. It turned out to be an incredibly simple change. I just added an additional column to the temp table created in the procedure, and then added that column to the INSERT into the temp table and the SELECT from the temp table to return the final results. The cursor already fetched the stats_id column to use in its work; it just didn’t return it.
Note that my script, which I am attaching, has not been thoroughly tested. It has been tested enough for my use when poking around the statistics, and using the sys.dm_db_stats_properties() function.
Perhaps some of you will find this useful. And if not, at least I’ll know where to find the script next time I need it!
In the last couple of weeks, I’ve had to troubleshoot several technical problems that, while not critical, were extremely frustrating, annoying and inconvenient.
Two of them had to do with setting up my new computer. I am now using a Microsoft Surface Pro 3 as my main work computer. And mostly, I absolutely love it! However, there have been a couple of problems.
1. Getting the tablet into the docking station involved a whole lot of jiggling. This is the first time I’ve ever used a docking station and I love the idea. But there were no instructions about HOW TO actually insert the tablet, so that the little piece that actually made the connection between the tablet and docking station was really connected. The first time, it took me a while to realize I wasn’t connected and that I should be looking for a little light to come on, on the side of the docking station, unless I wanted to run down my battery. I would put the tablet in and jiggle it around, and hope that I got it, and sometimes it worked and sometimes it didn’t. I tried looking for a HOW TO INSERT INTO DOCKING STATION guide, but could find nothing, as if it was just too obvious no one should need help. Finally, I found this page:
and I almost overlooked the solution. I read that I had to just slide the Surface into the docking station, but that just wasn’t working for me. It would not ‘just slide in’. Then I read the next part which said: then slide the sides of the docking station into the sides of your Surface.
Well, it had never mentioned that I had to slide anything open before putting the tablet in. But once I knew the sides were suppose to separate, it only took a few seconds to figure out how to do it. Slide the sides open, insert the tablet, slide the sides closed. And now I’m set.
2. The docking station has a mini display port that can be used to connect to an external monitor, which I have. In fact, I have two, for two different computers. So I looked at Amazon, and found an adapter for connecting a mini display port to a VGA cable. It hooked up fine. But the color just looked wrong. I tried the other monitor, and the color still looked wrong. I couldn’t put my finger on exactly what ‘wrong’ meant, but it didn’t look the same as on the tablet screen. And certain colors I’d used for highlighting on some documents turned completely black on the external monitor. So I used my search-fu to dig through pages and pages that were returned when I just searched for ‘color is funny on external monitor.’ Finally I found a page that talked about adjustments on the external monitor, which I didn’t think were necessary since the monitors worked fine with other computers. But on that page, I got enough information to determine that what was wrong was that ‘red’ wasn’t being displayed. The background on the tablet screen was purple, and the background on the external monitor was blue. And there was no red anywhere. So I was able to narrow down the search and look for details about no red available, and found a page about bad pins. The adapter had only cost $8, and Amazon will replace it if it is bad, so I just ordered a new one. And when I returned from my trip to Philly on Friday night, it was waiting for me. First thing Saturday morning I plugged it in, and now my extended monitor has a purple background, too. So the problem was just defective hardware.
3. My second computer had previously been used as my powerhouse machine, and that’s the one I would run my SQL Servers on. I would run TSQL scripts but didn’t create content, with either Word or PowerPoint on that machine. But occasionally I needed to actually have a Word doc on that machine, or I needed to use it for presenting, so I would need PowerPoint files on it. It seemed that some Word docs and some PowerPoint files would get corrupted when I copied them to that machine. I got a really nasty corruption error and then nothing would open. But not all files were like this. It happened most often when I downloaded from my Cloud storage (I have been using Box.com since long before OneDrive). When I copied over a local network share from one machine to the other, it was fine. Or if I copied onto a USB stick and then onto the other machine it was fine. But downloading from Box.com would corrupt files. I was almost ready to leave Box.com but then testing using OneDrive gave me the same behavior. Downloading from either Cloud storage location would corrupt my files. But only on one machine. I could download to my new tablet and everything was fine. Not cool. But I was going out of town and couldn’t spend more time debugging.
However, I had to keep debugging, because the first day of class, I didn’t have the slides with me, and I had to download them. And they wouldn’t open. So I spent the first day presenting off a PDF file. In my hotel that night, I copied from my tablet onto a stick and then onto the machine I was using for teaching, so I thought I was set. Then Friday morning, I got an email from the manager, with a PowerPoint slide attached. She asked me to display the slide during break as it had the instructions for the student evaluations. The file was corrupted and would not open. There was no other source for the files, I had to open it. So I did one more online search, in the last 5 minutes before class started, and found this:
It turned out it wasn’t a corruption problem at all, it was a trust center issue.
If you are using Office 2010, you need to check your setting in the Trust Center. Go to File>Options>Trust Center>Trust Center Settings>Protected View. Make sure the first box is NOT checked.
I unchecked the box and the manager’s PowerPoint slide opened right up. And so have all other downloaded files.
4. And finally, there is a technical problem with my phone, a Nokia Lumia 920 Windows Phone. The navigation software with step by step directions has just never worked well. I have gotten lost multiple times. I try some fix I’ve heard about, and on the next trip, it happens again. The main symptom is that it just freezes, and I’m driving along, thinking I have miles to go before my turn, and actually I’ve passed it already. And when that happens, there is no way to force a refresh. Even stopping the navigation and choosing a new destination comes back right to the screen that says I still have all these miles to go before I need to turn. Last week when I was driving to my hotel outside Philly it froze up right before a really critical intersection, and I didn’t know which direction to turn. I knew I had to take a certain hiway, but I thought I needed to go west, and the signs were for north and south. So I made the wrong choice, and ended up totally lost. The trip took me almost 2 hours when it should have taken me 45 minutes.
So in my hotel I started a dedicated search. Was there better navigational software for this phone? Was it a hardware problem and I needed to replace the phone? Or a bug in the navigation app? Finally I found a support forum discussion that might be relevant. It mentioned making sure that Wi-Fi was off, because that could sometimes interfere with GPS. And I do almost always forget to turn the Wi-Fi off when I leave home or hotel and hit the road where there isn’t Wi-Fi everywhere.
It seem a bit better now with the Wi-Fi turned off, but it’s not perfect. On my drive home from the Seattle airport Friday night, I put it to the test. Although it didn’t freeze, it did lose the signal multiple times. I didn’t get lost, since I knew my way home, but it would have been nice to know that I would be able to navigate next time I was in unknown territory. And I still have no confidence in that.
So three out of four isn’t bad, but I still need to figure out what to do about my phone. Maybe it’s time for an upgrade?
Online searching is certainly a powerful capability we have now, and figuring out the best search terms takes skill and patience. I like to think that all my searching for SQL Server solutions has helped me find other technical solutions more efficiently. And I decided to blog this so I would have a record of my solutions in case I ever needed them again!
I hope you are all having a wonderful start to your new year.
Oh, and GO HAWKS!
I apologize to all you folks who attended my Hekaton precon at the SQLPASS Summit last month, who have been eagerly awaiting my answers to some of the unanswered questions. I had got most of the answers sorted out, but I was waiting for one last reply, and put the whole list on the back burner. I just realized today it was now December, and I still hadn’t published this.
So here it is, six questions that I didn’t completely answer during my talk, although for some of them I did make a guess.
1. Do SCHEMA_ONLY tables affect the upper limit of memory available for memory-optimized table?
Remember, there actually is no hard upper limit for the amount of memory you can use for memory-optimized tables. The recommended maximum of 256 GB has to do with the number of CFPs (Checkpoint File Pairs) that can be supported. So in that sense you might be able to have more data in SCHEMA_ONLY tables because you don’t have to worry about the size of the CPFs, as the data in those tables is never written to the files. But you still need to be aware of the overall system memory limit and make sure your machine has enough memory for all the memory_optimized tables, both SCHEMA_AND_DATA and SCHEMA_ONLY) as well as memory for the buffer pool for your disk-based tables. Plus memory for other SQL Server and system needs.
If you have bound a database to a resource pool with a fixed upper limit of memory, all your tables have fit within that memory limit.
2. Is Point-in-time RESTORE possible for databases that have performed transactions on memory-optimized tables? (i.e. can we do a RESTORE LOG WITH STOPAT command?)
Simple answer here: YES
3. Can a database with memory-optimized tables be restored or attached onto a SQL Server 2014 instance using STANDARD Edition?
No. Both restore and attach will require that all filegroups are available, including the filegroups for your memory-optimized tables. And once you have those special filegroups, the database cannot be restored or attached on STANDARD Edition and you get an error. So you really don’t even have to have any memory-optimized tables. As soon as you create a memory-optimized filegroup, the database will no longer be accepted.
I wish I could tell you what the exact error would be, but I don’t have a SQL Server 2014 STANDARD Edition to test on. I have frequently requested that Microsoft provide a feature that allows us to ‘switch off’ Enterprise features if we are using EVALUATION or DEVELOPER Edition, so we can actually test using a base other than ENTERPRISE. But so far, Microsoft has not complied. There are two Connect items that seem to address this this need. The first one at https://connect.microsoft.com/SQLServer/feedback/details/331297/developer-edition-in-standard-workgroup-mode is marked “Closed as WON’T FIX’ but the other is still active at https://connect.microsoft.com/SQLServer/feedback/details/496380/enable-sql-developer-edition-to-target-specific-sql-version, so you might want to go give it a vote.
4. Can views on memory-optimized tables be accessed from within a natively compiled procedure? (I think the question was sort of hinting at a workaround where you could create a view on a disk-based table and access that through the native procedure.)
The answer is NO. You can create a view on a memory-optimized table, but you cannot access it from a native procedure. No views at all can be referenced in a native procedure.
5. With REPEATABLE READ Isolation, with which SQL Server needs to guarantee READ STABILITY, what happens if a column that wasn’t read is changed? (i.e. your query read col1 and col2 from Table1, what happens if another query changes col3?)
It’s the row that is important. If Tx1 reads col1 and col2 from Table1 in REPEATABLE READ and doesn’t commit, then Tx2 updates col3 in Table1, when Tx1 commits it will then fail due to REPEATABLE READ violation.
6. Can transactions on memory-optimized tables run in RCSI (READ COMMITTED SNAPSHOT Isolation)?
RCSI is just a variant of READ COMMITTED Isolation. Operations on memory-optimized tables have to run in something higher than READ COMMITTED, so RCSI is moot.
I hope this is useful!
And so another PASS Summit passes into history. It was an awesome week, filled with old friends and new, and lots of superlative technical content! My Hekaton book was released just in time, and it was great seeing the excitement. Red Gate gave away all the copies they had, the bookstore sold all they had, and I gave away 20 in my sessions.
There's also a review of my book up already: http://www.i-programmer.info/bookreviews/21-database/7950-sql-server-internals-in-memory-oltp.html
My PASS Demo scripts are now available on my website. Just go to www.SQLServerInternals.com and click on Resources in the left-hand menu, and then choose Conference Content.
The winners of my book giveaway were announced on Twitter, and email was sent out. I thought I was going to have to say that one of the winners had not yet responded, but just as I started writing, the sixth winner’s email arrived. I’m still scrambling to get caught up on everything that got put off while I was at the MVP Summit and PASS Summit last week, but I hope to have the books shipped by the end of this week.
Here’s the list:
And now that PASS is over, my next project is updating my SQL Server Internals course to SQL Server 2014. The first public delivery will be in Stockholm in February. Check my schedule for more details: http://sqlserverinternals.com/kalens-schedule/
We are having a glorious Autumn here in the Beautiful Pacific Northwest. I hope you are enjoying your Autumn (or Spring in the Southern Hemisphere) as much as I am enjoying it here!
And it coincides with a new book of mine being published. So I decided I would give away some presents!
Today marks 27 years that I have been working with SQL Server. On this day in 1987 I started working for the Sybase Corporation in Berkeley, California, and SQL Server has consumed by working life, ever since….
I still have quite a few copies of my SQL Server 2012 Internals book, plus all the copies of the new In-Memory OLTP book, so I’ll be giving away a couple of each. (Many more of the In-Memory OLTP book will be given away at the PASS Summit next week.)
So what do you have to do?
1. Send an INSERT statement to this email address
2. Include "Happy Anniversary" in the subject line. The mail will be filed as junk if you don't.
3. The body of the email should consist of a single T-SQL INSERT statement (shown below). For the last column, choose which book you’d prefer.
A 1 will indicate the SQL Server 2012 Internals book, and a 2 will indicate the In-Memory OLTP book.
4. Doublecheck the syntax! Make sure you have open and close quotes for all strings, and that the quotes are STRAIGHT quotes, as in the INSERT statement below. If you use Word to add a quote of your own or replace one that's there, it's usually a "smart" quote, which in this case is NOT smart. SQL Server will reject the INSERT if the quotes are not STRAIGHT. I suggest you use SSMS to write the insert.
5. Send the email before 23:59:59, October 30, US Pacific Time.
Here is the INSERT statement. Replace the strings with your specific information, and include it in your email:
INSERT INTO Raffle(emailAddress, firstName, lastName, country, city_region, choice)
VALUES('Your email address', 'First Name', 'Last Name', 'Country', 'City, State or Region', [1|2] );
Due to postage costs, I’ll only choose one winner from outside the US. I’ll choose 3 from within the US.
If my friend John wanted to enter, and wanted the In-Memory OLTP book, his INSERT might look like this:
INSERT INTO Raffle(emailAddress, firstName, lastName, country, city_region, choice)
VALUES('firstname.lastname@example.org', 'John', 'Cook', 'US', 'Houston, TX', 2 );
If my son-in-law in Germany wanted to enter, and he wanted the 2012 Internals book, his INSERT might look like this:
INSERT INTO Raffle(emailAddress, firstName, lastName, country, city_region, choice)
VALUES('email@example.com', 'Manuel', 'Meyer', 'Germany', 'Weil der Stadt', 1 );
I am very excited to be able to report that my latest book is at the printers now, and should be ready in time for the PASS Summit!
And since the last files were sent off to the publisher (Red Gate) last Friday, I could finally finish up the slides for my Pre-Conference Seminar, all about In-Memory OLTP.
Check it out here:
Although there will be some copies of the book at the Red Gate booth, along with many of their other wonderful titles, Red Gate will also be giving me a box full of books to give away during the session. There won’t be enough for everyone (unfortunately) but your chances of snagging one will be good!
See you in November!
I’ll be speaking on my new favorite topic… Hekaton! (aka In-memory OLTP)
I was delighted to have the chance to give a taste of the presentation for the the 24 Hours of PASS preview on September 9th. I was in Germany at my daughter's house that day, so although my US audience heard me in the morning, for me, it was almost supper time. I am always impressed by the quality of the questions I get when giving online talks, and this was no exception. I was only able to address about a third of the questions that came in, but the organizers sent me a list of most of the submitted questions to address later.
Note that I won’t be giving really detailed answers to most of the questions. After all, this session was just a preview of my precon at the PASS Summit. In addition, I am just finishing up a book on In-memory OLTP for Red Gate, that hopefully will be released at the Summit. And that will have lots more details!
Also keep in mind that this is a brand new technology, so the best practices and use cases are still being explored and those people out there actually implementing In-memory OLTP solutions are discovering new things all the time!
Q: Is the size of the table (or tables) that you want to use Hekaton on dependent on the amount of RAM you have available?
Absolutely. You can’t have more data in memory than will fit in your memory. In addition, if not all your tables are in-memory tables, you’ll need to make sure there is enough memory for the buffer pool to hold needed pages from the disk-based tables. You also need to make sure you have enough memory for multiple versions of rows in your in-memory tables, based on how much updating you’re going to be doing.
Q: How can I tell if a disk-based table has overflow rows, before I attempt to migrate it to Hekaton. Also, how would I workaround this?
You can look for any objects that have rows in sys.allocation_units with a type description of anything other than IN_ROW_DATA. This will give tables with either ROW_OVERFLOW or LOB data. If you just want to know the tables with ROW_OVERFLOW data, look for where the type_desc is ROW_OVERFLOW_DATA. This query should get you started:
SELECT OBJECT_NAME(object_id) FROM sys.partitions
WHERE index_id < 2
AND partition_id IN (
SELECT container_id FROM sys.allocation_units
WHERE type_desc != 'IN_ROW_DATA'
AND total_pages > 0)
Working around this might take a table redesign (which is a little beyond the scope of this short Q&A).
Q: Is it a requirement that the memory should be double the amount of data to efficiently use the Hekaton engine?
As mentioned in the first answer, you need to have room for all the table data for tables you want in-memory, and you need to have enough room for the indexes on those tables, and for the versions of rows as the data changes. In some cases you might end up needing more than twice the data size, but it completely depends on the volume of your data changes, and the indexes you’ve created. If you have insufficient memory, it won’t just be a matter of not efficiently using the Hekaton engine; you will not be able to do any more data modifications once you are out of memory. But, there is no hard and fast rule. You need to know your data and your operations, and you need to test thoroughly.
Q: What happens as unique row counts exceed allocated bucket sizes?
Q: Do buckets grow as data is inserted ? Can you manually alter the number of buckets?
I will assume you are asking what happens when the number of unique values exceeds the predefined bucket count. Here, this does become a matter of efficiency. If there are lots fewer buckets that unique values, one bucket will be used for multiple values and your search operations will be slower. The number of buckets is fixed at the time the table is created and cannot be changed, either automatically or manually. You’ll have to drop and recreate the table.
Q: When does the data write to disk, and do you have to compress the row history to free up memory space?
Q: The data file name does not need an extension (like 'mdf')?
New rows of data are written to disk using a mechanism called ‘continuous checkpoint’ which uses SQL Server’s filestream technology to write to files on disk. There is no concept of a data file, so no .mdf is used for the in-memory data. The database containing the in-memory tables still needs a regular data file and a log file but the in-memory tables are not written there. The filestream files where the in-memory data is written, to be used only in case needed for recovery, do not use a file extension. You might want to read about the filestream technology, which was introduced back in SQL Server 2008.
Q: How do you calculate how much memory the table is going to need?
The required space depends on the data in the rows, the number of rows, and the volume of updates. There is no simple formula for calculation because no one knows what your volume of changes will be. You can make a guess if you know your workload, and run some tests. There is good metadata available to tell you how much space you are using for your in-memory tables once you have created them. You can look at the memory used, run some of your workload, and then see how much more memory is used, to see how the memory requirements are growing. You might start with the DMV: sys.dm_db_xtp_table_memory_stats.
Q: I really like to use in memory OLTP but we frequently alter table . What is your recommendation on using In Memory OLTP for these table?
As you’ve heard, once an in-memory table is created, it cannot be altered, not even to add more indexes, or to change the bucket count for any of the existing hash indexes. You’ll have to test whether dropping and recreating the table, or creating a new table and copying the data, can be done is a reasonable amount of time for your requirements. (Keep in mind that even with disk-based tables, some of the ALTER TABLE operations are not fast, efficient changes. Some ALTER TABLE options require that SQL Server completely recreate every row. )
If you want even more details on Hekaton in SQL Server 2014, there are still seats in my precon (last I heard)!
And check out the Red Gate booth in the vendor area to find my new In-Memory OLTP Internals book!