Ramblings of Greg Low (SQL Server MVP, MCM and Microsoft RD) - SQL Down Under
In a previous post, I was talking about how changing data types from the older ntext, text, and image data types to the current nvarchar(max), varchar(max), and varbinary(max) data types doesn't achieve the same outcome as having defined the tables that way in the first place, unless you subsequently rebuild the tables.
I also had a question about how you can find out which columns still have pointers to out of row data. Unfortunately, finding that out doesn't seem so easy and it would vary row by row in the table.
To see this, let's start by creating a table, populating it, then changing the data type the same way that I did in the last post:
First I ran a query to find the pages that are allocated to the table:
Note the there are in-row data pages and LOB data pages. Let's now investigate the contents of page 2579316 from file 1:
If we scroll down through the contents of the page, we can find the row data and pointers:
Note that this shows that column 3 (RequestDetails) is a Textpointer. You can see from the RowId value where the data is located (File 1, Page 2044944, Slots 0).
I'd like to automate these steps but that's a project for another day.
I've run into a situation at a number of sites where the following occurs
- An excessive number of logical page reads during query execution
- Changes have occurred from ntext, text, or image data types have been replaced by nvarchar(max), varchar(max), or varbinary(max) data types, as part of a clean-up of deprecated data types.
- Rebuilding the table greatly reduces the number of page reads and the customer is puzzled about why.
One of the causes for this situation is related to how the data in these columns is stored. The ntext, text, and image data types defaulted to having their data stored out of row. The row contained a pointer to where the data was located. By comparison, the nvarchar(max), varchar(max), and varbinary(max) data types default to storing data in-row where possible.
The issue is that when the data type is changed, the data isn't moved in-row and won't move until the column is updated. Here's an example:
Let's start by creating the table:
Note that the RequestDetails column is of the ntext data type. It will default to being stored out of row. Let's start by inserting some data into the table:
If we query the allocation units that have been created for this table, we see the following:
Note that there are both IN_ROW_DATA and LOB_DATA allocation units and both have been used.
For a comparison, let's create another table dbo.RequestsX that uses nvarchar(max) instead and see the difference:
Note that although there are LOB_DATA and ROW_OVERFLOW_DATA allocation units, no pages have been allocated to either of those.
Now, let's go back to the original table and change the data type of the RequestDetails column to nvarchar(max) and see the difference:
Note that the ROW_OVERFLOW_DATA allocation unit has appeared but isn't used. Also the original two allocation units are unchanged in how much they are used. The bottom line is that the data hasn't moved yet.
Let's now update the column by setting it to its own value and compare the outcome:
Notice that the data has basically moved in-row now but we now have some fragmentation.
So just for completeness, let's rebuild the table completely and compare the outcome:
So the important message is that if you go through your code and dutifully replace all the ntext, text, and image data types with nvarchar(max), varchar(max), and varbinary(max) data types, you'll need to rebuild the table to get the best results.
OK, been doing a bunch of MVA courses as part of the local Microsoft AU dev div heroes campaign. I need to find 5 Australian citizens who have done at least one of the courses below, and get their email addresses.
You don't have to do the whole of any badge. For example, you could do SQL Server 1, SQL Server 2, or SQL Server 3.
Anyone up for it? Or do all of the ones for a badge for a figurine. They are cute. There are some t-shirts on offer too. Details are here:
Looking forward to delivering another session for the Perth SQL Server user group. Will be the night of 27th November. Here's the session details:
"Understanding SQL Server High Availability Options
While more and more systems need ever increasing levels of availability, many customers are confused about when to use each of the high availability options provided by SQL Server. In this session, Greg will provide a detailed overview of log shipping, database mirroring, failover clustering and availability groups, with recommendations on where and when to use each, and the pros and cons of each option. He will discuss all currently-supported versions of SQL Server from 2008 to 2014. If you are confused about SQL Server HA options, this session is for you."
Contact Jenny Richardson for more details (or ping myself or Mai).
Hope to catch up with many of our Perth friends while there.
Lots of big changes for Visual Studio and .NET were announced today.
The biggest items are:
- .NET becoming open source
- Microsoft work to help move .NET onto Linux and Mac
- Visual Studio 2013 Community Edition
- Visual Studio 2015 Preview available
- Lots of integration for Xamarin developers including Xamarin install from within Visual Studio
The one that I like most here is the Visual Studio 2013 Community Edition. We’ve had Visual Studio Express for some time but it was very limited. In particular, it blocked any attempt to extend it with plug-ins. Plug-ins are where the real creativity with the product can appear. The new community edition is full-featured and free for all except enterprise application development.
Full details from Soma are here: http://blogs.msdn.com/b/somasegar/archive/2014/11/12/opening-up-visual-studio-and-net-to-every-developer-any-application-net-server-core-open-source-and-cross-platform-visual-studio-community-2013-and-preview-of-visual-studio-2015-and-net-2015.aspx
I do hope the SQL Server team are watching this. I like Jamie’s suggestion here about doing the same with SQL Server Developer Edition. As Jamie points out, it barely adds to revenue. Making it free would seem a good idea.
Cost is one thing but extensibility is another. Whenever there are MVP meetings on campus, I always feel like I’m the one in the room endlessly asking about extensibility when each new feature is shown. And the answer from the SQL Server team is invariably “we haven’t allowed for extensibility in this version but might in the future”. But that almost never happens.
So many new features fall short of the mark when they are first released but if there were extensibility points, others could contribute to make them more useful. Without those extensibility points, new incomplete features can just flounder. There have been many examples of this over the years. (As an example, ask where the UI for Service Broker is. Klaus had some wonderful work done on building one that he showed us back in 2006 but there’s no supported way to make add-ins for SQL Server Management Studio either. You can hack it but then you need to worry about it being broken by every new update or release that comes out).
I think this is the difference between shipping a product, and building an ecosystem around a product. I’d love to see SQL Server morph into something that has an ecosystem.
I’m not a fan of letting the system automatically name constraints, so that always leads me to thinking about what names I really want to use. System-generated names aren’t very helpful.
Primary keys are easy. There is a pretty much unwritten rule that SQL Server people mostly name them after the table name. For example, if we say:
A violation of the constraint will return a message like:
The name isn’t helpful and it shows us the key value but not which column was involved.
So, we might say:
Even in this case, there are a few questions:
- Should the name include the schema? (ie: PK_dbo_Clients) If not, this scheme has an issue if there are two or more tables in different schemas with the same table name.
- Should the name include the columns that make up the key? (ie: PK_dbo_Clients_ClientID) This might be useful when an error message is returned. A message that says that you violated the primary key, doesn’t tell you which column (or columns) were involved.
So perhaps we’re better off with:
I do like to name DEFAULT constraints in a similar consistent way. In theory it doesn’t matter what you call the constraint however, if I want to drop a column, I first have to drop the constraint. That’s much easier if I have consistently named them. I don’t then have to write a query to find the constraint name before I drop it. I include the schema, table, and column names in the DEFAULT constraint as it must be unique within the database anyway:
CHECK constraints (and UNIQUE constraints) are more interesting. Consider the following constraint:
The error returned is:
Note how (relatively) useless this is for the user. We could have named the constraint like so:
Note how much more useful the error becomes:
And if we are very keen, we might remove the underscores and delimit the name to make it more readable:
This would return:
I’d like to hear your thoughts on this. How do you name your constraints?
I was teaching a SQL 2014 class yesterday and the students were using the current SQL Server 2014 Enterprise (on Windows Server 2012 R2) template.
We were using the Table Memory Optimization Advisor (right-click a table in Object Explorer within SQL Server Management Studio). I had several people in the class that reported that when they got to the primary key migration screen, that they couldn’t interact with the screen because the checkboxes were not present in the displayed list of columns.
This is what the screen should have looked like:
This is what it did look like:
Note that there are no checkboxes in the left-hand column. I had never seen that happen before.
We tried clicking, etc. in the area (wondering if there was some odd font problem or something) to no avail. There seemed to be plenty of room for a checkbox so it seemed like there must be some logical reason why it didn’t want any of these columns as the primary key. But it only happened on some machines.
Eventually, one of the students resized the rows that were displayed. The checkboxes then appeared.
This is a basic UI issue. I’ve recorded it here in case anyone else runs into it.
Nice to see some updated connectors for Oracle and Teradata for SQL Server Integration Services developers/users.
Version 3.0 of the Attunity connectors have been released. Some of these have substantial improvements. For example, the list of enhanced features for the Teradata connector includes:
- Expose additional TPT Stream Attributes(TD_PACK and TD_PACKMAXIMUM) to provide maximum tuning flexibility.
- Support for loading table with columns using reserved words.
- Fix mapping for TIME(0) to DT_STR SSIS datatype.
- Can display table name more than 30 characters correctly.
- Support for block mode and set as default mode.
- Expose TD_SPOOLMODE for TPT Export for faster extracts.
- Support for Extended Object Names(EON), which allow UNICODE object names in the Data Dictionary tables.
- Adding new datatypes (TD_PERIOD_DATE, TD_PERIOD_TIME and TD_NUMBER)
You’ll find details of the enhancements and downloads at: http://www.microsoft.com/en-us/download/details.aspx?id=44582
I got an email the other day from Sean and Jen at Midnight DBA (www.midnightdba.com) about their new tool Minion for managing index rebuilds and fragmentation:
You can find details of Minion here: http://www.MidnightSQL.com/Minion
With these tools, they have been a little more ambitious in some ways than the tools provided by Ola Hallengren (https://ola.hallengren.com/) that have been our favourite tools for this work. I quite liked many of the concepts they have put into the tool. It still feels a bit version-1.0-ish to me but shows lots of promise. I liked the way that it’s all set up with a single script. I would, however, like to see more error handling, etc. in that script. For example, you should be able to run it twice without errors. With the script I looked at, that’s not possible.
I liked the way they are providing some capture of details from sys.dm_db_index_usage_stats.
For both this tool, and for Ola’s tool, I wish there was more focus on the index usage stats. Rather than basing decisions about rebuilding or reorganizing indexes based only on fragmentation level, I’d like to see details of how the indexes are used (ie: user seeks vs user scans) playing a much larger role in deciding the operations to be performed. Overuse of reindexing is a primary cause of bloated logs, log shipping failures, mirrors that fall behind, etc.
Regardless, it’s great to see a new entrant in this area. I encourage you to check it out, see what you think, and more importantly, provide feedback to them. Sean has recorded a video demo of the product which is also available at the site.
There was a question this morning on the SQL Down Under mailing list about how to determine the Windows groups for a given login.
That’s actually easy for a sysadmin login, as they have IMPERSONATE permission for other logins/users.
Here is an example procedure:
When I execute it on my system this way:
It returns the following:
Note that the Usage column could also return “DENY ONLY” or “AUTHENTICATOR”.
As most websites do, we collect analytics on the people visiting our site http://www.sqldownunder.com
I thought it might be interesting to share the breakdown of visitors to our site. Keep in mind that we have a primarily Microsoft-oriented audience. Enjoy!
No surprise on the native languages:
Country breakdown reflects the amount of local traffic we have for instructor-led courses. Most others are podcast listeners:
We first noticed Chrome slightly outstripping IE a while back but recently, it’s changed a lot. I suspect that IE11 will have been as issue here:
No surprises on the operating systems but Linux continues to disappear from our clients. It used to be higher:
The big change has been in mobile operating systems. It’s the first time that iOS has only managed 50%. It used to be 82% for us:
We’re also seeing a shift in screen resolutions:
And this is the mix of where our site visitors come from:
I had the pleasure of recording another SQL Down Under show today.
Show 64 features Microsoft Azure DocumentDB product group member discussing Azure DocumentDB and what SQL Server DBAs and developers need to know about it.
JSON-based storage has been one of the highest rated requests for enhancements to SQL Server. While we haven’t got those enhancements yet, DocumentDB nicely fills a gap between NoSQL databases (I use the term loosely ) and relational databases.
You’ll find the show here: http://www.sqldownunder.com/Podcasts
When I installed CU4 for SQL Server 2014, I started to receive an error in SSMS (SQL Server Management Studio) every time I connected to a database server:
It was interesting that it was a type that wasn’t being found in Microsoft.SqlServer.SqlEnum. I presumed it was a problem with a type being missing in that DLL and that I must have had an older one.
Turns out that the problem was with the Microsoft.SqlServer.Smo.dll.
The product team told me that “bad” DLL versions were pushed out by the first released version of SSDT-BI for VS2013. All was working fine though until I applied CU4; then the errors started.
While the correct file version was 12.0.2430.0, and that was the one I had in the folder, the issue seems to actually relate to the Microsoft.SqlServer.Smo.dll, not to the SqlEnum dll. For some reason the installer didn’t correctly replace the previous entry in the GAC. It was still a 12.0.2299.1 version.
What I ended up having to do was to use ProcessExplorer to locate the dependencies of the ssms.exe process, then find the version of Microsoft.SqlServer.Smo.dll that was being loaded. I renamed it to stop it being loaded and rebooted. Then I found I still had the error and there was another location that it loaded it from. Again I renamed it and rebooted. Finally the error said that it couldn’t find the file at all.
At this point, I did a Repair of the Shared Features on “SQL Server (x64)” from “Programs and Features”, then deinstalled CU4, rebooted, then reinstalled CU4.
And now it seems to work ok. (Although it did nuke the previous entries from the connection dialog in SSMS)
Hope that helps someone.
It’s great to see that the Connect site leading to fixes in the product.
I was really pleased when SQL Server Data Tools for BI appeared for Visual Studio 2013. What I wasn’t pleased about where a number of UI issues that came with that version.
In particular, there was a problem with previewing Reporting Services reports. If I create a new report project, add a blank report, and drag on a text box:
Note that when I clicked the Preview button, the following appeared:
It appears that the preview is provided by a program called PreviewProcessingService.exe that I presume was meant to be launched on-demand in the background. If you closed the window, an error appeared in your preview. If you minimized it, you could happily ignore if from that point on.
I reported it in the Connect site, and am happy to see today that a new KB article appeared with a fix for it.
What the KB 2986460 article provides is a link to a new downloadable version of SSDT-BI for VS2013:
When the article first appeared, I downloaded the version immediately. It did not fix the problem. Unfortunately, the KB article appeared one week before the download was updated. If you downloaded it before and it did not fix the problem, unfortunately you will need to download it again. Here are the file properties of the version with the fix:
Be forewarned that the fix is a complete replacement that is 1GB in size:
It would be great if they can get to the point of patching these programs without the need for complete downloads but I’m very pleased to see it appear regardless.