|
|
|
|
Ramblings of Greg Low (SQL Server MVP, MCM and Microsoft RD) - SQL Down Under
-
One aspect of SQL Server related development that I see done quite badly when visiting client sites is source code control. So many sites have almost nothing in place for this important aspect of development. By comparison, it's rare now that I visit an application development organisation that doesn't have at least reasonable source code control in place. Many DBAs seem to have grown up in environments where source code control systems weren't in use. I've been pleased to see the Red Gate folk continuing to work in this direction and to expand their toolsets. A while back, they released Source Control for SQL Server: http://www.red-gate.com/products/sql-development/sql-source-control/ but I was also interested to note their upcoming release of Source Control for Oracle: www.red-gate.com/source-control-for-oracle While I focus predominantly on SQL Server, many sites I work with use more than one type of database engine, and the other engine is often Oracle. So it's useful to be able to use a single system for both. These tools support TFS (which I tend to use mostly now) and SVN which I often see at client sites, usually in conjunction with a Tortoise front end. If you aren't using source code control in your own site, you should start looking at these types of tools.
|
-
-
Over the last few months, I've been a tech reviewer on John Sirmon and Heidi Steen's new whitepaper. It's published today on MSDN: Hardware Sizing a Tabular Solution (SQL Server Analysis Services) Its
goal is to help you determine the amount of memory and CPU resources needed to
accommodate query and processing workloads in a production environment. Recommended!
|
-
Happy new year to all my Chinese readers! Particular thanks to those that have been helping me to learn Mandarin and patiently answering my elementary questions.
恭贺新禧. 新年快乐.
感谢你的帮助,去年
|
-
There are some terms in IT that make their way straight into the hype stratosphere. Unfortunately "big data" is one of these. I see very few systems that I'd contend are actually "big data". However, I endlessly see the term applied to data stores that are trivial by today's standards. This might help the marketing teams but it's sad none-the-less. There are some technological challenges that really do start to bite as the data volume really does start to become large, and as the proportion of unstructured and semi-structured data increases. There are also some very interesting new tools that allow us to process larger volumes of data faster, particularly in relation to analytics, and a large market building around Hadoop and its derivatives. I also see entire teams that claim to focus on big data, yet whenever I discuss the projects with them, none of them are working with databases that are even vaguely in the ballpark of what anyone would have considered big data ten years ago, let alone today. None of the people involved have ever dealt with even really large tables by today's standards. It's interesting that so many data-related jobs have suddenly become "big data" jobs. I'd love to know what these teams think they mean, when they say that they "focus" in these areas. It simply isn't possible for so many of them to do so. I had a chuckle when I read this blog post from Warwick Leitch: Call it big data if you like... but you will look foolish. In that example, Warwick was referring to survey data that was held in a spreadsheet... For a more serious take on this subject though, there is some interesting material in Stephen Few's recent blog post: Big Data, Big Deal. Stephen argues that big data is simply a marketing campaign. As always, the comments associated with the blog post make for reading that's as interesting as the post itself. I don't totally agree with Stephen on this, as there really has been quite a shift in the available tooling in recent years, but much of his discussion is right on target. Ironically yesterday I was working with a team that has a project that I would qualify as "big data", yet they had never thought to call it that. I suspect we as an industry need to start to quantify what the term "big data" really means, at a given point in time. It's clearly a relative term that changes over time. Otherwise, we should lose the term entirely or further define it, as there is currently a great deal of confusion around it. The whole discussion reminded me of this wonderful xkcd cartoon that compared production levels in the energy industry: http://xkcd.com/1162/ One of the more amusing calls I had last year was with a US based fast food chain. They told me that they were ok using SQL Server for their analytic work but they'd decided they needed to use Oracle for the main database engine, based on the volume of data that they needed to handle efficiently. The Oracle sales guy had done a good job. I was intrigued about what volumes of data they thought would justify this. Later, it became apparent that it was about 30GB... Without triggering a "that's not a knife, that's a knife" moment, I'd love to hear what others currently consider "big data". I don't consider "using Hadoop (or HDInsight) as a synonym for "working with big data".
|
-
SQL Server 2008 R2 SP2 and SQL Server 2012 SP1 introduced sys.dm_db_stats_properties. This DMV is a new way to check the number of rows that have been modified since the last time a set of statistics was updated. It is described here: (http://msdn.microsoft.com/en-us/library/jj553546.aspx) In the latest version of Ola's tools, he's made use of this new DMV for customers that are on one of these SQL Server versions. For customers on earlier versions, he continues to use sys.sysindexes in the same way he did previously. Another notable change is that the new version works better when combined with log shipping on the same databases (avoids options that would break log shipping chains). Ola's tools have been gaining a well-deserved reputation, particularly given the price (ie free). You can read more about the most recent version of the solution at http://ola.hallengren.com/versions.html or download it at http://ola.hallengren.com/scripts/MaintenanceSolution.sql
|
-
Sysprep has been a useful tool to allow the preconfiguration of SQL Server environments, along with the rest of the operating system. This can be very useful for templated Azure VMs as well as for on-premises systems and private clouds. One of the limitations has been that not all SQL Server components were "sysprep-able". Another important change in CU2 for SQL Server 2012 SP1 is that more SQL Server components are now able to be "sysprepped". In particular, SQL Server Analysis Services (SSAS) and SQL Server Integration Services (SSIS) can now be part of this. This is good news and you'll find more info in the VM Engineering team's blog post by Robert Hutchison here: http://blogs.technet.com/b/scvmm/archive/2013/01/25/expanded-sysprep-support-in-sql-server-2012-sp1-cu2.aspx
|
-
One of the things I always tell people that I love about consulting/mentoring work is that you see things that you just can't make up. They provide interesting material for training classes. This week I came across something that I wasn't expecting. I was migrating data from DB2 and in my scripts, I had made the presumption that a table would have at least one column. Turns out that in DB2 you can have a table with no columns. I can only imagine that comes from creating a table then dropping all the columns. I wondered if SQL Server would do the same, so it was time to find out.
USE tempdb; GO
CREATE TABLE dbo.TableWithNoColumns ( FirstColumn int, SecondColumn int ); GO
ALTER TABLE dbo.TableWithNoColumns DROP COLUMN SecondColumn; GO
ALTER TABLE dbo.TableWithNoColumns DROP COLUMN FirstColumn; GO
It doesn't allow this. It won't let you delete the last column. The following message is returned. Msg 4923, Level 16, State 1, Line 2
ALTER TABLE DROP COLUMN failed because 'FirstColumn' is the only data column in table 'TableWithNoColumns'. A table must have at least one data column. Whenever I get asked why I like SQL Server so much, these are the sorts of things that don't immediately spring to mind but they are the sorts of reasons why I love working with SQL Server and get frustrated when working in other environments.
|
-
SQL Down Under show 54 is now available for download. In this show, I spoke with Grant Fritchey on his experiences with query tuning and reading SQL Server execution plans. We also discuss SQL Kilts. You'll find the show at our podcasts page: http://www.sqldownunder.com/Resources/Podcast.aspx Enjoy!
|
-
There has been a lot of discussion over the years about whether or not it's a good idea to introduce new functionality in service packs (for SQL Server, the OS, and others). The concern is that it can invalidate testing, invalidate documentation, books, courseware, etc. However, it can also introduce much-needed features without having to wait for another version of the product. It's been even-less likely that new functionality would be introduced in a cumulative update (or CU), which are basically a set of rolled-up hotfixes. However, interesting new functionality has appeared in CU2 for SQL Server 2012 SP1. The BACKUP command now supports TO URL when specifying a backup device. That allows you to specify the location of an Azure storage account as the backup destination. This avoids the need to create a backup locally, and to then have a process to copy it to Azure storage. Similarly, RESTORE now also supports a FROM URL option. This is very interesting news for a number of reasons. For customers using SQL Server within Azure Virtual Machines (VMs), it provides a way to specify the use of a storage account as the backup destination. While you could do that before by mapping a drive letter to a storage account (and that's what you do when setting up a VM in the first place), the number of drive letters available is limited and this avoids using them up. For customers using SQL Server on-premises, this provides a direct cloud-based backup option with low-cost, large storage available. The biggest limitation for these customers will no doubt be network bandwidth but for some customers, that isn't an issue. It's also an interesting new DR option for customers that have relatively smaller databases that don't take too long to send over the network but which are none-the-less critical to the organisations. Another group of customers that would benefit from this would be any customer that needs to distribute a single copy of a database to multiple locations. A backup can be done once, and each secondary location can restore directly from the cloud. I can imagine this being an interesting way to push out new versions of reference data, etc. A notable group of customers that would benefit from this would be those using Windows Azure SQL Databases. However, the syntax is not yet supported in WASD. I have no advanced knowledge on this but I'm guessing that adding this looks pretty likely, as it would be so beneficial. When you connect to a storage account, you need to provide both the URL and a storage key. There are two types of keys. One type of key allows for unlimited access. The other type of key can be time-bombed, handed out for short-term purposes and can easily be revoked. Both these types of keys should work with this new option. The BACKUP and RESTORE commands both support a WITH CREDENTIAL option for providing the link to this key. Credentials were introduced in SQL Server 2005. They are basically a way of giving a name to a set of authentication details (name and secret/password). Previously we have used them with SQL Server Agent Proxy accounts, to allow us to fine tune the permissions that are granted to steps in SQL Server Agent jobs. Now they have an interesting new use. You'll find details of the changes, with samples of the syntax in this article: http://blogs.msdn.com/b/windowsazure/archive/2013/01/24/sql-server-backup-and-restore-to-cloud-simplified.aspx
|
-
I promised there would be a bunch of shows this year. The second one is now published. In this show, SQL Server MVP Grant Paisley describes PowerPivot, Power View, Analysis Services Tabular vs Multidimensional and shares the lessons he's learned when working with these tools. The show is here: http://www.sqldownunder.com/Resources/Podcast.aspx Enjoy!
|
-
One of the issues that is often raised with Windows Azure SQL Database is that you don't get to pick the name of your server, so you end up with a bizarre name such as: yy2l95dk1k.database.windows.net. I can understand why the team did this. Apparently when they first set it up, they allowed you to pick your own server name, so everyone started registering Coke, Pepsi, etc. Not wanting to have yet another place for people to argue about name ownership, they quickly removed that ability. I'm glad they did. However, when working with a databases, I've been finding that I'm constantly looking at lists of Azure servers and having no idea which one is which. When I open SQL Server Management Studio (SSMS), and ask to connect, I'm greeted with a list of servers that looks like: - yy2l95dk1k.database.windows.net
- ky4296dk1k.database.windows.net
- xy9914dk2j.database.windows.net
and so on. Now I'm sure there are people that can remember which one is which, but as the number of servers increases (particularly when I'm dealing with client servers as well as my own), I'm not one of those people that can. Normally when I'm working with a bunch of servers and I only have IP addresses, I configure SQL Server Client Aliases for each address. However, when I first tried to configure an alias like AzureDemo for a server called yy2l95dk1k.database.windows.net, I found I couldn't connect to it. I received an error that said "Server name cannot be determined. It must appear as the first segment of the server's dns name (servername.database.windows.net). That led me to believe that using an alias wouldn't work. However, in later versions of this error message, more information is provided. "Some libraries do not end the server name, in which case the server name must be included as part of the user name (username@servername). In addition, if both formats are used, the server names must match. (Microsoft SQL Server, Error: 40531). Once again, when I first saw this, I presumed that it would still stop me from using an alias but that is not the case. To use a meaningful alias, what you need to do is: - Using SQL Server Configuration Manager, in the SQL Native Client 11.0 Configuration\Aliases node, create an alias. Pick a meaningful name (ie; HRServer) for the Alias Name, leave the port at the default (ie: 1433), leave the protocol at the default (ie: TCP/IP) and provide your real server name (ie yy2195dk1k.database.windows.net) in the Server field.
- Create the same alias in the SQL Native Client 11.0 Configuration (32bit)\Aliases node.
- When connecting to the server in SSMS, enter your new alias (ie: HRServer) for the Server name, SQL Server Authentication for the Authentication method, and for the login, specify your login name followed by an @ symbol, followed by the first segment of the real server name (ie: mylogin@yy2195dk1k), then enter your password.
- If the user isn't an admin user that has the ability to connect to the master database, you'll also need to manually set the database name using the Options button. (Note that you won't be able to browse for database names).
Once you have done that with all your servers, finding and connecting to the right server should be easy. In a separate post, I'll describe how to push out aliases to other users in your domain.
|
-
When I've been putting data into Windows Azure SQL Database (WASD) in the past, I'd normally been providing the dates from my own system. This week, I had the first time where I wanted to put a column default that provided a date in WASD. It suddenly dawned on me that I wasn't sure what timezone the date would be from. As I was using the Southeast Asia data centre (in Singapore), I was presuming the value would be based on Singapore's timezone. So it was time to find out. Prior to SQL Server 2008, GETDATE() was the normal way that we'd retrieve the current time from the server. GETDATE() returns a datetime data type. SQL Server 2008 introduced the datetime2 data type (a higher-precision data type with a poorly-chosen name). SQL Server then provided SYSDATETIME() as a replacement for GETDATE(). SYSDATETIME() returns the datetime2 data type. To make it easier to work with UTC-based values, SQL Server also provides SYSUTCDATETIME(). If I connect just now to my database, and execute the query:
SELECT GETDATE() AS GetDateValue, SYSDATETIME() AS SysDateTimeValue, SYSUTCDATETIME() AS SysUTCDateTimeValue;
The values returned were:
GetDateValue SysDateTimeValue SysUTCDateTimeValue ----------------------- ---------------------------
--------------------------- 2013-01-19 22:23:21.830 2013-01-19 22:23:21.8400294
2013-01-19 22:23:21.8400294
(1 row(s) affected)
I was pleasantly surprised to see that the values were UTC based instead. That's really useful as it means that no matter which Azure data centre you connect to, they all have the same concept of "current time". What also surprised me is that while the SYSDATETIME() and SYSUTCDATETIME() values were identical, the GETDATE() value wasn't just a rounded version of the same time. It was an earlier time so it must be resolved separately in the query. That's not an Azure-specific issue though. If I execute the same query against my laptop system, the following output is produced:
GetDateValue SysDateTimeValue SysUTCDateTimeValue ----------------------- ---------------------------
--------------------------- 2013-01-20 09:27:41.503 2013-01-20 09:27:41.5051908
2013-01-19 22:27:41.5051908
(1 row(s) affected)
Note that my system is operating in +11 timezone so my SYSDATETIME() value is 11 hours ahead of my SYSUTCDATETIME() value. But again notice that it's exactly the same value when allowing for those 11 hours. However, the GETDATE() value is different again. So there are two messages from this: - Azure SQL Databases are always UTC timezone based (which is good news)
- Don't depend upon GETDATE() and SYSDATETIME() returning exactly the same time in a single query, after allowing for rounding.
|
-
Hi folks, I'm back into the swing with a bunch of new podcasts coming, featuring SQL Server 2012. First up is Clustering MVP Allan Hirt discussing SQL Server HA with AlwaysOn. Allan also discusses the deployment of SQL Server on Windows Server Core. Thanks for all the emails that said they loved the shows and told me to get them happening again soon. I think you'll like the ones that are lined up. You'll find the shows here: http://www.sqldownunder.com/Resources/Podcast.aspx
|
-
|
|
|
|
|