THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

The Rambling DBA: Jonathan Kehayias

The random ramblings and rantings of frazzled SQL Server DBA

Unexpected Side Effects? Problems from Implicit Conversions

A coworker of mine brought me a problem that he found and was trying to understand.  It took a few minutes, but the root of the problem being looked at was an implicit conversion.  I am not exactly sure of what the use case was, or what the purpose of the code provided to me was, and I can't ask while writing this since I am not at the office, but I am going to take a guess that they were trying to get the time part off of a datetime value by using string functions.  An example of the code provided is:

DECLARE @dt1 datetime
SET @dt1 = '2009-07-10 01:30:00.000'
SELECT @dt1
SELECT RIGHT(@dt1, LEN(@dt1)-11)

The results of this query is:

image

This is what created the confusion.  If you look strictly at the output of these two queries, the second result doesn't make sense.  I assume that the thought behind the second query was, if 2009-07-10 01:30:00.000 is the return for a datetime datatype and all you want is the time portion then the RIGHT() function can be used with the LEN() of the datetime minus the first 11 characters.  What wasn't accounted for here is that the RIGHT() and LEN() functions are string functions, not datetime functions.  So when the datetime is passed to either of these functions it is implicitly converted to a varchar by the database engine, essentially the same thing as if you did a CAST().

DECLARE @dt1 datetime
SET @dt1 = '2009-07-10 01:30:00.000'
SELECT @dt1
SELECT CAST(@dt1 AS VARCHAR
(23))

image

Now if you have been using SQL Server for a while, this probably is common sense and I'm amazed that you are still reading this post. However, for people new to SQL, this kind of problem isn't very obvious.  Once you make the same conversion the output problem makes perfect sense.  So how do you get around it?  When converting datetime to a string, rather than CAST() use CONVERT() and specify the style to convert the datetime to as a part of the conversion.  A full list of the types can be found in:

http://msdn.microsoft.com/en-us/library/ms187928.aspx

For this example, the 121 type matches the original output that was expected in the query.

DECLARE @dt1 datetime
SET @dt1 = '2009-07-10 01:30:00.000'
SELECT @dt1
SELECT RIGHT(CONVERT(VARCHAR
(23), @dt1, 121), LEN(CONVERT(VARCHAR(23), @dt1, 121))-11)

image

The list of implicit conversions possible can be found on the previously mentioned link.  However, you shouldn't rely on implicit conversions as a matter of practice in your code.  When dealing with data of varying types, it is best to perform explicit conversions to prevent problems such as the one demonstrated above from occurring. 

Implicit conversions can also happen in other situations as well.  Datatypes in SQL Server have a conversion hierarchy or precedence that is documented in the Books Online:

http://msdn.microsoft.com/en-us/library/ms190309.aspx

When implicit conversions occur while evaluating two values against each other, the value with the lowest data type in precedence is converted to the higher precedence.  The is generally most problematic when a datatype of higher precedence is used in the WHERE predicate of a query.  The end result is a table scan of the table since the optimization causes a CONVERT() to occur on the column side of the predicate.  For Example:

USE tempdb
GO
CREATE TABLE dbo.ContactCopy
(
  
ContactID INT NOT NULL IDENTITY (1, 1) PRIMARY KEY,
  
NameStyle bit NOT NULL,
  
Title VARCHAR(8) NULL,
  
FirstName VARCHAR(50) NOT NULL,
  
MiddleName VARCHAR(50) NULL,
  
LastName VARCHAR(50) NOT NULL,
  
Suffix VARCHAR(50) NULL,
  
EmailAddress VARCHAR(50) NULL,
  
EmailPromotion INT NOT NULL,
  
Phone VARCHAR(50) NULL
ON [PRIMARY]
GO
SET IDENTITY_INSERT dbo.ContactCopy ON
GO
IF EXISTS(SELECT * FROM AdventureWorks.Person.Contact)
INSERT INTO dbo.ContactCopy 
  
(ContactID, NameStyle, Title, FirstName, MiddleName, LastName
      
Suffix, EmailAddress, EmailPromotion, Phone)
SELECT ContactID, NameStyle
      
CONVERT(VARCHAR(8), Title), 
      
CONVERT(VARCHAR(50), FirstName), 
      
CONVERT(VARCHAR(50), MiddleName), 
      
CONVERT(VARCHAR(50), LastName), 
      
CONVERT(VARCHAR(50), Suffix), 
      
CONVERT(VARCHAR(50), EmailAddress), 
      
EmailPromotion
      
CONVERT(VARCHAR(50), Phone)
FROM AdventureWorks.Person.Contact
GO
SET IDENTITY_INSERT dbo.ContactCopy OFF
GO
CREATE INDEX IX_ContactCopy_LastName ON dbo.ContactCopy (LastName)
GO
SELECT ContactID
FROM dbo.ContactCopy
WHERE LastName = 'Adams'
GO
SELECT ContactID
FROM dbo.ContactCopy
WHERE LastName = N'Adams'

The unicode N'Adams' forces an implicit conversion of the LastName column which results in a more expensive Index Scan instead of a Index Seek:

image

The implicit conversion can be seen by hovering the mouse over the Index Scan operator:

image 

This is one of the biggest drawbacks that I have seen from a basic implementation of LINQ to SQL (I understand there are ways to work around this as commented on my blog post on LINQ to SQL.) since the LINQ query TSQL code is produced using unicode types.  Most programmers that would be using LINQ to SQL/Entity Framework would miss this kind of thing until it became a severe performance problem. 

Published Thursday, July 16, 2009 10:58 PM by Jonathan Kehayias
Filed under:

Comments

 

grandtree said:

I found that these two statements have similar performance.

select LastName from Users

where LastName='WSW'

go

select LastName from Users

where LastName=N'WSW'

but these two have very different performance.

select * from Users

where LastName='WSW'

go

select * from Users

where LastName=N'WSW'

July 17, 2009 5:18 AM
 

Joe Webb said:

Great post! I've been meaning to post a blog about implicit type conversions for quite some time now. I've seen this affect people who are careless with their type definitions - going from an index seek to a scan.

July 17, 2009 5:46 AM
 

Jonathan Kehayias said:

grandtree,

How exactly are you measuring performance?  Just because they run in about the same time, doesn't mean they perform the same.  That is the reason that a lot of code starts out just fine and then as table data grows becomes slower and slower.

Your first queries are only pulling one column, so an single index operation is going to occur, if there is an index on the LastName column.  Your second set of code would result in either a clustered index scan, or use the nonclustered index with a lookup back to the clustered index to get the columns not covered by the nonclustered index. The answer to the difference is in the execution plan.

July 17, 2009 11:28 PM
 

AaronBertrand said:

VARCHAR without a length specification is dangerous as well; I strongly recommend you get in the habit of specifying the length.  You might understand the exceptions to the rule when writing your own code, but for code that users might learn from, I think it is much better to enforce best practices.

For more information on this see Erland's Connect item:

http://connect.microsoft.com/SQL/feedback/ViewFeedback.aspx?FeedbackID=244395

July 18, 2009 11:05 AM
Anonymous comments are disabled

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement