THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

Performance impact: implicit type conversions

Often you see an implicit type conversion or a type mismatch causing a performance issue when it trips up the SQL Server query optimizer. A while back though, I ran into a case where a harmless looking type mismatch caused significant performance degradation when there was no bad plan involved.

 

So we had a rather straightforward transactional replication setup for a table that included about 30 varchar columns. And when there was a burst of inserts into the table on the publisher, the distribution agent could not keep up with the traffic and a significant latency ensued. Upon investigation, we found that the inserts were distributed to the subscriber via a stored procedure, which was not unusual in itself because that is the default setting when you configure transactional replication using the SQL Server Management Studio (SSMS) GUI interface. The stored procedure could not be simpler:

 

CREATE PROC sp_MSins_MyTable

  @c1 nvarchar(40),

  @c2 nvarchar(30),

  ...

  @c30 nvarchar(30)

AS

INSERT MyTable(c1, c2, ..., c30)

SELECT @c1, @c, ..., @c30

 

This procedure was generated by SSMS when the published article was configured to call a procedure instead of using the INSERT statement for distributing inserts.

 

What’s odd was that even though the table columns were varchar types, the generated stored procedure above defined the input parameters as nvarchar, thus implicit type conversions must be performed during the proc execution. Not sure about why, but that is how SSMS generates these replication wrapper procs.

 

Since the type conversion was not involved in any where clause or join condition, there was no issue with a terribly inefficient query plan.

 

Still, the insert latency at the subscriber was huge and there was no other resource issue that might have helped explaining the poor insert throughput on the subscriber.

 

Thinking that all these type conversion efforts in each insert might have contributed to the slowdown, we replaced all the nvarchar’s with varchar’s in stored procedure. Voila! The insert performance increased dramatically and the distribution latency disappeared quickly.

 

All was happy and we moved on.

 

Yesterday, however, we ran into the same problem because upon recreating the replication over the weekend, we had inadvertently replaced our custom modified stored procedure with an SSMS-generated default proc as shown above, and it was using nvachar’s instead of varchar’s. Upon hearing the distribution latency, we promptly put in the proc with the varchar types and the problem went away almost immediately.

 

That got me wondering about the exact performance impact of these nvarchar/varchar type conversions. So I ran some tests outside the replication setup with two stored procedures, one with nvarchar for all the input parameters and the other with varchar for all the input parameters (note that the table columns are all varchar’s).

 

CREATE PROC p_insert_nvarchar

  @c1 nvarchar(40),

  @c2 nvarchar(30),

  ...

  @c30 nvarchar(30)

AS

INSERT MyTable(c1, c2, ..., c30)

SELECT @c1, @c, ..., @c30

 

CREATE PROC p_insert_varchar

  @c1 varchar(40),

  @c2 varchar(30),

  ...

  @c30 varchar(30)

AS

INSERT MyTable(c1, c2, ..., c30)

SELECT @c1, @c, ..., @c30

 

I put them through a loop to insert 100,000 rows into a truncated MyTable. To my surprise, I did not see much of a difference between the two procs.

 

At this point, I’m at a loss as to why changing from nvarchar to varchar had such a huge performance impact for the distribution agent, but had almost no impact in my controlled tests.

 

Thus the search for the explanation continues, and I’ll certainly report back when it’s found. By the way, if you know the reason, please post a comment here.

 

However, it is clear from this experience that a simple nvarchar/varchar type mismatch can have a huge performance impact even when it does not involve any foul query plan. And if you are experiencing latency on the subscriber side (not on the distributor), this is one more thing you may want to check.

Published Wednesday, July 20, 2011 1:26 AM by Linchi Shea
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Uri Dimant said:

>>>as shown above, and it was using nvachar’s instead of nvarchar’s.

Linchi I think you need to replace 'one' nvarchar to varchar word..

July 20, 2011 12:56 AM
 

Uri Dimant said:

>>>as shown above, and it was using nvachar’s instead of nvarchar’s.

Linchi I think you need to replace 'one' nvarchar to varchar word..

July 20, 2011 12:56 AM
 

Linchi Shea said:

Thanks Uri! I have corrected the typo.

July 20, 2011 1:26 PM
 

Scott Duncan said:

I had a similar problem last year, where a table was using varchar columns but a query against one of those columns was being sent to the engine via sp_executesql using parameters of nvarchar type (4 of them).

The table had 330,000 rows. To test, the query parameters were changed to match the queried column datatype, the result was dramatic:

BEFORE:

Table 'Request'. Scan count 0, logical reads 941708, physical reads 221, read-ahead reads 68687

SQL Server Execution Times:

  CPU time = 5343 ms,  elapsed time = 136990 ms

AFTER:

Table 'Request'. Scan count 4, logical reads 981, physical reads 0, read-ahead reads 0

SQL Server Execution Times:

  CPU time = 62 ms,  elapsed time = 434 ms

The query couldn't be changed in production, (the query gets generated on the fly by the application), so the column data type was changed instead.

July 24, 2011 11:40 PM
 

Erik Squires said:

This is sadly a problem that MS SQL Server has had for many years, and Oracle never had.  The issue is that the conversion doesn't happen where you think it happens.  Let's take a very simple query, like this:

create proc FindByFullName

@name varchar(50)

AS SELECT * from Employees where FullName = @name

Let's further assume that Employees.FullName is nvarchar(50).  

You would think that SQL Server would convert @name to nvarchar, and then go looking for the employees.  Instead, what SQL does is it converts all of the column values in Employees.FullName to varchar and then compares those values to the parameter passed.  This behavior forces full table or full index scans to occur per execution.  If you don't believe me, check the explain plans.  

It's such absurd behavior most people can't believe it really behaves this way.  Usually the only evidence we can find of this happening is to examine the explain plan.  If you see a full index/table scan occuring when it should be doing index seeks, check your parameter types.

October 5, 2011 2:00 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement