Often you see an implicit type conversion or a type mismatch causing a performance issue when it trips up the SQL Server query optimizer. A while back though, I ran into a case where a harmless looking type mismatch caused significant performance degradation when there was no bad plan involved.
So we had a rather straightforward transactional replication setup for a table that included about 30 varchar columns. And when there was a burst of inserts into the table on the publisher, the distribution agent could not keep up with the traffic and a significant latency ensued. Upon investigation, we found that the inserts were distributed to the subscriber via a stored procedure, which was not unusual in itself because that is the default setting when you configure transactional replication using the SQL Server Management Studio (SSMS) GUI interface. The stored procedure could not be simpler:
CREATE PROC sp_MSins_MyTable
@c1 nvarchar(40),
@c2 nvarchar(30),
...
@c30 nvarchar(30)
AS
INSERT MyTable(c1, c2, ..., c30)
SELECT @c1, @c, ..., @c30
This procedure was generated by SSMS when the published article was configured to call a procedure instead of using the INSERT statement for distributing inserts.
What’s odd was that even though the table columns were varchar types, the generated stored procedure above defined the input parameters as nvarchar, thus implicit type conversions must be performed during the proc execution. Not sure about why, but that is how SSMS generates these replication wrapper procs.
Since the type conversion was not involved in any where clause or join condition, there was no issue with a terribly inefficient query plan.
Still, the insert latency at the subscriber was huge and there was no other resource issue that might have helped explaining the poor insert throughput on the subscriber.
Thinking that all these type conversion efforts in each insert might have contributed to the slowdown, we replaced all the nvarchar’s with varchar’s in stored procedure. Voila! The insert performance increased dramatically and the distribution latency disappeared quickly.
All was happy and we moved on.
Yesterday, however, we ran into the same problem because upon recreating the replication over the weekend, we had inadvertently replaced our custom modified stored procedure with an SSMS-generated default proc as shown above, and it was using nvachar’s instead of varchar’s. Upon hearing the distribution latency, we promptly put in the proc with the varchar types and the problem went away almost immediately.
That got me wondering about the exact performance impact of these nvarchar/varchar type conversions. So I ran some tests outside the replication setup with two stored procedures, one with nvarchar for all the input parameters and the other with varchar for all the input parameters (note that the table columns are all varchar’s).
CREATE PROC p_insert_nvarchar
@c1 nvarchar(40),
@c2 nvarchar(30),
...
@c30 nvarchar(30)
AS
INSERT MyTable(c1, c2, ..., c30)
SELECT @c1, @c, ..., @c30
CREATE PROC p_insert_varchar
@c1 varchar(40),
@c2 varchar(30),
...
@c30 varchar(30)
AS
INSERT MyTable(c1, c2, ..., c30)
SELECT @c1, @c, ..., @c30
I put them through a loop to insert 100,000 rows into a truncated MyTable. To my surprise, I did not see much of a difference between the two procs.
At this point, I’m at a loss as to why changing from nvarchar to varchar had such a huge performance impact for the distribution agent, but had almost no impact in my controlled tests.
Thus the search for the explanation continues, and I’ll certainly report back when it’s found. By the way, if you know the reason, please post a comment here.
However, it is clear from this experience that a simple nvarchar/varchar type mismatch can have a huge performance impact even when it does not involve any foul query plan. And if you are experiencing latency on the subscriber side (not on the distributor), this is one more thing you may want to check.