Certain things just don’t go well together! No, not talking about liberals and Republicans. Rather, I’m talking about user defined scalar functions and SQL traces.
If you run SQL Trace or SQL Profiler often, you’d find user defined scalar functions to be rather annoying in that 90% of your trace file can be easily taken up by functional calls if you trace for SP:Starting and/or SP:Completed. And it can be much worse if you trace for SP:StmtStarting and/or SP:StmtCompleted.
Burying useful trace data in an avalanche of functional calls is but a minor issue compared to the adverse performance impact you may experience when combining a SQL trace and user defined scalar functions, perhaps inadvertently.
First, an anecdote. Some time ago, I was at a site where a stored procedure consistently performed much worse on the production server than it did on the QA server with the same database and with the production server supposedly being much better equipped than the QA server. The proc executed with the same query plan, and a trace confirmed that it was going through the same steps in both prod and QA. The proc ran for less than one second in QA, but took five seconds to complete in prod.
Just when we started to suspect whether the production server was really better than the QA server, it was found that there was a trace running on the production server. Turning off the trace on the production led to immediate improvement of the stored procedure on the production server. In fact, it now performed faster in prod than it did in QA. Furthermore, if we took out the scalar function calls in the stored procedure, it would run faster in prod than in QA regardless of whether that trace was running or not.
Now, let’s look at some controlled test results on the performance impact of a SQL trace on a stored procedure that calls a user defined scalar function.
The test setup
The scalar function is a dummy function in that it doesn’t really do anything. It just returns the input parameter:
create function dbo.fn_Dummy (@c varchar(20))
declare @a varchar(20);
select @a = @c;
The test table is named customer and its definition is as follows:
CREATE TABLE customer
The customer table is populated with 50,000 rows as follows:
declare @i int
set @i = 1
while @i <= 50000
insert customer(c_id, c_first, c_middle, c_last, c_data)
select @i, CAST(@i as CHAR(20)),
CAST(@i as CHAR(20)),
CAST(@i as CHAR(20)),
CAST(@i as CHAR(500))
set @i = @i + 1
create clustered index ci_customer on customer(c_id)
Two stored procedures are used in the tests: p_test and p_testUDF. They are identical except that p_test does NOT call the scalar function.
create proc p_test
set nocount on -- does not call fn_Dummy()
from customer c
where c_id <= 50000;
create proc p_testUDF
set nocount on -- does call fn_Dummy()
from customer c
where c_id <= 50000;
The key test parameters include:
· With scalar function calls vs. without scalar function calls. This is controlled by calling either p_test or calling p_testUDF. The former makes no scalar function calls.
· SQL trace configurations (i.e. level of tracing details). Four different SQL traces are configured with different levels of details being traced:
o No Trace. This is the test baseline when no trace is configured.
o Default Trace. This comes off the default trace template defined by SQL Server 2005. It basically traces for RPC:Completed and SQL:BatchCompleted. This is the least detailed trace of the tested trace configurations.
o SP Trace. This trace configuration includes the events covered by the Default Trace plus SP:Completed.
o Stmt Trace. This trace configuration include the events covered by the SP trace plus SP:StmtCompleted. This is the most detailed trace of the tested trace configurations.
So these four trace configurations are progressively more detailed.
The test results
All the test results are recorded only after multiple runs of the same tests. And since the customer table is about 2GB in size and the SQL Server buffer pool has 6GB allocated, all the test results are taken when the table is fully cached in the buffer pool.
The key metric is the response time of the test proc: either p_test or p_testUDF as defined above.
As always, it’s better to present the test results in a chart format.
A number of things stand out rather strikingly from these two charts.
First, when a proc calls a scalar UDF, the level of the details you trace can have a huge negative impact on the proc performance. In particular, if you do statement level tracing (e.g. tracing for SP:StmtCompleted or SP:StmtStarting), you could see your proc performing several times worse. In my tests, the proc takes 10 times longer to complete when the statement level trace is enabled.
Secondly, the exact degree of the impact depends, not surprisingly, on the number of calls to the scalar function, which is in turn dependent on the number of rows and the number of columns the scalar function is applied to. In my tests, the number of rows is set to 50,000. This may sound like a lot of rows to apply a scalar function to. And it is actually a lot of rows in light of the impact we are seeing. However and unfortunately, this is not uncommon in the real world applications.
Thirdly, and from the second chart, it is clear that the proc that does not call the scalar function (i.e. p_test) is insensitive to the trace configuration. This does not mean that any proc that does not call a scalar function is completely unaffected by how you configure a trace. Obviously, for a complex proc, a statement level trace would have to step into its inner steps and may have an impact. What the second chart does show is that it is a solid baseline in that the impact seen in the first chart is caused only by the change in the trace configuration.
Another point to highlight is that when you trace at the SQL batch or RPC level (e.g. RPC:Completed), the impact is rather small even with scalar UDF calls. In my tests, when the default trace is enabled, the elapsed time of p_testUDF increases from 576ms to 750ms consistently. That is a ~30% increase. But it is arguably negligible when you compare that with an increase from 576ms to 6436ms when the statement level trace is applied.
The key takeaway here is that you must be extra careful when you trace at the stored procedure level (e.g. tracing for SP:Starting), and you must be extraordinarily careful when you trace at the statement level.