To many, this is an old and tired topic, and any more mention of ad hoc queries versus parameterized queries may just send someone off the deep end.
But recently I was doing some Oracle benchmarks, and the benchmark tool reported ~1,200 transactions per second. A transaction in this case was a simple SELECT with a join on the indexed columns between two relatively small tables (a few million rows each).
Was that a good transaction throughput? Well, I could spend time looking into the system to identify the bottleneck and to see whether a higher transaction throughput could be achieved with further tuning. Without a reference, it’s hard to tell whether this was good throughput or not.
So to have a point of reference, I pointed the same benchmark configuration to a SQL Server box I happened to have. This was a significantly less powerful machine than the Oracle test box. I was very surprised that the SQL Server box produced ~3,000 transactions per second.
Now, I know SQL Server is good, but I didn’t know it’s that good. I mean, SQL Server on an old four-processor box beating Oracle on a new 16-core machine? Something wasn’t right, and I knew better not to gloat prematurely.
It turned out that the benchmark tool sent the same SELECT statement to the Oracle instance, but every time with different literal values in the where clause. Since by default Oracle only reuses a plan if the SQL text matches exactly, the Oracle instance under test ended up spending an inordinate amount of time doing hard parses and coming up with execution plans. The same benchmark tool, however, used a stored procedure wrapper when talking to SQL Server.
On the Oracle database, by simply setting its cursor_sharing parameter to forced, the transaction throughput went up from ~1,200 per second to ~16,000 per second, an improvement of more than 10 fold. This got me curious since SQL Server 2005 has a similar database option called forced parameterization. How would the same query against the same data behave on a SQL Server 2005 instance? Would the parameterization forced option give me 10 times the ad hoc throughput?
Since I could not change this 3rd-party benchmark tool to execute ad hoc queries against SQL Server, I wrote a little C# program to run the queries myself. With 50 threads going against the SQL Server 2005 instance concurrently, the throughput with the forced parameterization set to on was about six times that when the forced parameterization was off – quite a significant improvement.
I’m not suggesting you would see similar improvement. No doubt the extent of the impact that query parameterization may have in a particular case depends on many factors (e.g. the amount of parse/compile time relative to actually processing the data, CPU consumption level, and so on).
What kind of significant improvement have you seen with query parameterization?