THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Adam Machanic

Adam Machanic, Boston-based independent database consultant, writer, and speaker, shares his experiences with programming, performance tuning, and optimizing SQL Server 2000, 2005, and 2008, in conjunction with related technologies such as .NET.

Is PATINDEX faster than LIKE?

Originally posted here.

 


I keep seeing the same suggestion on various "tips and tricks" websites: For situations in which you might want to use LIKE in the WHERE clause, but for which indexes cannot be used, PATINDEX will perform faster.

 

So, according to these sources, this:

 

SELECT *
FROM tbl
WHERE PATINDEX('%abc%', col) > 0

is faster than this:

 

SELECT *
FROM tbl
WHERE col LIKE '%abc%'

The thing is, I'm not one to just take this kind of stuff at face value, so I've tested this assertion several times. Every time I see this tip, I think, "I must be missing something," and I test again. And every single time I test again, with different data, different patterns, etc, I arrive at the same conclusion: They perform exactly the same.

Which brings us to today. In the SQL Server Central forums, William O'Malley told me that in his tests, on his data that he can't post due to his industry (?), PATINDEX does outperform LIKE.

So I decided to test yet again. And I'm still coming up with the exact same numbers. I'm hoping that some reader will be able to tell me this mysterious circumstance in which PATINDEX really does outperform LIKE. Or maybe explain why my test is totally incorrect.

Anyway, here's what I did today... First, I created a big table of test data (83 million rows) with the following (which you may notice that I lifted from a previous post):

 

SELECT DISTINCT A.Name + B.Name + C.Name AS SomewhatLargeString
INTO #BigTableOfStrings
FROM master..spt_values A,
master..spt_values B,
master..spt_values C
WHERE a.TYPE NOT IN ('P', 'R', 'F', 'F_U')
AND b.TYPE NOT IN ('P', 'R', 'F', 'F_U')


CREATE CLUSTERED INDEX CI_LargeString ON #BigTableOfStrings(SomewhatLargeString)

I decided to test against the pattern '%ossDbOwnChainRefere%', for which there are 1752 rows in the test table.

First, I ran the LIKE query:

 

SELECT COUNT(*)
FROM #BigTableOfStrings
WHERE SomewhatLargeString LIKE '%ossDbOwnChainRefere%'

Runtime: 9:55.

Then I tried PATINDEX:

SELECT COUNT(*)
FROM #BigTableOfStrings
WHERE PATINDEX('%ossDbOwnChainRefere%', SomewhatLargeString) > 0

Runtime: 9:56 (yes, worse)

Then I ran LIKE again with a runtime of 9:47, then PATINDEX again with a runtime of 9:50, and now I'm not patient enough to run either of them again.

So am I correct? Is this claim bogus? Or have I gone completely off-base?



Published Wednesday, July 12, 2006 10:13 PM by Adam Machanic
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Mark Jackson said:

The important bit you missed is the phrase ' but for which indexes cannot be used'.  Your test has a clustered index on the searched field and that will be used every time.  If you run the same test with a clustered primary key index (on an id field) and your search field set up as a text datatype there is a hugh difference. I have had up to 50% speed gains on a 1.3 million record table of life customer data using patindex, although I would rather shoot myself in the head then do either of these as using a fulltext index on the same data returns the same results in less than a second, compared to 20+ seconds with patindex, or 35+ seconds with like.

October 24, 2007 9:24 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit

About Adam Machanic

Adam Machanic is a Boston-based independent database consultant, writer, and speaker. He has been involved in dozens of SQL Server implementations for both high-availability OLTP and large-scale data warehouse applications, and has optimized data access layer performance for several data-intensive applications. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "Expert SQL Server 2005 Development" (Apress, 2007) and "Inside SQL Server 2005: Query Tuning and Optimization" (Microsoft Press, 2007). Adam regularly speaks at user groups, community events, and conferences on a variety of SQL Server and .NET-related topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server and a Microsoft Certified IT Professional (MCITP).
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement