THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

SELECT Hints, Tips, Tricks FROM Hugo Kornelis WHERE RDBMS = 'SQL Server'

Just stuff it!

I’m sure you’ve all heard it, and probably even said it, many times: “SQL Server sucks at string manipulation”. And with good reason – it is true. But not quite as true as many seem to believe.

 

I notice that many people who complain about SQL Server lacking string manipulation are themselves unaware of the string functions that SQL Server does have. Most know LIKE, LEFT, RIGHT, and SUBSTRING. Some also know CHARINDEX, maybe even REPLACE. But how many of you know and use PATINDEX, REVERSE, QUOTENAME, and LTRIM, to name just a few?

 

The string function that appears to be the most overlooked has to be STUFF. And yet, this function can prove to be an invaluable tool. I’ll give a short description first, then show some examples of how STUFF can be used for string manipulation that would otherwise be much harder to achieve.

 

What STUFF does, basically, is cut a specified portion from a string and replace it with a new string. The replacement string can be shorter, longer, or exactly as long as the part that has been cut out. The STUFF function takes four arguments, none of which is optional. The first is of character data type and specifies the input string; the second and third are integers specifying the starting position and length of the substring to remove, and the fourth is the replacement string. Here are some examples to illustrate the use of STUFF:

 

DECLARE @Test varchar(15)

SET @Test = 'Hugo Kornelis'

SELECT STUFF (@Test, 1, 4, 'I am'),   -- Basic usage example

       STUFF (@Test, 6, 0, 'SQL '),   -- Replaced string can be empty

       STUFF (@Test, 2, 10, '')       -- Replacement string can be empty

 

                                   

--------------- ------------------- ------

I am Kornelis   Hugo SQL Kornelis   His

 

 

The first real world example of using STUFF is based on a newsgroup question. The question was how to replace a substring of a string. Here’s the simplified and non-working example given by the poster (based on the pubs sample database):

 

UPDATE authors

SET    SUBSTRING(phone,5,3) = '888'

WHERE  SUBSTRING(phone,5,3) = '826';

 

One of the regulars in the group replied with this suggestion:

 

UPDATE authors

SET    phone = REPLACE(phone, '826', '888')

WHERE  SUBSTRING(phone,5,3) = '826';

 

That looks clean and tidy, and runs fine on the data in pubs – but what if one of the authors in pubs has phone number ‘801 826-0826’? Indeed, it would be changed to ‘801 888-0888’ instead of ‘801 888-0826’, as requested.

 

Another way to do this that I often see recommended (though not in this particular thread) is to use LEFT and SUBSTRING to cut the string in pieces, then mend them together after making the change. Like this:

 

UPDATE authors

SET    phone = LEFT(phone, 4) + '888' + SUBSTRING(phone, 8, LEN(phone) - 7)

WHERE  SUBSTRING(phone,5,3) = '826';

 

That works – but ugh!, how ugly. This would be a lot easier with STUFF:

 

UPDATE authors

SET    phone = STUFF(phone, 5, 3, '888')

WHERE  SUBSTRING(phone,5,3) = '826';

 

 

Here’s another example. A common question in the newsgroups is how to find individual parts in a string. For instance, a bulk import results in a table of names in the form “Lastname, Firstname”. How to get individual Lastname and Firstname values from this?

For the first part of the string (Lastname in this case), this is easy. Use POSINDEX to find the starting position of the separator, use that to calculate the argument for the LEFT function.

The second part is harder. Answers typically given in the groups to these questions use techniques such as finding the starting position, calculating remaining length and feeding those as parameters to the SUBSTRING (version 1 below); using REVERSE to change the task to another task of finding the first part in a string (version 2 below) or using RIGHT instead of LEFT, with again the use of REVERSE to calculate length (version 3). However, with STUFF (version 4), this becomes much easier! See version 4, below:

 

CREATE TABLE BadData (FullName varchar(20) NOT NULL);

INSERT INTO BadData (FullName)

SELECT 'Clinton, Bill' UNION ALL

SELECT 'Johnson, Lyndon' UNION ALL

SELECT 'Bush, George';

-- Version 1, using SUBSTRING

SELECT LEFT(FullName, CHARINDEX(', ', FullName) - 1) AS LastName,

       SUBSTRING(FullName,

                 CHARINDEX(', ', FullName) + 2,

                 LEN(FullName) - CHARINDEX(', ', FullName) - 1) AS FirstName

FROM   BadData

-- Version 2, using REVERSE, LEFT, and REVERSE again

SELECT LEFT(FullName, CHARINDEX(', ', FullName) - 1) AS LastName,

       REVERSE(LEFT(REVERSE(FullName),

              CHARINDEX(' ,', REVERSE(FullName)) - 1)) AS FirstName

FROM   BadData

-- Version 3, using RIGHT and REVERSE

SELECT LEFT(FullName, CHARINDEX(', ', FullName) - 1) AS LastName,

       RIGHT(FullName, CHARINDEX(' ,', REVERSE(FullName)) - 1) AS FirstName

FROM   BadData

-- Version 4, using STUFF

SELECT LEFT(FullName, CHARINDEX(', ', FullName) - 1) AS LastName,

       STUFF(FullName, 1, CHARINDEX(', ', FullName) + 1, '') AS FirstName

FROM   BadData

 

You might say that there’s not actually that much difference between versions 3 and 4 above, so why am I so enthusiastic about STUFF? To see that, let’s take this to the next level. What if the names in the imported data are of the form “Lastname, Firstname MiddleInitial”, with MiddleInitial being optional? Below, you’ll find the best I was able to do without STUFF, and than a version with STUFF – still not pretty, but not quite as awful as the first version, I’d say. If anyone is able to write a shorter version, I’d love to hear it!

 

CREATE TABLE BadData (FullName varchar(20) NOT NULL);

INSERT INTO BadData (FullName)

SELECT 'Clinton, Bill' UNION ALL

SELECT 'Johnson, Lyndon, B.' UNION ALL

SELECT 'Bush, George, H.W.';

-- Version 1, without STUFF

SELECT FullName,

       LEFT(FullName, CHARINDEX(', ', FullName) - 1) AS LastName,

       SUBSTRING(FullName,

                 CHARINDEX(', ', FullName) + 2,

                 CHARINDEX(', ', FullName + ', ',

                           CHARINDEX(', ', FullName) + 2)

               - CHARINDEX(', ', FullName) - 2) AS FirstName,

       CASE WHEN FullName LIKE '%, %, %'

            THEN RIGHT(FullName, CHARINDEX(' ,', REVERSE(FullName)) - 1)

            ELSE '' END AS MiddleInitial

FROM   BadData

-- Version 2, with STUFF

SELECT FullName,

       LEFT(FullName, CHARINDEX(', ', FullName) - 1) AS LastName,

       STUFF(LEFT(FullName, CHARINDEX(', ', FullName + ', ',

                                      CHARINDEX(', ', FullName) + 2) - 1),

             1, CHARINDEX(', ', FullName) + 1, '') AS FirstName,

       STUFF(FullName, 1,

             CHARINDEX(', ', FullName + ', ',

                       CHARINDEX(', ', FullName) + 2), '') AS MiddleInitial

FROM   BadData

 

As you have seen, learning to use STUFF when appropriate can make hard tasks easy, complex queries simple, and extremely complicated queries somewhat less complicated. This string function is a tool that every SQL Server developer should know. So the next time someone complains how SQL Server is severely lacking adequate string manipulation tools, you know what to do – just tell’m to stuff it!

Published Thursday, October 12, 2006 8:50 PM by Hugo Kornelis
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Roji Thomas said:

October 13, 2006 6:05 AM
 

Denis the SQL Menace said:

-- Version 5, using PARSENAME
SELECT LTRIM(PARSENAME(REPLACE(FullName,',','.'),2))AS LastName,
LTRIM(PARSENAME(REPLACE(FullName,',','.'),1))AS FirstName
FROM   BadData


Denis
October 13, 2006 10:08 AM
 

Denis the SQL Menace said:

hugi,

I didn't see the second part (3 names) before, here is a solution for that too by using PARSENAME

select PARSENAME(FullName,NameLen+1) LastName,
PARSENAME(FullName,NameLen) FirstName,
COALESCE(PARSENAME(FullName,NameLen-1),'') as MiddleInitial
from(
select len(FullName) -len(replace(FullName,',','')) as NameLen,REPLACE(REPLACE(FullName,'.','~'),', ','.') as FullName
from BadData) x
October 13, 2006 10:23 AM
 

Denis the SQL Menace said:

Argh, who put the i next to the o

Hugo I meant to say Hugo not hugi, please don't call me Denise now  ;-)

October 13, 2006 10:27 AM
 

Denis the SQL Menace said:

Small problem, I forgot to replace ~ with . in the result

Here it is, final version (I promise)

SELECT PARSENAME(FullName,NameLen+1) AS LastName,
PARSENAME(FullName,NameLen) AS FirstName,
COALESCE(REPLACE(PARSENAME(FullName,NameLen-1),'~','.'),'') AS MiddleInitial
FROM(
SELECT LEN(FullName) -LEN(REPLACE(FullName,',','')) AS NameLen,
REPLACE(REPLACE(FullName,'.','~'),', ','.') AS FullName
FROM BadData) x


316 characters (according to edit plus)
October 13, 2006 10:40 AM
 

Peter W. DeBetta said:

Denis,
Nice touch using ParseName. I'm jealous I didn't think of it first!

Worth metioning, however, is that you're are limited to a 4 items being parsed. ParseName can take a string with the following formats - 'a', 'a.b', 'a.b.c', 'a.b.c.d' - but cannot parse 'a.b.c.d.e'

So for a name with only three parts (last, first, middle) it works great, but for a 5 part name (prefix/title, first, middle, last, suffix) such as 'Dr., John, Q., Public, III', the technique could no longer be used, as Parsename returns NULL if the string being parsed has more than four items.

October 13, 2006 11:34 AM
 

Denis the SQL Menace said:

Peter, you can't change requirements on me, this is not work  ;-)


But you are right,  PARSENAME is used to split (SERVERNAME,DBNAME, ObjectOwner, Object)

Quite handy with IP adresses also

for 5 part we can always do a join with a number table, something like

DECLARE @chvGroupNumbers VARCHAR(1000)
SELECT @chvGroupNumbers ='1,4,77,88,4546,234,2,3,54,87,9,6,4,36,6,9,9,6,4,4,68,9,0,5,3,2,'

SELECT SUBSTRING(',' + @chvGroupNumbers + ',', NumberID + 1,
CHARINDEX(',', ',' + @chvGroupNumbers + ',', NumberID + 1) - NumberID -1)AS Value
FROM NumberPivot
WHERE NumberID <= LEN(',' + @chvGroupNumbers + ',') - 1
AND SUBSTRING(',' + @chvGroupNumbers + ',', NumberID, 1) = ','
GO


Or make a UDF

Denis
October 13, 2006 11:44 AM
 

Hugo Kornelis said:

> please don't call me Denise now

I won't, Denos. :-P

Nice solution using PARSENAME - kudos!

I also liked the extended version using a numbers table, but this one can be simplified to
SELECT SUBSTRING(@chvGroupNumbers + ',',
                Number,
                CHARINDEX(',', @chvGroupNumbers + ',', Number + 1) - Number) AS Value
FROM   dbo.Numbers
WHERE  Number <= LEN(@chvGroupNumbers + ',')
AND    SUBSTRING(',' + @chvGroupNumbers + ',', Number, 1) = ','

But of course, you'll have to add a ROW_NUMBER() to preserve the order of the elements, otherwise we'd end up switching first and last names or day and month numbers and looking all American <g>

SELECT SUBSTRING(@chvGroupNumbers + ',',
                Number,
                CHARINDEX(',', @chvGroupNumbers + ',', Number + 1) - Number) AS Value,
      ROW_NUMBER() OVER (ORDER BY Number) AS Position
FROM   dbo.Numbers
WHERE  Number <= LEN(@chvGroupNumbers + ',')
AND    SUBSTRING(',' + @chvGroupNumbers + ',', Number, 1) = ','
October 14, 2006 1:01 PM
 

Daniel Lins said:

ok hugo, here s one more use of stuff (and PATINDEX) .

if there is a need to remove a character of defined length at the end of a string ,you could put to use the following:

lets say @str = someone and something and

select stuff(@str,PATINDEX('%and',@str),3,'')

here the result would be

--------------------------------

'someone and something'

cool uh?!

December 13, 2007 8:07 AM
 

Daniel Lins said:

Now guys .......anyone willing to help...............its kinda simple........but my head is kinda numb...........so here s the question.................

I need to make a report in the following manner below

year | RG1 | RG2            | value

07     Plan    Executed        3

07     Plan    Abandoned     1

07     Plan    Complete       1

this is to be done using sql 2000 (wihtout making use of reporting service)

So who's got the answer????

December 13, 2007 8:19 AM
 

Hugo Kornelis said:

Hi Daniels,

Sorry for the late reply. This is quite an old post, so I missed the new comment.

To answer your question, I need to know what the data you start with looks like. So I suggest that you post CREATE TABLE statements for the tbales involved (with all constraints, properties, and indexes included), INSERT statements with some well-chosen rows of sample data, and expected output.

However, don't post it here. Post it to a newsgroup, for instance microsoft.public.sqlserver.programming. There are many people in that group, always willing to help you solve such tasks, so you'll get a reply much sooner than when you post here! :)

Happy holidays!

December 24, 2007 9:57 AM
 

Ronald Beuker said:

Great article Hugo, thx! There's just a small error: POSINDEX should be CHARINDEX :-)

May 25, 2008 12:33 PM

Leave a Comment

(required) 
(required) 
Submit

About Hugo Kornelis

Hugo is co-founder and R&D lead of perFact BV, a Dutch company that strives to improve analysis methods and to develop computer-aided tools that will generate completely functional applications from the analysis deliverable. The chosen platform for this development is SQL Server. In his spare time, Hugo likes to visit the SQL Server newsgroups, in order to share and enhance his knowledge of SQL Server.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement