THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

John Paul Cook

SQL Server and case sensitivity

Just a few minutes ago, I thought a coworker insulted “my” database. He said it was insensitive. I asked him exactly what problem he had with the database. He said it couldn’t distinguish between lowercase and uppercase and he wanted that fixed.

 

Case sensitive comparisons do come up from time to time. By default, SQL Server creates case insensitive databases. Rebuilding a database to change to a case sensitive collation is one way of dealing with case sensitive comparisons, but there is a much easier way. You can specify the collation in the join or comparison condition. CS is case sensitive, CI is case insensitive. AI is accent insensitive.

Run the sample code below to see how it works:

 

create table #uppercase (
    
col nchar(1)
);
go

 

create table #lowercase (
   
col nchar(1)
);
go

 

insert into #uppercase (col) values ('A');
insert into #uppercase (col) values ('B');
insert into #uppercase (col) values ('C');
insert into #uppercase (col) values ('D');
insert into #uppercase (col) values ('E');

 

insert into #lowercase (col) values ('a');
insert into #lowercase (col) values ('b');
insert into #lowercase (col) values ('c');
insert into #lowercase (col) values ('d');
insert into #lowercase (col) values ('e');

 

select u.col, l.col
from #uppercase u inner join #lowercase l
on u.col = l.col--returns 5 rows

 

select u.col, l.col
from #uppercase u inner join #lowercase l
on u.col collate Latin1_General_CS_AI = l.col collate Latin1_General_CS_AI--returns 0 rows

 

Published Tuesday, July 13, 2010 3:29 PM by John Paul Cook

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

AaronBertrand said:

I've often thought that it would be a good idea, in general, to develop in case sensitive databases, even if your target now does not require it.  This is especially true in SaaS or 3rd party software where you never know what environment you may need to deploy to next.

July 13, 2010 4:00 PM
 

Boyan Penev said:

To support the point of using case sensitive databases - think of SSIS. In many cases SSIS is case sensitive - therefore if our databases are also case sensitive we easily avoid "strange" results when comparing strings. In particular the Lookup dataflow task in SSIS _always_ causes issues with this when in full caching mode. Just yesterday I thought about it after we caught some problems with an ETL package and I am inclined to agree with Aaron that because of compatibility issues it could be beneficial to use case sensitive collations by default. I'll give it a go next time I get a chance.

July 13, 2010 7:11 PM
 

Paul Nielsen said:

Case sensitive collations impact more than just the data, a case sensitive database means that all the user created objects (tables, columns, procs, variables, etc.) are also case sensitive. Developing on a case insemsitive database and then porting the scritps to a case sensitive environment can be a humbling experience.

July 15, 2010 4:33 PM
 

Srini Potti said:

What about the performance aspect then ? If we specify the collation in join then will it use indexes if any ? I have never tried it though as I have all case insensitive databases.

July 15, 2010 4:50 PM
 

Ralph D. Wilson II said:

Having recently been involved with a mixture of CI and CS databases, I have encountered many of the quirks involved with both.  For instance, CS databases are subject to problems due to users/applications not paying attention to case when performing data entry tasks . . . e.g. "john smith", "John Smith", and "JOHN SMITH" are not the same in a CS database.  While this may not seem like a major problem, it can result in some rather unfortunate results if there is no front end control on the data being entered.

One issue that I really wish Microsoft would address is that there are NO CS collations that have an equivalent CI collation such that you can have a CI table into which the data from a CS table can be readily inserted (or vice versa).  In every instance, you have to perform collation changes in the INSERT statement . . . and that can get rather tricky if you are inserting data using a complex INSERT statement that involves joins between various CS tables and subqueries on them.

December 2, 2011 12:18 PM
 

Juan Pablo said:

Great post,

I see you make in line condition for the case

Greetings

October 13, 2012 8:50 AM

Leave a Comment

(required) 
(required) 
Submit

About John Paul Cook

John Paul Cook is a Technology Solutions Professional for Microsoft's data platform and works out of Microsoft's Houston office. Prior to joining Microsoft, he was a Microsoft SQL Server MVP. He is experienced in Microsoft SQL Server and Oracle database application design, development, and implementation. He has spoken at many conferences including Microsoft TechEd and the SQL PASS Summit. He has worked in oil and gas, financial, manufacturing, and healthcare industries. John is also a Registered Nurse who graduated from Vanderbilt University with a Master of Science in Nursing Informatics and is an active member of the Sigma Theta Tau nursing honor society. He volunteers as a nurse at safety net clinics. Contributing author to SQL Server MVP Deep Dives and SQL Server MVP Deep Dives Volume 2. Opinions expressed in John's blog are strictly his own and do not represent Microsoft in any way.

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement