THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Alberto Ferrari

How are GUIDs sorted by SQL Server?

Today I ran into a nice problem with SQL Server fancy way of sorting GUIDS.

I know, there should be no reason at all to sort GUID columns but, if what you need is run two tables side by side to check whether they are aligned or not (this is the task of TableDifference) and a GUID is the best key you have, then you will end up sorting GUIDS. The question now is: how do you sort GUIDS? The fact is that SQL Server and .NET give different answers. Let's have a look at it in deeper detail.

I was using .NET guids to read data from SQL Server and then use .Compare to check for the sorting, supposing that a uniqueidentifier in SQL is sorted the same way a guid is sorted in .NET.

This is not true, I discovered here that you should use SqlGuid instead. Let's not spend too many words on why having two sort algorithm may be clever or not, what is frightening about that article is the fact that it says that SqlGuid uses only the last six bytes to sort data. If it's working this way then we will end up with a sorting algorithm that does not really sorts. Seemed to be too stupid, so I double checked it and ran into this that says something more interesting about how GUIDS are sorted. Anyway, the algorithm is not still clear. Will I end up with a completely sorted list if I issue an ORDER BY on a uniqueidentifier column? and, how will this data be sorted?

After all, the best way is always that of trying, so I end up with this very simple and funny query (that you can use to show friends that bits are still here, in 2007!):

With UIDs As (--                         0 1 2 3  4 5  6 7  8 9  A B C D E F
            Select ID =  1, UID = cast ('00000000-0000-0000-0000-010000000000' as uniqueidentifier)
    Union   Select ID =  2, UID = cast ('00000000-0000-0000-0000-000100000000' as uniqueidentifier)
    Union   Select ID =  3, UID = cast ('00000000-0000-0000-0000-000001000000' as uniqueidentifier)
    Union   Select ID =  4, UID = cast ('00000000-0000-0000-0000-000000010000' as uniqueidentifier)
    Union   Select ID =  5, UID = cast ('00000000-0000-0000-0000-000000000100' as uniqueidentifier)
    Union   Select ID =  6, UID = cast ('00000000-0000-0000-0000-000000000001' as uniqueidentifier)
    Union   Select ID =  7, UID = cast ('00000000-0000-0000-0100-000000000000' as uniqueidentifier)
    Union   Select ID =  8, UID = cast ('00000000-0000-0000-0010-000000000000' as uniqueidentifier)
    Union   Select ID =  9, UID = cast ('00000000-0000-0001-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 10, UID = cast ('00000000-0000-0100-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 11, UID = cast ('00000000-0001-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 12, UID = cast ('00000000-0100-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 13, UID = cast ('00000001-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 14, UID = cast ('00000100-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 15, UID = cast ('00010000-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 16, UID = cast ('01000000-0000-0000-0000-000000000000' as uniqueidentifier)
)
Select * From UIDs Order By UID, ID

 

The result is simple, the ID's are completely reversed even if they appear as second argument in the SORT request. This means that the evaluation is carried on in byte groups this way:

  • 0..3 are evaluated in left to right order and are the less important, then
  • 4..5 are evaluated in left to right order, then
  • 6..7 are evaluated in left to right order, then
  • 8..9 are evaluated in right to left order, then
  • A..F are evaluated in right to left order and are the most important

Well, I know it is not very important, but it is funny and maybe useful in some future. :) The really important thing is that ORDER BY GUID works, all bytes are evaluated and sorted, even if in a strange way.


Published Friday, August 31, 2007 5:19 PM by AlbertoFerrari
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

BAZRAMIT said:

I am wondering how indexing can be done on a GUID column? What if a primary key is GUID, would that create a performance issue?

Thanks

September 1, 2007 3:02 PM
 

Mladen said:

if speed isn't important i just convert then to varchars and sort them on that.

no worries no fuss.

September 1, 2007 3:53 PM
 

AlbertoFerrari said:

Bazramit,

using a GUID as a primary column is not a good idea. You normally have the clustered index based on the primary column, guids will give you a random generation of codes so SQL server will end up sorting a lot of data at each insertion.

INT IDENTITY are still the best choice for primary columns, they are created monotonically and require no sort (apart from some small problems you can find on this blog, but none of them is unsolvable).

Alberto

September 6, 2007 10:52 AM
 

AlbertoFerrari said:

Mladen,

Yep, surely varchar are much better for sorting purposes but... gathering the material for the post lead to some hours spent having fun. :)

September 6, 2007 10:55 AM
 

Dennis Gorelik said:

Alberto,

1) Thanks for good tip about how to find out SQL Server uniqueidentifier sort order.

2) There are may be many reasons to sort uniqueidentifier (guid) column.

For example, if you want to know what record was created first -- having guid column sorted may become quite useful.

3) SQL Server 2005 supports "default newsequentialid()" constraint, that sequentially increments uniqueidentifier column.

But what if you want to generate new sequential guid in C#?

In my case I need it, because I write same GUID into two tables and use SqlBulkCopy for performance considerations.

For mode details see my post <a href="http://developmenttips.blogspot.com/2008/03/generate-sequential-guids-for-sql.html">Generate Sequential GUIDs for SQL Server 2005 in C#</a>

4) If uniqueidentifier column is sequential, then it can be indexed by clustered key.

If it's not sequential -- you can use nonclustered index. In any case -- it's ok to use GUIDs for your primary key (if you know what you are doing).

5) Using GUIDs [for your primary keys] helps with replication scenario.

---

P.S.: Could you please turn on email notification for comments in your blog?

March 11, 2008 1:11 AM
 

Nhibernate Oracle GUID as Primary Key mapping using custom IUserType | Michael Hanney said:

April 13, 2008 1:05 AM
 

Technical Thoughts of Jorriss said:

A few months ago we started development on a new system. From the ground up we redesigned everything

April 24, 2008 11:55 AM
 

dosql » Order by GUID in SQL Server said:

September 3, 2008 12:00 PM
 

Sorting Guid in .NET the way Microsoft SQL Server does it | Rinat Abdullin said:

October 8, 2008 2:45 PM
 

Kevin said:

Ok, I'm glad for the information you researched but I suspect it may not be correct.

If I run this query:

SELECT     type_id, class_id, name, description

FROM         UnitTypes

Order by class_id

You are correct. It is not in a logical ABC order.

However, if you are correct, why does this work?

SELECT     type_id, class_id, name, description

FROM         UnitTypes

ORDER BY rtrim(ltrim(class_id))

October 23, 2008 9:51 AM
 

AlbertoFerrari said:

Kevin,

I haven't tested your query, but RTRIM (LTRIM (<whatever you want>)) if ever computed is a string, not a GUID. So sorting rules are those for strings.

October 23, 2008 10:05 AM
 

Technical Thoughts of Jorriss said:

Unraveling the mysteries of NewSequentialID

October 26, 2008 11:06 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement