THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alberto Ferrari

How are GUIDs sorted by SQL Server?

Today I ran into a nice problem with SQL Server fancy way of sorting GUIDS.

I know, there should be no reason at all to sort GUID columns but, if what you need is run two tables side by side to check whether they are aligned or not (this is the task of TableDifference) and a GUID is the best key you have, then you will end up sorting GUIDS. The question now is: how do you sort GUIDS? The fact is that SQL Server and .NET give different answers. Let's have a look at it in deeper detail.

I was using .NET guids to read data from SQL Server and then use .Compare to check for the sorting, supposing that a uniqueidentifier in SQL is sorted the same way a guid is sorted in .NET.

This is not true, I discovered here that you should use SqlGuid instead. Let's not spend too many words on why having two sort algorithm may be clever or not, what is frightening about that article is the fact that it says that SqlGuid uses only the last six bytes to sort data. If it's working this way then we will end up with a sorting algorithm that does not really sorts. Seemed to be too stupid, so I double checked it and ran into this that says something more interesting about how GUIDS are sorted. Anyway, the algorithm is not still clear. Will I end up with a completely sorted list if I issue an ORDER BY on a uniqueidentifier column? and, how will this data be sorted?

After all, the best way is always that of trying, so I end up with this very simple and funny query (that you can use to show friends that bits are still here, in 2007!):

With UIDs As (--                         0 1 2 3  4 5  6 7  8 9  A B C D E F
            Select ID =  1, UID = cast ('00000000-0000-0000-0000-010000000000' as uniqueidentifier)
    Union   Select ID =  2, UID = cast ('00000000-0000-0000-0000-000100000000' as uniqueidentifier)
    Union   Select ID =  3, UID = cast ('00000000-0000-0000-0000-000001000000' as uniqueidentifier)
    Union   Select ID =  4, UID = cast ('00000000-0000-0000-0000-000000010000' as uniqueidentifier)
    Union   Select ID =  5, UID = cast ('00000000-0000-0000-0000-000000000100' as uniqueidentifier)
    Union   Select ID =  6, UID = cast ('00000000-0000-0000-0000-000000000001' as uniqueidentifier)
    Union   Select ID =  7, UID = cast ('00000000-0000-0000-0100-000000000000' as uniqueidentifier)
    Union   Select ID =  8, UID = cast ('00000000-0000-0000-0010-000000000000' as uniqueidentifier)
    Union   Select ID =  9, UID = cast ('00000000-0000-0001-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 10, UID = cast ('00000000-0000-0100-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 11, UID = cast ('00000000-0001-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 12, UID = cast ('00000000-0100-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 13, UID = cast ('00000001-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 14, UID = cast ('00000100-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 15, UID = cast ('00010000-0000-0000-0000-000000000000' as uniqueidentifier)
    Union   Select ID = 16, UID = cast ('01000000-0000-0000-0000-000000000000' as uniqueidentifier)
)
Select * From UIDs Order By UID, ID

 

The result is simple, the ID's are completely reversed even if they appear as second argument in the SORT request. This means that the evaluation is carried on in byte groups this way:

  • 0..3 are evaluated in left to right order and are the less important, then
  • 4..5 are evaluated in left to right order, then
  • 6..7 are evaluated in left to right order, then
  • 8..9 are evaluated in right to left order, then
  • A..F are evaluated in right to left order and are the most important

Well, I know it is not very important, but it is funny and maybe useful in some future. :) The really important thing is that ORDER BY GUID works, all bytes are evaluated and sorted, even if in a strange way.

Published Friday, August 31, 2007 5:19 PM by AlbertoFerrari
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

BAZRAMIT said:

I am wondering how indexing can be done on a GUID column? What if a primary key is GUID, would that create a performance issue?

Thanks

September 1, 2007 3:02 PM
 

Mladen said:

if speed isn't important i just convert then to varchars and sort them on that.

no worries no fuss.

September 1, 2007 3:53 PM
 

AlbertoFerrari said:

Bazramit,

using a GUID as a primary column is not a good idea. You normally have the clustered index based on the primary column, guids will give you a random generation of codes so SQL server will end up sorting a lot of data at each insertion.

INT IDENTITY are still the best choice for primary columns, they are created monotonically and require no sort (apart from some small problems you can find on this blog, but none of them is unsolvable).

Alberto

September 6, 2007 10:52 AM
 

AlbertoFerrari said:

Mladen,

Yep, surely varchar are much better for sorting purposes but... gathering the material for the post lead to some hours spent having fun. :)

September 6, 2007 10:55 AM
 

Dennis Gorelik said:

Alberto,

1) Thanks for good tip about how to find out SQL Server uniqueidentifier sort order.

2) There are may be many reasons to sort uniqueidentifier (guid) column.

For example, if you want to know what record was created first -- having guid column sorted may become quite useful.

3) SQL Server 2005 supports "default newsequentialid()" constraint, that sequentially increments uniqueidentifier column.

But what if you want to generate new sequential guid in C#?

In my case I need it, because I write same GUID into two tables and use SqlBulkCopy for performance considerations.

For mode details see my post <a href="http://developmenttips.blogspot.com/2008/03/generate-sequential-guids-for-sql.html">Generate Sequential GUIDs for SQL Server 2005 in C#</a>

4) If uniqueidentifier column is sequential, then it can be indexed by clustered key.

If it's not sequential -- you can use nonclustered index. In any case -- it's ok to use GUIDs for your primary key (if you know what you are doing).

5) Using GUIDs [for your primary keys] helps with replication scenario.

---

P.S.: Could you please turn on email notification for comments in your blog?

March 11, 2008 1:11 AM
 

Kevin said:

Ok, I'm glad for the information you researched but I suspect it may not be correct.

If I run this query:

SELECT     type_id, class_id, name, description

FROM         UnitTypes

Order by class_id

You are correct. It is not in a logical ABC order.

However, if you are correct, why does this work?

SELECT     type_id, class_id, name, description

FROM         UnitTypes

ORDER BY rtrim(ltrim(class_id))

October 23, 2008 9:51 AM
 

AlbertoFerrari said:

Kevin,

I haven't tested your query, but RTRIM (LTRIM (<whatever you want>)) if ever computed is a string, not a GUID. So sorting rules are those for strings.

October 23, 2008 10:05 AM
 

Kevin said:

So, its converting it automatically to a string when I add those to the order by, hence the correct abc order?

October 23, 2008 10:13 AM
 

Martin Bell UK SQL Server MVP said:

When creating client-side guids to be stored in SQL Server things are not as simple as you expect...

May 25, 2009 9:38 AM
 

chris saam said:

Using reflector on SqlGuid you see that the comparison uses a map to look up each byte by index.  Below is the map

 x_rgiGuidOrder = new int[] { 10, 11, 12, 13, 14, 15, 8, 9, 6, 7, 4, 5, 0, 1, 2, 3 };

The last 6 bytes aren't the only ones compared but they are the first.

However it does look like SQL server does ignore the rest in it's sorting since the queries in my test

SELECT TenantId FROM Subscription where TenantId between '00000000-0000-0000-0000-000000000000' and '00000000-0000-0000-0000-010000000000'

SELECT TenantId FROM Subscription where TenantId between '00000000-0000-0000-0000-000000000000' and 'FFFFFFFF-FFFF-FFFF-FFFF-010000000000'

return the same result set.

December 2, 2009 10:31 PM
 

Chris Saam said:

After experimenting a bit in SQL I found that

SELECT TenantId FROM Subscription where TenantId between 'FFFFFFFF-FFFF-FFFF-00FF-FFF68AEFE603' and '00000000-0000-0000-0000-FFF68AEFE604'

Returns the TenantId:

50EE211B-BEAE-434A-98B4-FFF68AEFE603

but

SELECT TenantId FROM Subscription where TenantId between 'FFFFFFFF-FFFF-FFFF-FFFF-FFF68AEFE603' and '00000000-0000-0000-0000-FFF68AEFE604'

Does not.  So it seems that Sql sorts on 7 bytes not 6 with the 7th byte being at byte 8 which corresponds to the SqlGuid map.

 x_rgiGuidOrder = new int[] { 10, 11, 12, 13, 14, 15, 8, 9, 6, 7, 4, 5, 0, 1, 2, 3 };

December 2, 2009 10:50 PM
 

Ferrari hire said:

Thanks for this

April 16, 2010 7:57 AM
 

Lynn said:

How does this sort order affedc SSAS when GIDS are used as surrogate Key's

July 12, 2010 11:08 AM
 

microfox said:

> 8..9 are evaluated in right to left order, then

> A..F are evaluated in right to left order and are the most important

This is not exactly true.

Bytes 8..9 and A..F are evaluated in left to right order as the other bytes.

November 19, 2012 5:30 PM
 

Marc Brooks said:

For C# "sequential" (e.g. before the UUIDs can have privacy implications fiasco of 1999 forced a new "random" format) you can do this:

   internal static class SafeNativeMethods

   {

       [DllImport("rpcrt4.dll", SetLastError = true)]

       internal static extern int UuidCreateSequential(out Guid guid);

   }

   public static partial class Utility

   {

       const int RPC_S_OK = 0;

       const int RPC_S_UUID_LOCAL_ONLY = 1824;

       const int RPC_S_UUID_NO_ADDRESS = 1739;

       public static Guid NewSequentialGuid()

       {

           Guid value;

           int rc = SafeNativeMethods.UuidCreateSequential(out value);

           switch (rc)

           {

               case RPC_S_OK:

                   return value;

               case RPC_S_UUID_LOCAL_ONLY:

               case RPC_S_UUID_NO_ADDRESS:

                   return Guid.NewGuid();

               default:

                   throw new NotImplementedException("OS returned unexpected error code generating Guid: " + rc);

           }

       }

   }

August 19, 2013 4:28 PM
 

Marc Brooks said:

As for why GUIDs (which are supposed to be UUIDs) look like they do, and how they are represented, see this:

http://stackoverflow.com/questions/10687505/uuid-format-8-4-4-4-12-why

August 19, 2013 4:30 PM
 

Nils said:

I tried the following statement:

SELECT [ID]

     ,[GUID]

 FROM [MyTable] order by cast(guid as varchar(50))

which works really fine.

October 28, 2013 8:56 AM

Leave a Comment

(required) 
(required) 
Submit

About AlbertoFerrari

Alberto Ferrari is a Business Intelligence consultant. He his interests lie in two main areas: BI development lifecycle methodologies and performance tuning of ETL and SQL code. His main activities are with SSIS and SSAS for the banking, manufacturing and statistical sectors. He is also a speaker in international conferences like European PASS Conference and PASS Summit.
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement