THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Alexander Kuznetsov

Are values generated by NewId() distributed evenly?

I ran a quick check, and it looks like a reasonably even distribution to me. I genenerated 2M NewId values, as follows:

 

SELECT NewID() AS RandomColumn INTO #t  FROM Data.TableWith2Mrows


 I looked at the distribution into 256 buckets, considering the first byte of the value only. All 256 buckets had values:

 

SELECT COUNT(*) AS Cnt
FROM (
SELECT SUBSTRING(RandomColumn,1,1) AS FirstByte, COUNT(*) AS Frequency
FROM (SELECT CAST(RandomColumn AS BINARY(16)) AS RandomColumn FROM #t) AS t
GROUP BY SUBSTRING(RandomColumn,1,1)
) AS Distribution

-----------

 

256

 

and the number of values in all the buckets was between 7200 and 7700:

 

 SELECT COUNT(*) AS Cnt, Frequency
FROM (
SELECT SUBSTRING(RandomColumn,1,1) AS FirstByte, COUNT(*) AS Frequency
FROM (SELECT CAST(RandomColumn AS BINARY(16)) AS RandomColumn FROM #t) AS t
GROUP BY SUBSTRING(RandomColumn,1,1)
) AS Distribution
GROUP BY Frequency
ORDER BY Frequency

 After I had posted this, Simon Sabin told me that he also posted a similar script. He did it four years ago. Somehow I failed to google it up:

 

http://sqlblogcasts.com/blogs/simons/archive/2005/06/13/Random-number-on-a-per-row-basis.aspx

 

Published Thursday, May 14, 2009 1:56 PM by Alexander Kuznetsov

Comments

 

Adam Machanic said:

And here's another recent post on the same topic :-)

http://blogs.msdn.com/jenss/archive/2009/04/08/when-is-random-random-enough.aspx

May 20, 2009 3:39 PM
 

Alexander Kuznetsov said:

Adam,

How did you Google it up? Because whatever I did, Google did not bring it to me.

I think that's because the author should uppercase Force in the following line:

   (1,'Luke', 'Skywalker', 'The force is strong with him') ,

;)

May 20, 2009 4:46 PM
 

Adam Machanic said:

I didn't Google it; I happened to read it a few weeks ago when I was browsing the blog.

May 29, 2009 10:18 AM
New Comments to this post are disabled

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization. Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.

This Blog

Syndication

Privacy Statement