|
|
|
|
-
Playing with SSIS I normally have to read flat files that come in zip format. The standard procedure here is to launch winzip, decompress the file, read it with a flat file adapter and then delete the uncompressed file. This behaviour generates a lot of useless I/O on temporary disks to decompress a file that will be soon deleted. Another way of solving the problem is to have a look at DotNetZip library. A very smart programmer wrote a library that makes opening zip files very easy using .NET. He also added in version 1.6 a great function: OpenReader. You can open a zip file, look into its entries and open a stream reader on it. Using this tecnique you can open a stream reader directly into the zip file and the process the stream using a VB source component in your data flow package. And... Yes, you cannot use the flat file adapter. You will need to parse the stream and read strings from it but you will end up with very few I/O when compared to the unzip / read / delete solution. As I/Os are very expensive, this tecnique may speed up your package significantly.
|
-
Table difference is an SSIS custom component designed to simplify the management of slowly changing dimensions and – in general – to check the differences between two tables or data flow with SSIS. Its main advantage over Microsoft standard SCD component is its speed and ease of use, the component receives input from two sorted sources and generates different outputs for unchanged, new, deleted or updated rows. The project is freeware, full source code is available at www.sqlbi.com. Note to version 1.x users The internal structure of the metadata of TableDifference is very different from the previous one. The component is able to read metadata from the previous version and write them with the new format as soon as it is called from inside BIDS so you will be able to reload all the previous metadata without loss of functionality. However, in writing the new metadata, the component changes the lineage IDs of all the output columns so, when you open a package that contains a previous version of TableDifference, you will need to open its designer to check that everything worked fine and the correct the subsequent flows by double clicking on the components that use TableDifference outputs. I have converted a lot of packages without any problems but, if you encounter something strange, do not hesitate to contact me. Introduction One question that arises very often in Data Warehouse programming is “what are the differences between these two tables/flows of data?” It is often the case in SCD management; you receive several millions of customers and should decide what changes need to be done after your last successful load. Using SSIS we have the SCD component, it works but it does it slow that in the production environment it is quite always convenient to create an “ad hoc” solution to handle the SCD. We decided to write an SSIS component that has two inputs, one for the “old” and one for the “new” data, compares all the rows that come in from the old and the new flow and sends them to different outputs, namely: - Unchanged rows (are the same in both input)
- Deleted rows (appear in old but not in new)
- New rows (appear in new but not in old)
- Updated rows (appear in both flows but something is changed)
The following picture illustrates the component functionality better than thousand words.
As not every update is to be handled the same way, it is possible to tag each column with an integer; the number of updated outputs is determined by the number of tags, one for each, in this way you can tag with “10” the historical attributes, with “20” other columns and decide what kind of operation to carry on with the different updates. The inputs MUST be sorted and have a collection of fields (keys) that let the component decide when two rows from the inputs represent the same row, but this is easily accomplished by SQL Server with a simple “order by” and a convenient index; moreover the SCD do normally maintain an index by the business key, so the sorting requirement is easily accomplished and do not represent a problem. The main structure The structure of the component is pretty simple: - Old Flow Input: it should be attached to a convenient query that returns all the current rows from the SCD
- New Flow Input: it can be connected to the flow where the new structure of the SCD has been computed.
Each input has a buffer where all the incoming rows are directed and maintained. A separate thread starts as soon as data from both buffers is available and checks for the differences between them, sending the rows to the appropriate output and then removing the buffered rows from the buffers. The buffer does not need to maintain all the data from the SCD, data is removed as soon as it can be compared with a corresponding row: the buffer, even for a several million record table, should be pretty small (but read the document to the end… some tricks may be useful to avoid memory consumption). The outputs The component has three standard outputs for new, deleted and unchanged rows. The number of updated rows output is determined by the user via the component editor, one output is provided for each different updateID that is inserted by the user. The outputs are named “UpdateID” followed by the updateID defined by the user. Of course, you can easily change the name of the output to something more interesting like “Updated historical attribute” and we encourage you to do so! The collection of output fields is computed by the component via the intersection of the two inputs: if one column appears in only one input it is not managed, if it appears in both then it will be compared and outputted. Installing TableDifference Installation is very simple: · copy TableDifference.DLL into “<Program Files>\Microsoft SQL Server\90\DTS\PipelineComponents” · add the DLL to the GAC using “GACUTIL –I TableDifference.DLL” No installer for the moment… sorry. Note for Vista users: you need to run these command on a command line opened with administrator privileges to perform the tasks. Using TableDifference After the component is installed you should add it to the toolbar as for every SSIS component and then you are ready to use it. After both inputs has been attached, by double clicking on the component, you reach the component editor that shows a single window like this:
A few words about what is happening just before the component editor pops up: - TableDifference analyzes all the columns in both inputs and compares their names. If the name of two columns and their corresponding types are identical, TableDifference adds them to the available columns to manage.
- If the flows are sorted, their sort columns will be marked as key fields, using the same order in which they appear in the sort.
- All other columns are assigned a standard Update ID of 10 and are managed as comparable columns.
Using the component editor you need to provide these information for the columns: - Check Option: you can choose the column type between:
- Key field: these column will be used to detect when two rows from the inputs represent the same row. Beware that the inputs must be sorted by those columns
- Compare: these columns will be compared one by one to detect differences
- Prefer NEW: these columns will be copied from the NEW input directly into the output, no check
- Prefer OLD: these columns will be copied from the OLD input directly into the output, no check
- KeyOrder: If a column is of type “Key Field” it is the order under which the field appear under the “order by” clause of your query. Beware that the component do not check for the correct sorting sequence, it is up to you to provide this information.
- Update ID: each different UpdateID creates a different output flow. If you need to detect when a change appears in some column you can use different update ID. Beware that the lowest update ID wins, i.e. if AccountNumber has update id of 10 and AddressLine1 has update id of 20, then Accountnumber will be checked first and if a change is detected, the row will go to update output 10, no matter if AddressLine has a difference.
Clicking “OK” is enough for the component to generate the outputs and to define the metadata for all the outputs. In version 2.0 there are a three new panels: Outputs panel In this panel you can choose which output to enable. If you are not interested, for example, in the unchanged output, then you can deselect it from this panel to avoid warnings for unused columns.
You can also rename outputs and provide a concise description of them. Renaming is very useful as it gives the data flow clearness. Output Details This panel let you select the columns for each output. You cannot add any column but you can disable columns for outputs that do not use them.
The upper combo box let you choose an output, in the grid you can select or unselect any column. This feature is useful as it avoids warnings for unused columns. In the picture, the deleted output will receive only the customer key as it will make no use of any other information. Always remember that removing useless output columns increases the performances of the component. Misc Options Under this pane there are miscellaneous options.
Here you can define, for string comparisons: - The culture ID to use to perform the comparison. If not specified TableDifference will use the culture ID of the running task. The default is “empty”.
- If you want it to ignore casing during string comparisons. The default is unchecked so TableDifference will perform comparison considering case.
Warnings This panel will list any unused column from the input. As you might recall, if two columns are not identical regarding name and type, TableDifference will ignore them. This might be an error but the decision is up to you. By checking the warnings panel you can see if TableDifference is working with all the columns you need it to compare. FlowSync The component works fine and do not consume memory if and only if the input from both flows come in at a synchronized speed: if one input is much faster than the other then the component will start buffering data and consume memory that will be freed only when the buffer starts to shrink. Using the component to compare two tables with 5 millions of records we had several problems with memory, because data came in from one buffer much faster than the other and, after the difference (and hence the buffer on one input) reached 1.5 millions of records, the whole memory of the DtExec process (in a 32 bit virtual space) was filled in. As both inputs were from a simple “Select * From” and the speed from both lines was the same, we discovered that – for some obscure reason – SSIS prefers one input to the other and do not leave enough time to both tasks, resulting in memory consumption by the component. FlowSync is a component that will make two or more flows of data in an SSIS data flow package run at the same speed, by stopping one flow if the others run too slow. It has been created as a convenient companion to TableDifference to resolve the problems with memory occupation, it can be used by its own in the case where you want flows to run at the same speed. It makes use of semaphores to handle synchronization, so no CPU is ever wasted, when the faster flows is stopped all the CPU is free for others (more useful) processes. You can find source code, executable and description of FlowSync at www.sqlbi.com. We normally use flowsync when we need to compare more than half a million rows while we use TableDifference without flow sync for all the smaller tables in a project.
|
-
The new version 2.0 of SSIS SCD handling component TableDifference is in beta, I am searching for beta testers and some help in writing an installer for it. You will find all the informations at www.sqlbi.eu. The final release of the component will be as usually freely available with sources as soon as it has been tested enough.
|
-
Today I was optimizing a package that handles a very common topic: process some rows and send all them to a flow but, for some of them, make additional processing. A simple Multicast and a Conditional Split solve the problem, the picture explains it better than my words: Now, I was wondering why we should duplicate one million rows to two millions and then trash half of them away. I thought a better solution was to separate the rows before and duplicate only half a million, bringing them together again with the other half million from the other flow. So I tried this solution, thas does exactly the same but does not trash anything. We are not in the age of consumerism and we do not like to trash anything, don't we? Well, even if I thought it would be a better solution, it is not. The first one (with one million rows of 1K each) takes approximately 13 seconds to run, the second one with the same data 16 seconds, that is 23% slower than before. So, the conclusion is that SSIS still lives in the age of consumerism, the process of UNION ALL takes a very long time to process and makes trashing a better solution. There are good explanations for this (UNION ALL does a lot more work than Multicast does) but I did not think that a 23% degradation would occur. The lesson is: it is best to trash then try to recycle, at least with data with SSIS. :)
|
-
Visio database models have column descriptions, SQL Server has extended properties where to store them but, even if they share the same producer, it seems that Visio is unable to store column descriptions in extended properties so that other tools can gather the same information easily and produce a decent documentation of a database. After some browsing on the web, I was unable to find a tool that let Visio save the table and column descriptions in SQL Server, so I decided it was time to code it by myself. First you need to generate the DDL script of the database having DDL Script commens ON for all the columns (which can be configured in the driver options) as in the following picture: In the DDL file, VISIO adds header of comments to each table like this: /* Create new table "Sales.Dim_Customers". */
/* "Sales.Dim_Customers" : Contains all the customer with various attributes used */
/* to analyze them. Does not contain (bla bla bla) */
/* "ID_Customer" : Primary key of the table */
/* "CustomerCode" : Customer code as seen in the OLTP database */
/* "ID_Geography" : Key in Dim_Geography. It is not shown to the end user */
/* "MaritalStatus" : Description of the marital status (Single / Married) */
/* "Gender" : Description of the gender (Male, Female) */
... ... ...
create table "Sales.Dim_Customers" (
"ID_Customer" int identity not null,
"CustomerCode" nvarchar(15) not null,
"ID_Geography" int not null,
"MaritalStatus" nvarchar(20) not null,
"Gender" nvarchar(20) not null)
I have written a simple parser (GenerateExtendedAttrib) that looks for comments in this form in a DDL file (starting with "Create new table", then checks for column names and so on) and will finally generate the sp_addextendedproperty calls to define all the column and table descriptions.
The code fragment above generates this:
EXEC sys.sp_addextendedproperty
@name=N'MS_Description',
@value=N'Contains all the customer with various attributes used to analyze...' ,
@level0type=N'SCHEMA',
@level0name=N'Sales',
@level1type=N'TABLE',
@level1name=N'Dim_Customers';
EXEC sys.sp_addextendedproperty
@name=N'MS_Description',
@value=N'Primary key of the table' ,
@level0type=N'SCHEMA',
@level0name=N'Sales',
@level1type=N'TABLE',
@level1name=N'Dim_Customers',
@level2type=N'COLUMN',
@level2name=N'ID_Customer';
You can call the utility with "/Append" in order to have the property definitions appended to the original file (which, in turn, can be run to have the database created) or you can call it without parameters, in this case the code is written to the console and you can redirect it wherever you want.
The tool can be downloaded at www.sqlbi.eu and is provided with sources (it is indeed a very simple yet effective gadget) so you can adapt it to whatever your needs are.
Have fun and, if you make any change to the code that might be useful, send me a note so I can update the public release.
|
-
If you have two columns in a table that "remember" their old values in two different historical tables, as it is the case in AdventureWorks with EmployeePayHistory and EmployeeDepartmentHistory, you may need (or just wonder how) to merge these two tables into only one historical table that will keep track of both columns. This has been my "problem of the day" and, after some thoughts, I came up with a pattern of solution that I'd like to share with you both to see if you have a better solution and to have the code at hand when I'll need later. :) Before using the real table, I made up a simple test case with a Product table with only two fields a couple of variation tables (VariationA and VariationB): --
-- Sample table Products, only two columns A and B. The current value of the row is (D, 4)
--
CREATE TABLE Products (
A CHAR (1),
B INT
)
INSERT INTO products (A, B) VALUES ('D', 4)
--
-- Variations for column A
--
-- ------------------
-- Year Old Value
-- ------------------
-- 2001 A
-- 2002 B
-- 2003 C
-- 2005 D
--
CREATE TABLE VariationA (
Position INT,
OldValue VARCHAR (5),
ChangeDate DATETIME )
INSERT VariationA (Position, OldValue, ChangeDate) VALUES (1, 'A', '01/01/2001')
INSERT VariationA (Position, OldValue, ChangeDate) VALUES (2, 'B', '01/01/2002')
INSERT VariationA (Position, OldValue, ChangeDate) VALUES (3, 'C', '01/01/2003')
INSERT VariationA (Position, OldValue, ChangeDate) VALUES (4, 'D', '01/01/2005')
--
-- Variations for column B
--
-- ------------------
-- Year Old Value
-- ------------------
-- 2000 1
-- 2003 2
-- 2004 3
--
CREATE TABLE VariationB (
Position INT,
OldValue INT,
ChangeDate DATETIME )
INSERT VariationB (Position, OldValue, ChangeDate) VALUES (1, 1, '01/01/2000')
INSERT VariationB (Position, OldValue, ChangeDate) VALUES (2, 2, '01/01/2003')
INSERT VariationB (Position, OldValue, ChangeDate) VALUES (3, 3, '01/01/2004')
The query that mixes all these variations into a single table is pretty easy but it took me some time to discover. My solution is to detect what was the value of "B" when a variation in "A" happened and vice versa. As the values stored are "old" values, we know that the value of A at a certain date is the value stored in the first variation for A AFTER that date. The only special case is the current time: if no variation record is found then we know that the value to use is the current value of the record in the product table. The same is obviously true for B.
I ended up with this pattern query:
--
-- AllEvents contains all the events from both VariationA and VariationB,
-- and will detect, for each variation that happened, what was
-- the value of the other column at that time, building in this way the
-- merged variation list.
-- The final SELECT will return the sorted and DISTINCTed result.
-- Then final COALESCE is needed because if a value is NULL it
-- means that it should contain the "current" value of the column
--
WITH AllEvents AS (
SELECT
OldValueOfA = Events.OldValueOfA, -- Old value of A
OldValueOfB = Events.OldValueOfB, -- Old value of B
ChangeDate = Events.ChangeDate -- Date of change
FROM
(SELECT
OldValueOfA = OldValue,
OldValueOfB = (SELECT TOP 1 OldValue
FROM VariationB V
WHERE V.ChangeDate >= VariationA.ChangeDate
ORDER BY ChangeDate),
ChangeDate
FROM VariationA
UNION ALL
SELECT
OldValueOfA = (SELECT TOP 1 OldValue
FROM VariationA V
WHERE V.ChangeDate >= VariationB.ChangeDate
ORDER BY ChangeDate),
OldValueOfB = OldValue,
ChangeDate
FROM VariationB
) Events
)
SELECT DISTINCT
ChangeDate = ChangeDate,
OldValueOfA = COALESCE (OldValueOfA, (SELECT A FROM Products)),
OldValueOfB = COALESCE (OldValueOfB, (SELECT B FROM Products))
FROM
AllEvents
ORDER By ChangeDate
Clearly, i could have written some VB code in SSIS to solve the same problem. Nevertheless, having a pattern query at hand is useful because the pattern can be easily adapted to any real world situation with only SQL Management Studio at hand. The following code is the implementation of the pattern for AdventureWorks to merge Department and Payment history for Employees:
WITH AllEvents AS (
SELECT
EmployeeID = Events.EmployeeID,
OldValueOfRate = Events.OldValueOfRate,
OldValueOfDep = Events.OldValueOfDep,
ChangeDate = Events.ChangeDate
FROM
(SELECT
EmployeeID = EmployeeID,
OldValueOfRate = Rate,
OldValueOfDep = (SELECT TOP 1 DepartmentID
FROM HumanResources.EmployeeDepartmentHistory V
WHERE V.StartDate >= PayHistory.RateChangeDate
AND V.EmployeeID = PayHistory.EmployeeID
ORDER BY V.StartDate),
ChangeDate = RateChangeDate
FROM HumanResources.EmployeePayHistory PayHistory
UNION ALL
SELECT
EmployeeID = EmployeeID,
OldValueOfRate = (SELECT TOP 1 Rate
FROM HumanResources.EmployeePayHistory V
WHERE V.RateChangeDate >= DepHistory.StartDate
AND V.EmployeeID = DepHistory.EmployeeID
ORDER BY V.RateChangeDate),
OldValueOfDep = DepHistory.DepartmentID,
ChangeDate = StartDate
FROM HumanResources.EmployeeDepartmentHistory DepHistory
) Events
)
SELECT DISTINCT
EmployeeID = EmployeeID,
ChangeDate = ChangeDate,
OldValueOfDep = COALESCE (
OldValueOfDep,
(SELECT DepartmentID
FROM HumanResources.EmployeeDepartmentHistory DepHistory
WHERE EndDate IS NULL
AND DepHistory.EmployeeID = AllEvents.EmployeeID)),
OldValueOfRate = COALESCE (
OldValueOfRate,
(SELECT TOP 1 Rate
FROM HumanResources.EmployeePayHistory PayHistory
ORDER BY RateChangeDate))
FROM
AllEvents
ORDER By EmployeeID, ChangeDate
The code can be easily checked filtering for a single EmployeeID (4 is a good candidate for this check). Please note that - in this case - the "current value" is kept in the history table and so the final COALESCE is a bit more intricated but still pretty easy both to write and to understand.
If you have a better pattern for this kind of situation or any comments on it... I'll be glad to read your comments.
|
-
A ranged dimension is a dimension that is used to have a discrete view of a continuous measure. A good example of this is the analysis of amount sold per order. In AdventureWorks we have, for each line of an order, the amount and quantity sold. We would like to divide orders in three groups (HIGH, MEDIUM, LOW) based on the total of the order. This is a very frequent kind of analysis that is used to determine how much money customers are willing to spend in a single order and may be very useful to check whether there is any kind of relationship between the characteristics of a customer and his/her medium spending capability. In SQL it is very easy to get this analysis: WITH
OrderTotals AS (
SELECT
SalesOrderId = SalesOrderId,
OrderValue = SUM (OrderQty * UnitPrice)
FROM
Sales.SalesOrderDetail
GROUP BY
SalesOrderId),
OrdersRanged AS (
SELECT
SalesOrderId,
OrderValue,
OrderRange = CASE
WHEN OrderValue <= 1000 THEN 'LOW'
WHEN OrderValue <= 10000 THEN 'MEDIUM'
ELSE 'HIGH'
END
FROM
OrderTotals)
SELECT
OrderRange = OrderRange,
OrderNumber = COUNT (*),
OrderValue = SUM (OrderValue)
FROM
OrdersRanged
GROUP BY
OrderRange
The query, executed on AdventureWorks, leads to this result:
| OrderRange |
OrderNumber |
OrderValue |
| HIGH |
1,827 |
75,745,964.3559 |
| MEDIUM |
12,574 |
31,075,706.5634 |
| LOW |
17,064 |
3,552,218.3941 |
This SQL query is correct but what we really want to do is to create a new OrderRange dimension in our BI solution to let the user further analyze the characteristics of the orders. The granularity of this attribute is at the Order level and so we have to add an ID_ValueRange column in the fact table of orders.
If we define the Dim OrderRange dimension we will get something like it:
- ID_OrderRange Primary Key, usually INT IDENTITY
- MinimunValue Left value for the range
- MaximunValue Right value for the range
- OrderRange Textual description of the range
At the end of the processing, our relational database will look like this:
This solution is good when we have a very clear definition of what the ranges are during design time. This often does not happen in the real world. More likely, we will let the user define ranges through a table in the configuration database and we will define several attributes and hierarchies in the dimension to analyze how orders are ranged. All these attributes and hierarchies should be derived from the configuration database and processed during the ETL phase.
Ranges are dimensions that have a high variation rate over time and – normally – they can be defined only after the data is available because, before then, users do not have a clear understanding of what kind of ranges can be useful or not.
This leads to a very annoying problem i.e. the ranges can be determined only at the cube process time and not at ETL time. In other words we are not able to compute the ID_OrderRange key used to join the dimension to the fact table because the ranges change over time due to the user change of mind. Please note that in this case it is the dimension that changes its meaning, not the facts.
Computing the ranges during cube process time will lead us to define queries for the fact table that are complicated and uselessly consume processor time. So the challenge is to avoid this situation by means of changing the dimensional model.
The solution that we recommend is to define ranged dimensions with a different surrogate key. Even if we are not able to define the exact ranges, we can normally define the maximum granularity that will be used to define ranges. In the example the granularity can be 1,000, but it can be more cleverly set to 100 in order to gain some flexibility at the lowest level of the range. Supposing that 100,000 is the maximum value of an order (at least from the DSS analyst point of view), we will have a maximum range of values that starts at 100 and ends at 100,000 with a step of 100, leading to a range of only 1,000 different values. This will be the Dim OrderRange dimension.
Once we have defined the granularity we can redefine the Dim_OrderRange assigning to ID_OrderRange a simple INTEGER field that represents the value of FLOOR (OrderRange / 100). The same value, computed for each column of the order fact table, will be the ID used to join the fact table to the dimension.
The dimension will contain all the values from 0 to the maximum order value divided by the granularity and so it will be larger but still contain only a few thousand rows. In our example the table switched from 3 values to 1,000 but, from SSAS point of view, it is still a very small dimension. Moreover, as the distinct values of the attributes are still three, the aggregation will work very well leading to optimum performances.
The situation, with ranged dimensions, will be this:
Using the ranged dimension pattern you will be able to change all the attributes of the ranged dimension without having to worry about changes. The fact table will remain valid unless you decide to change the granularity.
Granularity is not subject to changes because it can be normally set to a value clearly defined by the user and there is very rarely the need of updating it.
Clearly ranged dimensions are a variation of standard dimension because their surrogate key has a clear meaning while the surrogate keys, in Kimball’s methodology, should not have any meaning at all. Nevertheless, we believe that in specific situation (and ranged dimension are among those) the “no meaning at all” constraint can be relaxed in order to get maximum flexibility in the final solution the customer will use.
Another very common situation where ranged dimension are useful is in the ranging of time duration. When handling time you can easily set the granularity to days, months, years or seconds, depending on the minimum and maximum values that the duration will have. Once done it you can compute the duration using the correct granularity and define any ranged dimension on it.
|
-
Today I ran into a nice problem with SQL Server fancy way of sorting GUIDS. I know, there should be no reason at all to sort GUID columns but, if what you need is run two tables side by side to check whether they are aligned or not (this is the task of TableDifference) and a GUID is the best key you have, then you will end up sorting GUIDS. The question now is: how do you sort GUIDS? The fact is that SQL Server and .NET give different answers. Let's have a look at it in deeper detail. I was using .NET guids to read data from SQL Server and then use .Compare to check for the sorting, supposing that a uniqueidentifier in SQL is sorted the same way a guid is sorted in .NET. This is not true, I discovered here that you should use SqlGuid instead. Let's not spend too many words on why having two sort algorithm may be clever or not, what is frightening about that article is the fact that it says that SqlGuid uses only the last six bytes to sort data. If it's working this way then we will end up with a sorting algorithm that does not really sorts. Seemed to be too stupid, so I double checked it and ran into this that says something more interesting about how GUIDS are sorted. Anyway, the algorithm is not still clear. Will I end up with a completely sorted list if I issue an ORDER BY on a uniqueidentifier column? and, how will this data be sorted? After all, the best way is always that of trying, so I end up with this very simple and funny query (that you can use to show friends that bits are still here, in 2007!): With UIDs As (-- 0 1 2 3 4 5 6 7 8 9 A B C D E F
Select ID = 1, UID = cast ('00000000-0000-0000-0000-010000000000' as uniqueidentifier)
Union Select ID = 2, UID = cast ('00000000-0000-0000-0000-000100000000' as uniqueidentifier)
Union Select ID = 3, UID = cast ('00000000-0000-0000-0000-000001000000' as uniqueidentifier)
Union Select ID = 4, UID = cast ('00000000-0000-0000-0000-000000010000' as uniqueidentifier)
Union Select ID = 5, UID = cast ('00000000-0000-0000-0000-000000000100' as uniqueidentifier)
Union Select ID = 6, UID = cast ('00000000-0000-0000-0000-000000000001' as uniqueidentifier)
Union Select ID = 7, UID = cast ('00000000-0000-0000-0100-000000000000' as uniqueidentifier)
Union Select ID = 8, UID = cast ('00000000-0000-0000-0010-000000000000' as uniqueidentifier)
Union Select ID = 9, UID = cast ('00000000-0000-0001-0000-000000000000' as uniqueidentifier)
Union Select ID = 10, UID = cast ('00000000-0000-0100-0000-000000000000' as uniqueidentifier)
Union Select ID = 11, UID = cast ('00000000-0001-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 12, UID = cast ('00000000-0100-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 13, UID = cast ('00000001-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 14, UID = cast ('00000100-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 15, UID = cast ('00010000-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 16, UID = cast ('01000000-0000-0000-0000-000000000000' as uniqueidentifier)
)
Select * From UIDs Order By UID, ID
The result is simple, the ID's are completely reversed even if they appear as second argument in the SORT request. This means that the evaluation is carried on in byte groups this way:
- 0..3 are evaluated in left to right order and are the less important, then
- 4..
| | |