|
|
|
|
-
Stop what you're doing and go to http://crisiscommons.org/ and see how you can use your geekiness to help the people in Haiti. So far there are 8 cities in the United States holding CrisisCamps for Haiti this Saturday, January 23. There's a lot going on and more to be done. I'm sure there's a place for you. If you can't travel to one of the CrisisCamps, cash is always helpful. Low on cash? Have a garage sale. Have any gold coins you received as a gift or bought as investment? Gold prices are relatively high - it's a great time to sell. Maybe you can start a CrisisCamp in your city.
Got another idea? Blog it, tweet it, email your friends, reply here, let's do something!
|
-
When you only have a few SQL Server Agent jobs, it’s easy to keep track of and make sense of them. But not so easy to understand what’s going on when there are many jobs starting around the clock and running varying lengths of time. Even if the jobs are neatly organized, over a period of time entropy sets in and you have an incomprehensible mess. Reading text to find starting times and run times isn’t fun – it’s just plain tedious. I’ve gotten some relief by using a free tool called SQLjobvis, available at http://www.sqlsoft.co.uk/sqljobvis.php. Go take a look. There’s a screen capture that shows you what it does for you. Thanks to my fellow MVPs, I found out about this tool today. It made my day go much more smoothly and made it easy for my coworkers to get the big picture on what’s really going on with SQL Agent jobs. Wouldn’t it be great if Microsoft offered a graphical UI for SQL Server Agent? I’d like Microsoft to provide graphical editing of SQL Agent job schedules. Please vote on this Connect item: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=524854
|
-
There are several things I do to get each New Year’s Day off to a good start. All of my machines have been to Windows Update today and are updated and defragmented. My DBA toolkit on a stick is completely new for 2010. When I went to http://portableapps.com and downloaded the latest full suite, it was a pleasant surprise to find out that the portable version of Open Office finally works on 64-bit systems. These are my recommendations for portable applications you might want to add: - 7-Zip portable it also opens iso files, which can be very handy
- FileZilla portable you never know when you might need to FTP something
- FireFox portable since it’s portable, you don’t install it
- Java portable Open Office needs this if the machine you’re using doesn’t have a JRE
- Notepad++ portable excellent freeware editor with lots of features
- winMd5Sum portable great for when you need to confirm that your download wasn’t corrupted
- WinMerge portable excellent comparison tool for comparing and merging files
- Foxit Reader portable small, fast pdf reader better than the Sumatra reader that comes with the suite
I also downloaded the latest version of PDF-XChange Viewer portable and put it on my machines and my USB sticks. I do not use Adobe Acrobat Reader at all. It’s too slow, too bloated, and it doesn’t have enough features. Plus I don’t like installing more than the absolute bare minimum number of applications. I created an Open with file association for pdf files and PDF-XChange Viewer portable. This lightweight product gives me the ability to annotate pdf files. I use it to fill out rebate forms, among other things. Sometimes it doesn’t render correctly and I switch to Foxit. You might want to associate pdf files with Foxit instead of PDF-XChange and use PDF-XChange Viewer only for annotating pdfs. I went to https://www.annualcreditreport.com/cra/index.jsp to get a free credit report. This is the Federal Trade Commission’s site – all of the other sites are commercial sites trying to sell you something. You must make your credit report request from that site only. Be careful about what you click because the credit reporting company is trying to sell you something beyond the free credit report. Since I’m married with joint accounts, I get a total of 6 free credit reports per year. I do it like this: - January 1 – I request a free credit report from Equifax
- March 1 – my wife requests a free credit report from Equifax
- May 1 – I request a free credit report from Experian
- July 1 – my wife requests a free credit report from Experian
- September 1 – I request a free credit report from TransUnion
- November 1 – my wife requests a free credit report from TransUnion
Getting an early start on taxes is a good idea, so I downloaded free tax preparation software from TaxACT. It even has free E-file. Really free is nice. I installed it into a virtual machine in keeping with my habit of installing the bare minimum on my machines. I unsubscribed from vendor email that was received today. That’s something I’ve been doing almost every day for the past several weeks. My inbox is no longer cluttered. Now that’s what I call a great way to start off the new year!
|
-
Many of you use Virtual PC and Virtual Server, but not many people are aware of some new hotfixes and security updates released in 2009. Before getting to those, the latest versions of those two products are Virtual PC 2007 Service Pack 1 and Virtual Server 2005 R2 Service Pack 1. You first need to be at that level. Here is a hotfix rollup for Virtual PC 2007 SP1:
http://support.microsoft.com/kb/958162 released February 20, 2009
This is a security update for Virtual PC 2004, Virtual PC 2007, Virtual PC 2007 SP1, and Virtual Server 2005 R2 SP1:
http://support.microsoft.com/kb/969856
If you are running Windows 7 on a machine that doesn't support hardware assisted virtualization, you'll need to use Virtual PC 2007 SP1. It works just fine. You don't have to run Windows 7 Virtual PC. Of course, if you can run the Windows 7 version, you should. Microsoft has a utility to let you know if your machine supports hardware assisted virtualization: http://www.microsoft.com/downloads/details.aspx?FamilyID=0EE2A17F-8538-4619-8D1C-05D27E11ADB2&displaylang=en
|
-
In the holiday season, white elephant gift exchanges are somewhat popular in the workplace. This is a really great definition from Wikipedia: "the term white elephant refers to a gift whose maintenance cost exceeds its usefulness." Now that definitely describes some databases I've seen!
We can do a white elephant gift exchange of worst schemas. Here's my white elephant constructed of snippets from real production databases:
create table [table] (
[column] varchar(1) --think Hungarian notation without abbreviations
, [count] decimal(10,0) --integer value that might someday be 3 billion
, replicationId varchar(40) --this was for storing a uniqueidentifier
, acomodation varchar(50) --can we add spellcheck to SSMS, please?
, seperationStatus varchar(50) --odd how many columns match the default length
, worseThanEAV xml
);
Do you hate brackets? I do! I really hate brackets when people use them to allow keywords and reserved words to be used for other purposes. I can see someone specifying decimal(10,0) instead of int, but other evidence made me suspect that the person didn't understand data types at all. Object names that are misspellings of ordinary words tend to cause the literate members of the staff to make errors. Don't you wonder about the design of a database in which most varchar columns are the same as the default width?
I really liked the varchar(1) that I saw in a table definition. I can't wait to hear the justification for that.
There was a recent blog post on EAV and quite a bit of discussion on that topic. The xml data type is a good companion to discussions about EAV. If you just can't quite finish your data model, be it third normal form or EAV (notice that the word normal is never used with the abbreviation EAV - ponder than one for a bit), you can always organize your data as XML and cram it into a table.
|
-
The Windows 7 Problem Steps Recorder is yet another reason to upgrade to Windows 7. It records what the user actually did, not what the user claims to have done. I used it to record the attached session of running an invalid T-SQL query in SSMS. When the recording was made, there was a step I added that’s not completely obvious from the recorded output. Notice that the query I executed was highlighted. I did a Control-C to copy the query into the past buffer. The Control-C event was recorded as User keyboard input in "Microsoft SQL Server Management Studio" [... Ctrl ... Ctrl-C]. The query was pasted into the Comment dialog box (not shown in the output – you’ll see it when you run the tool and select Add Comment), which left a textual version of the query in the record. I recommend using this approach when capturing a session. This provides the person analyzing the report a textual copy of the query instead of just the screen capture version of it. Figure 1. One of the screen captures from the Problem Steps Recorder. To start the Problem Steps Recorder, type psr in the Search programs and files box to find the psr.exe program. Run the psr.exe program. Figure 2. Finding and running the PSR. Complete output is in the attached file. Try it out – it’s very simple to use. Once you use it, I’m sure you’ll recommend it to others.
|
-
The SQL Server Riadk and Health Assessment Program, otherwise known as SQLRAP, isn't well known. It hasn't been blogged about much. Today I'm analyzing SQLRAP output and developing a plan to implement changes it suggests. You can read more about the program here: http://www.microsoft.com/downloads/details.aspx?familyid=8D54DB4D-3232-4CF8-8BA5-E80557835421&displaylang=en The SQLRAP service is when you pay for a premier support engineer to come to your site and run the SQLRAP tools to collect metrics about your SQL Servers. A verbal description of what's done doesn't do justice to SQLRAP. The real value of SQLRAP is in the presentation. DMVs, logs, and Perfmon already provide us with lots of data - often too much. SQLRAP turns the raw data into information we can easily use. SQLRAP is a service I'm comfortable recommending. Here’s a screenshot of a small portion of the output: 
|
-
When reviewing a client's database, I suspected there might be indexes on low cardinality (i.e., low "uniqueness") columns. Typing COUNT(DISTINCT colName)) became tedious quickly. Time for a script. Oops, wait, in deference to proponents of test driven development, time for a test case and then time for a script. Here's the test case:
CREATE TABLE dbo.CardinalTest(
a bigint NULL,
b binary(50) NULL,
c bit NULL,
d char(10) NULL,
e date NULL,
f datetime NULL,
g datetime2(7) NULL,
h datetimeoffset(7) NULL,
i decimal(18, 0) NULL,
j float NULL,
k geography NULL,
l hierarchyid NULL,
m image NULL,
n int NULL,
o money NULL,
p nchar(10) NULL,
q ntext NULL,
r numeric(18, 0) NULL,
s nvarchar(50) NULL,
t nvarchar(max) NULL,
u real NULL,
v smalldatetime NULL,
w smallint NULL,
x smallmoney NULL,
y sql_variant NULL,
z text NULL,
aa time(7) NULL,
bb timestamp NULL,
cc tinyint NULL,
dd uniqueidentifier NULL,
ee varbinary(50) NULL,
ff varbinary(max) NULL,
gg varchar(50) NULL,
hh varchar(max) NULL,
ii xml NULL
);
The test case has a column for every data type in SQL Server 2008.
A cursor loop is need to iterate through all of the tables. Notice that instead of iterating through INFORMATION_SCHEMA.TABLES, a different approach was used in case you want to limit the analysis to the largest n tables. If you want to do this, add a TOP n clause. If you have a lot of columns, you can exceed the 4000 character limit of the @TSQL variable. If you don't use Unicode characters for table and column names, you can change @TSQL to VARCHAR(8000).
Now for the script:
DECLARE @TABLE_SCHEMA NVARCHAR(128);
DECLARE @TABLE_NAME NVARCHAR(128);
DECLARE @COLUMN_NAME NVARCHAR(128);
DECLARE @TSQL NVARCHAR(4000);
DECLARE tableCursor CURSOR FAST_FORWARD FOR
/* The following query with a TOP n is useful by itself to show you the
largest tables in your database. */
/* This query works for SQL Server 2005 and above. */
SELECT OBJECT_SCHEMA_NAME(object_id) AS schemaName -- maybe add a TOP n
, OBJECT_NAME(object_id) AS tableName
FROM sys.dm_db_partition_stats
WHERE index_id < 2
AND OBJECTPROPERTY(object_id,'IsUserTable') = 1
ORDER BY sys.dm_db_partition_stats.row_count DESC;
/* This query works for SQL Server 2000. */
--SELECT TABLE_SCHEMA, TABLE_NAME
--FROM INFORMATION_SCHEMA.TABLES t
--WHERE T.TABLE_TYPE = 'BASE TABLE';
OPEN tableCursor;
SET ARITHABORT OFF -- prevent divide by zero errors if table has no rows
SET ANSI_WARNINGS OFF -- or if all values are NULLs
-- Perform the first fetch.
FETCH NEXT FROM tableCursor INTO @TABLE_SCHEMA, @TABLE_NAME;
-- Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE columnCursor CURSOR FAST_FORWARD FOR
SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = @TABLE_SCHEMA
AND TABLE_NAME = @TABLE_NAME
AND DATA_TYPE NOT IN ('geography','geometry','image','ntext','text','xml');
OPEN columnCursor;
SET @tsql = 'select ''' + @TABLE_SCHEMA + ''' as ''schema'', ''' + @TABLE_NAME + ''' as ''table'', count(*) as nrows';
-- Perform the first fetch.
FETCH NEXT FROM columnCursor INTO @COLUMN_NAME;
-- Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
BEGIN
SET @tsql = @tsql + ', cast(cast(count(distinct(' + QUOTENAME(@COLUMN_NAME) + ')) as decimal(15,3))/cast(count(*) as decimal(15,3)) as decimal(4,3)) as ' + QUOTENAME(@COLUMN_NAME);
-- This is executed as long as the previous fetch succeeds.
FETCH NEXT FROM columnCursor INTO @COLUMN_NAME;
END;
SET @tsql = @tsql + ' FROM ' + QUOTENAME(@TABLE_SCHEMA) + '.' + QUOTENAME(@TABLE_NAME);
CLOSE columnCursor;
DEALLOCATE columnCursor;
EXEC sp_executesql @tsql;
FETCH NEXT FROM tableCursor INTO @TABLE_SCHEMA, @TABLE_NAME;
END;
CLOSE tableCursor;
DEALLOCATE tableCursor;
Here's some sample output:
schema table nrows BusinessEntityID CreditCardID ModifiedDate
------ ---------------- ----- ---------------- ------------ ------------
Sales PersonCreditCard 19118 1.000 1.000 0.059
As you can see, both BusinessEntityID and CreditCardID have the highest possible cardinality (i.e., every value is unique) and are suitable for indexing. ModifiedDate values have low cardinality (low selectively), so it wouldn't make sense to create an index on the ModifiedDate column.
Cardinality is not a term that's common in SQL Server literature, but it is common in Oracle literature. I suspect that Oracle people don't actually know what cardinality is. They do know that whenever somebody says "the indexes were created based on cardinality" that they must knowingly nod their heads in agreement. But internally they feel like they're back in college hoping that the professor doesn't call on them. So, the next time some Oracle people talk smack about SQL Server, nonchalantly throw out cardinality in your conversation just to make them squirm.
The script was written for use at a SQL Server 2005 customer. If SQL Server 2008 is available, take a look at this SSIS feature described here: http://sqlblog.com/blogs/eric_johnson/archive/2009/09/11/ssis-2008-data-profiling-task.aspx
|
-
A quick review of a customer database revealed many inappropriate maximum lengths on character columns. When dealing with fixed width columns, grossly oversized columns negatively impact performance because fewer rows fit into a page, which means more pages need to be read. Here's a query you can run to compare actual data lengths to maximum defined column widths.
DECLARE @TABLE_SCHEMA NVARCHAR(128);
DECLARE @TABLE_NAME NVARCHAR(128);
DECLARE @COLUMN_NAME NVARCHAR(128);
DECLARE @PARMS NVARCHAR(100);
DECLARE @DATA_TYPE NVARCHAR(128);
DECLARE @CHARACTER_MAXIMUM_LENGTH INT;
DECLARE @MAX_LEN NVARCHAR(10);
DECLARE @TSQL NVARCHAR(4000);
DECLARE DDLCursor CURSOR FAST_FORWARD FOR
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME IN (SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'BASE TABLE')
AND DATA_TYPE IN ('char','nchar','varchar','nvarchar')
AND CHARACTER_MAXIMUM_LENGTH > 1
OPEN DDLCursor;
SET @PARMS = N'@MAX_LENout nvarchar(10) OUTPUT';
CREATE TABLE #space (
TABLE_SCHEMA NVARCHAR(128) NOT NULL
, TABLE_NAME NVARCHAR(128) NOT NULL
, COLUMN_NAME NVARCHAR(128) NOT NULL
, DATA_TYPE NVARCHAR(128) NOT NULL
, CHARACTER_MAXIMUM_LENGTH INT NOT NULL
, ACTUAL_MAXIMUM_LENGTH INT NOT NULL
);
-- Perform the first fetch.
FETCH NEXT FROM DDLCursor
INTO @TABLE_SCHEMA, @TABLE_NAME, @COLUMN_NAME, @DATA_TYPE, @CHARACTER_MAXIMUM_LENGTH;
-- Check @@FETCH_STATUS to see if there are any more rows to fetch.
WHILE @@FETCH_STATUS = 0
BEGIN
SET @tsql = 'select @MAX_LENout = cast(max(len(isnull('
+ QUOTENAME(@COLUMN_NAME) + ',''''))) as nvarchar(10)) from '
+ QUOTENAME(@TABLE_SCHEMA) + '.' + QUOTENAME(@TABLE_NAME);
EXEC sp_executesql @tsql, @PARMS, @MAX_LENout = @MAX_LEN OUTPUT;
IF CAST(@MAX_LEN as int) < @CHARACTER_MAXIMUM_LENGTH -- not interested if lengths match
BEGIN
SET @tsql = 'insert into #space values ('''
+ @TABLE_SCHEMA + ''','''
+ @TABLE_NAME + ''','''
+ @COLUMN_NAME + ''','''
+ @DATA_TYPE + ''','
+ CAST(@CHARACTER_MAXIMUM_LENGTH as nvarchar(10)) + ','
+ @MAX_LEN + ')';
EXEC sp_executesql @tsql;
END;
-- This is executed as long as the previous fetch succeeds.
FETCH NEXT FROM DDLCursor INTO @TABLE_SCHEMA, @TABLE_NAME, @COLUMN_NAME, @DATA_TYPE, @CHARACTER_MAXIMUM_LENGTH;
END;
CLOSE DDLCursor;
DEALLOCATE DDLCursor;
select * from #space
|
-
I posted a suggestion for a minimally logged delete feature on Connect: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=509341. There is a legitimate need for minimal logging of deletes. Just because deletes need to be able to be rolled back in most cases doesn't invalidate the need for minimal logging at other times. Bulk deletes are necessary at times. When large amounts of data are deleted, the transaction log grows. Sometimes deletes bloat the log so much that the server stops because it doesn't have any space left for the log to grow. If a minimally logged delete is made available, I won't care if it fails and it can't roll back. I'll just issue the command again. I don't want recoverability and I certainly don't want log growth when I'm doing a bulk delete. Imagine a TRUNCATE with a where clause - that's the general concept.
|
-
Previous posts have shown how to do a parts explosion and a where used query. Another common query needed for Bill of Materials is a list of quantities required for all components in an assembly. The indented parts explosion is easily modified to produce grand totals for all parts in an assembly.
;WITH BOMcte(ComponentID, Name, PerAssemblyQty, ProductAssemblyID)
AS
(
SELECT b.ComponentID,
p.Name,
b.PerAssemblyQty,
b.ProductAssemblyID
FROM Production.BillOfMaterials AS b
INNER JOIN Production.Product p
on b.ComponentID = p.ProductID
WHERE b.EndDate IS NULL -- only retrieve components still being used
and b.ComponentID = 775 -- specify a component to explode
UNION ALL
SELECT b.ComponentID,
p.Name,
b.PerAssemblyQty,
b.ProductAssemblyID
FROM Production.BillOfMaterials as b
INNER JOIN Production.Product p
on b.ComponentID = p.ProductID
INNER JOIN BOMcte AS cte
ON b.ProductAssemblyID = cte.ComponentID
WHERE b.EndDate IS NULL -- only retrieve components still being used
)
SELECT Name, sum(PerAssemblyQty) as Total
FROM BOMcte
group by Name;
Name Total ----------------------------- ----- Adjustable Race 1.00 BB Ball Bearing 10.00 Bearing Ball 10.00 Blade 2.00 Chain 1.00 Chain Stays 2.00 Chainring 3.00 Chainring Bolts 3.00 Chainring Nut 3.00 Cone-Shaped Race 2.00 Crown Race 1.00 Cup-Shaped Race 2.00 Decal 1 2.00 Decal 2 1.00 Down Tube 1.00 Fork Crown 1.00 Fork End 2.00 Freewheel 1.00 Front Brakes 1.00 Front Derailleur 1.00 Front Derailleur Cage 1.00 Front Derailleur Linkage 1.00 Guide Pulley 1.00 Handlebar Tube 1.00 Head Tube 1.00 Headset Ball Bearings 8.00 HL Bottom Bracket 1.00 HL Crankarm 2.00 HL Crankset 1.00 HL Fork 1.00 HL Grip Tape 40.00 HL Headset 1.00 HL Hub 2.00 HL Mountain Frame - Black, 38 1.00 HL Mountain Front Wheel 1.00 HL Mountain Handlebars 1.00 HL Mountain Pedal 1.00 HL Mountain Rear Wheel 1.00 HL Mountain Rim 2.00 HL Mountain Seat Assembly 1.00 HL Mountain Seat/Saddle 1.00 HL Mountain Tire 2.00 HL Nipple 72.00 HL Shell 3.00 HL Spindle/Axle 2.00 Keyed Washer 1.00 Lock Nut 19 1.00 Lock Ring 1.00 Lower Head Race 1.00 Metal Bar 1 1.00 Metal Bar 2 1.00 Metal Sheet 2 3.00 Metal Sheet 3 1.00 Metal Sheet 4 1.00 Metal Sheet 5 3.00 Metal Sheet 6 2.00 Metal Sheet 7 1.00 Mountain End Caps 2.00 Mountain Tire Tube 2.00 Mountain-100 Black, 38 1.00 Paint - Black 8.00 Pinch Bolt 4.00 Rear Brakes 1.00 Rear Derailleur 1.00 Rear Derailleur Cage 1.00 Reflector 4.00 Seat Lug 1.00 Seat Post 1.00 Seat Stays 4.00 Seat Tube 1.00 Spokes 72.00 Steerer 1.00 Stem 1.00 Tension Pulley 1.00 Top Tube 1.00
|
-
After having several people at SQL PASS ask me to explain what a Bill of Materials is, it seemed like a good idea for a blog post. A Bill of Materials, otherwise known as a BOM (don’t read this aloud in an airport lounge), is a listing of all of the components (i.e., parts) needed to assemble (i.e., make) something. Let’s start with a picture of our inventory of components.
Figure 1. Inventory of parts. (note: the colors were chosen because they are visible to people with any type of color vision deficit)
The inventory consists of:
32 1x1 yellow bricks
4 1x3 black bricks
5 1x4 blue bricks
These components are assembled into what are called assemblies. There are enough components to makes five assemblies. These assemblies form the letters E, F, H, I, and L. Here’s a picture of our five assemblies and the one leftover component.
Figure 2. Parts from Figure 1 assembled into five assemblies.
Several things should now be obvious. Although the inventory is the same in Figures 1 and 2, the entropy of Figure 2 is much lower. The components are subordinate to the assembly containing them – there’s a hierarchical relationship between an assembly and its components. If you focus on the letters H and I, you realize that another level can be added to the hierarchy. The two assemblies H and I can be used to assemble the word HI. When an assembly becomes a component in another assembly, it is called a subassembly. It doesn’t take much imagination to realize that letters can be assembled into words, words into sentences, sentences into paragraphs, paragraphs into chapters, and so on. The hierarchy can be made many, many levels deep.
There are multiple techniques for modeling a hierarchy. Organizational charts are hierarchical in nature and are implemented in the SQL Server sample databases Northwind, AdventureWorks, and AdventureWorks2008 and the Oracle sample schemas SCOTT and HR. I have read that a Bill of Materials and an organizational chart are similar. They are in the sense that they are both hierarchical. But they are also different. An employee table (e.g., Northwind Employee, AdventureWorks HumanResources.Employee, SCOTT.EMP, HR.EMPLOYEE) has one row per employee. Although a BOM will work with one row per component, it’s not just annoying, it’s a bad idea. Look at the two hierarchical displays shown below.
Figure 3. One row per part model on the left, part with quantity model on the right.
The list on the left shows what happens when one row is stored for each component. While it is necessary to know that the assembly E contains four 1x1 yellow bricks, it’s not helpful at all to see the 1x1 brick component description appear four times. Adding a quantity column to the data model results in a practical and useful BOM. Adding a quantity column to an employee table wouldn’t make sense, except when the employees are Imperial storm trooper clones.
The adjacency list pattern is the most common approach to modeling a hierarchy. It is used in the Production.BillOfMaterials table in AdventureWorks and AdventureWorks2008. A self-join is used to model the hierarchy.
Figure 4. The ParentPartId contains the PartId of the immediate parent (which is one level up in the hierarchy).
A top level assembly has a ParentPartId of NULL when an adjacency list pattern is used as the following output shows.
Figure 5. Each component (PartId values 17, 29) has a reference to the PartId 30, which is the PartId of the assembly named H.
When an assembly is used as a component of another assembly, it must have a ParentPartId instead of a NULL. The following output shows what happens when assembly H becomes a subassembly of assembly HI.
Figure 6. PartId 30 now has a ParentPartId of 47 instead of NULL.
When working with hierarchies, each item belongs to a specific level of the hierarchy. The parts lists shown in this post are examples of what is known as a Bill of Materials parts explosion. To format a parts explosion to show the hierarchical relationships, the level of each component must be known. If it isn’t stored in the table, it must be calculated from adjacency list references to the parents.
Figure 7. Parts explosion showing levels of components.
The levels of a hierarchy implemented using an adjacency list pattern are easily calculated with a recursive CTE.
;WITH BOMcte(PartId, Name, Color, Quantity, Lvl) AS ( SELECT WQ.PartId, CAST(WQ.Name as nvarchar(50)), WQ.Color, WQ.Quantity, 1 -- or use 0 if you prefer a zero-based hierarchy FROM AdjacencyList.WordQty AS WQ WHERE WQ.ParentPartId IS NULL AND WQ.PartId = 47 UNION ALL SELECT WQ.PartId, CAST(REPLICATE ('| ', Lvl) + WQ.Name as nvarchar(50)), WQ.Color, WQ.Quantity, Lvl + 1 FROM AdjacencyList.WordQty as WQ JOIN BOMcte AS cte ON WQ.ParentPartId = cte.PartId ) SELECT Lvl, Name, Color, Quantity FROM BOMcte;
Notice that the level is needed to indent the subordinate items according to their levels. The first level of the hierarchy is defined in the anchor query (hardcoded to 1 in this example) and the remaining levels are computed in the recursive query. The CAST function is required to prevent a run time error caused by the UNION ALL statement. The data types must be compatible between the two statements in a UNION. CAST is used to make the second column in the anchor query the same length as the second column in the recursive query. If they don’t match, the query fails with a Msg 240.
When the objective is to create nested results to represent the hierarchical relationships, a sort key needs to be computed.
;WITH BOMcte(PartId, Name, Color, Quantity, Lvl, ParentPartId, Sort)
AS
(
SELECT WQ.PartId,
CAST(WQ.Name as as nvarchar(50)),
WQ.Color,
WQ.Quantity,
1,
WQ.ParentPartId,
CAST('\' + WQ.Name as nvarchar(50))
FROM AdjacencyList.WordQty AS WQ
WHERE WQ.ParentPartId IS NULL
AND WQ.PartId = 47
UNION ALL
SELECT WQ.PartId,
CAST(REPLICATE('| ', Lvl) + WQ.Name as nvarchar(50)),
WQ.Color,
WQ.Quantity,
Lvl + 1,
WQ.ParentPartId,
CAST(cte.Sort + '\' + WQ.Name as nvarchar(50))
FROM AdjacencyList.WordQty as WQ
JOIN BOMcte AS cte
ON WQ.ParentPartId = cte.PartId
)
SELECT Lvl, Name, Color, Quantity, Sort
FROM BOMcte
ORDER BY Sort;
Figure 8. Parts explosion query with the computed sort key values shown.
As you can see, the sort key trick computes the full path from the root node to the node where the component is located. Ordering the output on the path (i.e., sort key) allows the indented results to appear in the desired order.
If you want to practice with AdventureWorks BOM queries yourself, I provided code samples for a parts explosion, a where used query, and an accumulated quantity query.
If you'd like to play with virtual Legos, download Lego Digital Designer from here. You can order what you built and Lego will ship you the parts.
|
-
-
AdventureWorks2008 uses the hierarchyid data type in a updated version of the HumanResources.Employee table, but not in the Production.BillOfMaterials table. Michael Coles provides the code for a clever CTE to create Production.HierBillOfMaterials from Production.BillOfMaterials in his book Pro T-SQL 2008 Programmer’s Guide (which I recommend). Another way to create the table is by using the brute force approach of running the 8,827 insert statements I provided in the attached zip file. The reason I call attention to size of the table is that the original Production.BillOfMaterials table is 2,679 rows. The value in looking at the same data implemented two different ways is to compare and contrast the differences. Once you can see and understand the differences, then you can choose which approach best suits your needs.
I’m using the Production.HierBillOfMaterials table in my SQL PASS 2009 presentation next week. It stores the bill of materials data using hierarchyid instead of an adjacency list pattern. Another good book is Paul Nielsen’s SQL Server 2008 Bible, which covers hierarchies in a chapter you can download for free.
The original Production.BillOfMaterials table has StartDate and EndDate columns to store historical data about a bill of materials. The Production.HierBillOfMaterials table uses the hierarchyid data type but does not have StartDate and EndDate columns because it doesn’t maintain history. Supporting history in a bill of materials isn’t a technical problem to be solved. It has significant business and even potential legal implications. If a customer is using widget X and orders another widget X a few years later after some of its subassemblies have changed, the customer might not be happy to see that widget X has changed. I met with people from one manufacturing company who told me they think it is better to define a bill of materials as immutable. If the bill of materials must be changed, they just create a new bill instead of creating historical versions of the same bill. Different customers will reach different conclusions on how to handle history.
After working with the same bill of materials implemented with hierarchyid and an adjacency list pattern, I prefer the adjacency list pattern for a bill of materials.
|
-
It's a pretty common thing to right-click on a stored procedure, function, or view to look at the source code. Understand that I'm specifically limiting the scope to looking at the T-SQL code, not changing it. You have the option to script the procedure, function, or view as either a CREATE script or an ALTER script. If you use ALTER, you're at risk of accidentally changing the code. If you use CREATE and you do hit F5 or execute, you'll get an error message stating that the procedure, function, or view already exists. Your code won't be changed. 
|
|
|
|
|
|