THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Peter DeBetta's SQL Programming Blog

Peter DeBetta's blog about programming in SQL Server using technologies such as T-SQL, .NET, CLR, C#, VB, Visual Studio, and SQL Server Management Studio.

  • SQL vs. Oracle Live Debate (AKA Smackdown!)

    A few years ago I was speaking at a conference in Raleigh, NC where Ted Neward and I found a fun way to promote a Java vs. .NET debate that was planned one evening. We stood in the middle of a crowd during one of the breaks and starting “arguing” about Java vs. .NET with one another. Our voice levels quickly raised and we ended it by slapping each other across the face with a glove to request a challenge. It was a great way to segue to our announcing of the actual debate planned later that evening. As you can imagine, it is a lively topic, with many folks taking extreme positions.

    And so I found out about another highly debatable topic that is happening soon. Red Gate is sponsoring a live debate entitled “Oracle Heap Tables or SQL Server Clustered Indexes?”. Jonathan Lewis and Grant Fritchey will be “duking it out” on Thursday 7th June 2012 at 11:00 AM EDT (17:00 CEST, 16:00 BST, 10:00 CDT, 08:00 PDT).

    To read more about the event, please visit: http://www.red-gate.com/products/sql-development/sql-source-control/webinar-oracle-heaped-tables-sql-clustered-indexes.

    To register for the event, please visit: https://www2.gotomeeting.com/register/270039946

    The live webinar is free of charge, but spaces are limited, so register soon!

  • Extracting GPS Data from JPG files

    I have been very remiss in posting lately. Unfortunately, much of what I do now involves client work that I cannot post. Fortunately, someone asked me how he could get a formatted list (e.g. tab-delimited) of files with GPS data from those files. He also added the constraint that this could not be a new piece of software (company security) and had to be scriptable.

    I did some searching around, and found some techniques for extracting GPS data, but was unable to find a complete solution. So, I did a little PowerShell programming this weekend and came up with this script (below). It is not the most elegant of solutions, but it worked as expected.

    Something of note: I was first lead down the path of searching for GPS data in the EXIF by looking for the property with an ID of 34853, but none of my files that appeared to have GPS data (according to file properties in Windows 7) contained an EXIF property with this ID. So I did a little spelunking and found that the following IDs are also used for GPS data:

    Image Property ID Size (bytes) Description
    1 2 Latitude North or South value
    Contains the literal “N” or “S”
    2 24 Latitude degrees, minutes, and seconds
    2 - 4 bytes values expressed as rational
    3 2 Longitude East or West value
    Contains the literal “E” or “W”
    4 24 Longitude degrees, minutes, and seconds
    2 - 4 bytes values expressed as rational
    5 2 Altitude Reference:
    0 Above Sea Level, 1 Below Sea Level
    6 8 Altitude in meters
    2 - 4 bytes values expressed as rational

    Update:

    • I found out that the Property IDs that are being used are not the EXIF ones, but the ones from the System.Drawing.Imaging.PropertyItem that encapsulate the metadata from an image file (System.Drawing.Bitmap in this case). You can find a reference for these IDs at PropertyItem.Id Property (System.Drawing.Imaging).
    • Added in the ability to read the altitude data
    • Added some additional exception handling for those files that don’t play nice and an error log file for record of those image files
    • Added the ability to change the status update counter (after 100 rows by default)

    Once I sorted this out, it was time to start writing some PowerShell script. Once I got the core functionality working, I added some additional features (recurse folders, include/exclude files with no GPS data, and so on) to the script and cleaned up the code and comments. Enjoy…

    PARAM([String]$FolderPath = "", 
        [String]$OutputFileName = "FileList.txt", 
        [String]$IncludeFilesWithNoGPSInOutput = "N",
        [String]$RecurseFolders = "N",
        [String]$Delimiter = "`t", 
        [Int32]$UpdateCount = 100)
    # These are the parameters for the command line
    # Syntax: 
    #    GetImageGPS2.ps1 
    #        -FolderPath "<Enter your folder path here>" 
    #        -OutputFileName "<output filename here>" (Default "FileList.txt") 
    #        -IncludeFilesWithNoGPSInOutput "<Y or N>" (Default "N")
    #        -RecurseFolders "<Y or N>" (Default "N")
    #        -Delimiter "<Single Character>" (Default [tab])
    #        -UpdateCount <Show status after number of rows> (Default 100)
    # Example: 
    #     GetImageGPS2.ps1 
    #        -FolderPath "c:\Pictures" 
    #        -OutputFileName "MyGSPFiles.txt" (Default "FileList.txt")
    #        -IncludeFilesWithNoGPSInOutput "N" (Default "N")
    #        -RecurseFolders "Y" (Default "N")
    #        -Delimiter "," (Default [tab])
    #        -UpdateCount 50
    
    # The extension of the files we will be searching
    $jpg = "*.jpg";
    # So we can see how long this is taking
    $startTime = [DateTime]::Now
    # Captilize $IncludeFilesWithNoGPSInOutput parameter
    $IncludeFilesWithNoGPSInOutput = $IncludeFilesWithNoGPSInOutput.ToUpper();
    
    # Check for 'N' or 'Y' in $IncludeFilesWithNoGPSInOutput - if not, assign 'N'
    if (($IncludeFilesWithNoGPSInOutput -ne 'Y') `
        -and ($IncludeFilesWithNoGPSInOutput -ne 'N'))
    {
        $IncludeFilesWithNoGPSInOutput = 'N';
    }
    
    # Captilize $RecurseFolders parameter
    $RecurseFolders = $RecurseFolders.ToUpper();
    # Check for 'N' or 'Y' in $RecurseFolders - if not, assign 'N'
    if (($RecurseFolders -ne 'Y') `
        -and ($RecurseFolders -ne 'N'))
    {
        $RecurseFolders = 'N';
    }
    
    # Check to see if a filename has been supplied for the output file
    # If not, use a default name 'FileList.txt'
    if ($OutputFileName -eq '')
    {
        $OutputFileName = 'FileList.txt';
    }
    
    # Check to see if a folder path has been supplied
    if($FolderPath -ne '') 
    {
        # Check to see if a valid folder path has been supplied
        if((Test-Path($FolderPath)) -eq $true)
        {
            # Trim the folder path
            $FolderPath = $FolderPath.Trim();
            # Check to see if the trailing backslash (\) is present, if not, add it
            if($FolderPath.Substring($FolderPath.Length-1, 1) -ne '\')
            {
                $FolderPath += '\';
            }
            # Tab delimiter
            if (($Delimiter -eq '') -or ($Delimiter.Length -ne 1))
            {
                $d = "`t";
            }
            else 
            {
                $d = $Delimiter;
            }
            
            $ErrorCount = 0;
            
            # Create output file variable with full path of folder being searched
            $OutputFilePath =($FolderPath + $OutputFileName);
            $OutputFilePathErrors =($FolderPath + $OutputFileName + '.ERRORS.txt');
    
            # Create output file in folder being searched
            New-Item $OutputFilePath  -ItemType 'File' -Force;
            New-Item $OutputFilePathErrors  -ItemType 'File' -Force;
    
            # Create file header row and write to output file
            $FileHeader = '"FileName"' + $d + '"LonEW"' + $d + '"LatNS"' + $d `
                + '"LonDegrees"' + $d + '"LonMinutes"' + $d + '"LonSeconds"' `
                + $d + '"LatDegrees"' + $d + '"LatMinutes"' + $d + '"LatSeconds"' `
                + $d + '"AboveSeaLevel"' + $d + '"Altitude(meters)"';
            $FileHeader | Out-File -FilePath $OutputFilePath;
            
            'FILES NOT PROCESSED DUE TO ERROR' | Out-File -FilePath $OutputFilePathErrors;
    
            # Get the files in the directory
            if ($RecurseFolders -eq 'Y')
            {
                $files = Get-ChildItem ($FolderPath + '*') -Include $jpg -Recurse;
            }
            else
            {
                $files = Get-ChildItem ($FolderPath + '*') -Include $jpg;
            }
            
            # Set initialize file counter variables
            [Int32] $fileCount = 1;
            [Int32] $fileCountWithGPS = 0;
            [Int32] $fileCountWithPartialGPS = 0;
    
            Write-Host 'Processing has begun...';
            # Iterate through the files
            foreach ($file in $files)
            {
                # Resolve the path to the image file
                $filename = `
                [String](Resolve-Path ($file.DirectoryName + '\' `
                    + $file.Name.Replace("[", "``[").Replace("]", "``]")));
                
                # Create new image object from image file
                # Write filename to error log if image create fails
                try
                {
                    $img = New-Object `
                        -TypeName system.drawing.bitmap `
                        -ArgumentList $filename;
                }
                catch
                {
                    $filename | Out-File -Append -FilePath $OutputFilePathErrors;
                    $ErrorCount += 1;
                    continue;
                }
    
                # Set variables for processing
                $out = '';
                [Boolean]$GPSData = $true;
                
                # Set ASCII encoding for extracting string data from property
                $Encode = new-object System.Text.ASCIIEncoding;
                
                # Check to see if image file has GPS data by checking 
                # PropertyID 1: Latitude N or S value
                try
                {
                    $LatNS = $Encode.GetString($img.GetPropertyItem(1).Value)
                } 
                catch
                {
                    $GPSData = $false
                };
                # If image file has GPS data, process remaining data
                if ($GPSData -eq $true)
                {
                    #Increment file with GPS data counter
                    $fileCountWithPartialGPS += 1;
                    
                    # Get GPS data from image file. 
                    # PropertyID 1 = Latitude N or S
                    # PropertyID 2 = Latitude degress, minutes, and seconds
                    # PropertyID 3 = Longitude E or W
                    # PropertyID 4 = Longitude degress, minutes, and seconds
                    
                    $LatDegrees = (([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(2).Value, 0)) `
                                / ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(2).Value, 4)));
                    $LatMinutes = ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(2).Value, 8)) `
                                / ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(2).Value, 12));
                    $LatSeconds = ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(2).Value, 16)) `
                                / ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(2).Value, 20));
                    
                    # Set ASCII encoding for extracting string data from property
                    $Encode = new-object System.Text.ASCIIEncoding;
                    $LonEW = $Encode.GetString($img.GetPropertyItem(3).Value); 
                    
                    $LonDegrees = (([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(4).Value, 0)) `
                                / ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(4).Value, 4)));
                    $LonMinutes = ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(4).Value, 8)) `
                                / ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(4).Value, 12));
                    $LonSeconds = ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(4).Value, 16)) `
                                / ([Decimal][System.BitConverter]::ToInt32( `
                                    $img.GetPropertyItem(4).Value, 20));
    
                    try
                    {
                        $AboveSeaLevel = 1 - ([System.BitConverter]::ToInt32( `
                                        $img.GetPropertyItem(6).Value, 0))
                                        
                        $Altitude = (([Decimal][System.BitConverter]::ToInt32( `
                                        $img.GetPropertyItem(6).Value, 0)) `
                                    / ([Decimal][System.BitConverter]::ToInt32( `
                                        $img.GetPropertyItem(6).Value, 4)));
                    }
                    catch
                    {
                        $AboveSeaLevel = "0";
                        $Altitude = 0;
                    }
                    
                    
                    # Create outpfile row from GPS data
                    if ((($LatDegrees + $LatMinutes + $LatSeconds + `
                        $LonDegrees + $LonMinutes + $LonSeconds) -ne 0) `
                        -or ($IncludeFilesWithNoGPSInOutput -eq 'Y'))
                    {
                        $out = '"' + $file + '"' + $d + '"' + $LonEW + '"' + $d `
                            + '"' + $LatNS + '"' + $d + $LonDegrees + $d `
                            +  $LonMinutes  + $d +  $LonSeconds  + $d `
                            +  $LatDegrees  + $d +  $LatMinutes  + $d `
                            +  $LatSeconds  + $d +  $AboveSeaLevel  + $d `
                            + $Altitude;
                    }
                    if (($LatDegrees + $LatMinutes + $LatSeconds + `
                        $LonDegrees + $LonMinutes + $LonSeconds) -ne 0)
                    {
                        $fileCountWithGPS += 1;
                    }
                }
                elseif ($IncludeFilesWithNoGPSInOutput -eq 'Y')
                {    
                    # Create blank output row if it should be included in results
                    $out = '"' + $file + '"' +  $d + '""' +  $d + '""' + $d `
                        + '0' +  $d + '0' +  $d + '0' +  $d + '0' +  $d `
                        + '0' +  $d + '0' + $d + '0' +  $d + '0';
                }
                # If image file has GPS data or image file should be recorded 
                # with blank GPS data, add data row to output file
                if ($out -ne '')
                {
                    $out | Out-File -Append -FilePath $OutputFilePath;
                }
                # Write update to console every 100 files that are processed
                if (($fileCount % $UpdateCount) -eq 0)
                {
                    [Int32]$startTimeDiff = `
                        ([TimeSpan]([DateTime]::Now - $startTime)).TotalSeconds;
                    $hostout = "Processed $fileCount files so far. ";
                    $hostout += `
                        "$fileCountWithGPS file(s) contain complete GPS data. ";
                    $hostout += `
                    "$fileCountWithPartialGPS file(s) contain partial GPS data...";
                    Write-Host $hostout;
                    Write-Host "$startTimeDiff seconds processing so far.";
                    Write-Host "";
                }
                # Increment file counter
                $fileCount += 1;
            }
            [Int32]$startTimeDiff = `
                ([TimeSpan]([DateTime]::Now - $startTime)).TotalSeconds;
            $hostout = "Processing complete. Processed $fileCount total files. ";
            $hostout += "$fileCountWithGPS file(s) contained complete GPS data. ";
            $hostout += `
                "$fileCountWithPartialGPS file(s) contained partial GPS data...";
            Write-Host $hostout;
            Write-Host "$startTimeDiff seconds total processing.";
            $img.Dispose();
        }
        # An invalid folder path has been supplied, so return without processing
        else
        {
            Write-Host 'You must supply a valid folder path';
        }
    }
    # No folder path has been supplied, so return without processing
    else
    {
        Write-Host 'You must supply a folder path';
    }
    
  • SQLblog gets a hardware upgrade

    SQLblog found a new home this past weekend and was moved onto a much needed, much better server infrastructure. SQLblog continues using MaximumASP (now CBeyond Cloud Services but still found at www.maximumasp.com).  We have been very happy with our hosting and support and  MaximumASP worked with us ensure that are hardware and disaster recovery options were optimal for our budget. It obvious why we are now in our fourth year being hosted at MaximumASP.

  • Composing XML from HIERARCHYID in T-SQL

    Now that I've discussed converting XML into a set of HIERARCHYID values I thought I'd try to reverse the process...

    Not sure if you’ve attempted to convert a table with HIERARCHYID to an XML representation, and if you have, I’m sure you’ve experienced the same woes as me. Sure, I could have taken the route of using C# to create the XML, and it very well may be a better way to make such a conversion; instead I decided that I had to be able to do this in T-SQL, and so began the journey to find such a solution...

    Since the XML modify method can only insert into a single node in an XML document, I had to either attempt to generate a string representation of the xml form the data (no simple task) or I could cursor through the data one row at a time (yes, cursor) and insert each node. For this implementation, I choose the cursor method simply because it would be easier to do than to recreate the XML document abilities akin to what is done in the .NET framework.

    When using XQuery to insert nodes, you must use a static value (in other words, you cannot use a composed string variable for the XQuery in the XML modify method). At first, this made it difficult to figure out how to insert a node into another node since there was no point of reference.

    And so thought that I could use the HIERARCHID’s ToString() method to figure out node positions in the XML, but that quickly was discarded after realizing that the path representation is guaranteed to be neither consecutive nor integers, and it would require using sp_executesql, which I also wanted to avoid.

    Then I thought that the data would probably be uniquely identifiable, and so I could use that “id” to add an attribute to every node that I constructed and then cursor through and insert into the node that matched the parent id of the node I was inserting, which removed the need to use sp_executesql. In other words, I would create a cursor that contained the parent node ID and concatenated values from the row of data to create the node with an “id” attribute.


    CAST ('<' + NodeName + ' id="' CAST(NodeID AS VARCHAR(20)) + '">' + ISNULL(NodeText, '') + '</' + NodeName + '>' AS XML) AS XmlToInsert

     

    I would then iterate through the cursor and insert the node as follows:


    SET
    @XR.modify('insert sql:variable("@xcol") into (//*[@id=sql:variable("@hparentid")])[1]')

    This uses the sql:variable extension to find a node via a relative reference. It looks for the node, regardless of location in the XML, with the attribute" “id” equal to the “id” of the parent which is contained in the @hparent variable.

    Alas, this could become more problematic if the unique key contained multiple fields. One also might not want to include an extra “id” attribute in the results. Because of these and other things that I realized could go wrong with this implementation, I decided to scrap it and moved on. And although the version I am about to present has its own potential for issues, I felt it was more flexible and cleaner in its approach. Essentially what I decided to do is to use a temp table that contained the generated the XML node, the original HIERARCHYID value, a row number, generated with ROW_NUMBER() ordered by the hierarchy order, and a parent row number, which would initially set to 0 then updated using a self join on the temp table.

    Then since the XML nodes position will match the generated row number based on the HIERARCHYID position, we can simply insert the new node into the parent node based on its position.


    -- Sample Data to test with
    CREATE TABLE
    #HTable (NodeName sysname, Attributes xml, NodeText VARCHAR(MAX), HierarchyNode HIERARCHYID)
    INSERT INTO #HTable (NodeName, Attributes, NodeText, HierarchyNode)
    VALUES 
      
    ('a', '<a attr="1" />', NULL, 0x),
       (
    'b', NULL, NULL, 0x58),
       (
    'c', '<a xyz="3" />', 'abc', 0x5AC0),
       (
    'c', NULL, 'def', 0x5B40),
       (
    'b', '<a id="111" pid="1234" />', NULL, 0x68),
       (
    'c', NULL, 'abc', 0x6AC0),
       (
    'c', NULL, 'def', 0x6B40)


    CREATE TABLE #T (XmlToInsert XML, HierarchyNode HIERARCHYID, RowNum INT, ParentRowNum INT)

    -- INSERT the generated XML node, the original HIERARCHYID, a unique row number, and a parent row number (set to 0)
    INSERT INTO #T (XmlToInsert, HierarchyNode, RowNum, ParentRowNum
    SELECT 
      
    CAST(
              
    '<' + NodeName + ' '
              
    + CASE WHEN Attributes IS NOT NULL 
                  
    THEN SUBSTRING(CAST(Attributes AS VARCHAR(MAX)), 3, LEN(CAST(Attributes AS VARCHAR(MAX))) - 4)
                  
    ELSE '' END
              
    + '>' + ISNULL(NodeText, '') + '</' + NodeName + '>'
          
    AS XML) AS XmlToInsert
      
    , HierarchyNode 
      
    , ROW_NUMBER() OVER (ORDER BY HierarchyNode) AS RowNum
      
    , 0 AS ParentRowNum
    FROM #HTable
    ORDER BY HierarchyNode

    -- UPDATE the parent row number using the HIERARCHYID method GetAncestor in the self join
    -- If the amount of data is great, an index could be created on the temp table prior to the update

    UPDATE T1
    SET T1.ParentRowNum = T2.RowNum
    FROM #T AS T1
      
    INNER JOIN #T AS T2 ON T2.HierarchyNode = T1.HierarchyNode.GetAncestor(1)

    DECLARE @xcol XML, @parentrownum INT, @flag BIT = 0, @XR XML = ''

    -- We actually only need the generated XML and the parent row number to do the rest of this work
    DECLARE crH CURSOR READ_ONLY FOR SELECT XmlToInsert, ParentRowNum FROM #T ORDER BY RowNum
       
    OPEN crH

    FETCH NEXT FROM crH INTO @xcol, @parentrownum
    WHILE(@@FETCH_STATUS = 0)
    BEGIN
           -- First time through, we add a root node
           IF
    @flag = 0
              
    SET @XR.modify('insert sql:variable("@xcol") into (/)[1]')
          
    ELSE -- Subsequent passes we find the parent node by position
               SET
    @XR.modify('insert sql:variable("@xcol") into (//*)[sql:variable("@parentrownum")][1]')
           
          
    SET @flag = 1
          
    FETCH NEXT FROM crH INTO @xcol, @parentrownum
    END
    CLOSE
    crH
    DEALLOCATE crH
    DROP TABLE #T
    DROP TABLE #HTable
    SELECT @xr

    I did try one other version that used a recursive CTE to generate the list of nodes and their respective “rownum” and “parentRowNum”:


    ;
    WITH H1 AS
    SELECT  
         
    NodeName
         
    , Attributes
         
    , NodeText
         
    , HierarchyNode  
         
    , ROW_NUMBER() OVER (ORDER BY HierarchyNode) AS RowNum 
      
    FROM #HTable AS HT
    )
    ,

    AS
    SELECT  
         
    CAST
                 
    '<' + NodeName + ' ' 
                 
    + CASE WHEN Attributes IS NOT NULL  
                     
    THEN SUBSTRING(CAST(Attributes AS VARCHAR(MAX)), 3, LEN(CAST(Attributes AS VARCHAR(MAX))) - 4
                     
    ELSE '' END 
                 
    + '>' + ISNULL(NodeText, '') + '</' + NodeName + '>' 
             
    AS XML) AS XmlToInsert 
         
    , HierarchyNode  
         
    , RowNum
         
    , CAST(0 AS BIGINT) AS ParentRowNum 
      
    FROM H1 AS HT
      
    WHERE HierarchyNode = HIERARCHYID::GetRoot()
       
      
    UNION ALL
       
      
    SELECT  
         
    CAST
                 
    '<' + HT.NodeName + ' ' 
                 
    + CASE WHEN HT.Attributes IS NOT NULL  
                     
    THEN SUBSTRING(CAST(HT.Attributes AS VARCHAR(MAX)), 3, LEN(CAST(HT.Attributes AS VARCHAR(MAX))) - 4
                     
    ELSE '' END 
                 
    + '>' + ISNULL(HT.NodeText, '') + '</' + HT.NodeName + '>' 
             
    AS XML)
          ,
    HT.HierarchyNode  
         
    , HT.RowNum
         
    , H.RowNum 
      
    FROM H1 AS HT
          
    INNER JOIN H ON H.HierarchyNode = HT.HierarchyNode.GetAncestor(1
    )
    INSERT INTO #T (XmlToInsert, HierarchyNode, RowNum, ParentRowNum)  
    SELECT XmlToInsert, HierarchyNode, RowNum, ParentRowNum
    FROM H
    ORDER BY HierarchyNode

    Instead of inserting the initial values into the temp table #T, this generates the completed set of data and then inserts it into #T. However, regardless of what variation I tried (e.g. add indexes), the initial method I show that inserts then updates #T was always faster.

    Please let me know if you have any ideas that might optimize this, and if you have an implementation (T-SQL or .NET), please share.

  • Shredding XML into HIERARCHYID Take 2

    In a previous blog post, I had discussed a method of shredding XML to a table with HIERARCHYID, and realized that it had a dependency that I was not too keen about: The XML data required an “id” attribute in order to create the hierarchy. I had sorted out a way to inject a unique attribute ID into all the nodes (I’ll discuss this in a follow up post), but having to modify the original XML didn’t have much appeal. But, upon reading another post by my fellow blogger, Adam Machanic, I realized it could be done without this requirement. Using the technique that Adam presented, I can generate unique paths to be parsed into a HIERARCHYID column.


    SET
    @x = '<a someAttribute="1"><b><c>abc</c><c anotherAttribute="2">def</c></b><b><c>abc</c><c>def</c></b></a>'

    DECLARE @T TABLE (NodeName VARCHAR(255), Attributes XML, NodeText VARCHAR(MAX), HierarchyNode HIERARCHYID)
    ;
    WITH N (Node, NodeName, Attributes, NodeText, HierarchyPath
    AS
    SELECT
          
    CAST(Expr.query('.') AS XML) -- Node
          
    , CAST(Expr.value('local-name(.)', 'varchar(255)') AS VARCHAR(255)) -- NodeName
          
    , CASE WHEN Expr.value('count(./@*)', 'INT') >
                  
    THEN Expr.query('<a>{for $a in ./@* return $a}</a>'
                  
    ELSE NULL END -- Attributes
          
    , CAST(Expr.value('./text()[1]', 'varchar(max)') AS VARCHAR(MAX)) -- NodeText
          
    , CAST('/' AS VARCHAR(1000)) -- HierarchyPath
      
    FROM @x.nodes('/*[1]') AS Res(Expr
           
      
    UNION ALL
       
      
    SELECT  
          
    Expr.query('.') -- Node
          
    , CAST(Expr.value('local-name(.)', 'varchar(255)') AS VARCHAR(255)) -- NodeName
          
    , CASE WHEN Expr.value('count(./@*)', 'INT') >
                  
    THEN Expr.query('<a>{for $a in ./@* return $a}</a>'
                  
    ELSE NULL END -- Attributes
          
    , CAST(Expr.value('./text()[1]', 'varchar(max)') AS VARCHAR(MAX)) -- NodeText
          
    , CAST(N.HierarchyPath 
                 + CAST(DENSE_RANK() OVER (ORDER BY Expr) AS VARCHAR(1000)) 
                 +
    '/' AS VARCHAR(1000)) -- HierarchyPath
      
    FROM N CROSS APPLY Node.nodes('*/*') AS Res(Expr)
    )
    INSERT INTO @T (NodeName, Attributes, NodeText, HierarchyNode)
    SELECT NodeName, Attributes, NodeText, CAST(HierarchyPath AS HIERARCHYID)
    FROM N
    ORDER BY CAST(HierarchyPath AS HIERARCHYID)

    SELECT * FROM @T 

    For this example, I simple grab the node name, the node text, and the attributes (when they exist) as a simple XML value of the format:

    <a [attribute1=”attribute value” [attribute2=”attribute value”]...] />

    Of course, these values could also be shredded into the hierarchy. One way of doing this would be to add an additional column to the results that represents the type of entry in the hierarchy (node versus attribute). My challenge to you is to create that solution.

    Have fun!

  • OT: Getting Fit in February

    A friend and colleague of mine, Caleb Jenkins, has started a fitness challenge (http://calebjenkins.wordpress.com/2009/01/29/official-rules/) and I decided that it was time for me to accelerate my fitness program that I had started in January. As of 9:00 AM on February 1, I weighed in at 201 pounds. My overall and ambitious goal is to lose at least 20 pounds. I hope to lose at least 10 pounds during the month of February. Even if I don’t win the fitness challenge, I will be much better off and will hopefully get to my ultimate goal of staying under 180 pounds.

    I worked out on Saturday and Sunday. On Sunday I also helped my 5 year old son attempt to ride his bicycle without training wheels for the first time. My backs hurts from leaning over and catching Chris from falling. I am sore. I am achy. But I am determined. Wish me luck on my road to a healthier lifestyle.

  • Tweet! SQLblog is now on Twitter!

    Are you a twitter tweeter who loves SQL Server? You are?! Great, because SQLblog is now posting (AKA tweeting) on Twitter at http://twitter.com/sqlblog. Enjoy!

  • Introducing SQL Server 2008 - The Book

    It took a while, but a new e-book authored by yours truly, Greg Low, and Mark Whitehorn is finally available. And it's free!

    To get your free copy, browse to http://www.microsoft.com/learning/sql/2008/default.mspx. From there, look in the Special Offers section for the "Free e-book offer". Although the site states you can get excerpts, the whole book is available for reading.

    Many, many thanks to Greg and Mark for their quality contributions to the book.

  • PSA: Juvenile Diabetes Research Foundation - Walk to Cure Diabetes

    Last year, my 4 year-old son and I walked (well, I carried him for 2 miles) in support of Juvenile Diabetes Research Foundation (JDRF). We raised a modest amount in support of this wonderful effort. This year, we want to help more than ever, and I thought I'd ask the SQLblog community for some additional support.

    We are walking in support of our friend Marissa (Marissa's Mermaids team). Marissa was diagnosed in Sept. 2005 with Type 1 Diabetes, since then she has had 4620 finger pricks to draw blood and 1095 injections. Now, she is attached to an insulin pump 24 hours a day. Type 1 (or Juvenile) Diabetes is a chronic, debilitating disease affecting every organ system - it is an autoimmune disease where the body attacks its pancreas and permanently destroys the cells that make insulin. Once Type 1 Diabetes arrives, it becomes a life-long condition.

    Like nearly 3 million Americans with Type 1 Diabetes, Marissa has lower projected lifespan by 14 years and is more likely to suffer blindness, amputation, heart disease, kidney disease, nerve damage, stroke, and heart attack. Every year, 13,000 children are diagnosed. INSULIN IS NOT A CURE - IT IS MERELY LIFE SUPPORT. At times when Marissa is most upset, with tears in her eyes, she says, "I wish I could take my diabetes and throw it in the trash can."

    To manage Marissa's diabetes, her parents have the impossible struggle of balancing carbohydrates, insulin, and physical activity with uncontrollable variables such as illness, growth, exercise-level, excitement, and stress. Every day offers Marissa the risk of a drastic low blood sugar causing seizure or possibly death. It could only take one mistake on one morning or one afternoon or one night. Or it could simply involve a random event from her body.

    Marissa is brave. Her parents poke her finger to draw blood 8-12 times each day including routine checks at midnight and 3am. Every carbohydrate at every meal or snack must be counted. The carbs must be perfectly balanced immediately through injections or an insulin pump to attempt to imitate the way yours and my pancreas works. There are no exceptions and there is never a day off!

    JDRF is amazingly efficient - 85% of donations go directly to research.

    Major research advances have occurred in the past few years! Diabetes has been cured in mice in at least 4 ways and various human trials are beginning. Scientists legitimately believe a cure is possible before Marissa loses this life-long battle. We ask you to share Marissa's story with your organization so that your entire company would consider sponsoring Marissa. Your contribution might make the difference in the one research laboratory where the cure will be achieved!

    So if you are able to donate, please visit http://walk.jdrf.org/index.cfm?fuseaction=extranet.personalpage&confirmid=87065878. Together we can help find a cure!

  • Policy Based Management Evaluation Modes

    Dan Jones wrote a great post about Facets from the new Policy-Based Management feature of SQL Server 2008. At one point in the post, he listed all of the available facets and their supported evaluation modes. Since SQL Server 2008 is not RTM, and since facets can be added in the future, I thought I'd write a query that would list the facets and supported evaluation modes.

    Note that the On Demand mode is always supported and has therefore been left out of the query.

    ;WITH EM (EvalModeID, EvalModeName)
    AS
    SELECT *
       FROM
           (VALUES 
               (1, 'Check on Change: Prevent'),
               (2, 'Check on Change: Log'),
               (4, 'Check on Schedule')) AS EvalModes (EvalModeID, EvalModeName)
    )
    , FEM (FacetID, FacetName, EvaluationMode, IsSupported)
    AS
    SELECT
           pmf.management_facet_id
           , pmf.name
           , EM.EvalModeName
           ,
       FROM msdb.dbo.syspolicy_management_facets AS pmf
           INNER JOIN EM ON pmf.execution_mode & EM.EvalModeID = EM.EvalModeID
    )
    SELECT FacetID
       , FacetName
       , [Check on Change: Prevent]
       , [Check on Change: Log]
       , [Check on Schedule]
    FROM FEM
    PIVOT 
    COUNT(IsSupported)
       FOR EvaluationMode IN ([Check on Change: Prevent], [Check on Change: Log], [Check on Schedule])
    )AS FEMP
    ORDER BY FacetName

    The results for the existing 72 facets are as follows:

    FacetID

    FacetName

    Check on Change: Prevent

    Check on Change: Log

    Check on Schedule

    1

    ApplicationRole

    1

    1

    1

    2

    AsymmetricKey

    1

    1

    1

    3

    Audit

    0

    0

    1

    4

    BackupDevice

    0

    0

    1

    5

    BrokerPriority

    0

    0

    1

    6

    BrokerService

    0

    0

    1

    7

    Certificate

    0

    0

    1

    8

    Credential

    0

    0

    1

    9

    CryptographicProvider

    0

    0

    1

    10

    Database

    0

    0

    1

    11

    DatabaseAuditSpecification

    0

    0

    1

    12

    DatabaseDdlTrigger

    0

    0

    1

    13

    DatabaseRole

    1

    1

    1

    14

    DataFile

    0

    0

    1

    15

    Default

    0

    0

    1

    16

    Endpoint

    1

    1

    1

    17

    FileGroup

    0

    0

    1

    18

    FullTextCatalog

    0

    0

    1

    19

    FullTextIndex

    0

    0

    1

    20

    FullTextStopList

    0

    0

    1

    21

    IDatabaseMaintenanceFacet

    0

    0

    1

    22

    IDatabaseOptions

    0

    1

    1

    23

    IDatabasePerformanceFacet

    0

    0

    1

    24

    IDatabaseSecurityFacet

    0

    0

    1

    25

    ILoginOptions

    1

    1

    1

    26

    IMultipartNameFacet

    1

    1

    1

    27

    INameFacet

    0

    0

    1

    31

    Index

    0

    0

    1

    32

    IServerAuditFacet

    0

    0

    1

    33

    IServerConfigurationFacet

    0

    1

    1

    34

    IServerInformation

    0

    0

    1

    35

    IServerPerformanceFacet

    0

    0

    1

    36

    IServerSecurityFacet

    0

    0

    1

    38

    IServerSettings

    0

    0

    1

    37

    IServerSetupFacet

    0

    0

    1

    41

    ISurfaceAreaFacet

    0

    1

    1

    28

    ITableOptions

    1

    1

    1

    29

    IUserOptions

    1

    1

    1

    30

    IViewOptions

    1

    1

    1

    42

    LinkedServer

    0

    0

    1

    43

    LogFile

    0

    0

    1

    44

    Login

    0

    0

    1

    45

    MessageType

    0

    0

    1

    46

    PartitionFunction

    0

    0

    1

    47

    PartitionScheme

    0

    0

    1

    48

    PlanGuide

    0

    0

    1

    49

    RemoteServiceBinding

    0

    0

    1

    50

    ResourceGovernor

    0

    0

    1

    51

    ResourcePool

    1

    1

    1

    52

    Rule

    0

    0

    1

    53

    Schema

    1

    1

    1

    54

    Server

    0

    0

    1

    55

    ServerAuditSpecification

    0

    0

    1

    56

    ServerDdlTrigger

    0

    0

    1

    57

    ServiceContract

    0

    0

    1

    58

    ServiceQueue

    0

    0

    1

    59

    ServiceRoute

    0

    0

    1

    60

    Statistic

    0

    0

    1

    61

    StoredProcedure

    1

    1

    1

    62

    SymmetricKey

    0

    0

    1

    63

    Synonym

    0

    0

    1

    64

    Table

    0

    0

    1

    65

    Trigger

    0

    0

    1

    66

    User

    0

    0

    1

    67

    UserDefinedAggregate

    0

    0

    1

    68

    UserDefinedDataType

    0

    0

    1

    69

    UserDefinedFunction

    1

    1

    1

    70

    UserDefinedTableType

    0

    0

    1

    71

    UserDefinedType

    0

    0

    1

    72

    View

    0

    0

    1

    73

    WorkloadGroup

    1

    1

    1

    74

    XmlSchemaCollection

    0

    0

    1

     

  • TechEd 2008 Samples

    As promised, I am posting my samples from the North America TechEd 2008 Developer Week.

    Note: All the samples are designed to run in SQL Server 2008. The safe dynamic sql samples can be modified, however, to work in SQL Server 2005.

     

  • SQLblog: New SQL Server Forum is now available

    After numerous questions and requests, we decided to create a forum for all SQL Server related questions. You can view the new forum at http://sqlblog.com/forums/57/ShowForum.aspx or use the Forums menu item on the site to browse there. We hope this new addition to the SQLblog community provides the members a valuable benefit and look forward to seeing your forum posts.

  • OT: Introducing SQL Server 2008 Book is Content Complete

    With some help from two esteemed colleagues, Greg Low and Mark Whitehorn, the Introducing SQL Server 2008 book is now content complete. Sure, there is still some tech and copy editing that needs to be done, but the core writing is complete. The book, based on the Feburary CTP6 release, will be available free from Microsoft and covers a slew of new features including:

    • Policy-Based Management and Auditing
    • Transparent Data Encryption and Data and Backup Compression
    • Resource Governor
    • Performance Data Collection
    • New data types including HierarchyID, Filestream, Spatial, Date and Time, and Enhancements to the XML data type
    • Table Types (and Table valued parameters)
    • T-SQL Enhancements including Merge, Single statement declaration/assignment, Increment Operators, and GROUPING SETS
    • Sparse Columns and Filtered Indexes
    • High Availability Enhancements for Database Mirroring and Failover Clustering
    • Business Intelligence Enhancements for Integration Services, Reporting Services, and Analysis Services

    I'll post again when the book is available for download.

  • Yet Another Stored Procedure vs. Ad-hoc Query Discussion?

    Earlier today, Will Sullivan posted a blog entry, My Statement on Stored Procedures, in which he emphatically states his official opinion of stored procedures as:

    "I prefer not to use them."

    He then goes about dismissing most of the misinformation about why stored procedures are better than ad-hoc (parameterized) queries.

    The first bit of misinformation he dispels is the now defunct argument that "Stored Procedures are faster than ad-hoc queries". He states that "Unless your ad-hoc queries are always significantly different from each other, their execution plans are cached right along side those of the SP's." I completely agree. We'll call that one a tie, so the score so far: SP 0, Ad-hoc 0.

    Another myth he tries to debunk is that "Editing SP's is a breeze with Query Analyzer". Query Analyzer - that's so SQL Server 2000. Seriously, though, there are a number of fine code editors that allow you to edit SPs with ease. Query Analyzer is not at the top of that list, however. I will say that when you write T-SQL you should use a code editor that is meant for T-SQL, for the same reasons that when you write C#, you want to use a code editor meant for C#. Again, no winner here, so the score remains: SP 0, Ad-hoc 0.

    He addresses another statement that is supposedly made in defense of SPs: "Ad-hoc queries are a nightmare to maintain, as they are spread all over your code". Again, either one is easy to maintain, with the right tools. We are still scoreless: SP 0, Ad-hoc 0.

    It just so happens that I agree with many of his points. And there are other objective and subjective points on topics such as organization, maintenance, design, and so on, which one could argue for either SPs or ad-hoc queries equally so. Don't get me wrong, however, as I believe that using ad-hoc queries when you could have used stored procedures is simply wrong.

    And so I will address Will's last point (actually, it was his second point) that is repeatedly misrepresented: "Stored Procedures are safe against SQL injection attacks; ad-hoc queries are not".

    Ad-hoc queries prevent SQL Injection attacks as well as SPs do. Any claim otherwise would be wrong. But that's not the issue. The problem is that ad-hoc queries require that you expose the underlying objects of the database. In order to use ad-hoc queries, you must allow direct access for select, insert, update, and delete operations to the tables in the database. Although I know most experienced developers would only write ad-hoc/parameterized queries against the underlying data, at a later date, some disgruntled or inexperienced developer may write dynamic SQL instead (I have seen it happen), and expose the database to SQL injection attacks (which I have also seen in production systems), including exposure to such awful actions as...

    -- Can you say Identity Theft?
    SELECT FirstNameLastNameZIPCreditCardNumberCreditCardTypeCreditCardExpoiration
    CVV 
    FROM 
    Customers

    ...or worse...

    -- Do we really need all those customers?
    DELETE Customers

    ...or even worse...

    -- NEVER EVER DO THIS, PLEASE
    -- This will execute a DELETE against all tables
    EXEC sp_MSforeachtable 'DELETE ?'

    ...or even, even worse (assuming the SQL login has elevated permissions - which many apps do)...

    -- NEVER EVER DO THIS, PLEASE
    -- This will drop all tables from the database
    EXEC sp_MSforeachtable 'DROP TABLE ?'

    ...and so although your ad-hoc query code won't allow SQL injection, some other programmer's dynamic SQL will. Assuming you've correctly secured your database, this doesn't happen with stored procedures since you do not have to expose any of the underlying tables (because of a little something known as chain of ownership).

    Of course, you could completely self-destruct any security benefits by creating a SP such as this one:

    -- NEVER EVER DO THIS, PLEASE, I beg of you...
    CREATE PROC prExecuteSql (@sql VARCHAR(MAX
    ))
    WITH EXECUTE AS 
    dbo
    AS
        EXEC 
    (@sql
    )
    GO

    As you can see, SPs aren't fool proof, but you can mitigate your risk by having an employee or a consultant who knows what they are doing in the database.

    Yes, there are some applications will not require the extra security, or other factors may simply prevent you from using stored procedures, and so using ad-hoc SQL is a viable option in those cases. But I believe that security should be at the top of your important-things-for-your-application list, and alas, ad-hoc queries require you to unnecessarily expose your database objects, which will more than likely lead to problems down the road.  You can argue any other point and there are no clear winners, but when it comes to security, ad-hoc loses. If you want to a more secure database, you need to be using SPs.

    And so, the final score (well, for now anyways): SP 1, Ad-hoc 0.

  • SQL Server 2008 Leap Year Issue

    Microsoft reported:

    "We have recently discovered an issue with SQL Server 2008 CTPs that result in SQL Server 2008 not starting or installing on Feb 29 GMT only. We recommend that you do not run, install or upgrade this CTP on Feb 29 GMT to minimize any impact in your environment. You can install starting on March 1 GMT...."

More Posts Next page »
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement