THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

  • SQL Saturday Birmingham #328 Database Design Precon In One Week

    On *August 22, I will be doing my "How to Design a Relational Database" pre-conference session in Birmingham, Alabama. You can see the abstract here if you are interested, and you can sign up there too, naturally. At just $100, which includes a free ebook copy of my database design book, it is a great bargain and I totally promise it will be a little over 7 hours of talking about and designing databases, which will certainly be better than what you do on a normal work day, even a Friday.

    You can read the abstract , but what should you really expect?  First off, expect to not sit in one spot while I drone on and on for 7 hours of lecture  (not that I couldn't go 7 hours straight just lecturing while only inhaling every 10 minutes, which would be actually be a little bit easier to prepare, I assure you).

    Rather, there is quite a bit of interaction as a class and in small groups, giving you not only some of the foundational information, but a bit of practical experience too. (Plus, the reason I insisted in buying ebooks for everyone was to let the attendee have the “full” picture right on their laptop or tablet device after we are finished with class.)

    The day is broken up into 3 or 4 modules, with several participation exercises along the way. The sections are:

    Section 1: Introduction with a bit of history

    We start with just enough introduction to the materials, the stuff you need to do *before* you design, and introduction to the history of the craft to make sure we are all on the same page. A big part of this section is just getting it straight why we design like we do.

    Section 2: Modeling and structures

    This section will cover the fundamental building blocks of relational databases, like tables, columns, keys, etc; and how to create a data model of the constructs. This is by far the largest part of the lecture, and by the end we should all be on the same page as to what goes into the database, if not exactly "how" the final product should look.

    We will stop at this point, and I will get out my modeling camera (which sounds a LOT more glamorous than it will turn out to be) and we will do some modeling on paper, eliciting attendees to provide the parts of a simple database, and we will all decide what should go into the model.

    The document that I base this model on is VERY simple, but the complexities of translating the document to a base database design are always quite interesting to experience as a class, and I get an idea of who is going to be the outgoing class members at this point too.

    Section 3: Model Standardization (Normalization)

    In this section, we will look at the kinds of things you need to do to the model to prepare the model to be implementable by truly analyzing the structures to see if they make "sense" within the confines of the relational model. It is always interesting to me that most models are normalized to some degree for simplicity, but people think that normalizing makes things slower. And the misconceptions about the higher normal forms make even less sense…

    Once we are done with the slide material in this section, we will start a couple of exercises. The first exercise is planned as a full class exercise, where I will man the data model (first on paper, then in a modeling tool), and elicit input from the class, in a manner that make sure everyone gets a say.

    Then we will break up into small teams and build a final model on paper, which I will bring up to the projector and we will discuss the different solutions.

    Section 4: Physical Modeling Overview

    Assuming we still have time/energy, we will take the last part of the class and cover turning the model into a "real" database. Data types, domain implementations, constraints, testing, etc. will be covered.

    Due to the limitations of the ~7 hour format, and a *strong* preference of previous classes towards actually doing some design, there are topics we won't cover. But honestly, if you can get the basic design correct and make the model
    look like what the final model needs to, the rest is kind of gravy (and well documented in a lot more websites and books than mine!) I spend a lot more time figuring out the shape of the database objects than I do on the implementation aspects because the problems with database designs are almost always an inability to match the user’s requirements more than missing an index here and there (and that is why God created DBAs & Data Programmers).

    What I really love about doing all of the designs is that we really get the flavor of a real design meeting. A few differing opinions, a few ideas I hadn't planned for, and a few argumentative types who really want their own way. But none of the arguments so far have gotten out of hand so far, and they have all been very much like the typical data modeling meeting.

    I hope to see you in class!

    * Note, this originally said September 22. If you want SQL training with me on that date, you will need to join me in Orlando's version of Asia, on Expedition Everest.

  • How In-Memory Database Objects Affect Database Design: Hybrid Code

    In my first attempts at building my code, I strictly went with either native or on-disk code. I specifically wrote the on-disk code to only use features that worked in-memory. This lead to one majorly silly bit of code, used to create system assigned key values. How would I create a customer number that was unique. We can’t use the Max(value) + 1 approach because it will be very hideous with MVCC isolation levels, since 100 connections might see the same value, leading to lots of duplication. You can’t see other connections, so you would duplicate data quickly.  I was also limited to not using sequence objects because they too are not allowed in native code.

    So, I used a random number generator, like this:

    DECLARE @CustomerNumber CHAR(10)
    WHILE 1=1
    BEGIN
        SET @customerNumber = 'CU' + RIGHT('00000000' + CAST(CAST(100000000 * RAND() AS INT) AS VARCHAR(8)),8)
       
        IF NOT EXISTS (SELECT * FROM Customers.Customer WHERE CustomerNumber = @CustomerNumber)
            BREAK

    END   

    This actually worked quite well for the tens of thousand rows I generated. The likelihood of two connections getting the same random value, and getting past the NOT EXISTS block was very unlikely. Now, obviously the probability of clashes will rise greatly as you approach the limits of the 8 digit number, so you would have to monitor usage and change the CU to something else to increment the value. This is demo code, and as this blog is about, there are easier ways. But it was fun to write and test.

    Of course the worst part of this code isn’t the random number generator, or even the looping (oh, the dreaded looping), no this code is not very optimal, because of the NOT EXISTS subquery.  Best case we have to do one probe into the table to see if that value doesn’t exist.  While it was sub optimal in interpreted code, in native code, it got more silly looking because you can’t break out of a while loop, and you can’t use subqueries (nor the RIGHT function). So the code changed to the following awkward (yet operational) bit of code:

    DECLARE @customerNumber CHAR(10), @rowcount INT, @keepGoing BIT = 1, @baseNumber VARCHAR(30)
    WHILE @keepGoing = 1
    BEGIN
        SET @baseNumber = '00000000' + CAST(CAST(100000000 * RAND() AS INT) AS VARCHAR(8))
        SET @customerNumber = 'CU' + SUBSTRING(@baseNumber,LEN(@baseNumber) - 8,8)
       
        SELECT @rowcount = COUNT(*) FROM Customers.Customer WHERE CustomerNumber = @CustomerNumber
        IF @rowcount = 0
            SET @keepGoing = 0

    END   

    The inefficiency of this code is the same as the interpreted code: that query on CustomerNumber.  Of course, the more typical solution to the problem of a system generated key would be to use a SEQUENCE object (possibly as a default), and format the number somehow. But you can’t use sequences in native code, so instead of going fully native code, I am using a hybrid approach.

    First, I built a procedure that generated CustomerNumbers, it has a simple checkdigit appended to a 7 digit number (preceded by CU) (and sloughing off any customer numbers with 666 in the value from the string just to show what can be done).

    CREATE SEQUENCE Customers.Customer$CustomerNumber$Sequence
    AS INT
    START WITH 1
    go

    --use a sequence and format the output a bit to avoid lots of duplication
    CREATE  PROCEDURE Customers.Customer$getNextCustomerNumber
        @customerNumber char(10) OUTPUT
    AS

        --doing it in a loop is the simplest method when complex requirements.
        WHILE (1=1)
         BEGIN
            --Get the base account number, which is just the next value from the stack
            SET @customerNumber = 'CU' + right(replicate ('0',8) +
                        CAST(NEXT VALUE FOR Customers.Customer$CustomerNumber$Sequence as varchar(7)), 6)

            --add a check digit to the account number (take some digits add together, take the first number)
            SELECT @customerNumber = CAST(@customerNumber AS varchar(8)) +
                RIGHT(CAST(
                   CAST(SUBSTRING(@customerNumber, 3,1) AS TINYINT) +
                   POWER(CAST(SUBSTRING(@customerNumber, 5,1) AS TINYINT),2) +
                   CAST(SUBSTRING(@customerNumber, 8,1) AS TINYINT) * 3 +
                   CAST(SUBSTRING(@customerNumber, 9,1) AS TINYINT) * 2 +
                   CAST(SUBSTRING(@customerNumber, 10,1) AS TINYINT) +
                   CAST(SUBSTRING(@customerNumber, 11,1) AS TINYINT) * 3  AS VARCHAR(10)),1)

            --if the number doesn't have these character string in it (including check digit)
            if            @customerNumber NOT LIKE '%00000%'
                    AND @customerNumber NOT LIKE '%666%'
                BREAK -- we are done
         END
    GO

    Pretty simple, just loops until a good number is found. Not looping through rows, but looping through sequence values, which is very fast and can be done by many simultaneous callers with almost no contention.  If contention is an issue, you can choose your caching for a sequence object, which can avoid a couple of writes when it persists that you have burned through the previously cached values.

    So now, the code simply says:

    DECLARE @CustomerNumber CHAR(10)
    EXEC Customers.Customer$getNextCustomerNumber @CustomerNumber OUTPUT

    Rather than that loop. But I can’t use that in a natively compiled procedure, so we create an interpreted procedure that calls this procedure, then calls the native procedure:

    CREATE PROCEDURE Customers.Customer$CreateAndReturn
    @FirstName NVARCHAR(30),
    @LastName NVARCHAR(30),
    @MiddleName NVARCHAR(30),
    @EmailAddress NVARCHAR(200)
    AS

    BEGIN
        SET NOCOUNT ON

       --see if the customer exists… We don’t do updates here
        DECLARE @customerId INT = (SELECT CustomerId
                                    FROM  Customers.Customer
                                    WHERE EmailAddress = @EmailAddress) --we are assuming validation is done elsewhere

        IF @customerId IS NULL
            BEGIN
                DECLARE @customerNumber CHAR(10)
                EXEC  Customers.Customer$getNextCustomerNumber @customerNumber OUTPUT

                       
                EXEC @CustomerId = Customers.Customer$SimpleInMemCreate
                @CustomerNumber = @CustomerNumber,
                @FirstName = @FirstName,
                @LastName = @LastName,
                @MiddleName = @MiddleName,
                @EmailAddress = @EmailAddress

            END

          RETURN COALESCE(@customerId, -100)
    END
    GO

    I haven’t added error handling just yet, but this is nearly the final version. The procedure to do the actual insert is just a simple insert using native compilation:

    CREATE PROCEDURE Customers.Customer$SimpleInMemCreate
    @CustomerNumber CHAR(10),
    @FirstName NVARCHAR(30),
    @LastName NVARCHAR(30),
    @MiddleName NVARCHAR(30),
    @EmailAddress NVARCHAR(200)
    WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
    AS BEGIN ATOMIC WITH
    (
    TRANSACTION ISOLATION LEVEL =  SNAPSHOT, LANGUAGE =  N'us_english'
    )
            DECLARE @customerId int
                INSERT INTO Customers.Customer
                        ( FirstName , MiddleName ,LastName ,
                            CustomerNumber ,  EmailAddress,
                            RowCreateTime, RowLastModifiedTime
                        )
                VALUES  ( @FirstName , @MiddleName ,@LastName ,
                            @CustomerNumber ,  @EmailAddress,
                           SYSDATETIME(), SYSDATETIME()
                            )
                SELECT @customerId = SCOPE_IDENTITY()


          RETURN isnull(@customerId, -100)
        END
    GO

    So we get the benefits of the compiled procedure (if there is any in the actual case, my demo code is fairly simplistic) coupled with anything in the interpreted code that could not be done in native mode.

  • Speaking on 7/25 for the Nashville SQL Server User Group: How In-Memory Database Objects Affect Database Design

    So I have blogged about it, and I have prepared for it, and next Friday at lunch time I will be unveiling my new presentation. The location/other details can be found here: http://nashville.sqlpass.org/Home.aspx, but the abstract is:

    How In-Memory Database Objects Affect Database Design
       

    With SQL Server 2014, Microsoft has added a major new feature to help optimize OLTP database implementations by persisting your data primarily in RAM. Of course it isn't that simple, internally everything that uses this new feature is completely new. While the internals of this feature may be foreign to you, accessing the data that uses the structures very much resembles T-SQL as you already know it. As such, the first important question for the average developer will be how to adapt an existing application to make use of the technology to achieve enhanced performance. In this session, I will introduce the concepts of In-Memory Database objects, discussing how the design is affected by the new technology. As a basis for the presentation, I will start with a normalized database, and adapt the logical and physical database model/implementation in several manners, performance testing the tables and code changes along the way.

    It is not exactly what I have envisioned for the presentation for the slightly distant future, but I am pretty pleased with where it is right now. I decided that since this was such a new feature, it is very likely that people would not be well enough acquainted with the subject for me to ignore the introductory aspects. So while I originally planned to dive right in, I have added a lot of introductory material to explain the features enough first to make sure that the design aspects I will cover make sense no matter your level with the in-memory features.  I plan to use the same format with some flexibility if I do this for a SQL Saturday later this year, and certain so when I do the presentation at Devlink. Luckily at Devlink I have another 15 minutes to work with, so 15 more minutes of code comparison will hopefully fit the needs of the more programming oriented attendees at Devlink.

    Of course, I am not done blogging about a few additional aspects I have come up with, but with a week and a half to go before I present it, more writing on the presentation has been the goal.

     

  • How In-Memory Database Objects Affect Database Design: Uniqueness

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!)

    The first issue that arose in my tests was with the new concurrency model. I started out with the stored procedure I will show later in the entry. It sees if a customer exists, and if not create it. The table is the customerAddress table, with the following structure:

    image

    With on-disk structures, I have a uniqueness constraint on the Address columns, along with the customerId. The customerId is the first column in the constraint declaration, so the performance was great. I don’t use a transaction or error handling (yet) because I want to simulate what a normal application might do. I put the code in stored procedure form because, well, it will just be easier to manage for me. (Plus we can change that code to be proper a lot easier, and when I build the natively compiled procedures, things will be easier to simulate.

    Basically, the gist is, look up the address using all of the address columns that are in the uniqueness constraint. If you get a customerAddressId, return it, if not, create the customer and then return it, using the alternate key values:

    CREATE PROCEDURE Customers.CustomerAddress$CreateAndReturn
    (
        @customerId INT,
        @Line1  NVARCHAR(30),
        @Line2  NVARCHAR(30),
        @City   NVARCHAR(30),
        @ZipCode VARCHAR(20),
        @USStateCode CHAR(2),
        @AddressTypeCode NVARCHAR(20)
    )
    AS
    BEGIN
        SET NOCOUNT ON
        DECLARE @customerAddressId INT = (
                                         SELECT CustomerAddressId
                                          FROM  Customers.CustomerAddress
                                          WHERE  CustomerId = @CustomerId
                                            AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                            AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                            AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                            AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                            AND  USStateCode = @USStateCode
                                            AND  AddressTypeCode = @AddressTypeCode ) --we are assuming validation is done elsewhere

        IF @customerAddressId IS NULL
            BEGIN
                INSERT  INTO [Customers].[CustomerAddress] ( [CustomerId], [Line1], [Line2], [City], [ZipCode], [USStateCode], [AddressTypeCode] )
                VALUES  ( @CustomerId, @Line1, @Line2, @City, @ZipCode, @USStateCode, @AddressTypeCode )

            SET @customerAddressId = (
                                              SELECT CustomerAddressId
                                              FROM  Customers.CustomerAddress
                                              WHERE  CustomerId = @CustomerId
                                                AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                                AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                                AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                                AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                                AND  USStateCode = @USStateCode
                                                AND  AddressTypeCode = @AddressTypeCode )

            END

          RETURN COALESCE(@customerAddressId, -100)
    END
    GO

    My test data resides in a side database, with a lot of duplicated customers and addresses. The goal is to do a good number of reads, but not make the code so complex that we start to deal with slight variances in addresses. These are the kinds of things one might do in an ETL process, or certainly as a background process in their OLTP system.  For this example, I want to avoid duplication, but only if it is 100% duplicated.

    The code is called in batches of four inter connected sets of procedures. Each returns a surrogate key via a return statement (it isn’t 100 finished, as I haven’t really handled stuff like –100 being returned, even though it really couldn’t occur with my current data.):

    DECLARE @CustomerId INT
    EXEC @CustomerId = Customers.Customer$CreateAndReturn @firstName = 'Tyler', @MiddleName = 'R', @LastName = 'David', @EmailAddress = 'Tyler@email.com'

    DECLARE @CustomerAddressId INT
    EXEC @CustomerAddressId = Customers.CustomerAddress$CreateAndReturn @customerId = @CustomerId,@Line1='9 Gordon Highway',@line2='Apt 296',@city='Buffalo',@ZipCode='81254',@USStateCode='MO',@AddressTypeCode='Office'

    DECLARE @SalesOrderId int
    EXEC @SalesOrderId = Sales.SalesOrder$Create @CustomerId=@CustomerId,@CustomerAddressId=@CustomerAddressId,@SalesOrderStatusCode='New'

    EXEC Sales.SalesOrderItem$Create @SalesOrderId=@SalesOrderId,@SalesOrderItemNumber=48904,@Quantity=3.6386,@UnitPrice=14.57,@ProductPriceId=3779
    EXEC Sales.SalesOrderItem$Create @SalesOrderId=@SalesOrderId,@SalesOrderItemNumber=98015,@Quantity=3.0596,@UnitPrice=5.31,@ProductPriceId=1043

    I have 4 files of these stored procedure batches, each file having 10,000 of batches in it. I start them all at the same time using my WaitFor Sync FrameWork I blogged about a few year’s back. When I was using the on-disk tables, not once did this code fail to work, and not once did I get an error, even though I certainly could have if a duplicated row was inserted:

    DECLARE @customerAddressId INT = (
                                         SELECT CustomerAddressId
                                          FROM  Customers.CustomerAddress
                                          WHERE  CustomerId = @CustomerId
                                            AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                            AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                            AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                            AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                            AND  USStateCode = @USStateCode
                                            AND  AddressTypeCode = @AddressTypeCode ) --we are assuming validation is done elsewhere

        IF @customerAddressId IS NULL
            BEGIN
                INSERT  INTO [Customers].[CustomerAddress] ( [CustomerId], [Line1], [Line2], [City], [ZipCode], [USStateCode], [AddressTypeCode] )
                VALUES  ( @CustomerId, @Line1, @Line2, @City, @ZipCode, @USStateCode, @AddressTypeCode )

    Time is the enemy. Time passes between the fetch of the CustomerAddressId and the insert. If another connection inserts the row in those moments between these statements, you could get a duplicate key error. Which I accepted as a possibility. If I finished this code for on-disk, I would handle that error by refetching the key. No problem.

    When I changed my table to an in-mem table (which I will talk more about the actual structure in a blog or two), I immediately got duplicated data. Why? Three reasons.

    1. The in-memory code ran 300% faster with little optimization.

    2. No UNIQUE key constraint. I put a hash index on the customerId so the WHERE clause would use a simple seek operation, but with only a PRIMARY KEY constraint, there was nothing to stop the duplicate

    3. The no locking optimistic concurrency control. (I won’t go into too much detail, but read this if you haven’t yet: http://sqlblog.com/blogs/kalen_delaney/archive/2013/10/20/sql-server-2014-in-memory-oltp-hekaton-whitepaper-for-ctp2.aspx)

    Each of these factors figured into the issue. Since the code is faster, it is more likely that we will have collisions. This was certainly a factor, as in my first tests, I forgot to index the customerId, and the code ran twice as slow as the on-disk version, and there were no collisions.

    Without a uniqueness constraint, the rows will get created with no issue, even if you accidentally get duplicates to create. The lack of constraints is one of my least favorite parts of the whole in-memory structures.

    Using Multi-Valued Concurrency Control (MVCC), even while the row is being inserted (or really, as long as the transaction has not been committed), other connections can execute the search for the address and get back no results. In the default isolation level, access is in the basic snapshot isolation level. The second connection sees how the table is when they start the transaction.  Even if I escalated to REPEATABLE READ or SERIALIZABLE, it wouldn’t eliminate duplication, since the new row’s uniqueness is defined as a surrogate key and there would technically be no collision. Is using a surrogate key the best solution? It isn’t feeling like it for these tables, but you know what, this is the norm for most people. I could easily just chuck the primary key on the natural key here, and let the identity be the possible (and far less likely) duplication point. But that seems like cheating, and what if a table has 2 natural key choices, or 3, or more. You would be stuck again with this issue (and it will not be easy to solve.)

    Fortunately, this code here is going to fail IF multiple rows are committed by the time it executes:

    SET @customerAddressId = (SELECT CustomerAddressId
                                      FROM  Customers.CustomerAddress
                                      WHERE  CustomerId = @CustomerId
                                        AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                        AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                        AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                        AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                        AND  USStateCode = @USStateCode
                                        AND  AddressTypeCode = @AddressTypeCode )

    Since a subquery can only return 1 value and not cause an error. But with MVCC, this too could be an issue if we want to put it in a transaction and rollback on an error.  Sadly however, it will not be so easy to fix because it would only fail if the transaction with the insert has been committed.

    As I start to think about how to solve this problem, the solution is going to have to be very different than with disk based tables. There, if I threw a transaction around the search and the insert (and possible a few indexing hints to block multiple readers), I could stop the duplication easily. However, in all of the MVCC isolation levels, readers are never blocked. I could use an application lock to single thread the inserts, but that would definitely not be the most efficient operation, and it would not work at all in managed code.

    In reality, there is another more common solution (as I expect a lot of people wouldn’t even have the alternate key on the address to start with). Just give in and accept the duplicates as part of high performance business. We can minimize the damage by changing the two subqueries to:

    SET @customerAddressId = (SELECT MIN(CustomerAddressId)

    So we always try to use the first one. With a low cardinality table like a customer’s address, you will only get back a few rows at most, so the aggregate isn’t going to be terribly costly. And you build cleanup routines to reallocate duplicated addresses. This is something that you will need to do with the data in any case, since it is altogether possible that the user types: '9 Gordon Hwy' instead of '9 Gordon Highway', and you get duplicates you need to clean up anyhow. My current times to enter the 40,000 rows across 4 connections using the command line SQLCMD interface (the files crashed SSMS!) is around 12 seconds.

    When I get back to refining the code tomorrow, this is going to be my new tack for this (and perhaps the customer create) procedure. Minimize duplication, but go ahead and allow it. That problem is for the ETL Architect to deal with. (Now if much of my daily job wasn’t writing ETL, that would actually be comforting.)

  • How In-Memory Database Objects Affect Database Design: Or does it?

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!) 

    Before I start describing what I have discovered so far along the way, let's make a few educated guesses. I have my data model from the previous blog entry (here), and I have a plan of attack for entering data. Basically, I plan to take the "normal" way that people deal with data like this, and enter data into each table separately, outside of and overarching transaction.

    So I will create or retrieve, the Customer first, then the CustomerAddress, then I will create the SalesOrder and ever how many SalesOrderItems that the customer (or my data generation using RedGate's Data Generator!) wants. I may dress it up over time, but for my initial experiments, this is the plan.

    For a database with UNIQUE constraints on alternate keys, and FOREIGN KEY constraints on relationships, this is all pretty easy and safe. I know I can't duplicate a customer, or violate foreign key constraints. How this will play out in the code is still just a bunch of guesses, with my only goal to basically be to not have the ETL architect trying to decide if running me over with a car is punishment enough for me introducing more bad data to deal with.

    But the title of the session ends in "...Database Design". The code isn't database design. Rather, as an architect, it is important to realize that the database design affects the code. So will we need different data structures or will the normal normal model suffice? My initial feeling is that the logical model doesn't change. The actual implementation details (such as the lack of foreign, unique, and check constraints) will be a difference, and the hardware/implementation layer changes immensely, and this affects your code in ways you had best understand.

    I "think" that what I determine will be that the basic data architects output remains rather constant. The rules of normalization (briefly described here), don't change at all. You still design the tables and columns based on the same factors of cardinality you would before. The physical model will be the same, and if your data modeling tool supports the in-memory structures, it is just a matter of choosing which tables should be in-mem and which don't really merit it, followed by a lot of testing (a lot).

    For my first tests, I only made the four "hot" tables in-mem:

     

    This design will work fine when dealing with interop code, even if I want to implement data integrity checks to the domain tables. If I want to use native code, then all of the tables used will need to be in-memory. Are data integrity checks that important? Well, they are to me, but not so much to a lot of folks who trust their other layers to get it right. My scenario, where all of the domain tables are "static" enable this scenario to work just fine. As long as ProductType never changes, and the other code layers have only the right values, you can easily say "this works" (as long as it does... and your code has been tested for what happens if anything crashes on any given statement to the database...which is not an easy task.).

    In my next post, I will share at least one of the effects the change to in-memory code has had on the code I am using, and what I am considering doing about it.

  • How In-Memory Database Objects Affect Database Design: The Logical and Physical Models

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!)  

    Now that I have the conceptual model I want to work with (and bunch of "life" tasks out of the way), I head now into fleshing out the model. I go ahead and add attributes, along with domains that I will use to build my actual database.  The model continues on with a minimalist approach, just the basics that are needed, nothing more, nothing less. Some times, names, etc;

     Logical Model

    As this is a very simple model, I won't drag it out anymore than I need to, and I am going to go ahead and make this a physical model:

     

     

    Basically, I set datatypes, added domain tables for some of simple domains (to make doing data checking easier when (if?) I build manual foreign keys in the code, since in-memory databases don't support the concept), and added row create and modify times to every editable table. (The four tables Customer, CustomerAddress, SalesOrder, and SalesOrderItem that I will be loading as part of my stress testing.) Nothing too fancy, but at the same time, not so simplistic that I felt like it would lose any and all meaning.  The other tables are not really "transactional", so I will be loading them all at once as a configuration step. My lack of adding physical columns like row modify times isn't a suggestion that it isn't needed at all in a real system (who added that $.10 fare from Boston to LA?), but rather I wanted to keep it as an indication that I wasn't planning on dealing with that here. The four major tables will be loaded at full speed ahead, and as fast as we can take orders from new and existing customers, we will. All other data will be simply configuration data, for me.

    On the other hand, if I was going to do a full day test, a process that might be interesting to include would be a price hike. Who chooses the price? The database code, or the front end? If the price changes while the client has the old price in hand, should the transaction fail and say "Sorry, our prices went up, try again! Thank you come again." or just accept the previous price?  And then, how much time might acceptably pass before the fail notice comes up? Could that be hacked by a craft 10 year old to get pricing from a year ago? Discussions like this is why actual design sessions take simply forever to complete, and are very boring to people who strictly code... It is also why we are avoiding the topic here, for the most part.

    In terms of our implementation, one of the more interesting columns to deal with are going to be the CustomerNumber, and the SalesOrderNumber. Generating unique values for customer consumption is a fairly simple concept, but it will change when we start working with in-memory code, certainly for the 2014 version.

    The next steps (which are already underway, but still looking for interesting things to blog about along the way), are already underway. In the next entry in this session writing process, I will discuss my methods of testing the structures, the issues I have seen along the way, and anything else that comes up.

     

  • Speaking this weekend at SQL Saturday 286 - Louisville

    Wow, where does the time go? Hope I see every reader there in Louisville this weekend for a lovely time. Say you live in Alaska and it would be too far to go in one day? Hey, we all have our problems (and don't tell anyone, but I am going to try to finish up video versions of these sessions by the end of the year.. shh.)

    I will be doing my session on Database Design Fundamentals and How to Write a DML Trigger, both sessions I have done quite often of late, and I really like how they work now. I have a bit of a reputation as a slide/code tinkerer and this time is no different, but the biggest thing I have settled on is how I deal with presentation like these, where honestly the subject is more than a 1 hour session can handle. Homework.

    I will cover the fundamentals of whatever topic in slides, and then demo as much code as I have time, but the rest is homework. I got the idea when I was prepping to speak to the Richmond User Group, as I was flip flopping between showing some slides and showing code. I realized that the slides would explain the fundamentals better than slogging through the internals of the code directly, and that when I attend sessions with a lot of code, all I am sitting there thinking is "man, I want that code".  So I try to comment the code to make it self explanatory, run the code before each presentation, enhance it as I have time to, show you enough of the code to get you started, and then give it to you to play with (and if you want to pick it apart, email me at drsql@hotmail.com, I LOVE constructive criticism.  

    I post all of my slides and code on my website (http://drsql.org) not just because it is part of a presentation, or to make myself feel cool, but mostly so I can reference it as I need it. I use my code snippets all of the time when coding, and I try to keep them up to the version I am using (or writing about as it were.) So hopefully, I see you and get to explain the fundamentals, then the (rather wordy at times) slides are there for reference, and the code is there to get you started practicing on your own. Homework.

    The abstracts for this weekend's presentations: 

    Database Design Fundamentals

    In this session I will give an overview of how to design a database, including the common normal forms and why they should matter to you if you are creating or modifying SQL Server databases. Data should be easy to work with in SQL Server if the database has been organized as close as possible to the standards of normalization that have been proven for many years. Many common T-SQL programming "difficulties" are the result of struggling against the way data should be structured and can be avoided by applying the basic normalization techniques and are obvious things that you find yourself struggling with time and again (i.e. using the SUBSTRING function in a WHERE clause meaning you can't use an index efficiently). 

    How to Write a DML Trigger

    Triggers are extremely powerful and useful (if somewhat rarely needed) objects that are coded very similar to a common stored procedure. Yet for their similarity, there are some very important differences that need to be understood. In this session, I will walk through what goes into writing a robust DML trigger, starting with a simple version of a trigger, and working through some very useful applications of DML Triggers.. 

     

  • Speaking this week at Richmond SQL Server User Group

    Thursday night, at 6:00 (or so) I will be speaking in Richmond (http://richmondsql.org/cs2007/ ), talking about How to Implement a Hierarchy using SQL Server. The abstract is:

    One of the most common structures you will come across in the real world is a hierarchy (either a single parent "tree" or a multi-parent "graph"). Many systems will implement the obvious examples, such as a corporate managerial structure or a bill of materials. It turns out that almost any many-to-many relationship can be used as a hierarchy to find the relationship of parent rows to child rows (for example, the relationship of actors to movies they’ve been in). In this session, we’ll discuss several different implementations of hierarchies that you can use to optimize your hierarchy implementations and put into practice immediately, with ready-made T-SQL examples.

    Hierarchies are one of my favorite "fun" topics, as they are interesting for a few reasons. First, they have very common usages that most people come across, and second, they make for fairly interesting example code and performance testing. In my slides (and downloads), I will have examples where I generate several trees, including 3400, 41000, 52000, and even a 1.2 million node tree (which, using my slightly slow generator, took like 16 hours on my i7 laptop) along with 5 rows of sales data for every root node of the tree. It is kind of interesting to me to see how well the different tree implementations behave using each sized tree. I may even get a chance this week to toss the tree structures into in-memory tables and check their performance (but if not, it certainly will be included in what I am going to announce in the next paragraph.)

    The only downside is that (not unlike most of my presentations) I have way too much material for 1 hour (or even 2). So I will be working in the upcoming future (hopefully by Devlink) to put out my directors cut video of this and several other presentations I have that are just too unwieldy for a non-precon sized session. I will officially announce this effort soon (along with a realistic schedule!), but it has been met with many life issues. I had a few weeks set aside for this task, but the weekend I sat down to record videos, I got sick and have had to push things back.

    However, all of the code will be available for download, and my email address is no secret (drsql@Hotmail.com) and I am always happy (if sometimes slow) to answer questions, take criticisms, or paypal payments at that address, so feel free to do either with the code when it is posted at http://www.drsql.org/Pages/Presentations.aspx 


  • How In-Memory Database Objects Affect Database Design: The Conceptual Model

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!)  

    After a rather long break in the action to get through some heavy tech editing work (paid work before blogging, I always say!) it is time to start working on this presentation about In-Memory Databases. I have been trying to decide on the scope of the demo code in the back of my head, and I have added more and taken away bits and pieces over time trying to find the balance of "enough" complexity to show data integrity issues and joins, but not so much that we get lost in the process of trying to actually get rows generated for display.

    To that end, I came up with the following basic model:  

     Conceptual Model

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    We will define customers, define a simple US address with a domain set only for US States (my client here can only sell to the US in any case. We will define Products, and over time, the price can change. I had at one time considered including discounts and offer codes, but it all started to seem to get way far away from what was necessary to complete the task at hand. Start with a set of tables using on-disk structures pre-loaded with a few thousand rows of data, then a several more thousand transactions, and look at the performance.  Then morph the design through a few cycles (both with and without RI, with and without stored procedures. In the end, the hot spot of my design will be two-fold:

    1. Looking up and creating new customers (will I duplicate orders from the same customer? Probably, I figure I may use AdventureWorks data for the loading, though I am not 100% sure.).

    2. Creating new orders

    The rest of the process would just be tedious and harder to visualize for the attendees (and I will have 1:15 at Devlink, and 1 hour if any of the two SQL Saturday's pick up the session, and that isn't much time.)

    If you think this is (or isn't) adequate, I am interested to hear from you.

    The next blog will be the Logical Model, where I will start to fill out the design. I will use these diagrams in the session to demonstrate the design, and it is the process I do anytime I am given a database to design (other than the fact that I get to adjust the requirements to meet the amount of work I want to do!)


     

  • Speaking this weekend at SQL Saturday 277 - Richmond

     

    One of my favorite locations to speak or attend is when Richmond has a SQL Saturday. (though if you are an organizer of another SQL Saturday's I have submitted to, note that I said "one of my favorites" :)). This will be the third time I go to Richmond. I like it for several reasons:

     

    • The people - I have coworkers coming up from Virginia Beach to attend, and from Maryland, and our leader lives in Richmond; I have a cowriter who is also speaking (Jessica Moss), and other good friends who are amongst the speakers and organizers
    • The location - Richmond is a cool city to visit (even if it is just for a day,) and we are having a team gathering after the event
    • The trip - I am driving in from Nashville, starting out tomorrow night, taking the slow road with time to stop and smell the roses.  On the way back I am taking a vacation day and stopping by Dollywood, so that is awesome..

     

    Of course, none of this fits into the "why should *you* care that I will be there" category. I will be speaking on Triggers, a subject that I equally love and loathe. Love because they are awesomely powerful tools to help manage data integrity. I loathe them because they are so misused by many people. That was why I initially put together this abstract.

     

    How to Write a DML Trigger

    Triggers are extremely powerful and useful (if somewhat rarely needed) objects that are coded very similar to a common stored procedure. Yet for their similarity, there are some very important differences that need to be understood. In this session, I will walk through what goes into writing a robust DML trigger, starting with a simple version of a trigger, and working through some very useful applications of DML Triggers..

     

    It is set at the intermediate level so I have done some editing of the introductory material, and will focus more on several sets of examples with plenty of code to download when you head back home and need to write your own trigger.

     

    Will I see you there?

  • How In-Memory Database Objects Affect Database Design: Getting Started

    Weird, somehow I lost text in the middle of this post after it was posted... I created it a few days back and set it to post later... Corrected.

    So I put in the following abstract for Devlink (and may for SQL Saturday Orlando and/or PASS Summit.) I don't know if I will get accepted, but I am pretty excited to work on this topic one way or another...

    "With SQL Server 2014, Microsoft has added a major new feature to help optimize OLTP database implementations by persisting your data primarily in RAM. Of course it isn't that simple, internally everything that uses this new feature is completely new.  While the internals of this feature may be foreign to you, accessing the data that uses the structures very much resembles T-SQL as you already know it. As such, the first important question for the average developer will be how to adapt an existing application to make use of the technology to achieve enhanced performance. In this session, I will start with a normalized database, and adapt the logical and physical database model/implementation in several manners, performance testing the tables and code changes along the way. "

    My plan is to start with a moderately straightforward table design, and implement a set of scripts that create a bunch of data. Starting out with normal on-disk tables using both ad-hoc calls and stored procedures, I will morph the design from just changing some of the tables to in-memory, then change the procedures to use in-memory wherever possible and test any changes we need to make to the physical structures. Another aspect to the process will be errant data. Part of the experiment will be to have a smattering of bad data to simulate real live programmers and users doing real live stuff.

    I think I know the main content already, but I expect to pick up a lot of knowledge between now and then (which I will attribute, naturally!) I have at least 5 months, before I would need to present it, so who knows if I will change my mind. Along the way, I will blog semi-frequently about the progress, including what I have learned, code samples, philosophies, etc. 

     

     

  • Abstract Submission Day for Speaking 2014…

    Today is my least favorite part of the whole speaking thing. It is the day that I put myself out there and wait to see if I get rejected or accepted. Rejection stinks, but at the same time, being accepted means more work. Neither are perfect, and as I stated in my last post, I also have a goal of not getting myself stuck spending 7 hours a day doing work outside of my day job.

     

    I plan to put in abstracts to most of the conferences I expect to attend this year. And when I say attend, I mean I plan to attend pretty much regardless of whether my submissions are accepted.  I have just as much desire to learn as I have to teach, and accumulating knowledge for my ongoing writing efforts can be just as valuable as speaking. (The only downside being that when you don't speak, you don't get to go the speaker dinner, which can be a lot of fun, particularly at a SQL Saturday event).

     

    My goals for this year is to get myself stuck writing a lot of new sessions. I do have a few sessions that are "done" that I would just have to update a bit that I may submit:

     

    Fundamentals of Relational Database Design - Just what it sound like really. Normalization, documentation, etc.

     

    DML Triggers - The basics of writing triggers with examples. I am doing this session in Richmond in 3 weeks at SQL Saturday.

     

    Hierarchies - A deep session that covers the different ways a hierarchy can be implemented in SQL without favoring one method over the other in a manner other than empirically. I plan to base a book on the subject later this year.

     

    Sequences - All about the sequence and its performance characteristics. Not a big hit in the past since 2012 adoption hadn't quite got there when I did it first, perhaps it has come time.

     

    Finally, I do have a new session that I am going to put in and start writing in my spare time. 

     

    How In-Memory Database Objects Affect Database Design - In this session, I will be demonstrating a normalized database design using disk based tables, and then  adjusting the design to make best use of in memory objects. To demonstrate, I will use test data that will simulate multiple connections of orders being taken.

     

    In the biggest change of my writing /speaking career, I plan to start writing this presentation way before it is accepted. I will be starting on this presentation immediately in my free time, blogging about it as I go.  I also have a lot to learn about in memory database structures, so this should reinforce the learning I am and will be doing (more about all of this when I write my introductory blog about it).  Even if it never gets presented, I will certainly be using the material when I put together my next database design book, whenever that turns out to be. 

     

    Unless someone just really wants me to (and I don't know the meaning of the word "no"), I doubt I put in a pre-con session this year. It is a lot of work and makes PASS a crazy time of year. Realistically, I may just attend a pre-con if one strikes my fancy. I am in the middle of building a T-SQL from the ground up pre-con (or larger) class that I am doing for my company (with rights to share secured in the process).

     

    Other than Atlanta, which has a deadline of Tuesday (March 3) for submission, I have a bit of time to change my mind and add/subtract an abstract, but I think that I am going to stick with:

     

    SQL Saturday Atlanta - May 3 - Hierarchies, Triggers, Database Design

    SQL Saturday Louisville - Jun 21 - Hierarchies, Triggers, Database Design

    Devlink - Aug 27 - In Memory DB, Hierarchies

    SQL PASS Summit - Nov 4 - In Memory DB, Hierarchies, Triggers, Database Design , Sequences

     

    Now nothing to do but sick back and wait... And prepare the In-Mem session.. And hey, if none are chosen, it will give me something else to blog about!

  • Holy Cow, It has been a while

     

    Well, it has been over a year since my last insanely optimistic blog about the next year, but this year it is going to be (actually) different (not like the last time I said it would be different)

     

    Usually, like pretty much everyone on earth, I overestimate just how much I can get done in the following twelve months by some impressive amount. Writing, speaking, blogging, exercising, etc etc. Not this year. I am specifically going the other direction, saying I am doing less. Specifically more writing and less of everything else, particularly things with oppressive deadlines that preclude personal activities like exercising, eating right, going to Disney World, seeing the kid and her kids, and perhaps most importantly, sleeping!

     

    (Ironically it is 12:35 am as I write this. Yeah, I will start tomorrow…ish)

     

    I am not giving up speaking entirely, as I do love the challenge and teaching people.  As such, I do expect to put in sessions for Devlink, Atlanta, Richmond, Louisville, for sure, plus probably PASS.  But my goal is to not do what I did last year, which was to put in a bunch of new sessions that get chosen which I then have to write (the fun part), and then practice to death until I am ready to present in front of 5-100 people (the not fun part) and then travel to the conference and speak (the less fun than writing but more fun than sitting at home wishing I was at the conference part.) Of course, I may just pop down to a conference or two and just attend and learn.  That might be fun too.

     

    The biggest thing I plan to write more steadily again. This blog should see more life, and I try to maintain a healthy stream of articles on my What Counts for a DBA Simple-Talk blog.  The What Counts blog is fun because I have to really stretch myself to fill in a topic in a way that is both interesting and informative. Sometimes it works great, and sometimes the editor smacks me in the mouth and makes me start again. In contrast, this blog is great because I do what I want with no editor but myself, though I am annoying to write for at times. With an editor, I can trust someone has looked over the blog that isn't making a bunch of in jokes to himself.

     

    Along the lines of writing, another challenge I love is technical editing. A great way to sharpen your knowledge of a subject is to tech edit a book/article/presentation. It is very much like writing a book, without the need for so much creativity.  Just like when you write you have to fill in the blanks that you don't know/haven't done before, you have to do the same in tech editing. Everything you don't know, you need to look up and validate with other authorities, leading you to learn more and more. And even when you do know everything in the material, for those hours you spend, you take on a role that is incredibly important to the author. Make sure that the material is correct, and help fill in the blanks. I know I have learned a lot from my tech editors over the year, because some pre-conceived notions you have are possibly wrong.

     

    Writing is what I really like to do, because it can be done at my pace. I have 20+ ideas in my OneNote list just to the right of where I am writing, and I just have to fill in the blanks (notably the area between the title and the end of the blog, but that is just work.  I have another 50 ideas for the What Counts blog, and possibly a book idea that I will be working on this year. 

     

    I am definitely getting ready to wind back up the Why We Write series, something I paused around PASS time last year before I had family and health stuff going on that precluded me keeping on. I got 9 entries in last year, (plus one where I interviewed myself), and I would like to get more this year.

     

    Lastly, I do intend to get back to my roots in the SQL Server Forums, something I keep promising I will do and keep getting drug away from.. I really miss the interactive nature of the process of solving immediate problems. Yet, the biggest issue there is that my interest lies in problem prevention by proper design, and it is very hard to do. Even in my day job, too often the situation dictates that things are done subpar for time constraints, or code constraints, or tool constraints (ironically, often this leads to a lack of constraints and a surplus of data issues :))

  • SQL in the City (Charlotte) Wrap Up

    Ok, it has been quite a while since the event, two weeks and a day to be exact, but I needed a rest before hitting Windows Live Writer again. Speaking is exhausting, traveling is exhausting, and well, I replaced my laptop and had to get all of my software back together. (Between Windows 8.1 sync features, Dropbox and Skydrive, it has never been easier…but I digress.)

    There are plenty of great vendors out there, but one of my favorites has always been Red-Gate. I have written half of a book with them, have a blog series on What Counts for a DBA, and have convinced at least one company to purchase their tool belt product for all of the developers/dbas (the company I work for!). I have also spoken on their behalf a few times, this time being the most fun.

    If you didn’t get a chance to attend any of their events yet (full list here), it is a highly recommended experience. It is very much like a SQL Saturday, except perhaps that Red-Gate has a better budget than most, so the venue and food are at least a wee bit better.  At a minimum you generally get two of my favorite speakers: Steve Jones and Grant Fritchey, plus lots of the Red Gate team looking to get and give feedback about their tools. They had two concurrent sessions, and for the most part, about half of the sessions were at least somewhat Red Gate tools specific. I was personally interested in the Red Gate sessions, so I missed the sessions by Nigel Sammy, Kevin Boles, Mickey Steuwe, and Kevin Hazzard.

    Of course, the one non-Red Gate centric session I attended was the one I gave. I was a headline speaker for the conference, which was very gratifying I must admit. I try to seem very humble, but really I have a raging ego that must be served. But seriously, it was very much an honor to have my name just under Steve and Grant who are extremely good, very natural speakers. Lucky for me, they picked from the list of abstracts I submitted, my favorite session: Fundamentals of Relational Database Design (slides available here). The combination of this being a free event, not being recorded, a wonderful energetic audience who were into giving feedback (like the fact that the user it what matters in designing a database!), and a delightful lunch, I had a very nice session indeed. (It was quite well rated even, so for once I felt good and everyone seemed to like it simultaneously.)  I learned later that it was being recorded, so if you want to see it again, or for the first time, I will tweet the address once I get it.

    All in all, I just have to say that it was a great time, and next year I hope to at least attend their events if I don’t’ get a chance to speak again.

    Now I just need to get to blogging a wrap up of the PASS Conference!

  • How to Design a Relational Database; PASS Precon Swag and it’s “Symbolism”

    Update! 10 more books added to the cadre from my friends at Red-Gate.

    With less than a week to go, I am starting to pack up for Charlotte and PASS 2013. I love that it is in Charlotte this year so I can drive and bring along some goodies to give away. Books and toys mostly, a variety of which were chosen rather specifically for some manner of symbolism with a tie in to database design for the most part. (Okay, symbolism is perhaps a bit of a stretch, but I have tied everything, even the goofy stuff, to the meaning of the class!)

    1. 5 printed, and 3 ebook copies of my Pro 2012 Relational Database Design and Implementation. Well, obviously the symbolism here is that I hope the rest of the class each buys 5 copies and gives away to their coworkers so my royalties will buy me more than a model airplane.

    2. Lego Architecture Leaning Tower of Pisa – Last year I gave away the Farnsworth (glass) House because we database folks can’t throw stones because we aren’t perfect either. (Luckily, none of us actually live in glass houses either!). The Leaning Tower of Pisa reminds me that not every masterpiece is perfect. And well, you really need to start with a good foundation!

    3. Lego DeLorean Time Machine – Because, if you had a time machine, up front database design wouldn’t be that necessary.  Just go back and fix the database from the start without rework. Of course we don’t have time machines so, the more right you get it done the first time the better.  (And it is a really cool set!)

    4. Lego Space Needle – For anyone pining for Seattle, they can get this and put it in their hotel room and pretend it is next year.

    5. 5 copies of the book: The Imagineering Workout – A book by some Disney Imagineers, about how to stretch and use your imagination. Imagination is very important when you are designing a database (or any code). If you compare the vision of engineers of early computers (no one will want one of these in our house) to science fiction writers (wrist radio, eh Dick Tracy), who is more right. Neither were particularly right, but look at those companies that flourish, they have a great mix of technology, customer service, and imagination on their side.

    As architects, being able to envision the final product in action is of great help. Being able to code and execute code mentally means you don’t have to write a lot of code to test out every idea you have. Not that your mental query processor is going to find all architecture errors ahead of time, but the more it does eliminate the better.

    6. 1 copy of MVP Deep Dives 1 – A great book by a lot of great authors. I had an extra copy in my swag stash, so it is coming along too.  There is a chapter or two on database design in there penned by myself and Paul Nielsen, so it fits!

    7. Lego Star Wars X-Wing Fighter – Well, no deep symbolism here, and I don’t mention Star Wars in the slides, but Star Wars is cool and a job (or class) in technology ought to be fun.  (And don’t start a debate about prequels in the comments. The prequels were not as great as the original trilogy, but I still like them and the Clone Wars series! A bad day in the the Star Wars universe is better than a day at work in any case.)

    8. 2 Lego Brickleys – I use pictures of Brickley Lego dragons to illustrate the magnitude of design and requirements needed early in the session. Brickley is a great example to me because he comes in three sizes. This tiny size, an order of magnitude greater size (in number of pieces), and the “life” sized version at Downtown Disney in Orlando. The process to design and then explain to someone else how to build these is very similar, but the time and documentation investment is very very different.

    9. A Lego set of a Demolition man blowing something up – Sometimes, you just have to blow things up and start afresh. Many databases out there make Frankenstien’s monster seem like a very natural creature, with bits and pieces bolted on to something that ought to have been dead for years.

    10. 4 small Lego “Swamp Creature” sets – This set is really in there for two reasons. 1 because this is October and 2, to make a simply awful joke that (if you get this set and put it on your desk) should remind you what the job of the data architect really is using a bit of rhyme association.

    11. A copy of Performance Tuning with SQL Server Dynamic Management Views, my book with Tim Ford for Red-Gate, because while in 7 and a half hours (less lunch and breaks), we just won’t have time to spend a lot of time on the physical side of things.

    12. A copy of The Manga Guide to Databases. A fun book with some “interesting” art, but with a lot of decent explanations of database design principles. In some manners, it is very comparable to what I want to do, to strip away the complexities of theory and leave only the practical application. Of course, it is a lot lower level and feels like it was written for a 12 year old, but it was kind of fun too.

    13. 5 copies of SQL Server Transaction Log Management – I have read 1/2 of this book by Gail Shaw and Tony Davis, and it is excellent. I learned quite a few things about the transaction log that I didn’t know (nor had every really thought about :)

    14. 5 copies of SQL Server Concurrency – A book by one of my SQL heroes, Kalen Delaney. Wish she was going to be here at the Summit this year.

    So while it does in fact look like just a bunch of toys and books (for good reason, actually, since that what it is), all of the items were chosen for reasons beyond just that they were cool or fun, to help the person who ends up taking them home will flash back to the class when they got them. Am I spoiling the fun by sharing the list before the session? Maybe, but it will help me to determine how many people read the blog about such things, and well, it will keep me from changing my mind on #3 and #7 (both sets I haven’t bought myself yet and really do quite want.)

    So I hope to see you next Tuesday, and perhaps you get something cool to ship home because your luggage is already too full as it is!

More Posts Next page »

This Blog

Syndication

Links to my other sites

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement