|
|
|
|
-
Haven’t posted anything useful in a while, so I figured I should. This was something I was building for a coworker the other day, and I find the best way to flesh out a solution is to think about how it will work for the thousands who might see this post, and the hundreds that might actually read it. My next post figures to be an updating of my calendar table queries here on my sqlblog.com blog (since I have pretty much abandoned my spaces blog for now. Another new feature I am going to start is to include a download of the pertinent source code for posts like this. So the create table and insert statement will be in a link at the end of the post. One of the posts that I am kind of proud of was my initial post on how to build a date table here. I use that script any time I want to create a date dimension for a warehouse, or for some other application. The last use was a time entry system, where I extended it to include work_week, which instead of being Saturday - Sunday, as a normal week is, is defined as Monday - Sunday (Yeah, that way extra time on the weekend is part of the same week.) But I digress. Recently, the need came up to do groupings of data on the hour of the day, for call center tracking. When I heard the person who would be doing the coding start to say something about using datepart and such, I said, "I know a better way". If you can calculate the number of minutes past midnight, then I could do the rest with relative ease, much like their current experience with the date dimension. In the end, the hardest part of using the date dimension is the calculating of seconds past midnight. Not hard so much that it is really difficult, but hard to explain. Basically you use the old stripping the time off of the datetime trick: DATEADD(DAY, 0, DATEDIFF(DAY, 0, <referencedColumnName>)) This is a neat trick, datediff(day,0 returns a number of days since the base date for the type, either smalldatetime or datetime, and dateadd(day,0 converts the integer to a date value. Quite fast, and you end up with a datetype without having to cast the value (I have to explain this to everyone I introduce it to, so if you already got it, I am sorry). Then use datediff to see how many seconds have passed since midnight: DATEDIFF(mi,DATEADD(DAY, 0, DATEDIFF(DAY, 0, <referencedColumnName>)), <referencedColumnName>) Simple enough, really, just a bit untidy to type. In reality, if this is a commonly used value for a table (particularly if you have a datetime value in one of your dimension tables), you can just add a computed column to your dimension (probably essential to use the PERSISTED keyword to ensure it is only calculated once). alter table <tableName> add <columnName> as (datediff(mi,DATEADD(DAY, 0, DATEDIFF(DAY, 0, <referencedColumnName>)), <referencedColumnName>)) persisted Now, the basics are a table that has a key of the number of seconds past midnight, with the types of attributes you might typically calculate, as usual, it is very likely that you will have some more you might want, or less, depending on your organization. Like if everyone had a fixed lunch time, you could add lunch_time, and include the minutes of the lunch time in it. Then you could compare, say internet bytes/second during lunch versus work hours. First create your table. I have called it time_of_day, using my typical naming convention of having reporting tables named with underscores. I have commented on each column as to its meaning. set nocount on go create table time_of_day ( time_of_day_key smallint primary key, hour_of_day_24 tinyint, --0-23, military/European time hour_of_day_12 tinyint, --1-12, repeating for AM/PM, for us American types am_pm char(2), --AM/PM minute_of_hour tinyint, --the minute of the hour, reset at the top of each hour. 0-59 half_hour tinyint, --1 or 2, if it is the first or second half of the hour half_hour_of_day tinyint, --1-24, incremented at the top of each half hour for the entire day quarter_hour tinyint, --1-4, for each quarter hour quarter_hour_of_day tinyint, --1-48, incremented at the tope of each half hour for the entire day string_representation_24 char(5), --military/European textual representation string_representation_12 char(5) --12 hour clock representation sans AM/PM ) go Then we load the table from --digits gives you a set of 10 numbers 0-9 with digits (i) as( select 1 as i union all select 2 as i union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 union all select 0) --sequence produces a set of integers from 0 - 9999 ,sequence (i) as ( SELECT D1.i + (10*D2.i) + (100*D3.i) + (1000*D4.i) FROM digits AS D1 CROSS JOIN digits AS D2 CROSS JOIN digits AS D3 CROSS JOIN digits AS D4) insert into time_of_day(time_of_day_key, hour_of_day_24, hour_of_day_12, am_pm, minute_of_hour, half_hour, half_hour_of_day, quarter_hour, quarter_hour_of_day, string_representation_24, string_representation_12) --calculates the different values for the time table SELECT i as time_of_day_key ,datepart(hh, dateval) as hour_of_day_24 ,datepart(hh, dateval) % 12 + case when datepart(hh, dateval) % 12 = 0 then 12 else 0 end as hour_of_day_12 ,case when datepart(hh, dateval) between 0 and 11 then 'AM' else 'PM' end as am_pm ,datepart(mi, dateval) AS minute_of_hour ,((i/30) % 2) + 1 AS half_hour --note, I made these next 4 values 1 based, not 0. So the first half hour is 1, the second is 2 ,(i/30) + 1 AS half_hour_of_day --and for the whole day value, they go from ,((i/15) % 4) + 1 AS quarter_hour ,(i/15) + 1 AS quarter_hour_of_day ,right('0' + cast(datepart(hh, dateval) as varchar(2)),2)+ ':' + right('0' + cast(datepart(mi, dateval) as varchar(2)),2) as string_representation_24 ,right('0' + cast(datepart(hh, dateval) % 12 + case when datepart(hh, dateval) % 12 = 0 then 12 else 0 end as varchar(2)),2)+ ':' + right('0' + cast(datepart(mi, dateval) as varchar(2)),2) as string_representation_12 FROM ( SELECT dateadd(minute,i,'20000101') AS dateVal, i FROM sequence AS sequence WHERE i between 0 and 1439 --number of minutes in a day = 1440 ) as dailyMinutes Go You can see the values by using SELECT top 5* FROM time_of_day returns: time_of_day_key hour_of_day_24 hour_of_day_12 am_pm minute_of_hour half_hour half_hour_of_day quarter_hour quarter_hour_of_day string_representation_24 string_representation_12 --------------- -------------- -------------- ----- -------------- --------- ---------------- ------------ ------------------- ------------------------ ------------------------ 0 0 12 AM 0 1 1 1 1 00:00 12:00 1 0 12 AM 1 1 1 1 1 00:01 12:01 2 0 12 AM 2 1 1 1 1 00:02 12:02 3 0 12 AM 3 1 1 1 1 00:03 12:03 4 0 12 AM 4 1 1 1 1 00:04 12:04 To use the values, I will create a test table, in this case to simulate a phone call for a call center: create table testPhoneCall ( TestPhoneCallId int identity primary key, callType varchar(10), callTime datetime --the point in time of the call ) Obviously a real table would have more columns with more descriptive information, but this will do for our purposes. Using the following insert, I will load the table. Basically I am trying to set about 70 percent of calls to incoming, and then at some random time of day, within 2000 days of 20010101. The go 1000 will load the table with 1000 rows. insert into testPhoneCall(callType,callTime) select case when rand()> .7 then 'Incoming' else 'Outgoing' end ,dateadd(minute, rand() * 1439, dateadd(day,(rand() * 2000),('20010101'))) go 1000 Taking a look at the data select top 10 * from testPhoneCall You should see something along the lines of: TestPhoneCallId callType callTime --------------- ---------- ----------------------- 1 Outgoing 2002-10-23 06:46:00.000 2 Outgoing 2004-03-23 14:27:00.000 3 Outgoing 2001-01-13 11:45:00.000 4 Incoming 2002-04-29 03:41:00.000 5 Outgoing 2005-06-30 01:19:00.000 6 Outgoing 2002-06-30 13:15:00.000 7 Outgoing 2005-11-13 06:03:00.000 8 Outgoing 2003-01-01 04:56:00.000 9 Outgoing 2004-09-04 12:26:00.000 10 Outgoing 2005-09-28 08:47:00.000 To use this data with our time_of_day table, we need to get the number of seconds past mighnight the data is. Using the trick to get the date without time: DATEADD(DAY, 0, DATEDIFF(DAY, 0, <datevalue>)), I datediff by minutes the timeless date, and the date value with time. You can do this in a query, or a view, or in my case, I am choosing to add this to the table as a computed column (which is what I am doing with my data warehouse table that sparked this need: alter table testPhoneCall add callTimeOfDay as (datediff(mi,DATEADD(DAY, 0, DATEDIFF(DAY, 0, callTime)), callTime)) persisted --persisted means it calculates at save time, not run time Now, we can join to the time_of_day table with our new callTimeOfDay columm, and find out how many calls were in the AM, and which were in the PM select time_of_day.am_pm, COUNT(*) as numCalls from testPhoneCall join time_of_day on time_of_day.time_of_day_key = testPhoneCall.callTimeOfDay group by time_of_day.am_pm And since we used a random number, it should be about even, and it seems to be: am_pm numCalls ----- ----------- AM 494 PM 506 And you can mix in data from your table, like the call type, and then see what percentage of calls were in the which half of the hour and if they were incoming or outgoing calls: select testPhoneCall.callType, time_of_day.half_hour, COUNT(*) as numCalls from testPhoneCall join time_of_day on time_of_day.time_of_day_key = testPhoneCall.callTimeOfDay group by testPhoneCall.callType, time_of_day.half_hour This returns: callType half_hour numCalls ---------- --------- ----------- Incoming 1 157 Outgoing 1 352 Incoming 2 155 Outgoing 2 336 It is a simple as that. Keep in mind that using a table to do time calculations will not always be faster. Using the functions can be faster, though whenever you are doing groupings/joins a time/date table will often be faster but will always be more convienient. To download the code for creating and loading the time_of_day table, follow this link: timeTableCreateAndLoad.sql
|
-
I was tagged by Glenn Berry in the meme that Paul Randal started a few days ago, so I need to tell my story to keep it going. So here goes. A Girl, A Guy, and a Tennis Racket When I was a young fellow, I liked a girl. She liked me. Unfortunately (at the time) and fortunately (in clear hindsight, knowing what I know now), the definition of like was much much different. Eventually, she introduced me to her boyfriend, who became my best friend throughout college. I met my wife through a friend of his (whom she was dating at the time.) His friendship changed me from a super introverted geeky guy who would probably have been first on the list to be interviewed if some incendiary device was used on my high school to a simple introverted nerd. He got me my first job in IT working with databases, and really helped me see how computer science was the place to be (though he wouldn’t consider what I do “programming” as last I remember he was a hardcore C++ programmer). He wasn’t like a perfect influence of course, and we spent many school hours playing Gauntlet (the video game) and/or Tennis (the real sport), and later when I was finishing my degree, and he was finishing his Master’s, golf. And honestly I spent more time with him than I did with my own family at times, causing a lot of grief. We lost touch due to some “stuff” that happened, and I haven’t seen him in 17 years now, but honestly his influence on me continues today. A Move to a Beach The day I lost touch with him was the day we moved from Tennessee to Virginia Beach to work for CBN (whom I have worked for since then except for a short stint at a dot-com in Nashville. This was the day my family truly came together. We moved from our families, friends, etc, and started to only influence one another. The first years there were the best years in many ways, as it was before my wife or I got important/time consuming day or night jobs. So in part it was just a great thing for us as a family. The other half of the equation was that I was given an amazing chance to learn and grow. I started as a DB developer in Cleveland, TN, with a guy who was pretty good, so I was a solid newbie. But CBN let me go to several database design/programming classes, and was instrumental in encouraging us to speak at conferences. Another factor was that several people I worked with who really knew relational databases and I was able to really educate myself in ways that are often impossible for some people. Add to that spending a few nights and a few weekends doing this stuff over time, and my career was given a great push… Being Critical of Other People’s Work Led to Other People Being Critical of Mine About 11 years ago, I was attending one of those conferences where I was speaking (which was due to my bosses influencing me to speak) I met the guys at WROX press, who needed tech reviewers. I got involved and by doing a good job, in an email exchange with the editor, something was said like “would like to work with you again” and I said “Sure, and if you ever get a book on database design, I would be more than happy to review it”. The reply was “Good idea. Why don’t you write it.” The rest is painful, glorious, painful, painful, wonderful history. Looking back on the first version of my book, you can see that I didn’t know enough to write a book. Stuff like terminology was all over the place, and some of the stuff in the book was not super. But it wasn’t bad either. Granted it took me 9+ months working 30+ hours a week (plus quite a few weeks of vacation, though I didn’t miss any of my daughter’s school activities!) to finish the book, and the brutality of the comments from tech reviewers was very painful at times (some were pretty snotty, actually, but in retrospect most were spot on.) That book was that last turn in the road that delivered me to where I am today (which is good, because every time I think about how long it took, I realize that I could have went back and worked for Wendy’s and made a lot more money than I did on the book!) So am I suggesting you all go out and write a technical book? No way, and it has nothing to do with not needing the competition (well, not *everything* to do with it). The fact is, unless you are truly gung ho and ready to spend lots of time on a book, you will hate yourself. I forget where I first heard it, but some famous person was quoted to have said that it is a wonderful thing to have written a book. It is like being a parent in many ways. It is great to have a 20 year old daughter. And while the process up to that age was a labor of love, it was always a little painful, worrisome, scary, etc, etc. Worth it? Of course. But if I woke up tomorrow 20 years ago knowing what I know today, after I put a bed down on Superbowl 24, I would be pretty frightened. Back then it was an adventure, just like every camping trip you go on when it is sunny when you leave and raining actual no fooling cats and dogs by the time 10 minutes after you are in your tent, which promptly blows away. I can sum up my advice in two sentences: 1. “The only thing worse than writing that book you have in you, is not writing it.” 2. “If you can think of nothing you would rather do than write a book, then write it, otherwise run away” Runner’s up Mostly because I didn’t include my wife or child in the list as much as they deserved (if it was a top 100, 50+ would include some action from them), I need to include my list of runner’s up items. There have been so many other events that made me what I am, and I wouldn’t have survived without my wife and child, who are my best friends of the past 17 years. Barely a day goes by that I speak with the child (who isn’t a child anymore, as I previously mentioned) and certainly none goes by without talking to my wife unless one of us is REALLY busy. So that list includes, but is not limited to: Day I got Married; Day I became a Father; Day I became a Grandfather, Day I first ate hot chicken; Day I become an MVP; Flunking out of Engineering; Give up on becoming a Mathematician; Working with Paul Nielsen; Meeting Arnie Roland; Meeting Kevin Kline; Meeting Joe Webb; First PASS Conference; Day I ate my first McRib. Honestly, these are almost as important as those three, but in terms of being influential to my career not quite as much. I will tag Paul Nielsen, because working with him has really elevated my game and speaking abilities, and I wonder what got him here. Also Kevin Kline, because with a name like his, I should get a few more hits to my blog. Plus, I will bet his are pretty interesting
|
-
Darn it, when I get my reviews back from a conference, I want them to say “Perfect”, “Wonderful”,”The best conference I have ever gone to, bar none”, and no matter how many times I offer to fill in the evaluation forms from people, well, let’s just say that there are always a few people who don’t say these things. I saw Grant Fritchey’s post, and started mine, then saw Marco Russo and Andy Leonard’s while mine stewed overnight… So I am clearly not the pioneer here. In reality, this was probably my best set of evals ever. One year, I was the fifth lowest in the list. Fifth…lowest. I was very much on the verge of quitting at the time. But frankly my friend Wayne Snyder gave me the best advice ever. Who cares what they think, just speak about what you care about. (At least I think it was Wayne, we have had a few memorable conversations over time, and I think we were standing in the front of a bus at the time, but I digress). As a person of (ahem) alternate shape, I have a lot of self consciousness pent up left over from my high school years, and to be honest, I often get that fear of people laughing at me going around and around in my head. In the end I realized that I wanted to try to end the poor database craptacular explosion that (from the examples I have seen) has been spreading throughout the world. Imagining that the audience is all dressed in bright blue bunny eared parkas doesn’t hurt either. In the end I know I write better than I speak, and one comment was “'Work on the delivery; otherwise, good content.”. I am just glad this person wasn’t leaving comments five years ago :) The session was database design, and on a five scale, the ratings this time were (giving a 1 to “very poor” and a 5 to “excellent”: | How would you rate the usefulness of the session information in your day-to-day environment? | How would you rate the Speaker's presentation skills? | How would you rate the Speaker's knowledge of the subject? | How would you rate the accuracy of the session title, description, and experience level to the actual session? | How would you rate the amount of time allocated to cover the topic/session? | | 4.475 | 4.675 | 4.85 | 4.7 | 4.525 | Not shabby, and I am certainly happy that the highest score was for knowledge :). A few “poor” ratings, and one very poor (a 1) for the usefulness..Fair enough. Lots of excellent. I am never quite sure what people are expecting from a topic of Database Design, and I generally talk at a high level, briefly cover normalization, give some examples, and “bam!” an hour is gone. I tried to do more in the past, but we always got hung up and never made enough progress. At SQL Saturday events (which I have done two, and am doing the Richmond event Jan 30) I add a patterns session where I talk about the patterns I like to use to solve common problems. Still high level stuff, admittedly. I have commonly used a lot of jokes/lists to keep the mood light and keep people from leaving the room after they realize they had gone to sleep and were drooling on their conference materials. (That is really not cool!) And to be honest, the old adage of making be laugh as a deflection is quite a useful tool. But, a couple of the comments were: "'Funny but a lot of stop and go." and "I would get rid of our 'lists.'" So they are gone! Okay, not gone, but I have been thinking of de-emphasizing them for a while. I am possibly doing a keynote later this year and I want to replace some of the lists with a more sermon-esque half-rant about the value of database design. I certainly am not going to lose the humor, but lowering the content will be good. In any case, thank you to everyone who fills out your evals at conferences in a reasonable manner and leaves useful comments. We do read them, and we listen when they are not shouting/whining/insane. Like the person who commented “I hate to give all 5's, but Louis deserves them.”, that is the kind of constructive criticism a person likes to hear. In any case, hope to see you in November, or at least sometime soon at an event.
|
-
My family used to send out Christmas cards with a yearly report of "where we are from whenst we came from" report, but some years we depressed people, but too often it sounded like bragging. This year I could neither brag nor depress, for it has been a pretty even Steven kind of year. Some medical stuff, but had a really good year SQLwise. Kept the MVP, wrote some good blogs that will make up a chapter of my next design book, and finally hit a nice stride when speaking (big thanks to Paul Nielsen, as speaking with him has really helped me see what I did right (and wrong!) But here at the end of the year, it is definitely the custom to send out cards to people. As a guy, I am not really into the whole “physical” card thing, but I like writing SQL. So I figured I would combine the greeting thing with the SQL thing. The following script, when run, will, Mad Libs style, spit out a personalized holiday message from me to you. A little personal, a little SQL. If you are not so equipped to run the script, email me at louis@drsql.org and I will translate. Of course, I am more than happy to take any other types of messages there too (other than ones that pertain to my lottery winnings or any sort of personal products to enhance my life by emptying wallet. Those are right out.) For those who lack the desire to spend the effort, Merry Christmas, Happy New Year, etc. I love this community of SQL folks as a really interesting extended family (frankly I see a lot of you more than my family at various events :) But hey, if I was gonna say much more then the following SQL would be a waste of time, huh. I will note, if you say you don’t celebrate Christmas, but you do believe in Santa Claus, you won’t cause your computer to burn up with a paradox, but I did default it to a lack of belief in Santa, cause frankly, I figure the largest contingency of my readers are the ones who have to buy the gifts, not wonder where they came from :) See you in a week with my SQL New Year’s Resolutions! declare @name varchar(30) declare @celebrateChristmasFlag bit declare @presentYouWant varchar(100) declare @believeInSantaFlag bit set @name = 'Friend' --<Your name here set @celebrateChristmasFlag = 1 set @presentYouWant = 'Quad Core Netbook with eternal battery' set @believeInSantaFlag = 0 set nocount on select 'Hello ' + @name +',' union all select case when @celebrateChristmasFlag = 1 then CAST(0x4D65727279204368726973746D61732C204861707079204E6577205965617221 as varchar(40)) else CAST(0x536561736F6E73204772656574696E677321 as varchar(30)) end union all select '' union all select CAST(0x492077616E7420746F207769736820796F7520616E6420796F7572732061207665727920686170707920686F6C6964617920736561736F6E as varchar(60)) union all select CAST(0x66726F6D20616C6C206F6620757320617420647273716C2E6F7267202877656C6C206974206973206A757374206D652C207265616C6C7929 as varchar(60)) union all select '' union all select CAST(0x5468616E6B20796F7520666F722072656164696E67206D7920626C6F672C20616E64204920686F706520796F752077696C6C20636F6E74696E7565 as varchar(60)) union all select CAST(0x746F207265616420696E20746865206E657720796561722C20616E6420796561727320746F20636F6D652E204920686F70652074686174 as varchar(60)) union all select CAST(0x796F7520676574207468617420 as varchar(20)) + @presentYouWant union all select case when @believeInSantaFlag = 1 then CAST(0x66726F6D2074686520626967206A6F6C6C7920677579206E616D65642053616E746120436C61757321 as varchar(60)) else cast(0x66726F6D20736F6D656F6E6520796F7520646F6E27742073686172652061206372656469742063617264207769746821 as varchar(60)) end union all select '' union all select cast(0x53656520796F7520696E2074686520636F6D696E6720796561722120204C6F756973 as varchar(60))
|
-
Ever since the early days of laptops, I have looked for the “perfect” device. Something with excellent battery life, very portable, and useful for writing. I like to write, you might have noticed. I also like gadgets. The goal being that whenever I get an idea, I can get it down quick. My first try was to acquire some form of PDA device. I have tried almost ever version of Pocket PC, including the one the looked a lot like a PC, though much much smaller (It was made by Phillips, I think it was called a Handheld PC). None of these has ever exactly hit the mark, though they all have had good points. My current Samsung i760 phone does pretty well, but not great for writing long stuff though it does a very good job at quick note-taking/short writing device because it is with me 99% of the time. I can whip out a few paragraphs, and on rare occasion a full blog post, but no matter how good of a thumb typist I might be, it is generally overly tedious to do more than a page or so on the device. Include the fact that the form factor makes a little text seem much larger than it actually is, and it just isn’t even vaguely a good enough writing device. (I am hoping the next version of Windows Mobile 6.5 or 7 will give me more satisfaction, but who knows.) On the laptop side, I have always used one as my work computer. The problem there has always been weight to power to battery life ratios, I have a DELL M4400 with a 3 cell internal and a 12 cell slice battery, and I still only get about 4 hours, tops. All that battery makes it heavy. Really heavy, like Van Gogh with a side of Escher mixed with a bit of Dal.. oh wait, that is a different kind of heavy. I mean it weighs like 9+ pounds. The Dell is a very solid workhorse, dual core, 4GB, 1920X1080 video, but it is just a big old thing to lug around (and no, quad core was just too much when I got it a year ago). The only real feature that makes it a great mobile device is backlit keys. This is the feature I truly love, (and the awesome resolution) but it is just too much trouble to lug this thing around just to write a blog post…It is great when I travel for business, as the 1920X1080 video rivals even decent workstation video, and at home I have a writing tablet that it is attached to that make it a great workstation (plus I use it to remote control a quad core 8GB workstation a lot of the time.) So recently I resolved to get a machine that is small, powerful enough to run Word, Live Writer and an edition of SQL Server that I could use all day long. So I started looking towards the mini / netbook class machines. When I started this post, I had just purchased a HP MINI, 110-1125NR netbook as the answer. I was pretty impressed with it, but I struggled a bit with the resolution at 1024 x 600. Even the keyboard was quite nice and I got used to it in a few hours (I never thanked my dad nearly enough for making me take typing!) I installed Office 2010 pre-release, SQL Server 2008 Express Edition, Windows Live tools, and it all ran pretty nicely. I carried it around PASS for the most part, though I didn’t have everything completely set up on the new mini so I had to lug the monster around. I did do most of my Powerpoint editing on the device which worked pretty nicely. And this was with 1 GB and Windows 7 Starter edition, which had one limitation that irked me… the inability to change your desktop theme. That is going to annoy many netbook users, as their entire goal is a status symbol thing. I do like having vacation location pictures on my desktop, but I would have lived with that… When I got home, the wife coveted the new device, and as I was keen to up the resolution, I gave it to her and I got to search again. My theory is that way more than 50% of the delight in any task is the anticipation, and so it was fun. I narrowed it down to a 11.6’ Gateway machine with an Athlon processor, the same HP device, just upgraded to the HD video, and a Toshiba Satellite T115-S1105 (The Dell machines seemed pretty nice too, but still had to order it custom to get what I wanted.) Battery life on the Gateway was just 5 hours, and I wanted more battery. The HP had to be ordered, and to get it right made it more like 450 than the original 350, then tax and shipping made it too expensive. The Toshiba won out as it had ridiculous 9 hours + battery, DDR3 memory, 2 GB of ram (with an open slot available for another 2 GB), N wireless, and a 250 GB hard disk, 1366x768 video, HDMI output (excellent for hotels with LCD TVs) and Win 7 Home Premium instead of Starter. All for 479 from Amazon. The keyboard is very nice, and I didn’t have to get used to it much at all. It has a nice trackpad, a Pentium class processor (single core, low power… A Celeron Edition is available for ~30 less too), and runs quite nicely. I don’t have any really large databases on the machine, as I mostly test syntax and stuff for forum posts and blogs when I use SQL Server on the machine. The only con is size. The step up in video size makes the machine seem almost laptop sizes rather than a mini PC size. It only weighs 3.5 pounds with a 6 cell battery, and they make a 12 cell…18 hours between charges? Might have to get into that. I am even getting 6 hours with the addition of my USB device from Verizon. The lack of a DVD drive is an expectation of something with 9 hours of battery on 6 cells, and frankly with the excellent MagicCD tool and the fact that everything I need I generally get from the web either in Isos (from MSDN) or installers from the web. Since a device like this will often need to entertain you as well as let me write (sometimes simultaneously), I have tested a lot of the video sites. It does streaming video great from the Move family of players (ESPN360, ABC), as well as Silverlight (Netflix) in excellent HD quality. Hulu video is decent, but not at the highest quality (low quality is smooth, and looks good enough on the 11.6 screen). With VideoLAN and Zune both, video looks great. Like all computers, it of course comes with a bunch of weird bloatware apps that almost immediately get removed, but also a few useful ones.. The coolest that is worth sharing is the hard disk protection service. It includes a 3D viewer that lets you watch the heads of the drive move, and shake the PC a bit and the heads/platters move to a safe position: -
 -
Cool stuff, though frankly I haven’t even thought about the hard disk heads in quite a long time. I know the Dell I have also has something like this to protect against a drop harming your precious data. All in all it seems like it is going to be a good friend for a few years to serve as my writing notebook and travel companion. Since I can keep it lean and not install most of the stuff you need to program/do your job, it should be fast enough for most anything. I have already noticed that I carry this laptop far more places than any of my previous ones, due to portability, and battery life. So if you see a guy sitting in a restaurant with a computer with too many stickers on it like so: -
-
Stop and say “howdy”. -
|
-
Yeah, yeah, I know. PASS ended two weeks ago. Well, not for me. The effects of 5 days of standing and walking and standing and walking took their toll on my post operative hip and I went down for the count. Finally things have calmed down and it is time to do what I should have done before. Basically, this was quite a good conference, with lots of great connections made, and a chance to reconnect with what are becoming old friends after 10 years of attending. The effects of budget cuts were very apparent, with no volunteer gathering this year, far less snacks, etc. Luckily, one great thing about the Seattle location is that there are several places to get your own food including a Subway, and a pretty good Baja Mexican restaurant. Next year, if there isn’t a gathering paid for by PASS, we need to do it anyhow. Bowling the year before was a lot of fun, and I would have happily dropped a Jackson or two to hang out and bowl or play pool before the conference. I arrived on Sunday evening, still with plenty of stuff to do to get ready for the week. Ate some dinner at the Daily Grill, Paul Nielsen and I went back to the room we were sharing and continued prep for our Friday Post Con on database design. Biggest lesson learned that day, use your sinus meds when sharing a room. Apparently if not you snore a “bit”. Monday Monday was a busy day, with a snippet of an Insider session and a couple of PASS meetings. Really good stuff that I need to relay back to my Nashville SQL Server User group buddies. The we moved on to Quiz Bowl. After all of the years, different configurations, etc, we finally hit upon a method that worked best. New software to run the Quiz Bowl was purchased (here) and it worked really nicely. And add to that that we used “celebrities” instead of random people from the crowd, we had a much better set of contestants. In the past we had tried many different configurations and unfortunately had always ended up with one really good contestant, and some not so decent ones. Paul Randal and Kimberly Tripp proved they were up to the task, and beat the other teams of Kevin Kline and Brent Ozar, as well as Joe Webb and Grant Fritchey. Next year it will likely be more of the same, but who knows. Tuesday Slow day, with the biggest thing for me was the “Birds of a Feather” lunch, where I hosted a table for database design, along with Paul who tried to do a starting an ISV table, but abandoned it quickly to talk database design. The day was finished off with an Insiders party, but I left early to go back and get some work in prep for Wednesday. Wednesday This was a really long day for me, and started my downward slide mentioned in the opening italicized paragraph. It started off rough, with a Quest sponsored breakfast where four speakers tried to do way too much with little prep and even less bacon that we were promised (the sausages were good :). I especially tried too much, and left the crowd less than satisfied (I feel). Got plenty of good comments (and a few negative ones in the feedback,) and I immediately (during the keynote on my new mini-PC, which I will review in my next blog, and which I am typing right now) put together a blog entry on the Disk IO stuff I tried to cover. Will blog the performance monitor DMVs in the upcoming days. Next up was the most rockstar moment of my life. We had a book signing for the SQL Server MVP Deep Dives book. I signed (along with the other nerdiariti who wrote and/or edited the rest of the book) like 150+ copies. Very definitely a cool project and a cool feeling to sign so many books (even a few for the higher ups in Microsoft.) Then I had my database design session. I have finally become comfortable enough with the material (which you can download here) that it goes smoothly and even fits into the allotted time span…I just to run way short of time, but now the presentation seems short (hence the reason why did a post-con on Friday.) Didn’t make any sessions because I ended up standing around talking for the entire day. Wore me out, but one of the best things about PASS is the community, meeting new people and keeping up face to face relationships with everyone you have met throughout the years. From there I bailed on the party at Gameworks and got some rest. Thursday On Thursday, I hit a couple of Insider’s sessions, and then the second rockstar feeling moment of the summit. Did an interview with Joe Celko and Paul Nielsen for the Midnight DBA crew. Oh yeah, and Buck Woody happened in as well for a section of the interview. Will post more when (if?) it shows up on their site. It was pretty wild at times, and Sean and Jennifer McCown were a lot of fun to hang around with for a couple of hours. Frankly I still don’t feel like I am worthy of being interviewed with the likes of Joe Celko. Why? Because like most technical people, I am not a pioneer of anything. Joe Celko was a part of getting SQL to where it is, and even Paul has some nifty new stuff with his Nordic Object Relational framework. Rather, I take the work of others and attempt to make it accessible to the masses. When I was in college, the academic stuff went over my head, until I finally tried it and got it. Won’t say more because I have said it all before. I actually went to a couple of sessions this day, including a few on virtualization (something we are getting ready to do in a major way at my employer.), but again mostly a lot of socialization, then headed in early since we had an early day on Friday for the post conference session. The downside of the day was that I was in considerable pain by the end of the day and had to bail on the Red-Gate dinner. I really hated to miss it, as I am very appreciative to them for what they do for the SQL community, not to mention I ended up eating a 44 dollar hamburger in my room for dinner…really good, but nothing compared to what they had planned. Friday The post con session was a great deal of fun. Paul and I have made a habit of presenting together at several conferences over the past few years, and one thing is always the case. Not enough time. So this year Paul suggested we put in for a post conference session and while I thought we had a snowball’s chance in Bermuda of getting it, and well, I was wrong. We had a good crowd of 47 with a lot of good conversation and in the end, 6.5 hours was a decent amount of time to do an advanced session on database design, though still with no active involvement doing design. It did give us time to use the white board and show some designs when people wanted more, or the places where we didn’t write good enough slides (you can get the slides here: PASS2009_AdvRelDBDesign_PostCon.pptx). Hopefuly we get to do the session again next year, as it was a tremendous amount of fun. Finally, many of the MVPs headed out to a fantastic meal that Arnie Roland has arranged for the past two years. This year, he managed to get Quest to sponsor the proceedings, and they not only fed us, but gave us a bag of t-shirts as well. Good food and nice people are always appreciated. Afterwards, back to the room to finish packing. Then it was a matter of making it home, which was a pretty interesting experience. Southwest Airlines has some definite software problems that cause troubles every time I fly. And they always point to the “new” system. That was a reasonable excuse during last PASS, but a year later? Mercy. At the very least I was able to get my luggage in at exactly 50 pounds after swapping a book for an umbrella. I am home now, and now somewhat getting over the physical experience of it all. The mental experience is something I will never get over, at least until next November.
|
-
For any of you who attended the Quest breakfast at PASS (and anyone else really) I wanted to go ahead and post the code and give a little wrap up/explanation. (I will post the stuff on perf counters later. They are cool, but this is really what I wanted to make sure people saw.) I had a plan for my presentation, but time was less than I expected, and the good stuff kinda fell off. So here is the stuff around the Disk IO DMV section that I really messed up and didn’t show enough. I read a few of the comments that my presentation wasn’t that useful to you, and that is clearly on me. If the index ones are useful, these are too, and can be used to help spread load around your disks. So, here is the slide I presented: By knowing the amount of data written to a database file, you can see how busy the file is, the filegroup is, the database, the disk drive, etc, by aggregating the data in the following query in various way. The query is pretty simple and will return all files in all databases (there are parms on the function to change that, but check BOL for more info): --since reboot SELECT db_name(mf.database_id) as databaseName, mf.physical_name, divfs.num_of_reads, divfs.num_of_bytes_read, divfs.io_stall_read_ms, divfs.num_of_writes, divfs.num_of_bytes_written, divfs.io_stall_write_ms, divfs.io_stall,size_on_disk_bytes FROM sys.dm_io_virtual_file_stats(null,null) as divfs JOIN sys.master_files as mf ON mf.database_id = divfs.database_id and mf.file_id = divfs.file_id The “stall” columns give you how long any processes have waited on your disk drives to serve up data for this file. This information plus the information in _bytes_written and _bytes_read columns can give you how much data was moved, and how long your CPU had to wait for the data before it could get working on it. Very powerful stuff, and can really give you a look at opportunities for performance tuning. That query gives you data from the last reboot of the server, but usually you want to see up to date values. So you have to get a baseline of values, then compare them at a later point. You can build tools that do this, and you can use the performance data warehouse in 2008, and I will demonstrate the former. Using a temp table and a WAITFOR DELAY command, I build a quick tool to see waits and stats over 20 seconds. Note that I use a #temp table to allow you to run the last query over and over to see the comparison to the baseline until you want to drop the #temp table. --over the following delay declare @delay varchar(10) = '00:00:20' SELECT db_name(mf.database_id) as databaseName, mf.physical_name, divfs.num_of_reads, divfs.num_of_bytes_read, divfs.io_stall_read_ms, divfs.num_of_writes, divfs.num_of_bytes_written, divfs.io_stall_write_ms, divfs.io_stall,size_on_disk_bytes, getdate() as baselineDate INTO #baseline FROM sys.dm_io_virtual_file_stats(null,null) as divfs JOIN sys.master_files as mf ON mf.database_id = divfs.database_id and mf.file_id = divfs.file_id WAITFOR DELAY @delay ;WITH currentLine as ( SELECT db_name(mf.database_id) as databaseName, mf.physical_name,num_of_reads, num_of_bytes_read, io_stall_read_ms, num_of_writes, num_of_bytes_written, io_stall_write_ms, io_stall,size_on_disk_bytes, getdate() as currentlineDate FROM sys.dm_io_virtual_file_stats(null,null) as divfs JOIN sys.master_files as mf ON mf.database_id = divfs.database_id and mf.file_id = divfs.file_id) SELECT DATEDIFF(s,#baseline.baselineDate,currentlineDate), currentLine.databaseName, left(currentLine.physical_name,1) as drive, currentLine.physical_name ,currentLine.io_stall - #baseline.io_stall as io_stall_ms ,currentLine.io_stall_read_ms - #baseline.io_stall_read_ms as io_stall_read_ms ,currentLine.io_stall_write_ms - #baseline.io_stall_write_ms as io_stall_write_ms ,currentLine.num_of_reads - #baseline.num_of_reads as num_of_reads ,currentLine.num_of_writes - #baseline.num_of_writes as num_of_writes ,currentLine.num_of_bytes_written - #baseline.num_of_bytes_written as num_of_bytes_written FROM currentLine join #baseline on #baseLine.databaseName = currentLine.databaseName and #baseLine.physical_name = currentLine.physical_name go --drop table #baseline The most interesting part of this query is often the stall columns. When I see really bad IO, the number of seconds waited can end up being way more than the actual number of seconds. Disk IO is more or less serial (particularly on my mini HP with Win7 Starter Edition), so if you have many queries running concurrently, every one of them needs data from the same IO source, and if one query used the disk for 1 second, the other processes might wait for 1 second each. When wait stats are really low, that means the disk were ready, willing, and able to serve all processes. On your test server, you can use a query like: --create database DMV --go use DMV go create table testDMV ( testDMVid int identity, bigun char(8000) ) go set nocount on insert into testDMV(bigun) select REPLICATE('a',8000) go 5000 --drop table testDmv select * from testDMV To create some data. Note that data will be written to the disk in various counts. I set all of the rows to take a full data page, and each of the rows is an individual transaction. But it won’t be a 1-1 match with the number of rows written or read. SQL Server flushes data to the disk in the manner it wants to, and some manner of caching will change the number over time. Run the select * from testDMV query over and over and you will not see any change in the disk IO, because it is in cache now. I hope this is better. Sometimes as a presenter I am a good writer, and frankly the lack of bacon really threw me off this morning. Feel free to email me at louis@drsql.org or tweet me at drsql and I will be happy to share more. All of this will be in our RedGate book that will be coming out before the end of the millennium, at the very minimum.
|
-
Mercy, PASS is going to be a busy week for me. My basic schedule is: Sunday: Arrive just after 4 PM. Bus to the Sheraton. Monday: Some Insiders stuff, some meetings, etc 6:30 or so, the Quiz Bowl 2009 (Celebrity Edition). We have 3 star studded teams this year, and instead of winning prizes for themselves, they are going to win prizes for you…So come on out for some good food, friends, and fellowship with your fellow nerds. Tuesday: 11:30 – 1:30 Birds of a Feather Lunch. I have a table where I hope to meet with people and help them with design (so they can help me with examples for my next book) and just discussing normalization or whatever. This will be in the Expo Hall 4B Wednesday: 7:00 – 8:30 Quest PASS Breakfast “Dine and DMV’s” . I am on the panel and will be showing a couple of the DMVs for the Performance Counters and Disk IO. More info: http://www.brentozar.com/archive/2009/10/quest-pass-breakfast-event-dine-dmvs/. This event will be in Room 608 11:30 – 12:00 Book signing with 53 of my coauthors on the MVP Deep Dives Book. Don’t have a copy? They will be on sale there. More info, check here: www.sqlservermvpdeepdives.com. This will be in the Spotlight Theater. With 30 minutes and 30-50 people signing books, this should be a hoot and a holler. 1:30 – 2:45 Database Design Session. With room for 360, there should be some elbow room if you find your session is packed. It is that kind of self-deprecating humor, coupled with light doses of database theory and practical advice that will make this session the can’t miss session of PASS! We will be room 615-617, but unlike breakfast, there will be no bacon involved, and please don’t bring eggs either. 7-9, The party at Gameworks. Not sure if I will be there, cause frankly I will have been up at 6:30 in the morning and am not a big fan of being awake in the first place. Friday: All day – Post Con with Paul Nielsen on Relational Database Design. Finally, a session length that is long enough to get through an overview of Relational Design. We have lots of good stuff to talk about, and lots of interaction planned. Not sure if there is a ton of space left, so hurry up and sign up if you wanna! Saturday: 11:00, get back on a plane and head to Nashville, giving up the two bonus hours I received in route. I will blog some and tweet some more during PASS, so if you want to meet up, follow me/direct message me at www.twitter.com/drsql and let me know. Frankly the whole thing seems exhausting to me, so you might find me slumped in a corner asleep…luckily, I am not a bingo square, but it should be fun anyhow: http://codegumbo.com/index.php/2009/10/21/sql-pass-twitter-bingo-the-rules-so-far/
|
-
The final numbered post in this version of my “pillar” series of posts ends in the most contestable part of the design/implementation process. Encapsulation. The concept of encapsulation is not contested (or even contestable by sane programmers in any field of the art of system creation. Every time you use a Windows API call you are participating in encapsulation. Every language that doesn’t look like: 0101000101001010101010100101010101010101010101010101010100000001110010101 is using some level of encapsulation to make life easier. The problem isn’t encapsulation, it is who do we allow to perform the encapsulation. It is at this point in the process that I have to make something clear. As a writer in a non-religious instructional topic, you need to be reasonably open to other people’s ideas and concepts and ideas, and no where in database design does this get contentious like when talking about encapsulation. I defined encapsulated in this concepts in the following terms: “Changes to the structures cause only changes to usage where a table/column directly accessed it” The obvious answer to any long term dba is that by establishing a solid layer of stored procedures to allow others to access the data is the easiest way to achieve these goals. Just like the Windows API that you call to get a message box on a 64 bit server or this little mini laptop I am typing on right now, stored procedures give you an API the encapsulates all of the joins, inserts, etc away from the other layers of the application. Now the data programmer can adjust the API if desired, but can also remove the entire db and replace all the tables with new tables, if desired. Rarely does this occur, but it is not uncommon to have multiple apps using the same database structures, and sometimes the needs differ (perhaps one app only needs one address for a person, the other can use all of them…). So at this point, the common argument is “why go through all of this when I can let my ORM deal with the database?” and it is a not uncompelling argument. As a bonafide lazy person, I hate to do more work than necessary. The problem is, the future. I learned a long time ago to honor future-me in as many ways as I can (well, eating right and exercising not withstanding) and future-me wants to hang out and admire all of the work current-me did and either relax or build something new for even later in the future me. I hate doing rework. The motto of my website (drsql.org) “an ounce of design is worth a pound of rework” is a direct reflection of that. Design now…enjoy later. Some of my favorite production moves were done on stored procedure access systems. I had built and tested the heck out of the database, and was ready for nearly anything, the other layers not so much. I sat around and taunted them mostly, though I am not perfect, the problem is far easier to manage on the data layer. It is at this point we pause to admit that the database layer is so much easier to do right than the other layers. There are fewer moving parts, fewer things to go wrong, and honestly, the SQL Server Dev team gives us a platform that just freaking works. And the foundational principals that Codd came up with 30 years ago constantly amaze me at how well they just work. And look, the code we wrote 16 years ago will still run with minor modifications (that code can be enhanced with new features, but T-SQL strongly resembles the code we wrote all those years ago. That having been said, to be honest, I said that the argument for ORM’s was not uncompelling, and with some reservations, I find it all too frequent that I have to give in to the ORM craze. The only concern I have is that the database is becoming less of a platform that we code to, and more of a storage platform only. While initially this sounds like I am trying to protect my job, it really isn’t that at all. My problem is that there are a few consumers of the data that ORM's won’t do an adequate job for unless the database is built for the database server, and not for the ORM. - Reporters – Best case, we have a BI solution such that users don’t have to access the OLTP database. But someone has to write queries to get the data out
- Future ORMs – No one likes to get nailed down to one programming methodology other than SQL programmers. Most of the products that were around when we were doing 4.21 programming are gone now or morphed into something bizarrely different (like VB). What is good for ORM A might not be good for ORM B. So now we need to do data conversion? Too costly.
- Importers/Outside sources – Poorly normalized databases make it harder to import data from outside sources without using the API, and most shops are going to do bulk loading of data at some point.
Note I said poorly normalized databases. Is this guaranteed to be the case? No, but any database I have seen designed for the needs of the screen/programmer always looks vastly “different” than a database that was designed for the data platform. So, unlike the other pillars, this one turned into an op-ed piece, rather than straight facts. Sorry, but it is just that type of problem. There are good points on both sides of the equation, but the real problem is that where data platform pays off is over time. The average turnover for a programmer (any type) is said to be around 2 years. Who cares about the future when you plan to be somewhere else? Personally I have been with my current employer for 12 years, and hope to go another 20+. And I see/feel the mistakes of the past, while many of my coworkers are somewhere else, grousing about how horrible someone else’s code is… Final comments on this topic. Any solution that meets the criteria “Changes to the structures cause only changes to usage where a table/column directly accessed it”, or even one that recognizes the problem is a major step in the right direction. I always joke with people in my sessions that if you are attending my session, you probably care about doing things right. Same with this concept. If you are paying attention to the future, at the very least you probably won’t leave future-you in a bind, even if future-you is a totally different person.
|
-
I guess eventually I had to admit that performance matters. Whenever I speak, and in all of my books, I try to stress over and over that performance is NOT the only thing. The fact is all too much time is spent trying to make database applications run faster when the real goal should be to architect data structures that solve the problems of the user in a natural way with proper integrity. But performance is important. Extremely important. Like a fancy sports car, it is the first second thing that a user will notice about the system…The first will be the UI (the paint job). The problem is that it isn’t the most important thing. Just like a sports car, the most important thing is that everything works. If the car runs 100 miles an hour in first gear, it doesn’t matter if the seats aren’t bolted in and the steering wheel works backwards. I initially defined well performing as “Gives you answers fast”, and in the context of performance, that is true, but in reality, that is a very simplistic attitude. Certainly taken out of context, this is definitely NOT a good explanation of well performing. Maybe: “Does all operations required in a minimal amount of time in consideration of the realities of physics?” When considering performance, there are many facets to the problem: - Initial Database design – Sometimes the reality of the problem to be solved cannot be done quickly, but often it is just poorly designed data structures that get in the way. Designing the database with the concepts of Normalization close at hand is the first step in getting things right. The relational engine of SQL Server is named relational because it likes data to be formed in a relational manner. The concepts of Normalization are the backbone of relational design (sorry, I have to make mention of Normalization every time I blog, I think).
- Concurrency – Hey, if all I ever had to work on were single user systems, life would be easy. But the fact is, on the main database system I work with, there are FAR more than one user. And most of the “users” aren’t humans (no, they aren’t zombies either) but rather machines, reading in data from mail, internet, and phone calls and processing the data. Actual human beings work slowly compared to what a computer can pump in data. And to make sure that the data isn’t complete rubbish by the time the humans actually look at it, we have to use locks, and locks slow things down…But the people want to feel like they come first…it is a difficult task, but good design and decent throttling mechanisms can be used to make it happen. (And don’t get me started on all of the moving parts, disk, memory, etc, etc. Beyond data level locks, you have hardware/resource locks called latches that make certain that the laws of physics are honored.)
- Indexing – The first thing most people think of when they are adjusting performance, and for a good reason. An index can turn a full table scan operation into a single row operation, simply by adding a primary or unique constraint (usually part of the logical database design) or any other unique or non-unique indexes that may be needed for the queries. But indexes aren’t free, and you have to be careful not to put useless and irrelevant indexes on “just in case.”
- Hardware – SQL Server can be run on simple cheap hardware, and for most situations you wouldn’t notice if it was run on a server that could easily be a file server. Unfortunately, the problem is that as you start to need “real” power, the hardware configuration cannot be so…simplistic. Adding CPU power and RAM is simple, but the worst part of the process is disks. Until solid state drives really hit it big (and even afterwards in many ways) disk drive speed and redundancy is big in making your server run fast.
- Good code that accesses your data – No matter how well your database is designed, if you code in a crappy manner, using loops where queries would work better, lots of unnecessary temp tables, poorly formatted search arguments like WHERE datediff(day, columnName, getdate()) > 1 (would not use an index on columnName), you are hosed. If you have the foresight to use stored procedures, you can go back and tune later in the process as you need to (in case you get stuck with programmers who aren’t really relational “yet”), or at least have an architecture where you can adjust the code being executed from your app, you can fix poorly performing code…if not, well, good luck (and get ready to throw away a lot of bucks on hardware.)
- Adjusted Database Design – sometimes you just can’t use the design you have, and as a good designer/architect, you have to be big enough to admit that. Denormalization can be a solution (especially if you are trying to optimize reports…) but often you just have to go in and adjust the design in other ways. The better your code/encapsulation layer, the easier it is to do (it might take a bit longer, but it will be straightforward/safe).
- Etc – Network speed, faulty hardware, poorly performing applications…I could go on for days, but I won’t. The fact is, there are many problems that can make performance bad, and sometimes they are database related, and sometimes not.
I am not numb to the fact that performance is an issue. It is really a major pain point, because it changes so much with all of the aforementioned factors, but also because it is negotiable. Data integrity is not a negotiable factor. You can’t say “bah!” to users because they complain that “their numbers aren’t adding up”, but “the server seems kind of slow” can easily be written off as whining (mostly because it usually is.) The problem is, whether your company admits it to itself or not, time is money, and if a user has to do a task 200 times a day and performance issues makes it take 10 seconds instead of 4 seconds, it doesn’t seem like much… but 1200 seconds is not a small amount of time in the least. And if there are 100 people doing these same repetitive tasks…that’s a lot of time (and no uncommon in say, a call center.) So what’s to be done? Planning, testing, having proper environments for testing performance are great things but to do it right you will probably need to double your spending on hardware to have a production and a “pre” production environment to test out your designs under load. Most smaller companies (particularly if the hardware isn’t the backbone of what they do) don’t have that. Production is where the load is tested, and they hope for the best. Well stop hoping for the best, and design for great performance. Whether you can test performance or not, you can still be prepared by building your database for performance and just as important, for tuning in the easiest manner.
|
-
As I have mentioned in all of the previous posts, basic functionality is the foundation of any system. So it goes without saying that if you have just implemented a payroll system, everyone is getting paid. To meet the basic bar that EVERYONE agrees upon, to be useful things have to work. Frankly, this is generally the only criteria which needs to be met for most systems to be considered complete, and since I don’t want get off on a rant, that is all that I will say (for now at least, it will be in the final introduction to the pillars when I get around to finishing them up.) So as we move on to this first implementation pillar, we assume that even the worst system out there does what it is expected to do. I realize this is probably a stretch in reality, but the best part of writing is that while reality figures into it, the goal is a practical version of perfection, not strictly attainable, but certainly approachable by any half decent programmer who knows the rules. This next pillar deals with keeping eyes out of the places where eyes shouldn’t be. This pillar is Secure. In basic terms, everything is hunky dory until the Janitor finds out that the CEO is making 2.5 million dollars a month because he clicks on the wrong place in the app, or that “devious” employee who is quitting starts digging around and gives himself a raise…and it is impossible to track who did it. In the initial post, I defined Secure simply as users being able to only see data they are privy to. Sounds easy enough, but when you start to consider what it means it is considerable more complicated than you might imagine. In my books, I take the “easy” way out and simply cover security internal to a single database, and even that is not easy. For the most part, the applications that I see/hear about tend to leave security to the applications that are written, and just give a lot of power to the application. Is that a good idea? Does it sound like a good idea as you read it? No. And if you said yes, then you probably either have no responsibility for the security of your data, nor should you. The fact is, even when using the application to enforce security, you are going to most likely need tables, columns, and coordination. As an example, a time entry system is an easy thing to implement, until you hit security. Everyone (except perhaps one person) has someone they report to. So rights to see everyone who works for you seems like a given, but this starts to be a hierarchy. Then people have dotted line bosses, people who enter time for them (like an assistant or timekeeper), and if you do any cost analysis, all of these people probably have some entry as to how much they are costing to work there, particularly if they get any overtime. In the end you have to build tables and queries that enforce lots of row-level security that is not easily done, and if you try to do this without tables and data, it is completely inflexible….so then you have to deal with overrides…. Let’s imagine then, that you get the app straight, and anyone who legitimately gets into the application is not accidentally or purposefully seeing data they shouldn’t. You are finally done, right? Well, of course not. Security needs to be considered in terms of access from external sources, and internal sources. Meaning not just a novice hacker who works for you who might just whine about his salary in comparison to a coworker, but now you have to consider that Harry the Hacker out there gets access to the data, what then? Could your company fall apart because they get your customer list? It doesn’t matter if you are top secret company, if Mr Hacker sends all of your customer an email stating “I have your name because I stole ________’s database,” how many more times will you give your name to that company? When I said I took the easy way out by only considering database security I meant it. To get it right, you have to build an impenetrable fortress around your data, making completely certain that while friends can use the data from inside the fortress and outside too, (like customers accessing their data on a portal, or you with VPN), baddies can’t get in from any layer. How you do that is a book in and of itself, and certainly cannot be covered in a post like this, or a book like mine. In the end, you just have to work with system administrators and network guys and make sure that you are protected, all activity logged, etc, etc. Of course, if enough administrators get together, you cannot stop or track them. When you control the logging, and the access, you control the world. So the greater your need for security, the greater emphasis you had best be placing on checking background and checks and balances so that enough people are required to turn off logging/security that it would be impossible to get that many people involved without someone informing on the crowd. Does this make you feel less secure? It should, unless the security of your data scares the crud out of you, you are not thinking hard enough. But if you take the time to make sure your implementation design considers the needs of the users and their rights to access data while keeping others out, you will be fine…probably.
|
-
On October 1, I received my email stating that I was again renewed as an MVP. One of the greatest honors I have ever received is the Microsoft MVP Award, certainly it is the greatest professional honor. And while there are some benefits to getting the award, I am not honored because of the benefits. While there are a few benefits that are awesome, the simple fact is that is is more or less a “thank you”. Just as it is nice when someone helps you to say "Thank you," this is what it means to me. * Is it necessary for me to do what it is I am doing? No. Does my company really care that I write, speak, answer questions in the forum? No. I won't lie and say that it never crosses my mind when I don't feel like doing much. But generally speaking I do what I do out of a sense of duty. Not to get overly spiritual or anything, but I feel led to help others. I really only have one skill set. Designing and coding for SQL...and perhaps writing about that topic (I don't count video games or watching TV as skills :) so as far as helping "normal" people, I am pretty useless. If I ever volunteered for something like Habitat for Humanity, they would end up coming to my house and helping me out, thinking I probably lived in a hovel (I don’t but I rarely fix my own stuff around the house, that is for sure. One question that is always coming is up is “How do I become an MVP?” Which really is how do you become and MVP, but I was speaking as if I was the person asking, not me (did I say I was a writer?) Microsoft has a page on the MVP site here: http://mvp.support.microsoft.com/gp/mvpbecoming, but my feeling is that the best MVP types meet the following criteria: - You have a heart to help others
- You will master your technology and keep up with how it continually changes
- You don't mind that the work you do will not always be rewarded with money (sometimes it will be really low paying jobs like writing a book)
I won't lie, it will not be bad for your career, but the opposite cannot be guaranteed. Nor will the time lost with your family (and/or video games) be replaced. To paraphrase a quote a president once said: "Ask not what your community can do for you, ask what you can do for your community" is how you get there. To be reasonable, I painted a picture that seems pretty perfect, and that is how I feel about the other MPVs. Me personally I am never quite sure how I get here, but it is a great honor that I definitely do not take lightly. *Some day I will figure out the logic of us helping out a company the size of Microsoft by doing a good deal of their education and tech support, basically for free. And the fact that the people we help are often our competitors. It works really well, but it still amazes me. For example, I work for the Christian Broadcasting Network, and I have never checked the credentials of people I have helped who might work for the Atheist Broadcasting Network. Can you imagine a Ford mechanic sharing secrets with a Chevy mechanic?
|
-
A project that I worked on a while back is finally coming to fruition, and it is one of the coolest around. Usually when I see a book that has 5+ authors, I am a bit concerned about quality. In this case however, there are 53 authors and even if the quality was not all that great (which it will be excellent, by the way), the proceeds will be going to a good charity (WarChild.org). At last year’s MVP Summit, the annual gathering of people with too much free time on their hands and a desire to help out other Microsoft using technologists with that free time, Bill Gates offered us a challenge to “do philanthropy where you are.” So an idea was hatched that we would all write a book and give the proceeds to charity. In the end 53 MVPs, largely from the SQL Server community (but not all) participated and produced a book that covers a lot of ground. Paul Nielsen and I wrote the first chapter called “Louis and Paul's 10 relational database design rules” based on our 24 Hours of PASS presentation, and I wrote the second chapter entitled “SQL Server tools for maintaining data integrity”. Then a lot of people wrote 57 other chapters on topics from design, development, administration, tuning, and even BI development. This is an all-volunteer book. All author proceeds are going to WarChild.org – an organization that helps children traumatized by war. Because this is a book for charity, Manning Publications wanted to also donate and gave us a higher than normal royalty. In addition, if you purchase the book through this link: www.SQLServerMVPDeepDives.com then the purchase will also count toward Warchild's Manning affiliate account and Warchild will receive an extra 10% of the purchase. As an additional bonus, you can save 50%: purchase from the website by the end of Sept (that is the day after I am posting this, so HURRY!) and use the code pop0928 at the checkout If you’re going to PASS: The first printed copies will be shipped directly to the PASS Summit Bookstore. We’re planning an official book launch at PASS. If you want to get your copy signed by every author at PASS then you must purchase a copy at the PASS Bookstore. And hey, if you get a copy and want to have it signed the slow way, one author at a time as you bump into us on the street, I will be happy to sign one any time. For the rest you will have to do some world travelling as the authors come from the whole world around. And some of them are always travelling to boot. So good luck and thanks for making this a success!
|
-
If you were at our 24 Hours of PASS session which we called the 10 Big Ideas in Database Design (and a few of you were), you witnessed something that no one else will. That is because the demons of the Internet decided that for whatever reason, the audio we were speaking needn’t be recorded. If you were one of the attendees, you will also know that without an audio feed, our presentation would be of little or no value. We basically have a set of pictures that illustrate some humorous take on each idea, some just to be funny, and others to help remind us of stories to tell. If you were not at our session, and you want to make sure you get to see us, next Tuesday at 8 PM Eastern, we are going to do it again. Will it be the exact same session? Probably not. We generally have points to cover, but not much written down. The Top 10 ideas are stuff we spend our nights, weekends, days, holidays, birthdays, and any other time that doesn’t fit into any of those categories talking about this stuff. It will be pretty close to the same thing, for sure, but who knows. So go register here: https://www.livemeeting.com/lrs/8000181573/Registration.aspx?pageName=qh37c3hrckwswsj5 And get your popcorn popped and your soft drinks on ice for our session next Tuesday. We will keep you up to date on twitter via my handle @drsql and Paul’s @PaulNielsen. “Wait a minute” you must be thinking. He said “make sure you get to see us”. Since this is our second try, who knows if the demons of the Internet will be against us again. Would we do it a third time? Probably not. If all goes well, our session will be up on the web along with all of the other 24 Hours of PASS sessions of pre- and post- conference presenters right now here: https://www323.livemeeting.com/lrs/8000181573/Registration.aspx?pageName=cw4mt2pg632crwfv With some luck, ours will be there next week too. But can you really take the chance and miss it… again?
|
-
First off, let’s just say that Devlink (www.devlink.net) is a fantastic conference. It returned to the Lipscomb campus this year, and that was a great blessing. MTSU last year was not as nice of a venue, and some of the way it had to be organized was not as nice as both years I have been at Lipscomb. Part of it is that it is a private college, but another part is that it is a smaller college. The food that I ate was good, the seating was nice, and the layout was awesome. The first day we had comedian Rik Roberts during lunch, and two years ago he was the party. The party this year was a baseball game, with a small amount you had to pay extra. I actually missed the party to go to a different gathering, but still it was a good idea for the party. Probably the only thing negative I could say about the conference was that it was harder to find a drink, and the refrigerators weren’t as cold as the barrels of ice from previous years. And if that is the worst thing I can think of, it was a great conference. Second, the sessions I attended ranged from great to excellent. Paul Nielsen and I kicked off the conference for the database track (after lots of last minute prep…worst misstep were a few slides that had black text on blue). In the afternoon I attended Tommy Norman’s Scrum/XP in Team System Deep Dive session and learned more and more about Scrum and some things you can do with Team System (I learned more about Scrum and the Team System was kind of out of my league…Why can’t we use management studio yet for Team System connectivity?). After the first day of expanded sessions, I attended the Opening Keynote by Josh Holmes, a developer evangelist with Microsoft whose topic was whether or not software development has gotten too complex. That was the first thing that stuck in my mind. Over the next two days, I attended lots of sessions by Brad McGehee, Kevin Kline, Joe Webb, Jessica Moss, Craig Berntson and Wally McClure. All of the sessions were great, but one of them was the second thing that really stuck in my mind. In addition, saw Tim Ford, Jeremiah Peschka, Robert Cain, Stephen Russell and many others. Finally, we had a taping of DotNetRocks Live. Carl Franklin and Richard Campbell led their show (http://www.dotnetrocks.com/) “live” in front of a studio audience. Again, complexity was the topic… And that is when I decided that I had to do a bigger write up than what I had gotten to so far. Complexity? Complexity? Complexity?!? As a data architect/programmer, this started to bug me… In the data driven design session, Craig Berntson had really started to get me cranked up by starting to imply that data wasn’t that important. Design “process” first. His words sounded like every developer who I have know that wanted to build the database themselves. I think we eventually go to the point where he liked having strong data assistance, but one thing I think that was presented was this: In my database design sessions, one of the most important things I try to get through to people is that you cannot have a screen that maps to a class that maps to a table perfectly. If you do, your database will stink, or your UI will stink. Your classes are also going to be a mess…But wait you say…this is exactly what we do. It is what most people do because it takes less work. He was coming from the class side of the equation, as most people design the data first, then everything else. I would note that this is not a good idea, even if you are by trade the data architect. First comes the processes and needs of the client, then the data and everything shakes out from there. Of course, as a data guy, I want to put up a few layers to deal with this. One a set of stored procedures that encapsulates all interaction, and then a solid layer of software that works with the stored procedures. Many business rules (those without any override for sure) would reside minimally in the data structures, and again in the UI to make the user happy (we all need users to be happy), and any rules that aren’t naturally handled in the structure of the data or in simple constraints, moved out to the functional classes. Many non-data programmers would prefer that all the database did was hold the data. Even things like UNIQUE constraints that fail whenever duplicated data is inserted are too much. What I really find interesting is that while stored procedures have been reasonably steady for 15 years, I couldn’t keep up with the number of different languages that were bantered around in the 4 or 5 hours of non-data topics I heard over the week. Ruby, PHP, VB 6, VB.NET, C#, Ajax, Python, F#, ASP, AUGH! I mean, seriously. And these are just a few. We have T-SQL. Maybe a little VB or C# for a CLR object now and again, but basically T-SQL has been around forever and is just getting better over time. Databases that we create in 4.21 would pretty much work now. And the stored procedures we created could have been upgraded in minutes. The problem is partially because it seems like it is really hard to change the database, but that is more about the fact that databases have state, unlike code. Why do you think there is COBOL code still in use? Because the code is so hard to replace? No, because the code is strongly coupled with data, which is hard to change because it is so valuable. But even if the data store could not be changed, if a proper encapsulation layer had been created, well, we might have a COBOL data access layer, but would we have a text terminals still in use? I doubt it. So if we could just work together and fortify the database and an encapsulation layer, replacing the code wouldn’t be so big of a deal. Look, I am all about getting things done fast. But it should say a lot that as data architects we want to do MORE work. I mean, in the short run, just getting the data normalized is win enough. And putting on CHECK constraints, UNIQUE constraints, and Triggers in some cases would be win enough (and very little work.) But if we do the more work up front, then later we can save our data administrating brothers (and sisters!) from having to demand code changes to optimize a query. Which we can’t because, based on Josh Turner’s keynote, the average tenure of a programmer is about 18 months and no one seems to understand how the code works anymore. So, if the definition of a great conference is to get people thinking… well, I guarantee you it was a great conference for me.
|
|
|
|
|
|