THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

James Luetkehoelter

Nearly any SQL topic presented at times in a slightly eclectic manner.

  • A counter-counter-counter(?) argument: Specialization vs Generalization

    This post is a reply to a reply to a post from Brent Ozar from K. Brian Kelly. The basic argument going on is that you if you want to be successful in the future, you should:

    1) Specialize in a specific area

    or

    2) Generalize and gain as much exposer to as many technologies as possible

    (This are overy simplifications of both views, I do this just for the sake of the argument)

    My own view is that both viewpoints are incorrect. The key to maintaining your marketability in the future is being both a generalist AND a specialist. And to make the argument more confusing, the specialist part will shift continually.

    Example 1: The pure specialist

    Let's say person A is extremely adept at RGP programming for an AS400. Right now, that remains an excellent skillset as far as hirability. However, perhaps 5 years from now IBM abandons the AS400 platform altogether and moves to a Unix-based, or MS-based or 3rd-party OS to go along with its hardware. RGP is gone. Now everything is writting in IBM-SQL, or IBM-Java, or something completely different. Now person A must completely retool just to get a job. Or, change professions all together. Yes, there will be the legacy shops that are reluctant to change, but that work will last only so long. And with today's economy, can anyone rely on retirement anymore?

    Example 2: The generlist

    I fall a bit into this area. I know an aweful lot about datababase systems, network, servers, business process, etc. (and I mean etc., I have a wide range of exposure). However, without focusing on specific area, I limit my usefulness. Let's say I'm hired to work with SSIS. Yes, I know it, and pretty well. But the really esoteric knowledge - know - I have to look it up. It doesn't matter to my employer that I'm answer questions to others in IT about SQL Server in general, suggesting different security schemes, interacting (successfully) with other business groups, identifying concrete business rules that previously remained ephemeral - if fix the very specific issues with SSIS quickly (which doesn't rule out Google, but knowing what to look for - Google now returns so much information to wade through, you need to know your subject someone before using it), I'm fired. Period. Fair enough.

    Example 3: The generalizing specialist

    So let's take how I try to focus, not always successfully. I consider myself a data specialist (as in data quality, efficient reporting, storage, etc) on the SQL platform. I also know a significant amount about programming, SSIS, SSAS, SSRS, operating systems (Windows and Unix), hardware, network, business process, etc. Theoretically I could jump into any job in any area and be productive from the start and become a specialist very shortly. I may need to change my specialization very quickly at any point to maintain my marketability. Right now SQL Server maintains a very competitive piece of the database market. I'm specializing more and more in SSAS, SSIS, and SSRS (also known as business intelligence, a term I despise). But let's say in 10 years Oracle wins the database wars over IBM, MS and open source. I also know the Oracle platform and continue to try to follow it. By generalizing a significant amount of my skills, I keep a foothold into other technology areas to have the tools to cross over faster than the COBOL program in Example 1. And unlike the Example 2, I've demonstrated the ability to be a specialist and have more credibility if I build my Oracle skills ( in this fictional example :) ) and proclaim myself an expert. Heck, I can probably take many of the things I've learned as an expert into the new area.

    I believe Example 3 is the way everyone should view their career. Another great example is language. Most of us on this site are "experts" (if you're from Great Britain you'd debate that) in English. But what if most of the business out there comes from China? Wouldn't be in my best interest to become as proficient in Mandarin as possible? What if most of the work moves to India, and India now decides that business should be done in Hindi rather than English as it is now? Shouldn't I learn Hindi? How about Quebec conquers Canada, and becomes a superpower? Shouldn't I learn French (or as the French would say, Quebecois)? Adaptability is important. Now if I had to learn these languages from scratch as things change, I have a huge hill to climb. Luckily I study languages, including French, Hindi and Mandarin (and 3 others - I'm trying to surpass my father who spoke 9 languages - for the most part, fluently). Now right now I stink at most of them (French is my most fluent), but I have an advantage over others should China take over the world. Don't I?

    Specialization and generalization can *not* be mutually exclusive. I would advise those entering the industry to learn as much as possible, while attempting to focus on a specific area. Yes, you may not have the depth in a specific technology like our RPG programmer in Example 1, but you should be able to handle 90% of what comes your way and have the knowledge to find the answers to what you don't know off the top of your head.

    That's my take. Am I wrong? Crazy? Just drunk?


  • Book Review: Pro Full-Text Search in SQL Server 2008

    Those of you that know me know that I don't endorse anything lightly. MVP Michael Coles sent me a number of his books in exchange for mine (still coming Michael), and I just finished perusing the first one on my list: Pro Full-Text Search in SQL Server 2008. I highly endorse this book to anyone jumping into the Full-Text arena - and my hope is that more and more are, it is a great technology. Michael and Hilary Cotter (who contributed, not sure where, but it matters not) did a fine job with it. Make no mistake, this is a difficult topic to tackle and a complicated feature in SQL Server 2008. Even if you have worked with Full-Text Search in 2005, there's a lot to learn since it changes dramatically in SQL Server 2008. It's well written, organized and very detailed. Will it tell you everything you need to know about Full-Text Search - of course not. But it covers quite a bit.

    This is not a book for a novice. The first few chapters focus on the basics of administration, but the book quickly moves into some fairly complicated ways in which the technology may be used. One of my favorite sections is a discussion on multilingual databases. This is an extremely important chapter - if you don't work with multilingual data yet, trust me, you will. Even for languages with a Latin character set can create confusion for Full-Text Search if it isn't aware of how the language uses accent marks. For example, in French there is the accent a gu (or aigu) - é and the accent grave - è. There can be words that are very similar in spelling, but use different accents. How does FTS know how to decide what matches? And those same accent symbols in Spanish may have completely different rules. Now through in languages like Arabic (reverse order) and Chinese, and the ability to truly match words and phrases can be very difficult. This book shows how this can be accomplished with real examples.

    More practical examples follow showing how to full-text-index blob data, blogs, create custom thesauri, etc. It even covers the dynamic management views for Full-Text Search. I highly recommend this book, which I don't say often about anything :)


  • An open letter to Microsoft: Listen to the real world please

    Before I start, I want to say I have a lot of respect for the MS development people, and many of those in MSC. But Microsoft, you have a fundamental disconnect with what is real and what you're moving towards.

    After hearing of the layoffs and the the end of Performance Point:Planning (I'm not entirely pleased with Performance Point as an end-user multi-dimensional data consumption tool), I started thinking more about the product and features MS is marketing. I've also been recalling encounters with MCS people brought in to provide an optimal "solution" or fix an issue with something with SQL Server or Sharepoint or Biztalk (the areas that I know). In almost every instance (almost), I see MS marketing, product development and consulting very out of touch with what people really need. Initially when Bill et alia started MS, no one knew what to do with computers, so they needed to very much steer the technology to show how it can drive business. I believe we've reached the turning point with technology and business. I understand that when MS comes in to a company, they're focused on their product and their product alone. What needs to change - look at the business.

    So what does a buiness really need (and we'll stick with databases since this is SQLBlog):

    • A business usually needs dire help organizing, managing and understanding its own data: One truism that most of us know is what a business asks for and what they really need are often two very, very different things. This entails not simply performing tasks but asking questions like "Why do you need this data? What questions are you trying to answer or what are you hoping to learn?" Then the true need often reveals itself.
    • A business needs people that truly understand their data: I'm not talking just about the data structure of the database, but the business meaning of the data, and what I usually refer to as the "application" meaning - how is this used or exposed in an application? This starts us down the road of Master Data Management, which MS is far, far behind in facilitating tools for this. Yes, there is something they acquired, but it is slow in coming and looks to me to be "cookie-cutter". Right now my best tool would be Excel or another SQL database...
    • Instead of buying a really generic tool like Sharepoint or Biztalk (or SQL and all it entails), a business needs to understand how these tools can be used to improve their business: I don't know how many times I've run into a SQL BI implementation with no thought into how it can really help the business - the assumption is the business knows. The same goes for Sharepoint (an information nightmare if not organized consistently) or Biztalk (usually overkill). Seldom does anyone selling or installing the product take the time to discover what might be useful for the business.
    • A business needs a way to quickly consume their data: I'm sorry PerformancePoint (and Proclarity) - you aren't cutting it. PerformancePoint is basically Proclarity rehashed - I believe if Proclarity was still in business they would have a had a new, more intuitive, useful version of the product. And things look now that PerformancePoint is close to the chopping block. Excel? Excell services? Um, no. If anyone has used some other data visualization products, you can see what is really possible from multi-dimensional data.

    What's the common theme here? Business understanding is necessary for a successful product. It's time MS starts thinking this way and stop trying to assume a business process with any of their products, or assume a business understands how to use their own information (or even what it means). Does that mean MS needs to have a business consulting wing? Maybe. But the days of "here's our product, buy it and all will be well" are long, long gone.

    Anyone disagree? Am I drunk or insane?


  • 5 Reasons why I think MDX is the Devil

    First, an up front apology to Mosha - this is half in jest, half real. MDX is a powerful and useful language, but I find that its very existence has made my life very difficult when it comes to developing a solid Analysis Services solution. That's not to say I don't use it, or endorse it, but I worry greatly about its overall usage.

    1) It tempts you by looking like SQL, but once your brought into it you realize how complex it is (if you don't have Mosha's MDX Studio, get it).

    2) MDX allows you to do TOO much when it comes to a multi-dimensional structure, without first understanding multi-dimensional concepts. I've seen code that creates filters or subset cubes that overlap or end up cutting out measures implicitly. You have no warning about possibly undermining previous dimensional modeling - that would be a nice addition to see.

    3) Along those lines, it enables sloppy dimensional modeling in the first place. "Throw it together, will add in things later". With a multi-dimensional database, it is very important to model it correctly, just like you would a relational structure. Unfortunately there are no Codd/Date rules of normalization, and often very little work done up front understanding the actually relationships and what the data is meant to represent. For example, instead of using a many-to-many cube dimension (what I call the dimesion to measure tie) to represent something like a bill of materials, I often see this done afterwards somehow via MDX calculations and such. Instead of going back to the dimensional modeling, the answer is usually "we can do this in MDX later".

    4) Yet another language with its own syntactical oddities - damn you syntax, you will be the death of me!

    5) I can't count the number of calculations I've seen in a cube that could have been done before the cube is loaded in the first place. An MDX calculation that does something like PRICE * QUANTITY = TOTAL SALES is just a waste of processing on the cube side. Remember, MDX is post aggregation - so the large the cube, the more dimensions it has and the level of drill down it allows all play a roll in the speed of things.

    Those are 5 to start - anyone have anymore?


  • A response to Linchi Shea's excellent posts on fragmentation

    In case this is your first visit to SQLBlog.com, Linchi Shea does very methodical analysis of the performance and behaviour of disk arrays. His most recent listings specifically explore file fragmentation and its affects on performance on storage. The posts sort of give one the impression that there is no significant impact on performance when the disk storage is a SAN. I'd be inclined to agree, and his data certainly backs that up. But in many in the posts he does compare direct attached storage, where we do see degradation.

    I'm writing this little blurb because my hunch is, at least from what I've encountered, the majority of SQL installations use DAS or "quasi-SANs" (Linchi, I'm sure you've run into some of these devices that claim to be SANs but are sorely lacking). In those cases, especially ones where databases are unmonitored and auto-grow occurs on a regular basis (or transaction log auto-grow/manual shrink accordianism), Linchi actually demonstrates that there will be performance degradation. This is one of the most common performance issues I run into out in the "jungle". Fragmentation.

    My message - those of you on direct storage or if you aren't certain of fragmentation affects, error on the safe side and defragment. Or at the very least investigate how physically fragmented your arrays are. Please.


  • Kalen's latest editorial from SQL Mag newsletter

    If you haven't already read it, Kalen has another great editorial in the latest copy of the SQL Mag e-newsletter. What I love about this is not just pointing out some behaviours that others may not be aware of (she focus an database recovery issues, near and dear to my heart), but in what she is really trying to say with the post. MS gets tons of requests for more information (and we see them on newsgroup posts all the time) for information not just on when someone has done something wrong, but a warning before they do it. I know that's true. The more I think about it, my reaction is - what the hell????

    If you were an Oracle DBA (ohoh, I said the O word), you are expected (or at least were from versions 9i back) to truly understand the platform. And if you weren't sure about something, you would invoke support. There is still this common mindset out there that SQL Server is this simple database platform where you really don't need a DBA, nor really understand the platform. I've seen it over and over again. I know most of you have too. I think MS is partly to blame for this, with marketing approaches (SQL 7 - it tunes itself!) to gearing the primary user of SQL Server to be Visual Studio. But it is also a fundamental problem with those that manage SQL Servers. Yes, I know a lot of you are DBAs by default. What everyone must understand is that SQL Server is a complex system, and if there's a way to make a mistake, you can do it without the system conveniently popping up to say "Danger Will Robinson!". Education is a never-ending thing - there isn't one person out there that can't learn something from someone else. And for pete's sake, if you aren't sure what is going to happen, FIND OUT first. Let's step up people. Take some personal responsibility rather than blame MS for everything. SQL Server isn't a TV you can simply turn on and off, and adjust the volume, or change the channel...and it never will be.

    Kudos Kalen, another great editorial.

    BTW, for those of you that don't have it, Kalen has some intense training DVDs out there for unbelieable prices - check her posts to get the address (sorry Kalen, I don't have it off hand).

    And for those that might be offended by my position - tough. This is reality. Expecting all the answers to come from someone else without you lifting a finger is, well, irresponsible.

    End rant.


  • The devil in the details...

    I haven't written a post in awhile, mostly because my fridge keeps falling on me and trapping me for months. Seriously, I've had some changes and have moved back into a permament position doing a plethora of activities (DBA to BI to AppDev and maybe even Sharepoint). I've found it at times frustrating, but very fulfilling experience. I've been out as a consultant/trainer/speaker/whatever for quite some time, and I often found myself dealing with very esoteric issues, or training people on fundamentals like how locking works, or just being thrown the most difficult scenario possible to try and resolve. I enjoy that, I always have. But what I discovered recently is that by doing just that, my skills to do day to day or basic code writing (syntax!!! - you will be my downfall!) diminished significantly. Juggling knowing PL/SQL vs TSQL vs ANSI SQL vs VB vs C# vs SSIS vs DTS vs Informatica vs SSAS vs Hyperion has really challenged me to get back to writing, well, very basic stuff.

    I find myself improving, slowly, and my colleague Steve (who will probably respond to this with a sardonic comment :) ) will attest to the silly things I do. Miss a comma. Miss a closing quote. Forget basic techniques that I've taught years ago. He constantly hear's me mutter "Stupid James". We have a beer tally going, and let's just say he'll never have to by beer again in his life :)

    Why am I saying this? It's to all you out there are that aren't authors, or trainers or consultants or MVPs (nothing against those that are - please don't come back and punch me in the nose any MVPs) really DO know the product quite well. Yes, there are others that need help - we see it on newsgroup posts all the time with questions like "Why is my transaction log filling the disk?". But there are so many out there that do the detailed work day in, day out that really deserve a shout out. Never, NEVER believe that there isn't something you can learn from a colleague. And companies and managers out there - your employees opinions about what to do are often more applicable to your situation than a consultant coming in for a day - value the people who work for you and get things done. Yes, we all know there are nightmare situations out there in the SQL world, but I think the people I'm addressing know who they are. Kudos to you all.


  • The concept of a data steward

    At PASS this year, Adam Machanic and Peter DeBetta gave a great session presenting things as sort of "anti-patterns". They would have slides that have would say things like "Naming standards are irrelevant" or "always use heaps". It was a blast. The audience chimed in with equally bad ideas (including myself). Kudos guys!

    What really struck me though is when they were talking about security and pointing out that most security issues happen "outside" of the database. A backup tape disappearing, a web application exposing things with no protection, or an employee walking out with credit card data. Adam then quite eloquantly talked about how we DBAs are the "stewards" of the data. We need to understand what is happening before the data hits the database, after it is in the database and once it leaves the database. I couldn't agree more. Someone has to be responsibility for data, and if the not the DBA, who else?

    What does everyone thing - is this the role of the DBA or someone else?


  • An open letter to all 3rd-party vendors: DO NOT USE SA ANYWHERE IN YOUR APPLICATION

    I've run into this problem again and again. Sometimes I've had luck in convince clients that if a 3rd-party application is hard-coded to use SA is shouldn't even be considered. Sometimes not. With all of the issues that have come up with the SA account over the last 10 years, I find it inexcusable that vendors still hard-code their application to use SA. Some at least let you pick your password, which you can make absurdly complex and then throw out. Others still go with SA and no password (yes, even under SQL 2005). Far too many vendors for this just to be a James-like rant.

    How do we solve this? DEMAND that software be changed. Refuse to purchase software where SA is used at all. Only two weeks ago I saw a dictionary attack against the SA account, and that on a SQL 2005 box behind a firewall (meaning it was an employee in all likelihood doing the hacking). I urge all of you - do NOT LET COMPANIES GET AWAY WITH THESE NIAVE SECURITY PRACTICES!

    Who's with me? Or am I just insane (a true possibility)


  • OT: I can't catch a break

    So I was suppose to give a webcast - no connectivity to the Live Meeting website. Oh, I can get here, yeah, but not the MS site. Apparently my ISP sets up routing via carrier pidgeon. So, if you were looking for that webcast, I apologize. I'll post the topic here instead (hey, this works).
  • Restoring Master - beyond the technical

    Thank Greg Low for reviewing my book and pointing out one of the copy-edit issues in it (I wasn't involved in that process by choice and schedule, a mistake I regret). As Greg pointed out in his post, one of the most imporant aspects to dealing with backup and recovery is restoring Master. In my book, it makes reference to a screenshot that should show a basic restore statement in action - instead it is a screenshot of setting the permissions for instant file initialization?? Silly James, I don't know how that got in there, but was probably my fault to start with (and mine for not seeing it first).

    The best resource for is BOL - seriously, the best documentation for any software product anywhere - MSDN also has a copy. SQL Server has to be started in single user mode (sqlserver.exe -m), then a simple restore statement works:

    RESTORE DATABASE MASTER from disk='D:\SQLbackup\master.bak'

    You're forced to restart - but what's next. Guess what happens to your user databases if they're in a new location (such as a new drive letter or directory) - in SQL 2000, they're all suspect. In 2005 they're just unable to go through recovery because the files can't be found and are unavailable. Scary, but you can always restore them with a "with move" statement to change the location (again, BOL people for syntax - if you don't live there, you should; memorizing syntax should come second to understanding process).

    Now what happens if you happen to try to restore to an instance that isn't the same build level? With all of the post-SP2 cumulative updates, if you have a significant installation base it is quite possible (I've been there). It isn't pretty. There is a KB article on that, but the better question might be - before I start restoring this master.bak file, do I know how recent it is? Is it on the same build? Of course its always recommend that whenever a change is to the instance itself, especially a patch, Master should be backed up. This is one of the practical things I talk about in my book - in a disaster situation, the key is not to panic and get your bearings (it drives CEOs crazy if they see you trying to be calm though :) ). This particular issue is also relavant to the Model and MSDB databases.

    We also know what happens with users in other databases when Master is restored. If they are SQL authentication logins, they can be out of sync with the user databases, requiring either a synchronizing script to move the secure IDs randomly created (SID) between the Master and user databases or with a script like sp_change_users_login (BOL people if you don't know it - if you do, you've earned a DBA stripe :) ). Common thought is that if Windows Authentication is used, this mismatched users issue doesn't occur because the SID is the actual Windows SID, so it would be the same in the Master database in the logins as well as all user databases. Dangerous line of thought!! What if you are restoring to a new domain? Guess what, the SIDs mismatch. Again, take time to think about the situation before diving in; once you restore Master, it isn't pretty to clean up if you have to...

    So what's your Master horror story?


  • It's all about the data...

    I read a very interesting post on Kimberly Tripp's blog which was about indexing, but she had a very interesting set of requirements for good indexing that I think apply to everything when designing, tuning, backing up, etc., any database. What strikes me the most are 1) and 2) - how can you tune any sort of system if you don't really know what the data *is* and how it is used? As Kim says:

    "So, what is “finding the right balance” in indexing? In my opinion, there are three requirements/pre-requisites:

    1. knowing the data
    2. knowing how the users use the data
    3. knowing how the underlying structures and database stores/manipulates and uses indexes"

    I can't count how many times I've run into clients where I'm asked to tune or create a disaster recovery plan or secure for a database where no one is really sure what things mean and how it is used. Everyone should start here, and I'd take it to the step of actually know not just the data, but the information; that is, what is the business meaning behind the data. It's a mantra to adopt - "Know thy data".


  • Do we need a SQL Developer-only certification exam?

    Recently Oracle announced a new certification - a "SQL Expert". The target of that certification is for developers that write a significant amount of SQL, but aren't "database developers" (something I'm finding more and more common these days). I guess the idea is that it pushes developers to be sure that if they do write SQL, it's at least good SQL.

    How many times have you run across SELECT *, or copious dynamic SQL, or poorly structured SARGs? I do all the time, and in the MS world it feels like it is getting worse, not better. Since everything revolves around data, I'm wondering if some sort of core TS exam like this should be required not just for the DBAs and database developers (man, we really need better names for what we do), but also the MSPD certification. It feels like really understanding SQL is becoming a "nice to have" for a VB/C# whatever developer, not a "must have". I see it in job postings as well.

     I think (gulp), Oracle has the right idea with this certification. Personally I'd urge MS to do the same. Thoughts?


  • The "meaning" of NULL - a different approach

    I know we've all beaten the discussion of NULL to death on this blog, but I had an epiphany of sorts when flying to Germany for the European PASS conference. Thus, I'm going to pick up the topic just one more time...

    It occurred to me that the word "Null" has a very specific meaning depending on the language you are speaking - TSQL, "academic"-SQL, English or...German, where null literally means zero. Thus, I would naturally expect some confusion when seeing the word "NULL" come up in a result if it is shown to your average German users.

    That brings me to what I think is the fundamental problem with the term NULL. In a very practical sense in SQL Server NULL is:

    • A recording in the NULL Bitmap for a specific row that a specific field (a tuple in the academic lingo) has the property of NULL, i.e., the tuple has a property, it has nothing to do with the value or lack of value in the column

    The problem is that language comes in to play. No one likes trying to explain NULL to users in an academic sense, so we come up with definitions. Here are some of the more popular ones (correct or incorrect):

    • "The absence of value" - I think Codd would have approved of this, as I think C.J. Date and others would endorse. The problem I have with this definition is that there is a connotation with this phrase. To me (and others that I've experienced), the implication is "Why is there the absence of value? Did someone ignore the field? Did the clear it out deliberately? Is it not applicable? Did the consumer not supply an answer to the question?" Thus, NULL chaos ensues, where people take it to mean whatever the conceive of (regardless of the intent of the DBA or developer). Here we get reports supplying questionable information.
    • "Empty Set" - I wouldn't agree that this is an accurate definition, but again it implies - "Why is the set empty?" There must be some reason why there is a lack of data within the "tuple". Thus we get the practical affect of converting NULL to an empty string ('') or a zero (0). Was the logic of the NULL transferred properly from the DBAs and Developers to the end user - no. Again, what I call NULL chaos ensues.
    • "Nothing" - Again I would disagree a bit with the definition, but the connotations remain the same - "Why is it nothing? Did no one enter data? Did they remove data? Is the "tuple" not applicable?" All of these things come to mind naturally to me (althought one could question whether I'm an accurate representation of a "normal" person).
    • "Unknown" - As far as SQL Server goes, this is the best definition. The connotations are almost child like - "It is unknown. - Why is it unknown? - I don't know why....Why doing you know why?...ad infinitum". "Unknown" is also a very practical definition when it comes to understanding things like mathematics where NULLs are involved (6+NULL=NULL -->6 + "unknown" must equal "unknown"

    In any event, we need to keep in mind as database professionals that the actual description used for NULL will naturally carry with it specific connotations, for good or for bad. Language plays a great role in the understanding and ultimate usage of NULL - let's remain aware of that.


  • OT: (again) My book description is finally correct!

    I'll say something technical soon, I promise :) After numerous tries my editor (thank you Jonathan) has gotten the correct description for my book out there on Amazon and such. It's amazing how long changes like that can take. The previous description was almost 3 years old (when I started the book). By the way, if you're thinking of writing a book, be prepared to be steam-rollered. It is a very difficult process, and a complicated balance of business and creativity, deadlines and quality writing.
More Posts Next page »
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement