<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tags 'Design' and 'Writing'</title><link>http://sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=Design,Writing&amp;orTags=0</link><description>Search results matching tags 'Design' and 'Writing'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>Chapter 8–Patterns and Anti-Patterns</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2011/07/10/chapter-8-patterns-and-anti-patterns.aspx</link><pubDate>Sun, 10 Jul 2011 17:29:42 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:36759</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;In this last kind of “creative” chapter, I will look at some of the ways you implement common problems in your relational database, and some of the ways you probably shouldn’t. The “should” sections will deal with:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;strong&gt;Uniqueness &lt;/strong&gt;– Beyond the simple uniqueness we have covered in the first chapters of the book, looking at some very realistic patterns of solutions that cannot be implemented with a simple uniqueness constraint.&lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Data Driven Design &lt;/strong&gt;– The goal of data driven design is that you never hard code values in your code that don’t have a fixed meaning. You break down your programming needs into situations that can be based on sets of data values that can be modified without affecting code.&lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Hierarchies&lt;/strong&gt; – A very common need is to implement hierarchies in your data. The most common example is the manager-employee relationship. In this section I will demonstrate the two simplest cases, and summarize other methods that you can explore &lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Images, Documents, and Other Files &lt;/strong&gt;– There are quite often a need to store documents in the database, like for a web users’ avatar picture, or a security photo to identify an employee, or even documents of many types. We will look at some of the methods available to you in SQL Server.&lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Storing User-Specified Data &lt;/strong&gt;– You can’t always design a database to cover every known future need. In this section I will cover some of the possibilities for letting the user extend their database themselves in a manner that can be somewhat controlled by the administrators.&lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Generalization &lt;/strong&gt;– In this section we will look at some ways that you will need to be careful with how specific you make your tables so that you fit the solution to the needs of the user.&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;This marks an increase of a 4 sections from the last book, when I added this Patterns chapter.&amp;#160; I did take away a few bits about sequence and calendar tables, but I do plan to move this to a later chapter on development, where I will discuss the sorts of objects that I find nice to have in each database and why.&lt;/p&gt;  &lt;p&gt;For the anti-patterns, I am adding one more in this edition, on undecipherable data.&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;strong&gt;Undecipherable data&lt;/strong&gt; – Too often you find the value 1 in a column with no idea what 1 means without looking into copious amounts of code. &lt;/li&gt;    &lt;li&gt;&lt;strong&gt;One-size-fits-all domain&lt;/strong&gt; – One domain table to implement all domains rather than individual tables that are smaller and more precise&lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Generic key references&lt;/strong&gt; – Having one column where the data in the column might be the key from any number of tables, requiring you to decode the value rather than know what it is.&lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Overusing unstructured data&lt;/strong&gt; – The bain of existances of the dba, the blob of text column that the&amp;#160; user swears they put well structured data in for you to parse out. Can’t eliminate a column for notes here and there, but overuse of such constructs lead to lots of dba pain.      &lt;br /&gt;&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;This is one of my favorite chapters because it really gets to the core of design. Up until now we have stuck mostly to basics and fundamentals, building very basic structures and working with the object types available to us. Here we build practical solutions to solve common problems.&amp;#160; &lt;/p&gt;  &lt;p&gt;From here, the next chapter we will move along to security, structures, and then putting the finishing touches on things. Hopefully soon I will have some exciting new about a final chapter that will tie it all together once I get final approval and acceptance from the guest writer on this chapter.&lt;/p&gt;</description></item><item><title>Normalization and How to Know When You Are Done… The short version…</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2011/05/29/normalization-and-how-to-know-when-you-are-done-the-short-version.aspx</link><pubDate>Sun, 29 May 2011 20:54:15 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:35960</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;A while back, I was working on a short article about Normalization for a book that never got published (admittedly I wasn’t getting paid for the article, and it wasn’t for charity, so I wasn’t &lt;em&gt;&lt;strong&gt;that&lt;/strong&gt;&lt;/em&gt; broken up over it.)&amp;#160; The task at hand was to, in 2 pages or less, describe the process of normalization and help you to know when you have finished. In my upcoming book Pro SQL Server 2000 + N (where N &amp;gt; 10) Relational Database Design and Implementation, it takes about 45 pages. So it wasn’t really a realistic task, especially considering I have spent about a full paragraph letting you know how hard the task is going to be. The most important thing that is missing from this short introduction is examples, which I include in the book in truck loads.&lt;/p&gt;  &lt;p&gt;There are two distinct ways that Normalization is approached. In a very formal manner, there are a progressive set of “rules” that specify “forms” that you are working to achieve. There is nothing wrong with that definition, but progressing through the forms in a stepwise manner is certainly not how any seasoned data architect is likely to approach the problem of designing data storage. Instead, you design with the principles of normalization in mind, and use the normal forms as a test to your design. &lt;/p&gt;  &lt;p&gt;The problem with getting a great database design is compounded with how natural the process seems. The first database that the past uneducated version of me built had 10+ tables, all of obvious ones like customer, orders, etc. set up so the user interface could be produced to satisfy the client. However, tables for address and even order items were left as part of the main tables, making it a beast to work with for queries, and as my employer wanted more and more out of the system, the design became more and more taxed. The basics were there, but the internals were all wrong and the design could have used about 50 or so tables to flesh out the correct solution. Soon after (at my next company, sorry Terry), I gained a real education in the basics of database design, and the little 1000 watt halogen light bulb went off… &lt;/p&gt;  &lt;p&gt;That light bulb was there because what had looked like a more complicated in the college database class that no normal person would have created (bet you can’t guess what my grade was in &lt;strong&gt;&lt;em&gt;that &lt;/em&gt;&lt;/strong&gt;class!) was really there to help my design fit in with the tools that I was using. Turns out that the people who create relational database engines use the same concepts of normalization to help guide how the engine is created that I needed to for a database to work well. So if the relational engine vendors are using a set of concepts to guide how they create the engine, it turns out to be actually quite helpful if you follow along.&lt;/p&gt;  &lt;p&gt;First, lets look at the “formal” rules. The normalization rules are stated in terms of “forms”, starting at First Normal Form, and including several others some of which are numbered, some are named for the creators of the rule. (Note that in the strictest terms, to be in a greater form, you ought to also conform to the lesser form. So you can’t be in third normal form and not give in to the definition of the First). To be honest, it is rare that a data architect will actually refer to the normal forms&amp;#160; in conversation specifically unless they are having a nerd argument with a developer that is trying to design an entire customer relationship management system in a single table, but understanding the basics of normalization is essential to understanding why it is needed. What follows is a very quick restatement of the normal forms:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;b&gt;First Normal form/Definition of a Table&lt;/b&gt; – Attribute and row “shape”       &lt;ul&gt;       &lt;li&gt;All columns must be atomic—one value per column &lt;/li&gt;        &lt;li&gt;All rows of a table must contain the same number of values – no arrays &lt;/li&gt;        &lt;li&gt;Each row should be different from all other rows in the table – unique rows          &lt;br /&gt;&lt;/li&gt;     &lt;/ul&gt;   &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Boyce-Codd Normal Form – &lt;/b&gt;Every&lt;b&gt; &lt;/b&gt;candidate key is identified, and all attributes are fully dependent on a key, and all columns must identify a fact about a key and nothing but a key.       &lt;ul&gt;       &lt;li&gt;Encompasses:          &lt;ul&gt;           &lt;li&gt;Second Normal Form - All attributes must be a fact about the entire primary key and not a subset of the primary key &lt;/li&gt;            &lt;li&gt;Third Normal Form - All attributes must be a fact about the primary key and nothing but the primary key              &lt;br /&gt;&lt;/li&gt;         &lt;/ul&gt;       &lt;/li&gt;     &lt;/ul&gt;   &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Fourth Normal Form&lt;/b&gt; - There must not be more than one multivalued dependency represented in the entity. That is to say that every attribute relates to the key with a cardinality of one. Not a common rule to violate, but it definitely does occur.       &lt;br /&gt;&lt;/li&gt;    &lt;li&gt;&lt;b&gt;Fifth Normal Form&lt;/b&gt; - All relationships are broken down to binary relationships when the decomposition is lossless. Very rarely violated in typical designs. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;There are other, more theoretical forms that I won’t mention, but they are rare to even encounter the definition. In the reality of the development cycle of life, the stated rules are not hard and fast rules, but merely guiding principles that can be useful to help you avoid certain pitfalls. In practice, we end up with denormalization, (meaning purposely violating a normalization principle for a stated, understood purpose, not ignoring the rules to get done faster) mostly to satisfy some programming or performance need from the consumer of the data (programmers/queriers/etc)&lt;/p&gt;  &lt;p&gt;Once you deeply “get” the concepts of normalization, you really will find that you build a database like a well thought out Lego creation, desiring how each piece will fit in to the creation before putting pieces together, because disassembling 1000 Lego bricks to make a small change makes Legos more like work than fun. Some rebuilding based on keeping agile can be needed, but the more you plan ahead, the less data you will have to reshuffle. &lt;/p&gt;  &lt;p&gt;In actual practice, the formal definition of the rules aren’t thought of at all, but instead the guiding principles that they encompass are.&amp;#160; In my mind, I use the following four concepts in the back of my mind to guide the database I am building, falling back to the more specific rules for the really annoying/complex problem I am trying to solve:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;&lt;b&gt;Columns&lt;/b&gt; - One column, one value &lt;/li&gt;    &lt;li&gt;&lt;strong&gt;Table/row uniqueness&lt;/strong&gt; – Tables have independent meaning, rows are distinct from one another. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Proper relationships between columns&lt;/b&gt; – Columns either are a key or describe something about the row identified by the key. &lt;/li&gt;    &lt;li&gt;&lt;b&gt;Scrutinize dependencies &lt;/b&gt;- Make sure relationships between three values or tables are correct. Reduce all relationships to binary relationships if possible. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;The question in the title still has yet to be conquered. “How to&amp;#160; know when you are done?” What I left out of the description of Normalization was the granularity you go with. The word “atomic” is a common way to describe a table or column that is normalized enough. Atomic would tend to indicate something that is broken down to its absolute lowest form. But unless you are not a nerd (and would you really be reading this if you weren’t?) we know that there are lots of particles smaller than an atom. When you try to mess with particles smaller than the atom, you get a mushroom cloud that even Timothy Leary would not have approved of.&lt;/p&gt;  &lt;p&gt;It is the same way with databases. Tables and columns split to their atomic level have one and only one meaning. Deal with them at a higher level, and you will suffer with lots of substrings, switching attributes that you use to find out what a table means in a situation. But break things down too far, and you will suffer even more. My best example of this is a column that holds a large quantity of text. If you never need to us part of the data using SQL, a single column is perfect (a set of notes that the user uses on a screen is a good example.) You wouldn’t want a paragraph, sentence, and character table to store this information. On the other hand, that same character column is abused when the users start putting coded information (because users WILL find a way to work if your software fails them). Then and you have to search for, you will need to begin working with the less comfortable string manipulation functions in SQL… And just try to index a part of a large text column. Possible? Sometimes. Best way to go? Never. &lt;/p&gt;  &lt;p&gt;The key to knowing what is normalization and what is an academic exercise for a nerd is to understand the needs of the users (commonly referred to as requirements, as in “Why don’t we ever have good requirements before we code!?!”). If it is clear that the user is planning on maintaining a list of values and will need to update them programmatically, then it is your job to make each value a row in a table. But if there is no requirement to ever search on a value in that list or programmatically access part of the value, then it might be overkill to do anything other than leave the value alone.&amp;#160; It is often best to err on the side of caution, but the ideal relational storage for a document would be minimally at the word/punctuation level. If you are read this far and are convinced that would be the proper solution, then you need to get a complete book or take a class on the subject before you start creating a relational database.&lt;/p&gt;  &lt;p&gt;The reasonable answer to when you are done normalization is when users have exactly the right number of places to store the data they need and you can query/use the data without parsing it… Easy enough until the user changes their mind, huh?&lt;/p&gt;</description></item><item><title>Design Book–Dimensional or No Dimensional, that is..the question</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2010/11/30/design-book-dimensional-or-no-dimensional-that-is-the-question.aspx</link><pubDate>Wed, 01 Dec 2010 02:29:31 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:31199</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;So, it is right there in the title of the book “Relational Database Design” etc (the title is kinda long :)&amp;#160; But as I consider what to cover and, conversely, what not to cover, dimensional design inevitably pops up. So I am considering including it in the book. One thing I try to do is to cover topics to a level where you can start using it immediately, and I am not sure that I could get a deep enough coverage of the subject to do that. I don’t really feel like it has to be the definitive source on all topics, but it should be usable.&lt;/p&gt;  &lt;p&gt;For example, I have a chapter on physical structures, indexes, files, partitions, etc. After reading the chapter, you will have a good enough feeling for the structure of the database to know the basics of applying indexes, partitions, filegroups, etc and a good amount of when and where to apply them.&amp;#160; In the end, I usually direct readers to go elsewhere for the truly advanced coverage of the SQL Server Internals (like the books of the owner of the &lt;a title="http://www.sqlserverinternals.com/" href="http://www.sqlserverinternals.com/"&gt;http://www.sqlserverinternals.com/&lt;/a&gt; domain, in fact).&lt;/p&gt;  &lt;p&gt;In this case, I am not really thinking that a full chapter is possible, but more of a quick intro. Fact and Dimension tables, and an example or two. The real payoff will be in the case study chapters where I could include a basic set of tables to demonstrate how offloading reporting to a dimensional structure could be done and how it can be used to produce a reporting model that takes the complexity out of writing reports.&amp;#160; No ETL coverage of course, and I would use a complete reload script to simulate the data.&lt;/p&gt;  &lt;p&gt;Admittedly, the point of these design book blogs is to talk myself into or out of doing something, and usually I pretty much have the answer by the time I am finished. In this case I am still not sure.&amp;#160; On the good side of thing, I won’t have to just say “data warehousing, good idea, denormalization, bad idea, get another book and find out”.&amp;#160; On the other hand, can I really cover the topic deep enough to make it worthwhile?&amp;#160; I don’t think I can include an example in the case studies and not introduce it in the skills chapters. &lt;/p&gt;</description></item><item><title>What is a physical database?</title><link>http://sqlblog.com/blogs/louis_davidson/archive/2009/06/11/what-is-a-physical-database.aspx</link><pubDate>Thu, 11 Jun 2009 22:26:36 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:14601</guid><dc:creator>drsql</dc:creator><description>&lt;p&gt;A bit of terminology that gets beaten to death is that of the “physical” database.&amp;#160; I would think most every DBA uses this term (I do), but…to mean what?&amp;#160; I think there are two common utilizations:&lt;/p&gt;  &lt;ol&gt;   &lt;li&gt;The layer of tables, constraints, indexes, etc used to store data &lt;/li&gt;    &lt;li&gt;The actual on-disk structures. &lt;/li&gt; &lt;/ol&gt;  &lt;p&gt;Frankly, until 3 years ago, I used the first interpretation.&amp;#160; However, I was beaten up pretty badly by a few people whom I don’t really remember (I think &lt;a href="http://www.simple-talk.com/author/anith-sen/" target="_blank"&gt;Anith Sen&lt;/a&gt; was one of them.)&amp;#160; The problem is, I was scolded, &lt;strong&gt;“physical”&lt;/strong&gt; already had a meaning, given it by the “founder” himself, EF Codd. &lt;/p&gt;  &lt;p&gt;So, checking his 12 Rules, Codd stated the following two things:&lt;/p&gt;  &lt;p&gt;&lt;b&gt;Rule 8:&lt;/b&gt; &lt;i&gt;Physical data independence&lt;/i&gt;: &lt;/p&gt; Changes to the physical level (how the data is stored, whether in arrays or linked lists etc.) must not require a change to an application based on the structure.   &lt;p&gt;&lt;b&gt;Rule 9:&lt;/b&gt; &lt;i&gt;Logical data independence&lt;/i&gt;: &lt;/p&gt; Changes to the logical level (tables, columns, rows, and so on) must not require a change to an application based on the structure. Logical data independence is more difficult to achieve than physical data independence.   &lt;p&gt;And actually, the implementation layer really is the logical model if you follow his terminology since his rules were pertaining to the relational model and not the entire design process.&amp;#160; This article says it better than I can in a long blog, but I am not sure about that URL (mac.com?): &lt;/p&gt;  &lt;p&gt;&lt;a href="http://homepage.mac.com/s_lott/iblog/architecture/C465799452/E20080301143528/"&gt;http://homepage.mac.com/s_lott/iblog/architecture/C465799452/E20080301143528/&lt;/a&gt; &lt;/p&gt;  &lt;p&gt;The physical layer of a relational database occurs down at the file system level.&amp;#160; Codd's &amp;quot;Rule 8&amp;quot; (Physical Data Independence) says that the things we're designing in ERwin (and similar tools) are the things our application depends on.&amp;#160; These are not physical in nature, but are the relational implementation.&amp;#160; &lt;/p&gt;  &lt;p&gt;So the thing I am trying to say is that physical means that a little 5 volt charge is sitting there representing a bit of data in the physical world.&amp;#160; I like the term logical to mean implementation platform non-specific.. The thing in the middle is the SQL Server/Relational&amp;#160; implementation specific model.&amp;#160; It may take liberties to optimize for SQL Server, but it is not physical. That is were partitioning. indexing, filegroups, etc come in. Changes to this layer ought never be noticable by the application.&amp;#160; &lt;/p&gt;  &lt;p&gt;I guess in the comments, I ought to expect a good number of replies that might start to answer the question.&amp;#160; Does it matter? Is it only semantics? Hey if you don’t think semantics matter, I hope that when you find yourself drowning that the person who has the choice of tossing you a life preserver or a sack of door knobs interprets the meaning of your cry for help in the way you intended. You would hate to find yourself at the bottom of a lake thinking “hmm, I wonder why they did that? Did they hate me, of just mis-interpret the meaning of my sentence?&amp;quot; &lt;/p&gt;</description></item></channel></rss>