THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Buck Woody

Carpe Datum!

The Database Design Process

I need your help. I know how I create databases, and I’ve watched a lot of other data professionals follow their own processes for that, but I want to know how YOU do it.


I’ve written about the process I follow for a complete database design on InformIT (use the "Next" button at the bottom of these to see them all). Beyond starting with the business requirements and REALLY hammering that out, here is the general outline I use:


1.      Pull out the nouns from the business requirements (“Client”, “FirstName”, “LastName”, “Business Name”, “Business Street” )

2.      Group the nouns into “parent | child” sets (“Client: First Name, Last Name”, “Business: Business Street”)

3.      Continue Steps 1 and 2 until you can’t do it any further. These are your tables and columns.

4.      Set a value that uniquely identifies every record (line). This becomes the Primary Key. I normally use a “surrogate” key, but a natural key also works if you need it. I don’t like compound keys, but I’ll use them where they make sense. But the overall goal is that every key should be able to identify one “Buck Woody” from another “Buck Woody” record. If it doesn’t, I haven’t done steps 1-2 enough, or perhaps I’ve done it too much.

5.      Examine each and every column and ensure that they are in the proper type. Check the business requirements to make sure. (dates are always dates, money is money, numbers are always numbers and so on) Repeat until everyone agrees.

6.      Examine the relationships between the tables. Can there be many clients in a company? Can there be many companies for each client? This sets up Foreign Keys, and potentially other tables to solve many-to-many relationships.


There’s a bit more to it than that, and the business requirements side of things are where I actually spend the most time. If you get that wrong, the most beautiful design in the world won’t work over time.


I currently use DBDesignerFork for my design documents to coordinate with my business and development teams,  which is not a perfect tool. But Microsoft doesn’t have a good one (we did in Visio, but we messed that up) and so this is what I have to work with. I don’t keep up with this diagram after I create it; it’s just a tool to help me communicate from business to dev to DBA’s.


So now here is where I need your help. Will you post a response here (if you design databases very often) that tells me:


1.      What process do you follow?

2.      How important are the business requirements?

3.      What tool do you use to create the design, do you need it to diagram, do you even care about diagrams?

4.      What’s your biggest pain-point about designing? Not with SQL Server, mind you, just the designing part.



Published Thursday, January 14, 2010 7:02 AM by BuckWoody



AaronBertrand said:

Once I have the base entities straight, I design the schema using CREATE TABLE statements.  I like to see tables and columns representing the "nouns" just because that's the way I think.  I am not a huge fan of ER tools or SQL Server's diagrams, as I find they are just an extra step on the way to building the database, though I do resort to them sometimes when I need to communicate the schema to others.

My main problem is that I have yet to be involved in a project where the business requirements (which are crucial!) are completely known up front, or at least remained stable throughout the design process.

Sometimes building out the schema has led to questions that challenge the business requirements, often leading to re-hashing the requirements or at least trying to make more sense of them.  Other times the requirements change during the design phase for other reasons - more client feedback, allowing "nice-to-haves" into the hard requirement bullet list, scope creep, etc.

Unfortunately these are the realities of most software projects that I don't think I'll have many opportunities to escape.

January 14, 2010 10:33 AM

Brent Ozar said:

My thoughts:

1. I haven't done from-scratch "formal" datamodeling in years.  These days when I do it, it's for my own pet projects, sites or tools that I'm building mostly just for myself.  As such, I have to pull steps 1 & 2 & 3 out of thin air, but I like doing it on paper first before I approach a datamodeling tool.  I do step 6 after step 3 though.  Seems like when I skip that, I find that I didn't catch a normalization mistake, and I find that an attribute is really many-to-many, and then I have to redo my datamodel.  That's why I really like doing v1 on paper.  After I've done several iterations on paper, then I hit Toad Data Modeler.  

2. Important, and question everything.  Never take just one person's answer.  Just because the project manager doesn't think a person can have two companies doesn't mean it's true.  Ask the people who are currently using the system you're about to replace, whether it's a paper system or a computerized one.  Ask them, and they'll know the exceptions to the rule.

3. Quest Toad Data Modeler.  I work for Quest, but I can give this answer with a straight face because I used the original tool (Case Studio) before it was bought by Quest, and long before I came to work for Quest.  That tool is awesome for the price.  I like diagrams because it helps me communicate ETL issues to people who manage other databases.

4. Right now, visualizing partitions.  If I want to shard or partition, it's painful to show different ways of doing that.

January 14, 2010 10:37 AM

peschkaj said:

I was going to do a blog post on something new I've been trying but I've been too lazy to do it.

I do everything on paper before I go near a computer. High level boxes and arrows stuff. Then I start decomposing the top level nouns that I came up with into their component parts. Sometimes, I'll actually write it all out long hand. If I can't figure out how something works and explain it clearly via prose, how can I hope to effectively model it and write code to deal with it?

Only after I really understand things do I get to the point of modeling nouns and verbs and then writing SQL.

January 14, 2010 10:46 AM

Ozziemedes said:

Hi Buck,

1) I'm pretty much a traditionalist when it comes to DB Modelling.  I'm sure these approaches will be pretty familiar to you.

My DB Design work routinely starts with a big (A3 or bigger) piece of paper or a whiteboard and a domain map - for example, in a traditional order processing system I'd have a big box for Sales, another box for Products, another for Order Fulfilment, one for Employees, one for Suppliers, etc.

I then put boxes inside (or spanning) those regions, starting at the intersects with bridging entities e.g. Orders bridges Order Fulfilment and Sales, Products connects to Orders, Suppliers connects to Products, Employee Sales, etc.  I'll then work back towards the centre of each domain box with more specific entities (e.g. Order Details, Product Options, Employee Type, Sales Channel Type, etc.

Once I've sketched out the top level entities (Len Silverston's reference models are really handy for this) I then sit down with the clients/end-users and attack the model.  What bits have I missed?  What bits don't you need?  What are the business processes that are supported by each of these subject areas?  That usually leads to more entities, types, sub-types, etc.. and tends to surface parent-child hierarchies.

With the extra detail about the entities, I'll then start adding some meat to the bones.  What do you need to know about an order?  What information about an order do you need to fulfill it correctly?  What are the decision points around order processing?  Likewise for the other core entities identified.

By the time I'm done with this, my model will typically be in ERwin or Visio.  I usually have a pretty good handle on cardinality, required business information fields (e.g. Names, phone numbers, order quantities, etc), business metadata fields required to support business automation (e.g. order IDs, invoice numbers, payment IDs).  At this point I'll do a normalization and refactoring sweep to do things like convert subtypes to type detail tables with a type-parent intersect table.

Next step is adding unique keys.  I'll probably already have an Int/Decimal(N, 0) of some description doing duty as a surrogate key which also has a physical PK constraint defined on it.  I'll define unique constraints against natural keys.  I'll also start thinking about data domains (check constraints, defaults, rules, etc) and start adding additional "state" metadata - e.g. IsCourierDeliveryOnly on Product, Is401KEligible on Employee, etc.  I'll also have data-types on pretty much all of the fields in the model.

At this point, I'll put the model back in front of the stakeholders and do a walkthrough.  I'll correct clarified requirements, add new columns/constraints, refactor where necessary and make sure that all auditing requirements are clearly captured.  This usually gives me enough to freeze the model at a point where I can hand it to a dev team to start coding against it.

This is where I'll do a cut to SQL Files, import it into VSTS Dev/DB edition and start working on code-gen for standard CRUD procs, triggers and functions.  I'll keep tinkering with the model - adding metadata to support the generation of CRUD procs/functions/triggers (e.g. extended properties, etc), and move from managing the model as a single artefact to managing individual physical objects, synching the changes through source control on a per-item basis.

2) Business requirements are critical, but you can't do it all at once.  In my experience, end-users and stakeholders typically don't understand enough about their processes to come up with all of the required information at once.  I'm convinced you need at least 2, and up to as many as 5 detailed passes (if it's a big app) on the model to get to a point where the model stands up to whatever the devs and users throw at it.

3) Not only will I put the model into a diagram, but I'll try to make the diagram as readable as possible by eliminating crossing relationship lines where possible.  Readability is really important, because when a business user comes to me and says "Did we capture XYZ" I can find the entities concerned and trace their relationships quickly without having to navigate a huge snarl of overlapping relationship lines.  As discussed, ERwin and Visio are tools of choice - but if you have ANY pull with the Visio people, can you please get them to unseal the DB modelling classes so we can do automation against them in the same way as we can do in ERwin.

4. Biggest pain points?  

A) The lack of policy based design in most modelling tools - e.g. It would be fantastic to have Visio not only telling me that I have a subtype relationship that needs refactoring, but automatically refactoring it for me when I move to physical model mode using a configurable refactoring policy for which I can control the templates.  

B) The lack of XMI formatted export capabilities in most modelling tools.  From memory, it was possible to export SQL Server 2000 Metadata Services models to XMI, but since that product has been deprecated, we're stuck with forward engineering to SQL and having to export XSD's via "SELECT ... FOR XML SCHEMA".  

If the "M" language in Oslo fixes these issues, then I'll be thrilled.  As a data architect, I'm excited by what SQL Server Metadata Services offers, not only for SQL Server data design, but for pushing data designs deeper into the solution stack - e.g. generating biztalk schemas and XSL transformation defintions for common transactions based on updateable views, generating WSDL method signatures for those XSDs, generating data-grid controls from those views, generating data access layers from SQL (or model) metadata, etc.


Jeremy Huppatz


January 14, 2010 11:05 AM

Alexander Kuznetsov said:

The following is a great write up from Martin Fowler:

"Evolutionary Database Design"

"Is Design Dead?"

January 14, 2010 2:37 PM

Giammarco Schisani said:

Nice and simple process. Even a newbie can understand it :-)

I personally go (more or less) through the normalization process, steps 1 to 3.

Not sure, but now it's kind of a second nature. Database design just happens :-)

January 15, 2010 12:09 PM

Arnoudvan Bers said:

Fully Communication Oriented Modeling

January 19, 2010 2:36 PM

Peter said:

I use MS VEA (Viso for Enterprise Architects) to create and evolving diagram. I defne Keys, Relations, Indexes, Constraints, and all the comments there and it can generate script for me (comments will be generated as extended properties).

One problem in Viso is that it has limited flexibility in name convention for Keys and constraints etc.

January 21, 2010 10:09 AM

Feodor Georgiev said:

1.      What process do you follow? - I start with a pen and paper. I sketch out the entities and their properties (tables and columns). Then I think about the relationships between them. Immediately after that I start thinking about the performance of my database and I adjust the database model to my query design; I make sure that my database design is completely agreeing with the performance requirements for my queries. In other words, I make sure that the database design serves the query performance. Additionally I design the indexes and the statistics with conjunction with the query performance and space used.

2.      How important are the business requirements? - The business requirement is key, however performance has priority.

3.      What tool do you use to create the design, do you need it to diagram, do you even care about diagrams? - I actually always use the SSMS tool for Database Diagramming. It works great for me. Furthermore, I always enforce colleagues and even business decision makers to look at the database diagrams.

4.      What’s your biggest pain-point about designing? Not with SQL Server, mind you, just the designing part. - Sometimes the deadlines. :)

March 3, 2010 2:18 PM

Buck Woody said:

Some time back I posted a blog entry (mirrored here and here ) asking you how you design databases. Out

March 24, 2010 10:12 AM
New Comments to this post are disabled

About BuckWoody

This Blog


Privacy Statement