THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Arnie Rowland

Discussion of issues related to SQL Server, the MSDN SQL Support Forums, the complex interplay between Developers and SQL Server Administrators, and our sometimes futile attempts to have a 'normal' life.

Through the Looking Glass: Elegant -or Not?

SQL Server as an EAV Database -Good Idea? 

It seems like I am getting more and more inquires from potential clients asking for help solving performance related issues with what, after some investigation, is using SQL Server as an EAV database.

First, for those unfamiliar with the concept, an Entity-Attribute-Value (EAV) database seems disarmingly simple and seductive to many developers. In its simplest form, an EAV database requires only one table, and that table has only four columns: an Identifier, the Entity, the Attribute, and the Value of the Attribute. The entity might be a Customer, the Attribute ‘Customer Name’, and the Value would be the actual customer's name. The same table could also contain Products, or Invoices, or Employees, etc.

The beauty of the design is that the developer is free from ever having to consider a table schema. Need another column, that’s just another Attribute/Value pair. Need another table -just a different type of entity. Add as many new entities, or new columns as is desired. One never has to negotiate with a DBA again. There is never a need to have to write that messy and sometimes problematic ALTER TABLE code again. Wonderful concept if you have to distribute changes to the data model in situations where the user (or application) does not have permissions to muck around with the database, the schema, and all of that DBA stuff. There is only one data model, and it is only one table that has four columns. What could go wrong with that?

And, of course, it is possible to select the ATTRIBUTE/VALUE pairs for an ENTITY (or group of entities) and PIVOT the data to simulate a table. It seems simple enough –doesn’t it? (If you want more information, just search for Common Table, EAV Database, Table of Tables, etc.)

In fact, the beauty is in the elegance of the model. Frankly, before the advent of SPARSE columns, you could save a lot of storage when there was a long list of potential attributes (columns), and most entities would only use a small portion of the potential attributes. And it works wonderfully for many situations: obviously, property ‘bags’, somewhat small and static tables –you know, the types of data that will be used to populate various UI controls, or be used for data validation, or internal processes.

I’m not going to address the issues of the increasing complex code required to pivot the faux tables –I’ll leave that to later discussions. Today, I’m interested in performance issues. One client has an EAV database that is approaching two billion rows in that single table –and the performance is starting to become quite troublesome. They really didn’t like my even broaching the subject of a data model redesign –too much ego involved in this elegant design.

I’m interested in your experiences, cautions, and concerns surrounding performance issues of using SQL Server as an EAV database.

So tell me, when does something that is elegant from one perspective become burdensome and inelegant from another?

 

Published Friday, October 03, 2008 12:19 AM by ArnieRowland

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

unclebiguns said:

Elegance is in the eye of the beholder.  In my eyes an elegant solution is on that is flexible, yet can be understood by new person coming into the department fairly quickly.  I think the furthest you should go with the EAV model is an EAV table per entity.  For example, in an order entry application, you may have customers, orders, order_details, and products.  I think it is valid to have an attributes table for each of these entities (customer_attributes, etc...) and an base attributes table that stores the entity (customer, order, etc...) and the attribute (contact_name for cutsomer, color and size for product, etc...).   This supplies flexibility, but is also simple to understand.  As soon as you get beyond this you sacrifice simplicity and maintainability for flexibility.  And, as you note likely performance.

October 3, 2008 8:04 AM
 

Paul Nielsen said:

Hi Arnie, and welcome to SQLBlog.com!!!

I've played with the EAV pattern a few times, even built a production database for material specs using it back in SQL 6.5.

and I blogged a few thoughts on this pattern a while ago...

http://sqlblog.com/blogs/paul_nielsen/archive/2006/11/27/374.aspx

Someday I'd like to flesh out a cool abstration layer for EAV that code gens all the ugly crud work. It's all generic enough that it's the perfect code gen pattern. I got somethign like this workign for one iteration of Nordic, but then didn't like it so I threw it away.

But tell me Arnie,

How do you solve the referential integrity and data type issues?

-Paul

October 3, 2008 1:43 PM
 

Paul Nielsen said:

I think the performance issues come when code pivots to create the fake table and then filters for joins or where clauses. The filtering has to happen before or in the pivot.

October 3, 2008 1:45 PM
 

Adam Tappis said:

I think EAV has it's place, but it must be used with care. A single table database is neither elegant nor is it a good design.

As mentioned, it's particularly good as a space saver (if implemented correctly) but more so to handle fairly dynamic changes to the schema.

SPARSE columns in SQL server 2008 adress the first issue, but you're still left with ALTER TABLE statements to add new attributes. ALTER TABLE in itself is not an issue, but DBA's certainly wouldn't want to grant DDL_admin to applications let alone end users. ETL processes on the other hand are a different story.

I've recently implemented an EAV model in a system where users are given a tool to define new custom dimension attributes. This is perfect for EAV because to store the definitions and meta daat the application doesn't need to make schema changes and it can interface nicely with SSIS data flows which are meta data hungry beasts that don't like changing schemas. After the periodic DW build, the user customizations are extracted from teh EAV model and loaded into the DW. This is where SPARSE columns come in, once you PIVOT the data. I use Itzik Ben Gan's dynamic pivot solution to pivot the data that has been extracted in EAV format to look like a dimension table and hence no impact on SSIS.

So, is EAV evil? Absolutely not.

Is it dangerous in the srong hands? Absolutely yes.

Is it a good idea to implement it in SQL 2008? I don't think it matters what version of database you consider it's the technical requirements and the design that should drive the implemetation. SQL Server 2008 SPARSE columns address only part of the problem (storage space which is increasingly cheaper).

EAV should definately not be percieved as a "one design fits all" solution.

March 13, 2009 2:35 PM

Leave a Comment

(required) 
(required) 
Submit
Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement