Conventional wisdom says to “normalize to 3rd normal form then denormalize for performance.” Poppycock! I say for an OLTP operational database (not a reporting or BI database), a well normalized design will outperform a denormalized design for three good reasons:
1) If the denormalization duplicates data, then the DML operations have to write to multiple tables, this means extra code and extra work.
2) If the denormalization duplicates data, these will typically be keys or at least candidate keys, so they’ll be indexed which contributes to the index bloat problem.
3) If the denormalization repeats columns (item1, item2, item3, etc), then the data has to massaged before being inserted or updated, and when reading the data it typically needs to be unpivoted for set-based code, and these operations mean extra work.
Those who promote denormalization would say argue that denormalization is trading write performance for read performance, and this is true for reporting database, but in issue number 3 above, even read performance suffers.
Since denormalization tends to go with SQL code in the app layer (no database abstraction in the form of stored procedures), the denormalization also impacts extensibility, since it’s even more expensive now to modify the database schema.
In production databases I’ve built, I’ve tested a pure normalized design vs. the best denormalized design I could come up with. Having a clearly defined database abstraction layer meant that I could run one script to modify the table structure, correct the data, and alter the stored procedure, and the app could continue to make the same stored procedure calls without breaking. In every case, the normalized design outperformed the denormalized design by about 15% due the normalized design’s more efficient code.