Hacker News new | past | comments | ask | show | jobs | submit login
Why Normalization Failed to Become the Ultimate Guide for Database Designers? (lambda-the-ultimate.org)
29 points by blasdel on Jan 9, 2010 | hide | past | favorite | 15 comments



Reaching the perfect database schema is only possible if you have static requirements. Out here in the real world, things change, and the more normalized your database model is, the harder it is to change that to reflect your new business needs.

Another opposing force is real-world performance. In most cases we're fine with a non-perfect model, we're fine with duplicated data, and we're fine with inconsistent data, because the problems they cause are much easier to deal with than performance bottlenecks. Having a slow frontpage because getting your inventory list requires a five-way join is gonna put you out of business really, really fast.


In my experience it is easier to change a normalized scheme then a non-normalized. The redundancy and integrity issues of non-normalized schemas are really difficult to deal with when the schema has to evolve due to changing business needs. I have certainly never observed that un-normalized data should make it easier to evolve a schema - can you provide an example of when this would be the case?

Regarding performance optimization, I think it is valuable to distinguish between the logical model and the physical model. Any kind of caching is a kind of redundancy, but that is not a problem if the database engine guarantees the integrity of the logical model. Relational database engines provides ways to optimize the physical storage and querying of data without compromising the logical model. For example materialized views can provide the same kind of optimization as a denormalized table in the 5-way join case - but without the integrity issues.


Olavk said most of what I wanted to, but he did it better. I want to further emphasize a couple of things though:

the more normalized your database model is, the harder it is to change that to reflect your new business needs.

Like Olavk said,this is not my experience. It is easier to start with a normalized database and adjust. Now when adjust I sometimes have permitted exceptions to the normal form to minimize the number of code changes, but I always prefer to start normalized even if he does not stay that way.

we're fine with inconsistent data,

In the databases I have worked with, this is absolutely never acceptable. I know in some areas it may be, but not where you really care about being right. Every application I have been a DBA for has wanted all of its data to be correct, and if it is inconsistent you know that some of it is wrong. As a DBA I certainly could not guarantee that the data was correct, but I could normally guarantee that it was consistent and that helps.


Whenever Z-Bo writes I feel like there is a thought there, screaming to get out, but muffled by a blanket of words. I wish he would be more concise.


I got the sense that he didn't grasp the point of Kent's paper (c.f. his remark regarding "N+1 schema"). I personally agree with Kent's view regarding the necessity for yet another (but final) abstraction layer to decouple semantic and logical abstractions. At some point in the future DSLs and Schemas will meet in a happy union to achieve this.


To quote one of the first engineers I ever worked with: "all problems can be solved with an additional layer of abstraction."

except performance problems, of course.


"except performance problems, of course."

Well, in some ways, even those can be solved with abstraction. The key is where that abstraction layer is placed. :) The assumption is you add them "on top" of what came before; but if you go in the other direction you may be able to improve performance by implementing, say, a smarter CPU with higher level abstractions built into the silicon. Things like that.


High level languages is also an example. C is really an abstraction layer upon machine code, but in most cases compilers produce faster code than what you would do if you wrote assembler. So C is an abstraction layer that helps you improve performance.


If a problem is so critical that one of its side-effects earns the title "the vietnam of computer science", then I think it is 'ok' to put "performance" considerations on a secondary tier and first see if there is a solution to the problem, in principle.


Then you seem to misunderstand the point. It is not a N+1 approach. There is a bound to N and that is 4.


I didn't see a lot of insight here, but I upvoted it anyway for at least dealing with some of the trickier issues of programming.

If you think of programming in mathematical terms, it's almost like you're the polar opposite of the pragmatic practitioner. This whole idea of a perfect form is inane. Relational set theory gives us conceptual ways of talking about constructs. It's a tool, not a way of life. Somebody build a bad relational model? Well -- what do you mean by "bad"? Did the resulting code achieve the goals set at the beginning?

And when I got to the point in the abstract where it said no books had shown how to use it well, I was ready to bail out. I'm not sure what books the author has been reading. The ones I've seen have practical real-world examples of normalization all through them.


I wonder if it's possible to have some sort of controlled de-normalization. Like, the ability to create a read-only column whose values mimic that of a column in some other table. A "view column" if you will.

I would definitely opt for that as a "real-world" practice.


Joins don't scale.


There need not be any relationship between the schema, which is purely logical, and how it is implemented. In practice, there is, but why this is still the case is baffling.


Let me guess twitter's bigtable:

    - id, nick, mail, pwd, time, mytwithere
    - id, nick, mail, pwd, time, anothertwithere
    - id, nick, mail, pwd, time, mythirdtwithere
    - id, nick, mail, pwd, time, mylasttwithere
hmm, no, that won't work

normalization still wins




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: