|
|
 |
Blogs Toad and Database Commentaries |
Toad World blogs are a mix of insightful how-tos from Quest experts as well as their commentary on experiences with new database technologies. Have some views of your own to share? Post your comments! Note: Comments are restricted to registered Toad World users.
Do you have a topic that you'd like discussed? We'd love to hear from you. Send us your idea for a blog topic.

 |
 |
|
|
 |
 |
Location: Blogs
Bert Scalzo's Blog
|
|
| Bert |
Thursday, October 22, 2009 3:48 AM |
When designing and constructing a successful (i.e. effective and efficient) relational database, there are two fundamental sets of rules or design principles that are universally accepted and generally practiced by database architects:
While no set of rules is ever absolute, these nonetheless provide the fundamental foundation by which successful relational databases are designed and built. There are times when “minor compromises” or deviations from these rules are justifiable, but these are more the exception – plus there are some well known design mistakes to avoid at all costs. Nonetheless these mistakes happen with great frequency.
The use of “Concatenated” or “Compound” fields, that is where a “single column” contains multiple values separated by some special character (e.g. a semi colon), violates key essential rules from both design paradigms and constitutes one of the worst possible database design mistakes that can occur.
Codd's Rule #2 requires that every individual scalar (i.e. simple) value in the database must be logically addressable by specifying the name of the table and the name of the containing column. When one has to “parse a column” to access different values within that column, all report performance will suffer for three reasons. First, most databases cannot utilize indexes when retrieving rows where the report has to perform a SUBSTRING operation within a WHERE clause on such values. Second, the sheer time to perform said parsing for millions to billions of rows is far too expensive. Third, most databases can't provide referential integrity for JOIN operations on such values.
Database Normalization defines “normal forms” that database designers should strive to achieve within their relational databases. Most database architects are familiar with 1st, 2nd and 3rd normal forms – but there are several others, including 4th, Boyce Codd or 4.5 and 5thnormal forms. The basic design starting point of first normal form requires that a table's rows not contain repeating groups of values, that those repeating groups instead be separated as separate rows in a child related table. However for reporting purposes and data warehouses, it's not uncommon to relax this rule. The preferred method for those repeating groups would be as several columns rather than a single column containing multiple values separated by a special character. Again, the numerous well known performance ramifications for such a poor database design choice are well known, documented and to be avoided at all costs.
There are numerous articles, papers and books on this topic – including these examples:
|
|
| Permalink |
Trackback |
|
 |
 |
|
 |
|
 |
|
|