Thursday, August 16, 2012

What is Normalization? OR What are the Different Types of Normalization?

Answer:
Note: - A regular .NET programmer working on projects often stumbles on this question, which is but obvious. The bad part is sometimes the interviewer can take this as a very basic question to be answered and it can be a turning point for the interview. So let's cram it.
It is set of rules that have been established to aid in the design of tables that are meant to be connected through relationships. This set of rules is known as Normalization.
Benefits of Normalizing your database include:
  • Avoiding repetitive entries
  • Reducing required storage space
  • Preventing the need to restructure existing tables to accommodate new data
  • Increased speed and flexibility of queries, sorts, and summaries
Note: - During an interview, people expect to answer a maximum of three normal forms and that's what is expected practically. Actually you can normalize database to fifth normal form. But believe this book, answering three normal forms will put you in a decent shape during an interview.
The three normal forms as follows:

First Normal Form

For a table to be in first normal form, data must be broken up into the smallest units possible. In addition to breaking data up into the smallest meaningful values, tables in first normal form should not contain repetitions groups of fields.
Repeating groups example
In the above example, city1 and city2 are repeating. In order for these tables to be in First normal form, you have to modify the table structure as follows. Also note that the Customer Name is now broken down to first name and last name (First normal form data should be broken down to the smallest unit).
Customer table normalized to first normal form

Second Normal Form

The second normal form states that each field in a multiple field primary key table must be directly related to the entire primary key. In other words, each non-key field should be a fact about all the fields in the primary key.
In the above table of customer, city is not linked to any primary field.
Normalized customer table.

 
City is now shifted to a different master table.

That takes our database to a second normal form.

Third Normal Form

A non-key field should not depend on another Non-key field. The field Total is dependent on Unit price and qty.
Fill third normal form
So now the Total field is removed and is the multiplication of Unit price * Qty.

Fourth Normal Form

Note: - Whenever the interviewer is trying to go above the third normal form, there can be two reasons, ego or to fail you. Three normal forms are really enough, practically anything more than that is an overdose.
In fourth normal form, it should not contain two or more independent multi-valued facts about an entity and it should satisfy “Third Normal form”.
So let us try to see what multi-valued facts are. If there are two or more many-to-many relationship in one entity and they tend to come to one place, it is termed as “multi-valued facts”.
Multi-valued facts
In the above table, you can see that there are two many-to-many relationships between Supplier / Product and “Supplier / Location (or in short multi-valued facts). In order for the above example to satisfy the fourth normal form, both the many-to-many relationships should go in different tables.
Normalized to Fourth Normal form.


Fifth Normal Form

Note: - UUUHHH if you get this question after joining the company, do ask him if he himself really uses it?
Fifth normal form deals with reconstructing information from smaller pieces of information. These smaller pieces of information can be maintained with less redundancy.
Example: Dealers sell Product which can be manufactured by various Companies. Dealers in order to sell the Product should be registered with the Company. So these three entities have a mutual relationship within them.
Not in Fifth Normal Form.
The above table shows some sample data. If you observe closely, a single record is created using lot of small information. For instance: JM Associate can sell sweets under the following two conditions:
  • JM Associate should be an authorized dealer of Cadbury
  • Sweets should be manufactured by Cadbury company
These two smaller bits of information form one record of the above given table. So in order for the above information to be “Fifth Normal Form” all the smaller information should be in three different places. Below is the complete fifth normal form of the database.
 Complete Fifth Normal Form

 

No comments:

Post a Comment