Normalization
- Normalization is the process of organizing the data in the database.
- Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate the undesirable characteristics like Insertion, Update and Deletion Anomalies.
- Normalization divides the larger table into the smaller table and links them using relationship.
- The normal form is used to reduce redundancy from the database table.
Why Data Normalization?
-
Increased consistency. Information is stored in one place and one place only, reducing the possibility of inconsistent data.
-
Easier object-to-data mapping. Highly-normalized data schemas in general are closer conceptually to object-oriented schemas because the object-oriented goals of promoting high cohesion and loose coupling between classes results in similar solutions (at least from a data point of view).
You typically want to have highly normalized operational data stores (ODSs) and data warehouses (DWs).
The primary disadvantage of normalization is slower reporting performance. You will want to have a denormalized schema to support reporting, particularly in data marts.
Anamolies of DB
Example: Suppose a manufacturing company stores the employee details in a table named employee that has four attributes: emp_id for storing employee's id, emp_name for storing employee's name, emp_address for storing employee's address and emp_dept for storing the department details in which the employee works. At some point of time the table looks like this
emp_id | emp_name | emp_address | emp_dept |
---|---|---|---|
101 | Rick | Delhi | D001 |
101 | Rick | Delhi | D002 |
123 | Maggie | Agra | D890 |
166 | Glenn | Chennai | D900 |
166 | Glenn | Chennai | D004 |
The above table is not normalized. We will see the problems that we face when a table is not normalized.