Normalization in DBMS
Normalization in databases is a process used to organize data to reduce redundancy and improve data integrity. The main goal is to divide a database into two or more tables and define relationships between them, ensuring that data is stored logically and consistently.
Normalization typically involves several stages, known as normal forms, each with specific rules to follow. Let’s discuss the primary normal forms along with examples:
1. First Normal Form (1NF)
A table is in 1NF if:
- All columns contain atomic (indivisible) values.
- Each column contains values of a single type.
- Each column has a unique name.
- The order in which data is stored does not matter.
Example: Before 1NF:
StudentID | Name | Courses |
---|---|---|
1 | Alice | Math, English |
2 | Bob | Science, History |
After 1NF:
StudentID | Name | Course |
---|---|---|
1 | Alice | Math |
1 | Alice | English |
2 | Bob | Science |
2 | Bob | History |
2. Second Normal Form (2NF)
A table is in 2NF if:
- It is in 1NF.
- All non-key attributes are fully dependent on the primary key (i.e., no partial dependency).
Example: Before 2NF:
StudentID | CourseID | Grade |
---|---|---|
1 | 101 | A |
1 | 102 | B |
In this example, we assume that StudentID
and CourseID
together form the primary key.
After 2NF:
StudentID | Name |
---|---|
1 | Alice |
2 | Bob |
CourseID | Course |
---|---|
101 | Math |
102 | English |
StudentID | CourseID | Grade |
---|---|---|
1 | 101 | A |
1 | 102 | B |
3. Third Normal Form (3NF)
A table is in 3NF if:
- It is in 2NF.
- All the attributes are functionally dependent only on the primary key (i.e., no transitive dependency).
Example: Before 3NF:
StudentID | CourseID | Grade | Instructor |
---|---|---|---|
1 | 101 | A | Dr. Smith |
1 | 102 | B | Dr. Johnson |
After 3NF:
StudentID | Name |
---|---|
1 | Alice |
2 | Bob |
CourseID | Course |
---|---|
101 | Math |
102 | English |
InstructorID | Instructor |
---|---|
1 | Dr. Smith |
2 | Dr. Johnson |
StudentID | CourseID | Grade |
---|---|---|
1 | 101 | A |
1 | 102 | B |
CourseID | InstructorID |
---|---|
101 | 1 |
102 | 2 |
4. Boyce-Codd Normal Form (BCNF)
A table is in BCNF if:
- It is in 3NF.
- For every functional dependency (X → Y), X should be a super key.
Example: Before BCNF:
StudentID | CourseID | InstructorID | InstructorName |
---|---|---|---|
1 | 101 | 1 | Dr. Smith |
1 | 102 | 2 | Dr. Johnson |
If an InstructorID is uniquely assigned to an InstructorName, then the above table would not satisfy BCNF as InstructorID should be a candidate key.
After BCNF:
StudentID | Name |
---|---|
1 | Alice |
2 | Bob |
CourseID | Course |
---|---|
101 | Math |
102 | English |
InstructorID | Instructor |
---|---|
1 | Dr. Smith |
2 | Dr. Johnson |
StudentID | CourseID | Grade |
---|---|---|
1 | 101 | A |
1 | 102 | B |
CourseID | InstructorID |
---|---|
101 | 1 |
102 | 2 |
5. Fourth Normal Form (4NF)
A table is in 4NF if:
- It is in BCNF.
- It has no multi-valued dependencies.
Example: Consider a situation where a student can have multiple projects and multiple advisors independently.
Before 4NF:
StudentID | Project | Advisor |
---|---|---|
1 | Math | Dr. Smith |
1 | Science | Dr. Jones |
1 | Math | Dr. Brown |
After 4NF:
StudentID | Project |
---|---|
1 | Math |
1 | Science |
StudentID | Advisor |
---|---|
1 | Dr. Smith |
1 | Dr. Jones |
1 | Dr. Brown |
These normal forms provide a framework for organizing data and ensure that the database is free from unnecessary redundancies and inconsistencies. As the normal forms progress from 1NF to 4NF (and beyond), they impose more rigorous requirements.
References
- Description of the database normalization basics
- Database normalization – Wikipedia
- Overview of Data, Database, DBMS, and RDBMS