Database normalization is a crucial process that aims to enhance database efficiency, consistency, and maintainability by reducing data redundancy and minimizing anomalies.
In this blog post, we will explore the different levels of database normalization – 1NF, 2NF, 3NF, and BCNF – with examples to help you better understand this essential technique.
So, let’s dive in! 😃
First Normal Form (1NF)
The first level of normalization, known as First Normal Form (1NF), eliminates duplicate data and enforces a consistent structure for each row in a table. To achieve 1NF, a table must follow these rules:
1.1. Atomic values: Each cell must contain a single, indivisible value.
1.2. Unique column names: Each column must have a distinct name.
1.3. Consistent data types: Each column must store data of the same type.
Example:
Before 1NF:
OrderID | Customer | ItemsOrdered |
1 | Alice | Apples, Bananas |
2 | Bob | Grapes, Oranges |
3 | Alice | Apples, Grapes |
After 1NF:
OrderID | Customer | Item |
1 | Alice | Apples |
1 | Alice | Bananas |
2 | Bob | Grapes |
2 | Bob | Oranges |
3 | Alice | Apples |
3 | Alice | Grapes |
As you can see, the table now conforms to 1NF, with atomic values in each cell and a consistent data structure.
Second Normal Form (2NF)
Second Normal Form (2NF) builds upon 1NF, ensuring that all non-prime attributes (attributes that aren’t part of the primary key) are fully dependent on the primary key.
To achieve 2NF, a table must:
2.1. Be in 1NF.
2.2. Have no partial dependency: Non-prime attributes must depend entirely on the primary key, rather than just a part of it.
Example:
Before 2NF (Already in 1NF):
OrderID | Customer | Item | CustomerAddress |
1 | Alice | Apples | 123 Apple St. |
1 | Alice | Bananas | 123 Apple St. |
2 | Bob | Grapes | 456 Orange Ave |
2 | Bob | Oranges | 456 Orange Ave |
3 | Alice | Apples | 123 Apple St |
3 | Alice | Grapes | 123 Apple St |
After 2NF:
Table 1 – Orders
OrderID | Customer | Item |
1 | Alice | Apples |
1 | Alice | Bananas |
2 | Bob | Grapes |
2 | Bob | Oranges |
3 | Alice | Apples |
3 | Alice | Grapes |
Table 2 – Customers
Customer | CustomerAddress |
Alice | 123 Apple St. |
Bob | 456 Orange Ave. |
By separating the tables, we’ve eliminated partial dependencies and achieved 2NF.
Third Normal Form (3NF)
Third Normal Form (3NF) goes a step further by removing any transitive dependencies. This means that non-prime attributes must not depend on other non-prime attributes through the primary key.
To achieve 3NF, a table must:
- Be in 2NF.
- Have no transitive dependencies: Non-prime attributes must not depend on other non-prime attributes indirectly through the primary key.
Example:
Before 3NF (Already in 2NF):
Table 1 – Orders
OrderID | CustomerID | ItemID |
1 | 1 | 1 |
1 | 1 | 2 |
2 | 2 | 3 |
2 | 2 | 4 |
3 | 1 | 1 |
3 | 1 | 3 |
Table 2 – Customers
CustomerID | CustomerName | CustomerAddress | Country |
1 | Alice | 123 Apple St. | USA |
2 | Bob | 456 Orange Ave. | USA |
After 3NF:
Table 1 – Orders
OrderID | CustomerID | ItemID |
1 | 1 | 1 |
1 | 1 | 2 |
2 | 2 | 3 |
2 | 2 | 4 |
3 | 1 | 1 |
3 | 1 | 3 |
Table 2 – Customers
CustomerID | CustomerName | AddressID |
1 | Alice | 1 |
2 | Bob | 2 |
Table 3 – Addresses
AddressID | CustomerAddress | Country |
1 | 123 Apple St. | USA |
2 | 456 Orange Ave. | USA |
By creating a separate table for addresses, we’ve eliminated transitive dependencies and achieved 3NF.
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF) is a stronger version of 3NF, ensuring that every determinant (a set of attributes that determines another attribute) is a candidate key (a minimal superkey, a set of attributes that uniquely identifies a tuple).
To achieve BCNF, a table must:
- Be in 3NF.
- Ensure that every determinant is a candidate key.
Example:
Before BCNF (Already in 3NF):
Table 1 Course Registrations
StudentID | CourseID | InstructorID |
1 | 101 | 1001 |
2 | 102 | 1002 |
3 | 101 | 1001 |
In this table, the combination of StudentID and CourseID uniquely identifies each record. However, InstructorID is also dependent on CourseID, and CourseID alone is not a candidate key. This violates BCNF.
After BCNF:
Table 1 – Registrations
StudentID | CourseID |
1 | 101 |
2 | 102 |
3 | 101 |
Table 2 – CourseInstructors
CourseID | InstructorID |
101 | 1001 |
1021 | 1002 |
By creating a separate table for course instructors, we’ve ensured that every determinant is a candidate key, and the tables now conform to BCNF.
Final Thoughts
Database normalization is a crucial process to optimize and maintain database efficiency.
By understanding and implementing 1NF, 2NF, 3NF, and BCNF, you can minimize data redundancy, reduce anomalies, and improve overall database performance.
Thank you for reading our blog, we hope you found the information provided helpful and informative. We invite you to follow and share this blog with your colleagues and friends if you found it useful.
Share your thoughts and ideas in the comments below. To get in touch with us, please send an email to dataspaceconsulting@gmail.com or contactus@dataspacein.com.
You can also visit our website – DataspaceAI