Database Normalization and Normal Forms

Database Normalization

What do you mean by Database Normalization?

Database normalization is a process used to organize a database into tables and columns to minimize redundancy and dependency. The main goal is to divide large tables into smaller, more manageable ones, reducing data duplication and ensuring data integrity. The process involves several stages, known as normal forms (NF), each with specific criteria. Here’s a breakdown:

First Normal Form (1NF) in Database Normalization

  • Atomicity: Each column must contain atomic (indivisible) values, meaning each column should contain only one value per row.
  • Uniqueness: Each column should have a unique name.
  • Order irrelevance: The order in which data is stored does not matter.

Example

Suppose we have a table with the following structure:

OrderIDCustomerNameProducts
1John DoeApple, Banana
2Jane SmithBanana, Orange

This table violates 1NF because the “Products” column contains multiple values. To normalize it to 1NF, we split it into separate rows:

OrderIDCustomerNameProducts
1John DoeApple
1John DoeBanana
2Jane SmithBanana
2Jane SmithOrange

Second Normal Form (2NF) in Database Normalization

  • 1NF compliance: The table must already be in 1NF.
  • Elimination of partial dependency: All non-key attributes must be fully functionally dependent on the primary key.

Example

Consider a table in 1NF:

OrderIdProductCustomerNameCustomerAddress
1AppleJohn Doe123 Elm St
2BananaJohn Doe123 Elm St

The “CustomerAddress” depends only on “CustomerName” (partial dependency), not the whole primary key (“OrderID” and “Product”). To normalize to 2NF, we create separate tables:

Orders Table:

OrderIDProduct
1Apple
1Banana

Customers Table:

CustomerNameCustomerAddress
John Doe123 Elm St

Third Normal Form (3NF) in Database Normalization

  • 2NF compliance: The table must already be in 2NF.
  • Elimination of transitive dependency: Non-key columns should not depend on other non-key columns.

Example

A table in 2NF might look like this:

OrderIDProductCustomerIDCustomerAddress
1Apple1001123 Elm St
1Banana1001123 Elm St

Here, “CustomerAddress” depends on “CustomerID”, not directly on the primary key. To normalize it to 3NF, we separate the dependent attributes into a new table:

Orders Table:

OrderIDProductCustomerId
1Apple1001
1Banana1001

Customers Table:

CustomerIDCustomerAddress
1001123 Elm St

Boyce-Codd Normal Form (BCNF) in Database Normalization

  • 3NF compliance: The table must already be in 3NF.
  • More stringent condition: Every determinant must be a candidate key.

Example

Consider a table:

CourseIDInstructorDepartment
C101Dr. SmithPhysics
C102Dr. JonesChemistry

If each department has only one instructor, and each course is taught by one instructor, we might face anomalies. To normalize to BCNF, we ensure that every determinant is a candidate key:

Courses Table:

CourseIDInstructor
C101Dr. Smith
C102Dr. Jones

Instructors Table:

InstructorDepartment
Dr. SmithPhysics
Dr. JonesChemistry

Fourth Normal Form (4NF) in Database Normalization

  • BCNF compliance: The table must already be in BCNF.
  • No multi-valued dependencies: A record should not have two or more independent multi-valued facts about an entity.

Example

Consider a table:

StudentIDCourseHobby
1001MathChess
1001ScienceChess
1001MathPainting

Here, “Course” and “Hobby” are independent multi-valued facts about “StudentID”. To normalize to 4NF, we separate them into different tables:

StudentCourses Table:

StudentIdCourse
1001Math
1001Science

StudentHobbies Table:

StudentIdHobby
1001Chess
1001Painting

Fifth Normal Form (5NF) in Database Normalization

  • 4NF compliance: The table must already be in 4NF.
  • No join dependency: Decomposing the table should not result in any loss of data when the tables are joined back together.

Summary

Database normalization involves organizing the columns and tables of a database to reduce redundancy and improve data integrity. The stages of normalization are designed to ensure that the data is stored efficiently and logically, reducing anomalies and improving the consistency of the database.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *