Normalisation
👍 Key Requirements:
All columns contain atomic values (no lists, sets, or composite fields).
Each row is unique (typically enforced by a primary key).
No repeating groups or arrays within a row.
Each column contains values of a single data type.
Suppose you have a table tracking customer purchases, where the “Purchased Products” column contains a comma-separated list of products:
101
John Doe
Laptop, Mouse
102
Jane Smith
Tablet
103
Alice Brown
Keyboard, Monitor, Pen
Why is this not in 1NF?
Non-atomic values: “Purchased Products” contains multiple items per cell.
Querying and updating are complex: Searching for customers who bought “Mouse” requires string parsing.
Data integrity risks: No way to enforce referential integrity between products and customers.
Inconsistent data entry: Different delimiters or typos can creep in.
Real-World Impact:
Reporting (e.g., “Who bought a Laptop?”) is error-prone.
Updates (e.g., renaming “Mouse” to “Wireless Mouse”) are tedious and unreliable.
Referential integrity cannot be enforced.
Real-World Issues:
Reporting challenges: Generating reports such as “How many customers bought a Laptop?” becomes complicated, as you cannot simply filter a column for “Laptop”, you must parse the string.
Update anomalies: If a product name changes (e.g., “Mouse” to “Wireless Mouse”), you must update every occurrence in every cell, increasing the risk of missing some entries.
Data integrity risks: There is no way to enforce referential integrity between products and customers, which can lead to orphaned or inconsistent data.
Transformation Steps to Achieve 1NF:
Identify columns with non-atomic values: In this case, “Purchased Products” contains multiple values.
Split the multi-valued column into separate rows: Each product purchased by a customer should be represented as a separate row, ensuring that every field contains only a single value.
Transformed Table in 1NF:
101
John Doe
Laptop
101
John Doe
Mouse
102
Jane Smith
Tablet
103
Alice Brown
Keyboard
103
Alice Brown
Monitor
103
Alice Brown
Pen
Definition: A table is in 2NF if it is in 1NF and every non-prime attribute (i.e., non-primary key attribute) is fully functionally dependent on the entire primary key. This addresses partial dependencies, where a non-key attribute depends only on part of a composite key.
Example Transformation to 2NF
1NF Table:
(student_id, course_id) | student_name | course_name
⚠️ Issue:
“course_name
” depends only on “course_id
”, not the full primary key (“course_id
”, “student_id
”). This is a partial dependency.
Normalization to 2NF:
table
students(student_id, student_name)
table
courses(course_id, course_name)
table
student_courses(student_id, course_id)
Definition: A table is in 3NF if it is in 2NF and all the attributes are functionally dependent only on the primary key, there are no transitive dependencies
201
101
Laptop
HP
202
101
Mouse
Logitech
203
102
Tablet
Apple
Supplier does not directly depend on primary key. Normalization to 3NF:
Move product and supplier information to separate tables.
table orders(order_id, customer_id, product_id)
table products(product_id, product_name, supplier_id)
table suppliers(supplier_id, supplier_name)
Last updated
Was this helpful?