Normalization – a systematic approach of decomposing tables to eliminate
data redundancy and undesirable characteristics
Normalization forms
- 1st Normal Form
- 2nd Normal Form
- 3rd Normal Form
- BCNF (Boyce-Codd Normal Form)
Possible
Database Anomalies
- Update anomaly
- Insertion anomaly
Ex: Course table contains fields CourseID, teacherID,
A
teacher who is not yet assigned to a course is added to table with CourseID as
null- Deletion anomaly
Ex: “Faculty and Courses”
table has both lecturer and faculty. If “MEDICAL” faculty is closed, and if we
delete all records belonging to “MEDICAL” faculty, all the lecturers in the
same faculty will also get deleted (will cease to exist)
Normalization explained
Normalization explained
1st
Normal Form
No
2 rows of the table should contain repeating information
Each set of
column must have a unique value (columns should not have comma-separated
/ list of values)
Before
normalization
Adam
|
15
|
Biology, Maths
|
After
normalization
Adam
|
15
|
Biology
|
Adam
|
15
|
Maths
|
Cons – Data redundancy
increases
2nd
normal form
For a table that
has concatenated primary key, each column in the table that is not part of the
primary key must depend upon the entire concatenated key for its existence.
If any column depends only on one part of the concatenated key, then the table
fails Second
normal form.
Ex:
Name Age Subject
Mike 15 Maths
Mike 15 Science
Andrew 20 History
Above
table has a concatenated primary key (i.e. to search a unique value the
concatenated value has to be considered.
The
field “Age” is not dependent on Name + Age + Subject. It depends only on the
name. So Age should be separated out with a primary key
Above
table will be decomposed into 2 tables
One
has columns Name, Age & the other has columns Name, Subject where name will
be the primary key of both tables
3rd
normal form
There
should not be the case that a non-prime attribute is determined by another
non-prime attribute
Ex:
Student_id
|
Student_name
|
DOB
|
Street
|
city
|
State
|
Zip
|
Street,
City & State depends on Zip. So the 3 fields should be moved to a different
table as follows (decomposed into 2 tables)
Student_id
|
Student_name
|
DOB
|
Zip
|
Zip
|
city
|
State
|
Street
|
BCNF - <TO BE INCLUDED>
Normalization vs De-normalizations
Normalization
Normalization vs De-normalizations
Normalization
- Writing operations (insert, update) are easy to perform (since no duplicate columns need to be updated)
- Reading operations are difficult to perform (since data span across multiple tables so joint statements should be used)
De-normalization
- Writing operations (insert, update) are difficult to perform since more than 1 columns need to be updated
- Reading operations are easy to perform since data is contained in same table so no need to use joint statements
No comments:
Post a Comment