Thursday, February 16, 2017

Databases : Normalization vs De normalization (studytonight.com)

Normalization – a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics

Normalization forms
  • 1st Normal Form
  • 2nd Normal Form
  • 3rd Normal Form
  • BCNF (Boyce-Codd Normal Form)


Possible Database Anomalies
  • Update anomaly
Ex: Both Employee and EmployeeContact tables have address columns. When updating the contact information, one table is not updated so the same employee has 2 different address values when queried
  • Insertion anomaly
Ex: Course table contains fields CourseID, teacherID,
A teacher who is not yet assigned to a course is added to table with CourseID as null
  • Deletion anomaly
Ex: “Faculty and Courses” table has both lecturer and faculty. If “MEDICAL” faculty is closed, and if we delete all records belonging to “MEDICAL” faculty, all the lecturers in the same faculty will also get deleted (will cease to exist)


Normalization explained

1st Normal Form
No 2 rows of the table should contain repeating information

Each set of column must have a unique value (columns should not have comma-separated / list of values)
Before normalization
Adam
15
Biology, Maths
After normalization
Adam
15
Biology
Adam
15
Maths

Cons – Data redundancy increases
2nd normal form
For a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails Second normal form.
Ex:
Name      Age        Subject
Mike        15           Maths
Mike        15           Science
Andrew   20           History

Above table has a concatenated primary key (i.e. to search a unique value the concatenated value has to be considered.
The field “Age” is not dependent on Name + Age + Subject. It depends only on the name. So Age should be separated out with a primary key

Above table will be decomposed into 2 tables
One has columns Name, Age & the other has columns Name, Subject where name will be the primary key of both tables

3rd normal form
There should not be the case that a non-prime attribute is determined by another non-prime attribute

Ex:
Student_id
Student_name
DOB
Street
city
State
Zip
Street, City & State depends on Zip. So the 3 fields should be moved to a different table as follows (decomposed into 2 tables)

Student_id
Student_name
DOB
Zip


Zip
city
State
Street


BCNF - <TO BE INCLUDED>



Normalization vs De-normalizations
Normalization

  • Writing operations (insert, update) are easy to perform (since no duplicate columns need to be updated)
  • Reading operations are difficult to perform (since data span across multiple tables so joint statements should be used)
De-normalization
  • Writing operations (insert, update) are difficult to perform since more than 1 columns need to be updated
  • Reading operations are easy to perform since data is contained in same table so no need to use joint statements

No comments:

Post a Comment