Tuesday, June 2, 2015

SQL - Normalization: First Normal Form (1NF)


Normalization is a set of rules that are to be applied while designing the database tables which are to be connected with each other by relationships. Database normalization is a technical term to improve the data accuracy and efficiency and reduces the data redundancy and inconsistent data dependency.
Definition of Normalization
We organize the data into database tables by using normal forms rules or conditions. Normal forms help us to make a good database design. Generally in the real practice, we organize the data up to third normal form. We rarely use the fourth and fifth normal form. These are the Normal forms of the normalization –
4)     Boyce Code Normal Form (BCNF)
5)     Fourth Normal Form (4NF)
6)     Fifth Normal Form (5NF)

For the examples, we’ll use table Reservation Staff Information as shown below UNF as a starting point.  As we pointed out in the last post’s modification anomalies section, there are several issues to keeping the information in this form.  By normalizing the data you see we’ll eliminate duplicate data as well as modification anomalies.

Reservation Staff
Project Code
Project Name
Project Manager
Project Budget
Employee1
Employee2
Employee3
PC001
Reservation
Mike Towery
$125,000
Ryan Arjun
Will Smith
Bill Gray
PC002
Attendance
Chris Gray
$155,000
Tony Towery
Chao Milk
Red Bill
PC003
Human Resources
Kimmy Wang
$225,000
Lucy Gray
Chris Crow
Tom Ramsay

First Normal Form (1NF)
A database table is said to be in 1NF if it contains no repeating fields/columns. Once a table is in first normal form it is easier to search, filter, and sort the information. The process of converting the UNF table into 1NF is as follows:
1)  Separate the repeating fields into new database tables along with the key from unnormalized database table.
2)     The primary key of new database tables may be a composite key.
When a value is atomic, the values cannot be further subdivided.  For example, the value “Reservation” is atomic; whereas “Reservation; Attendance; Human Resources” is not. Related to this requirement is the concept that a table should not contain repeating groups of columns such as Employee1, Employee2 and Employee3.
This example table is transformed to first normal form by placing the repeating customer related columns into their own table.  This is shown below:
The repeating groups of columns now become separate rows in the Reservation Staff table linked by the Emp ID foreign key.  As we know that a foreign key is a value which matches back to another table’s primary key.
In our case, the Reservation Staff table contains the corresponding Emp Id for the Employee Master now.
Employee Master
Emp Id
Employee Name
EmployeeDept
EmployeeRate
E0001
Ryan Arjun
Database
240
E0002
Tony Towery
Testing
230
E0003
Lucy Gray
IT
250
E0004
Will Smith
Database
245
E0005
Chao Milk
Testing
225
E0006
Chris Gyal
IT
210
E0007
Bill Gray
Database
190
E0008
Red Bill
Testing
210
E0009
Tom Ramsay
IT
200
Primary Key for the above table is Emp Id 

Reservation Staff
Project Code
Emp Id
Project Name
Project Manager
Project Budget
PC001
E0001
Reservation
Mike Towery
$125,000
PC002
E0002
Attendance
Chris Gray
$155,000
PC003
E0003
Human Resources
Kimmy Wang
$225,000
PC001
E0004
Reservation
Mike Towery
$125,000
PC002
E0005
Attendance
Chris Gray
$155,000
PC003
E0006
Human Resources
Kimmy Wang
$225,000
PC001
E0007
Reservation
Mike Towery
$125,000
PC002
E0008
Attendance
Chris Gray
$155,000
PC003
E0009
Human Resources
Kimmy Wang
$225,000
Composite Key  ( Unique Key) = Project Code + Emp Id

This design is superior to our original table in several ways:
1. The original design limited each Reservation Staff entry to three employees.  In the new design, the number of employees associated to each design is practically unlimited.
2. It was nearly impossible to Sort the original data by Reservation Staff.  You could, if you used the UNION statement, but it would be cumbersome.  Now, it is simple to sort customers.
3. The same holds true for filtering on the Reservation Staff table. 
Modification anomalies remain in both tables, but these are fixed once we reorganize them as 2nd normal form.

2 comments:

Popular Posts