Database Design - 1st, 2nd and 3rd normal form

In a relational database you can design separate tables and columns to hold sets of related data.

This is vaguely similar to a spreadsheet with multiple tabs holding different data but with three key extra benefits:

  1. You can strictly define the type of data to be stored in each place (Numbers/dates/text)
  2. Dates and numbers can be restricted to sensible values - Date_of_Birth must be in the past, Zip_code must be 20 characters or less etc.
  3. You can define exactly how the tables are linked (related) to one another.

It has been shown; in a series of systematic field audits and lab experiments; that around 88% of all spreadsheets contain errors (Raymond R. Panko, University of Hawai’i, 1998, Revised 2008)
Moving data into a structured relational database can prevent many of these errors.

First Normal Form

The  First Normal Form states that all column values must be atomic (singular)
So the value for one column (in a record) must contain a single value, not a list of items or a coded value that has more than one meaning

For example an order_type of 'A' meaning Overseas+ Airmail, then order_type of 'B' meaning Overseas+ Ship by boat would fail to meet First Normal Form. Changing to separate columns will fix this and allow for the case of Airmail within the same country.

First Normal Form also prohibits the presence of repeating groups even if they are stored in composite columns.
This means that a table for sales Orders containing the columns:
Order_No, Staff_id, Item1_description, Item2Description, Item3Description
would fail to meet First Normal Form, even though the descriptions are in separate columns.

Changing the design so that every item gets it's own row in the table means that we can have any number of items from 1 upwards, the table columns then become:
Order_No, Item_no, Staff_id, Description
with a primary key of Order_No + Item_no (concatenated) this is a valid First Normal Form table.

Second Normal Form

The  Second Normal Form implies that a tableis in First Normal Form and every non-key column is fully dependent on the entire primary key. In plain language the table should contain data relating to one 'thing' and the primary key will uniquely identify each 'thing'.

In the example above the Staff_id field is not fully dependent on the entire primary key, assuming only one member of staff would enter all the items in each order, then if you know the Order_No you will know the Staff_id without having to know the Item_no

This typically means that some data items need to be split into a separate table, in this case an Order_items table which would link back to the main Orders table.

Orders table:
Order_No, Staff_id

Order_Items table:
Order_No, Item_no, Quantity, Description

Third Normal Form

To meet the requirements of Third Normal Form a table must meet the requirements of Second normal form plus all non-key columns are mutually independent.

Orders table:
Order_No, Staff_id

Order_Items table:
Order_No, Item_no, Description

 

 

“My favorite things in life don't cost any money. It's really clear that the most precious resource we all have is time” ~ Steve Jobs

Related:

Naming conventions - The standard 'Reddick VBA Naming convention' for Access.
Number Data Types - Access data Types.


 
Copyright © SS64.com 1999-2019
Some rights reserved