Not all databases are good! A bad schema has a lot of redundancy
We can normalize a database by decomposing it into smaller parts

The tuple determines if uniquely defines a value of
This is called a functional dependency, and arises from the application of a database

There are some axioms to determine functional dependencies

  • Reflexivity: determines any subset of
  • Augmentation: If then
  • Transitivity: If and then
  • Union: If and then
  • Decomposition: If then and

Attribute closure is the set of all attributes that can be logically derived by an attribute
To calculate,

  1. Start with an empty set
  2. Use reflexivity on the LHS attributes
  3. Start from the first FD given if attributes on LHS are in closure, add attributes on RHS to closure
  4. Repeat

If the set of attributes determines all other attributes then it is a superkey and is a candidate key if none of its subsets are superkeys.

FD closure is the set of all functional dependencies that can be logically derived by a given set of FDs
The minimal cover of is the smallest subset such that where denotes the FD closure of
To find the minimal cover,

  1. Put all the FDs in standard form, with one attribute on the RHS
  2. Check if equivalence is preserved after removing an attribute
  3. Delete redundant FDs

A decomposition breaks into and

A decomposition is lossless-join if it can be recovered back through joins
To test if a decomposition and is lossless-join,

  1. Compute
  2. Compute
  3. If OR then it’s a lossless-join

A decomposition is dependency preserving if the functional dependencies are preserved

  1. Split & into and
  2. Compute &
  3. Check if

The reason these are NOT the same is because functional dependencies are determined from the application level, not from the data. Even though a lossless-join decomposition can be joined together to retrieve the same data, the individual tables will lack the original functional dependencies

Normalization

  1. 0NF is not normalized
  2. 1NF and 2NF are simple restrictions that any well-structured DBMS meets
  3. 3NF and BCNF are stricter restrictions that any good DBMS should have
  4. 4NF and 5NF are out of scope

In BCNF, all FDs are either trivial or LHS is a super-key
BCNF decomposition is always lossless but may not be dependency preserving

If you need to keep your dependencies, you probably have to add redundancy.
In 3NF, all FDs are either trivial, LHS is a super-key, or RHS is part of a candidate key
3NF is lossless and dependency preserving

To decompose to 3NF while preserving dependencies,

  1. Apply BCNF decomposition until in 3NF
  2. Compute minimal cover of
  3. For all non-preserved FD in add a new relation

2NF does this too but also guarantees that no non-key attribute can be derived from a partial key (part of a key)

In 1NF, every field must be atomic or single-valued