1. What is Normalization in a database application

    1. Normalization in a database is a process of organizing data to reduce redundancy and improve data integrity. It divides large tables into smaller ones and links them using relationships. The main objective is to make the database more efficient by eliminating data anomalies like duplication or inconsistencies.
    2. There are several "normal forms" (NF), which are steps that help achieve normalization. Here's a simplified explanation:
    3. 1NF (First Normal Form)
      1. Goal: Ensure that the table has atomic values (no multi-valued attributes) and all rows are unique.

      2. Rules:

        1. Each column should contain only atomic values (a single value, no lists or sets).
        2. There should be no repeating groups (no multiple columns for the same type of data).
        3. All entries in a column should be of the same data type.
        4. Each record (row) must be unique (no duplicate rows).
      3. Example (Non-1NF table):

        OrderID CustomerName PhoneNumbers ProductName
        1 John 123-4567, 234-5678 TV, Mobile
        2 Mary 345-6789 Laptop
      4. Problems:

        1. The PhoneNumbers column has multiple values.
        2. The ProductName column has multiple values.
        3. To achieve 1NF:
      5. PhoneNumbers and ProductName should have only one value per row.

        OrderID CustomerName PhoneNumber ProductName
        1 John 123-4567 TV
        1 John 234-5678 Mobile
        2 Mary 345-6789 Laptop
      6. Now, each column contains atomic values, and there are no repeating groups.

    4. 2NF (Second Normal Form)
      1. Goal: Ensure that the table is in 1NF, and all non-key columns are fully dependent on the entire primary key (not partially dependent).

      2. Rules:

        1. The table must be in 1NF.
        2. Every non-key attribute must be fully dependent on the whole primary key (if the primary key is composite, every non-key column must depend on the combination of all parts of the key).
      3. Example (Non-2NF table):

        OrderID ProductID ProductName CustomerName
        1 101 TV John
        2 102 Mobile John
        3 103 Laptop Mary
        1. The table is in 1NF, but ProductName depends only on ProductID, not on OrderID.
        2. CustomerName depends only on OrderID, not on ProductID.
      4. To achieve 2NF:

        1. Split the table into smaller tables to remove partial dependencies.
      5. Products Table:

        ProductID ProductName
        101 TV
        102 Mobile
        103 Laptop
      6. Orders Table:

        OrderID ProductID CustomerName
        1 101 John
        2 102 John
        3 103 Mary
      7. Now, every non-key attribute is fully dependent on the entire primary key.

    5. 3NF (Third Normal Form)
      1. Goal: Eliminate transitive dependencies (non-key columns should not depend on other non-key columns).

      2. Rules:

        1. The table must be in 2NF.
        2. There should be no transitive dependencies (no non-key attribute should depend on another non-key attribute).
      3. Example (Non-3NF table):

        OrderID CustomerID CustomerName CustomerAddress
        1 201 John 123 Main St
        2 202 Mary 456 Oak St
        1. Here, CustomerAddress depends on CustomerName, and CustomerName depends on CustomerID.
        2. This is a transitive dependency, as CustomerAddress depends indirectly on CustomerID via CustomerName.
        3. However, if you assume that CustomerAddress is determined by CustomerName (which is not typically the case), you could mistakenly think there's a transitive dependency. Instead, CustomerAddress should directly depend on CustomerID.
      4. To achieve 3NF:

        1. Split the table to remove the transitive dependency.
      5. Customers Table:

        CustomerID CustomerName CustomerAddress
        201 John 123 Main St
        202 Mary 456 Oak St
      6. Orders Table:

        OrderID CustomerID
        1 201
        2 202
      7. Now, there are no transitive dependencies:

        1. Each non-key attribute in both tables depends only on the primary key:
          1. In the Customers table, CustomerName and CustomerAddress depend on CustomerID.
          2. In the Orders table, there's no non-key attribute that depends on another non-key attribute.
    6. BCNF (Boyce-Codd Normal Form)
    7. 4NF (Fourth Normal Form)
    8. 5NF (Fifth Normal Form)

  2. Summary