Data Normalization

Data normalization is the way to store data in an OLTP system, the normalization needs to be (an ACID-compliant transaction). Data normalization will result in high consistency and low data redundancy.

The 3NF which is the 3rd normalization form is being leveraged widely.

  • Normalized data is leveraged by OLTP (online transaction processing)
  • De-normalized data is levered by OLAP (online analytic processing)
  • OLTP : Transaction workload, normalized data, write-heavy system.
  • OLAP : analytic workload, De-normalized data, read-heavy system.
Data normalization is a process of structuring RDMBS by 1st normal form, 2nd normal form, and 3rd normal form , 4th normal form, 5 th normal form and 6th normal form.

Data normalization will reduce redundancy and will enhance Data integrity and consistency, Data normalization process will ensure tables are optimized to ensure dependencies are properly enforced to make database operations more faster, reliable, and well organized. OLTP systems can have DB JOINTS the DB joints can be complex and hence in complex database operations it is better to avoid joints as it can consume a lot of CPU power.

These are the normal forms that have been standardized.

  • UNF (un-normalized form)
  • 1NF (first normal form)
  • 2NF (second normal form)
  • 3NF (3 rd normal form)
  • 4NF (4th normal form)
  • 5NF (5th normal form)
  • 6NF (6 th normal form)

Normalization-optimized data storage ensures faster reads and writes. De-normalization database will have less optimized write ops.

All the normal forms have certain protocols and constraints through which they operate. 3NF is implemented widely.

For all the normal forms there will be a set of constraints applied on  DB tables to achieve the specified normal form, each normal form will have certain steps and protocols that need to be adhered upon to achieve the desired normal form.

The normalization model is ETNF, then comes the 6NF which was created in the year 2003) 3 NF inception was in 1971.

Normalization is a storing schema where the data

  • Redundancy is reduced (nonredundant)
  • Ensures a high level of data consistency.
  • Normalization forms are integrated with the OLTP system to handle transactional workloads)
  • Query execution for writing is faster.

Pros of normalization

Reduced redundancy: Normalization minimizes data duplication by storing information only once. This reduces storage requirements and improves efficiency.

Improved data integrity: Normalized data eliminates anomalies such as insertion, update, and deletion anomalies, ensuring that the database remains accurate and consistent.

Enhanced consistency: Normalization enforces consistency in data representation across tables, leading to a more coherent and standardized database structure.

Nature of De-normalization:

De-normalization is a schema where the data can be queried easily like in  OLAP cube. De-normalized forms are integrated with the OLAP system to handle (analytics processing) -> Query execution for read is faster then writes. 
Denormalized data combines information from multiple tables into a single table to enhance data retrieval and performance for read-heavy scenarios. Denormalization prioritizes faster query execution over some redundancy and data integrity. denormalization trades off some redundancy and data integrity in exchange for quicker query execution. 

The article above is rendered by integrating outputs of 1 HUMAN AGENT & 3 AI AGENTS, an amalgamation of HGI and AI to serve technology education globally.

(Article By : Himanshu N)