Lec-20: Introduction to Normalization | Insertion, Deletion & Updation Anomaly
Normalization in DBMS
Introduction to Normalization
- The speaker introduces normalization as a crucial concept in Database Management Systems (DBMS), emphasizing its role in reducing redundancy within tables.
- Normalization is defined as a method to eliminate or minimize data duplicacy, which can occur at both row and column levels.
Row Level Duplicacy
- An example of row level duplicacy is provided using a student table containing SID, student name, and age. The first and third rows are identical, illustrating this type of duplicacy.
- To address row level duplicacy, the speaker suggests implementing a primary key. In this case, the SID is designated as the primary key to ensure uniqueness and prevent null values.
Column Level Duplicacy
- The discussion shifts to column level duplicacy with another table that includes various attributes like Course ID and faculty details.
- Although no two rows are identical due to the primary key constraint, columns may still contain duplicate values across different rows.
Types of Anomalies Caused by Duplicacy
Insertion Anomaly
- The speaker explains three types of anomalies: insertion anomaly, deletion anomaly, and updation anomaly.
- An insertion anomaly occurs when adding new data (e.g., a new course or faculty member) becomes problematic because it requires an existing primary key value that may not yet exist.
Deletion Anomaly
- A deletion anomaly arises when removing data from the database inadvertently deletes essential information about other entities (e.g., deleting a student record also removes associated course information).
- This issue highlights how deleting one piece of data can lead to loss of critical context regarding related records.
Conclusion on Normalization's Importance
Understanding Database Anomalies and Normalization
Introduction to Database Operations
- The speaker discusses the removal of student details, leading to loss of course and faculty information, highlighting the issue of data redundancy.
- Three primary operations in databases are identified: insertion, deletion, and updation. These operations are fundamental in database design.
Updation Anomaly Explained
- A query example is provided where a student's name is updated from "Amrit" to "Amritpal," demonstrating a straightforward update process.
- The speaker illustrates another query for changing a faculty member's salary from 30,000 to 40,000, emphasizing potential issues with repeated updates due to data duplication.
Consequences of Updation Anomaly
- The speaker notes that if the faculty ID (F1) appears multiple times in the database, updating the salary will occur as many times as it appears rather than just once.
- This leads to an important realization: despite having only one faculty member with ID F1, the salary change could be incorrectly applied multiple times due to column-level duplicacy.
Introduction to Normalization
- The concept of normalization is introduced as a method for removing redundancy within databases through specific rules.
- A proposed solution involves dividing a single table into multiple tables based on attributes like Student ID (SID), Course ID (CID), and Faculty ID (FID).
Benefits of Normalization
- By restructuring tables into separate entities for students, courses, and faculty members, future data entries become more efficient without redundancy issues.