Normalization is a systematic approach of decomposing tables to eliminate data redundancy (repetition) and undesirable characteristics like Insertion, Update, and Deletion Anomalies. It is a multi-step process that puts data into tabular form, removing duplicated data from the relation tables. Let's discuss how normalization is applied within the movie_production_companies database, focusing on the primary stages up to the Third Normal Form (3NF), which is often sufficient for most databases.
In the movie_production_companies database, 1NF is seen through in the tables in the following ways:
Each record is identified by a unique column (the primary key),
Values in each column of a table are of the same data type. See the Data Dictionary for Tables,
Each table cell is indivisible and contains a single value,
Lastly, a separate table for each group of related data is created; this is seen how, in the company table, information like the kind of organization a company is, the registration body, the city a company is located in and of course the country are broken down and stored in a table of their own and then linked to the company table via a foreign key. Imagine this was not the case and, say a country name is inserted directly into the company table, picture five companies are located in the same country say Macedonia at the time but this country eventually gets renamed to North Macedonia, this update is made in the database to only 3 out of the 5 records of movie companies located here, the inconsistency and shortage of database integrity here calls for concern. I, thereby, saw it as a diligent practice of the 1NF to separate the city, country, registration_body, kind_of_organization tables from company table.
This is the whole point of Atomicity, ensuring that either ALL the changes made are committed to the database, or NONE of them are.
In 2NF, a table should have NO partial dependencies. In other words, all non-prime attributes (attributes not part of any candidate key) must be fully functionally dependent on the entire primary key.
Application in the database
In the crew_info table, fields like scene_bonus, hourly_rate, daily_bonus, etc, are non-key attributes and must be fully functionally dependent on the Composite Key - employee_id, role_id & movie_code as a whole because any partial dependencies on only either of the composite key can lead to anomalies, and that's what aimed to address with 2NF.
In 3NF, no non-key fields depend on any fields that are not the primary key.
Application in the Database
For example, it was noted in the Application Scenario to properly document a description for each employee that has one or several phone numbers. This description field (a non-key attribute) does not depend on the employee id (the primary key) but on the phone number (a fellow non-key attribute). To solve this, the description field was removed, and an additional table was created with the field it depends on, phone_number and then linked back to the employee table via the employee_id.
In this database, the process of normalization helps in minimizing duplication, ensuring data integrity, and simplifying the data model making it easier to maintain. The adherence to normalization principles in the movie_production_companies database ensures efficient storage, quick data retrieval, and clear relationships between entities, which facilitates both simple and complex queries while maintaining the integrity and consistency of the data.