What follows is a comprehensive list of definitions for the Databases and Normalization section. It is in your best interest to simply learn these off by heart. Memorizing them means that you get free marks for any definition questions and you’ll have a step by step guide as to how to normalize a database. The definitions of 1nf+2nf+3nf are literally instructions on how to normalize a database.
Database:
“A database is a collection of information that is organized so that it can easily be accessed, managed, and updated.” (http://searchsqlserver.techtarget.com/definition/database)
SQL:
Structured Query Language
Relational Database Management System (RDBMS):
a type of database management system (DBMS) that stores data in the form of related tables. (http://www.webopedia.com/TERM/R/RDBMS.html)
Table:
A table is a collection of related data held in a structured format within a database. It consists of columns, and rows. (https://en.wikipedia.org/wiki/Table_(database))
Field:
A field is a column in a table in a database
Record:
A record is a row in a table in a database.
Normalisation:
Normalisation is a technique for designing relational database tables to minimize duplication (data redundancy) and in so doing reduce the risk of data anomalies (ie. deletion, update and insertion anomalies).
Super Keys:
any combination of fields that can uniquely identify a record
Candidate Keys:
minimal super key (basically whittling down super keys to find a decent primary key)
Primary Keys:
A primary key is a unique key which can uniquely identify each row in a table. A unique key can comprise of one or more columns. No two distinct rows in a table can have the same value (or combination of values) in those columns
Composite/Compound/Concatenated Key (we call them composite keys):
A primary key made using more than one field. A primary key is used to uniquely identify a record in a table. Often a single field will not by itself uniquely identify a row so two or more fields will need to be chosen.*
*If you have been asked to define a composite key, and have not previously been asked the definition of a primary key, then give the definition of a primary key as well just to be safe.
Foreign key:
A key used in one table to represent the value of a primary key in a related table. (usually a 1 to many relationship)
Repeating Groups:
Any situation where multiple pieces of information are stored in a single field, or span multiple columns.
Type 1:
multiple pieces of data stored in the same field (e.g. netball, cricket, hockey in one cell).
Type 2:
when you have multiple fields in the table that all store the same information ( e.g sportColumn1: hockey, sportColumn2: cricket).
Type 3:
when you have a single field that spans multiple rows or records (this is essentially type 1 but looks very different).
Data redundancy (OR redundant data):
When data is repeated unnecessarily. This is usually most evident in 1nf:
Difference between repeating groups and data redundancy:
Repeating groups are multiple entries stored in a single cell or in a single row. Data redundancy is information that is duplicated across multiple rows/records.
Anomalies (Update, insert, delete):
Update Anomaly:
You want to change(update) something in the database but you need to change it in multiple places because of data redundancy. If you don’t change it in all instances then your database will become inconsistent.
Insert Anomaly:
When you can’t add something to the database because you require more information than is present.
Delete Anomaly:
when removing something from the database inadvertently causes data to be lost from the database.
Dependencies:
Full dependencies are when the full primary (composite) key is required (all columns of the key) to determine another attribute.
Partial dependencies are when the key is composite and some but not all of the columns of the key determine another attribute. (This may still be more than one column.)
Transitive dependencies are when a non-key (i.e. not a primary key) attribute determines another attribute
Derived data: Using fields to calculate other fields - usually not stored.
Atomic data: only 1 piece of information should be stored in each field.
0nf:
0nf is a table that has repeating groups and no primary keys. For some reason this pops up in exams and I have no idea why, because it is hardly ever used. When you see this just assume that it is a table that is not yet in 1nf.
Rules for 0nf:
* it has repeating groups of fields
* Positional dependence of data
* non-atomic data
http://www.blueclaw-db.com/database_0th_normal_form.htm
Rules for 1nf, 2nf and 3nf:
First Normal Form (1nf):
no repeating groups
has a primary key (or composite key)
Second Normal Form (2nf):
satisfies 1nf
no partial dependencies
Third Normal Form (3nf):
satisfies 2nf
no transitive dependencies
Blob:
If you're like me, you picture a blob as a enigmatic green object that lacks a defined shape or size. In the computer world, however, blobs are a bit easier to define. The term "blob" actually stands for "Binary Large Object" and is used for storing information in databases.
A blob is a data type that can store binary data. This is different than most other data types used in databases, such as integers, floating point numbers, characters, and strings, which store letters and numbers. Since blobs can store binary data, they can be used to store images or other multimedia files. For example, a photo album could be stored in a database using a blob data type for the images, and a string data type for the captions.
Because blobs are used to store objects such as images, audio files, and video clips, they often require significantly more space than other data types. The amount of data a blob can store varies depending on the database type, but some databases allow blob sizes of several gigabytes. Now that is a big blob!