Explain what is meant by data consistency, data redundancy and data independence.
Explain what is meant by relational database organisation and data normalisation (first, second and third normal forms).
Restructure data into third normal form.
Explain and apply entity relationship modelling and use it to analyse problems.
Explain how the data can be manipulated to provide the user with useful information.
Data validation and Verification - Explain and apply appropriate techniques for data validation and verification of data in databases.
Searching data - Explain the purpose of query languages.
Searching data - Construct and run queries using Structured Query Language (SQL).
Database management Systems - Explain the purpose of a database management system (DBMS) and data dictionaries.
Big Data - Explain what is meant by Big Data, predictive analytics, data warehousing and data mining
Distributed systems - Explain that distribution can apply to both data and processing
Distributed systems - Describe distributed databases and the advantages of such distribution.
Benefits of a relational database
Avoids data duplication (storing of the same data more than once)
Avoids inconsistent records (copies of the same data stored more than once, but some copies have been updated, whilst others have not, leading to inconsistency between the data stored)
Data can be readily updated and edited
Data can be added and deleted as required. If some data is deleted, the unwanted side effect may be that other data is lost. A relational database does not allow data to be deleted, if it is related to data in other tables.
Can support complex queries such as SQL
Enables different access levels to different data. Enables a DBMS to be used
An un-normalised database is where a table contains repeating groups/entities i.e. it is not in 1NF.
Data Normalisation is the mathematical way of structuring data according to theoretical rules
Benefits of Data normalisation are:-
Reduces data duplication/redundancy
Avoids danger of inconsistency/maintain integrity
Avoids danger of data being lost during updates
Avoids wasting processing time
Probably enables easier maintenance of the database
Allows different view of data
The stages of normalisation are:
Unnormalised-1st - Remove repeating groups of data
1st-2nd Normal form - Any partial dependencies must be removed (if data depends upon only part of the primary key)
2nd - 3rd Normal form - Any transitive dependencies need to be removed to convert from 2NF to 3NF. It needs to be ensured that each attribute / field depends only on the primary key i.e. if a field depends upon another field which is not the primary key, then it should be in a separate table.
3NF - All data items depend on nothing but the primary key
A primary key is a field which uniquely identifies a record in a database.
A foreign key in a database is a field in a table which links to or establishes a relationship with another table. It enables the data in different tables to be linked together.
Database management Systems - Explain the purpose of a database management system (DBMS) and data dictionaries
A database management system stores data accessed by multiplies types of user.
Different users may have different levels of access to different data sets.
With different users having different access rights (e.g. read only, read/write).
Security is improved as access is password protected
IT staff may be asked to carry out the following tasks on a DBMS:
Setting up tables, queries and reports for different classes of user
Database maintenance and performance management
Allcoating user names, managing password and access rights
Making regular backups of the data and restoring data in the event of loss
Monitoring the network via acess logs.
Maintaining security through installing virus checking and firewall
Providing training facilities for users and access to a help desk
Data mining is the analysis of a large amount of data in a data warehouse to provide new information or to find new patterns in the existing data.
A supermarket could use the intelligence derived from data mining on data extracted from loyalty card data to increase its profits by attracting customers to make additional purchases via targeted special offers, etc and to reward customers for previous purchases
Big Data refers to data sets so large and complex that it becomes difficult to process using standard relational database techniques.
Predictive analytics consist of a variety of statistical techniques including modelling, machine learning and data mining. For example In business, predictive models analyses patterns found in historical and transactional data to identify patterns that may present risks and opportunities.
As a databases often contain huge amounts of data it is often more efficient to store data on a number of different computers (probably in different locations) in order to maximise performance.
Advantages are:-
Resilient. A problem in one site will not stop other sites from working. In the case of stabase failure, the total system of centralised databases comes to a halt. However, in distributed systems, when a computer fails, the functioning of the system continues.
Security - Staff can be limited to access of only their portion of the database. A single site will still work even if the connection between sites is temporarily broken
Scaling - If demand increase then it is straight forward to add an extra node to the distributed database (A large company, the demand will likely increase/decrease when a market changes and the system, can be scaled quickly). The work simply required adding new computers and local data for the new site and then connecting them to the distributed system, with no interruption in the current functions.
High performance - Queries and updates are largely local so that there is no network bottleneck (The queries/updates are likely to be local to the sites, and there will be no local bottlenecking)
Lower Communication Cost - In distributed database systems, if data is located locally where it is mostly used, then the communication costs for data manipulation can be minimised,
Issues are that it is difficult to ensure that all the data in all the computers is always up-to-date i.e. difficult to maintain integrity as the data is stored on a number of different computers.
These are all the combinations of commands that you may need to use to write an SQL command, on your Unit 4 paper.
The following commands are the fundamental functions you may need to use:
CREATE TABLE (tablename) ...
INSERT INTO (tablename)VALUES ...
SELECT (fieldname(s))FROM (tablename(s) WHERE ...
SELECT * FROM (tablename)WHERE ...
UPDATE (tablename) SET (fieldname)= ... WHERE ....
Please take note of the following:
the use of capital letters in the SQL commands - they are important! You also need to write your table names in CAPITALS too.
for the SELECT, INSERT and UPDATE commands, take note of how apostrophes (or speech marks - you can use either) are used around any data (either numeric or not)
for the INSERT command, all data to be entered is separated by commas
for the CREATE command, notice how a string is defined (char(7)) and how an integer is defined (int) - no number required for int data type. Note you can also declare a data type as numeric(m,n). As an example, numeric(4,2) would be used for a number with the format 2412.34.
for the CREATE command note how you define a PRIMARY KEY, and put NOT NULL against your PRIMARY KEY.
you may have to join to tables together using their primary and foreign keys, in a SELECT statement
write the fieldnames and table names exactly as they are given in the question
The questions below give you examples of all types of SQL commands you may need.
Mark Scheme:
Mark Scheme: