Learning Outcomes
Students should be able to:
• demonstrate understanding of and explain basic database concepts such as table, record, field, key field, query, form, report, macro, relationship and importing data;
• identify and use appropriate data types when creating a database structure;
• demonstrate understanding of the need for data validation; and
• describe the following types of validation checks: presence, length, type, format and range.
A database is a (large) collection of data items and links between them, structured in a way that allows it to be accessed by a number of different application programs. The term is also used loosely to describe any collection of data. In the past these may have been paper based but nowadays most will be computerised.
Databases are very widely used and many of us will have interacted with one – without necessarily being aware of it. If you have entered your personal details into a social networking site, they have almost certainly been stored in a database; if you have browsed an online retail site, product details will almost certainly have been retrieved from a database.
Here are some other contexts in which a database might be used:
• Your school might have a database that contains data about pupils, attendance, subjects and assessment marks;
• An employer might have a database that contains data about employees, job titles and salaries;
• The organisers of the 2016 Rio Olympic Games might have a database that contains data about athletes, events and medals.
The most common way to structure a database is as a collection of interrelated tables. This kind of database is known as a relational database.
A table is the name for each group of similar data with rows for each instance of an entity and columns for each attribute. We use a different table for each object in our database and then we link them together.
An entity is is simply an object that we hold data about and an attribute is a piece of information about that entity.
For example in a database about cars, the enity is car, the attributes would be the data we hold about the car, for example brand, model, colour, number of doors, miles on clock. An instance of an object would be the details for one particular car, for example Vauxhall, Insignia, Black, 4, 40,000.
A Record is the basic unit of data stored in a data file. It is a collection of data items, which may be of different data types, all relating to the individual or object that the record describes and is treated as a unit for processing. In our previous example it would be all of the information about the car. A record is made up of many fields.
A field is one single piece of information about an object. It is another word for attribute as explained previously.
A primary key is a field that uniquely identifies an individual record in a table. For example if we have a table containing information about students, what information might we collect? Can any of these fields be used to uniquely identify each record? In some cases we will have to create an additional field such as candidate number to uniquely identify each record.
A composite key is one that consists of two or more fields. In the Event table, if we know both the name of an event and the gender of the competitors we can uniquely identify an event. This combination of fields is therefore a suitable choice for primary key of the table. As the key consists of two fields, we say that it is a composite key.
When the primary key from one table appears as a field in a second table, it is known as a Foreign Key of the second table.
To create a relationship we move the primary key from one table to another, where it is known as a foreign key. This piece of data can be used to retrieve all of the other information about an object if necessary.
There are three different types of relationships possible between objects.
1)One to One- A husband can have one wife, a wife can have one husband.
2)One to many- A football team can have many players, a player can play for one team.
3)Many to many- A subject can have many students, a student can have many subjects.
We use entity relationship diagrams to show how different tables in our database are related.
Create an Entity-Relationship Diagram for the car rental database you have been working on.
A query is a question used to retrieve selected information from a database.
For example we might want to ask the following questions in relation to our car database
Find all of the Ferraris.
Find all black cars.
Find all cars with less than 5000 miles.
Who won gold in the Men’s Road Race? Special computer languages, called query languages, are used to specify queries in a form that can be processed by a computer. A special piece of software called a query processor analyses the query and computes the answer.
A report is the presentation of selected data from a database.
Query results are not very presentable particularly when printed. We use a report to make the query results more visually appealing.
Once we have created a table we can also create a form to aid data entry. This improves the visual appearence of the screen and also will decrease the likelihood of data entry errors.
A macro is a small program to perform a repetitive task and which can be created and stored for later use by a user. This can cut down the amount of time the user spends clicking on options on a menu and assign all of those selections to one simple button. These can be used to print out pages or even link to other pages/documents.
The term Big Data has been used to refer to a range of problems and technologies that relate to the management of very large data sets. The amount of data that we currently generate is enormous, and is increasing each year. “IBM say that ‘every day, we create 2.5 quintillion bytes of data – so much that 90 per cent of the data in the world today has been created in the last two years alone.’”
http://www.bcs.org/upload/pdf/big-data.pdf Sources of this data include:
• text, images and videos uploaded to social media sites;
• medical records;
• fitness monitors;
• financial markets – share price data, currencies, etc.;
• web server logs;
• mobile phone records;
• ecommerce transactions;
• device logs (internet of things).
There is no generally agreed definition of big data, and it is clearly not just about the size of the data sets. There is, however, widespread recognition that big data problems are characterised by what have become known as the three Vs: volume, velocity and variety.
It is difficult to put a size limit in what counts as big data; indeed what was considered big yesterday may not be considered big tomorrow. It is clear, however, that big data problems involve data sets that are so large and complex that traditional tools, such as relational database management systems, are not able to deal with them effectively.
Big data problems frequently have to deal with real time flows of data. Events happen in the real world and the data generated is real time, and is often time sensitive.
For example:
• It has been estimated that, in September 2016, there were almost 891 thousand transactions per day on the London Stock Exchange (Statista, 2016). Each transaction results in money moving between accounts, as well as shareholder registers being updated. https://www.statista.com/statistics/325326/uk-lse-average-daily-trades/
• It has been estimated that there are between 4.5 and 9 million security closed circuit television cameras (CCTV) in the UK. While there is debate about the desirability of all this surveillance, it is hard to ignore the potential to enhance security and combat crime. http://www.bbc.co.uk/news/uk-30978995
Much of the data being produced is unstructured – this is not data that can easily be put into relational database tables. It is video, audio, and image, as well as text and numeric data. The effective management of the volume, velocity and variety of all this data is an important challenge for the big data industry.
The management of big data is, however, only one part of the problem. There is no value in managing data unless we can interpret it and extract useful information. This is the role of data analytics.
Data analytics applies algorithms to raw data in order to spot patterns, relationships and trends. The algorithms may include statistical analysis as well as artificial intelligence techniques.
Potential uses of Big Data include:
• Healthcare. – https://www.newscientist.com/article/dn28340-big-data-better-health/ – http://www.bbc.co.uk/news/health-38055509
• Crime prevention combat crime. – http://www.bbc.co.uk/guides/zqsg9qt
http://graymattersystems.com/big-data-crime-patterns/
http://www.bbc.co.uk/news/uk-30978995
• Energy Management. – http://www.bbc.co.uk/news/business-35722324
You can find more applications of big data and data analytics here:
http://www.bbc.co.uk/news/business-29147254
Keywords
Possible Past Paper Questions
What is a database? (2 Marks)
What is a relational database? (2 Marks)
3.Explain the term field. (2 Marks)
4. Explain the term record. (2 Marks)
5.Describe the purpose of a form within a database. (2 Marks)
6.Describe the purpose of a query within a database. (2 Marks)
7.Describe the purpose of a report within a database. (2 Marks)
8. Why do we use Macros within a database. (2 Marks)
9.Explain the purpose of the key field(primary key). (2 Marks)
10. What is big data?[2]
11. Explain the three V's in relation to Big Data? [6]