Check your understanding of Data Structures at AS level first. This included the ability to describe, interpret, manipulate records, one dimensional arrays and two dimensional arrays.
Describe, interpret and manipulate data structures including arrays (up to three dimensions), stacks, queues, trees, linked lists and hash tables.
Describe the manipulation of arrays (up to three dimensions).
Represent the operation of stacks and queues using pointers and arrays.
Represent the operation of linked lists and trees using pointers and arrays
You need to be able to describe, interpret and manipulate data structures including arrays (up to three dimensions), stacks, queues, trees, linked lists and hash tables.
What is the definition of a data structure?
A data structure is a specialised format for organising, processing, retrieving and storing data. There are several basic and advanced types of data structures, all designed to arrange data to suit a specific purpose. Data structures make it easy for users to access and work with the data they need in appropriate ways.
Represent the operation of linked lists and binary trees using pointers and arrays.
What is the difference between a balanced and unbalanced tree?
A balanced tree allows searching to be much more efficient. It can be quicker to find elements in the tree as you should be able to go through the search algorithm less times.
Searching an unbalanced tree can take longer, as each time you chop the list in half, you are unable to reduce the number of items by half.
Binary Trees use a recursive algorithm to traverse the data structure.
What is an algorithm?
An algorithm is an (unambiguous) set of instructions to solve a problem.
An alternative methods to pseudocode of expressing an algorithm is a flowchart/structured English.
What is a recursive algorithm?
A recursive algorithm is one which calls itself (with a parameter which changes) and it must have a terminating condition (base case) to stop the calls.
What are the disadvantages of a recursive algorithm?
Memory overheads of stack use with many recursive procedural calls
It can be difficult to program, run and bug if there is a fault
It can cause stack overflows
Has a tendency to have infinite loops.
What is a linked list?
A linked list is a set of data elements, where each element contains the data itself and a pointer to the next element
They are not necessarily held in contiguous locations (i.e. do not have to be next to one another in memory)
Each node in a linked list contains the data item, plus a pointer (also known as a link field) to the position of the next node (in memory) in the list
The pointer in the last item in the list indicates there are no further items in the list (can be 0, -1 or null)
A variable pointer is used to point to the first node/item in the list
A linked list is known as a dynamic data structure, where memory is allocated to the list as it is needed. The data struture (list) can shrink and grow during the life of the list
What are the benefits of a linked list compared to an array?
New items can be inserted into a linked list without rearranging all the other elements
If programmed dynamically they use memory more efficiently, as memory is only allocated as needed (as opposed to static data structures, where the memory is allocated at the start of processing and is not changed throughout the running of the program
You would use a linked list over an array, when you don't know how many items you may be storing in your list
What are the drawbacks of a linked list compared to an array?
A linked list requires extra memory for the pointers to the next item in the list
In a linked list, items can only be accessed sequentially from the start of the list, whilst arrays can be accessed randomly
Represent the operation of stacks and queues using pointers and arrays.
What is the difference between a stack and a queue?
A stack is a container of objects that uses the last in first out (LIFO) principle.
In a stack the last or most recent item of data to be added to the stack is removed first
Adding data to a stack is known as pushing, whilst removing data from a stack is know as popping.
It is a limited access data structure - elements can be added and removed from the stack only at the top
Push adds an item to the top of the stack, pop removes the item from the top.
A stack can be used as a recursive data structure.
A stack is either empty, or it consists of a top and the rest which is a stack.
Underflow occurs when an attempt is made to pop (remove) from an empty stack. Overflow occurs when an attempt is made to push an item (add) to a full stack.
A queue uses a first in first out (FIFO) principle.
In a queue the last or most recent item of data added to a queue is the last to be removed.
Alternatively, the first item to be added to the queue is also the first item to leave the queue.
All new items are added to the end of the queue and items leave from the front of the queue.
Enqueue adds an item to the queue and dequeue removes an item from the queue.
A queue can be empty or it can be full. You cannot add items to a full queue (Overflow) and you cannot remove items from an empty queue (Underflow).
NOTE: You DO NOT need to know about Priority Queues described in the video below.
Describe the manipulation of arrays (up to three dimensions).
What is a hash table?
A hash table is just an array of linked lists. Each linked list holds all the items in the table that share the same hash code. Initially, all the lists are empty (represented as null in the array). We need to be able to add and delete items in the list.
In hash tables the positions within an array are sometimes called buckets; each bucket is used to store data. This can be a single piece of data (such as in our examples) or can be a more complex object such as a record. The key must be stored as part of, or alongside, the data.
What are the advantages of hash tables over other data structures?
The main advantage of hash tables over other data structures is speed . The access time of an element is on average O(1), therefore lookup could be performed very fast. Hash tables are particularly efficient when the maximum number of entries can be predicted in advance.
It doesn't matter how large a hash table gets, the time taken to insert an item is still the same (i.e. calculate hash value and insert into correct place).
What do I need to do to implement these data structures, when using them in programming?
When writing programs to manipulate data (search, sort, procedure calling, keyboard buffers, printer queues, etc.), you need to utilise memory to hold the data and various other pointers. A combination of arrays and variables are used to make sure these data structures operate correctly.
How do I implement a stack?
These are the things you need:
an array to store the stack items
a variable to store the current position of the top of the stack (stack pointer)
a variable to store the maximum number of items that can be stored in the stack (max value)
How do I implement a queue?
These are the things you need:
an array to store the queue items
a variable to store the current position of the front of the queue (front or start pointer)
a variable to store the current position of the last item in the queue (rear or end pointer)
a variable to store the maximum number of items that can be stored in the stack (max value)
How do I implement a binary tree?
These are the things you need:
a two dimensional array to store the items in the binary tree
a left pointer for each item in the binary tree, to indicate the position of the node in the left hand subtree
a right pointer for each item in the binary tree, to indicate the position of the node in the right hand subtree
a variable to store the current position of the start of the binary tree (root node)
if the node is a leaf (no children), then the left hand an right hand pointers can be set to 0,or -1, or null
How do I implement a linked list?
These are the things you need:
a two-dimensional array to store the items/nodes in the list, plus a pointer to the next node in the list, alongside the data in each node
a variable which points to the start node in the list
How do I implement a hash table?
These are the things you need:
an array to store the items required
if there is a collision, you can use linked lists from the original hash value in the array, to store further items (known as daisy chaining)
a hash function, which takes the key field of the item to be stored, and calculates a hash value, which corresponds to a position in the hash table (array)
What is the difference between searching for an item in an ordered list and an unordered list?
When searching an ordered list in a linear manner, the search can be terminated when an item greater than the search value (or less that) is reached
When searching an unordered list the search cannot be terminated until the last item has been reached (or the search value not found)
For an ordered list a binary search can be used
For an unordered list a linear search can be used - you cannot use a binary search on an unordered list
Traversal of Data Structures:
When you are asked to traverse a data structure, you are being asked to follow an algorithm to access all the items in a data structure, in some sort of order, to maybe find a particular item, or may be to retrieve all items. Different data structures will have to be accessed using different traversal algorithms, due to the way in which they must be processed depending on their use.
Traversal of an Array:
Use a linear search to access an array, starting at (usually) position zero, through to the end of the array
NumberArray()
Search Algorithm:
SearchValue = 79
For count = 0 to 5
If SearchValue = NumberArray (count) then
Output 'Search Value found at position ', count
End If
Next count
NOTE: The output from this algorithm would be:
'Search Value found at position 4'
Retrieve all Items Algorithm:
For count = 0 to 5
Output NumberArray (count)
Next count
NOTE: The output from this algorithm would be:
2
5
12
1
79
11
Traversal of an Queues and Stacks:
If you wanted to traverse queues and stacks, you would use the methods above for arrays, as queues and stacks are implemented using arrays. However queues and stacks are more likely to be accessed by enqueueing, dequeueing, pushing and popping, rather than traversal.
Traversal of a Binary Tree:
There are three ways to traverse a binary tree:
PreOrder Traversal (Remember: ROOT, LEFT, RIGHT)
InOrder Traversal (Remember: LEFT, ROOT, RIGHT)
PostOrder Traversal (Remember: LEFT, RIGHT, ROOT)
The 'Order' refers to the order in which each node and its subtree is visited
Note: Traversal of a Binary Tree is a recursive algorithm i.e. one which calls itself.
Example Binary Tree
1. PreOrder Traversal:
The 'Order' refers to the order in which each node and its subtree is visited. So for PreOrder, the root node is visited first, before the left and right nodes
ROOT, LEFT, RIGHT - visit root node, then visit left node, then visit right node
Using the example Binary Tree above, we start at the root (always), which is Legg. By following the algorithm, and recursively calling itself as needed, the data items will be retrieved in the following order:
Legg, Charlesworth, Illman, Hawthorne, Jones, Todd, Ravage, Youngman
What could it be used for?
Clone a tree
Count the number of leaves (i.e. nodes with no children)
What does the algorithm look like?
Algorithm PreOrder(tree)
Visit the root
Call PreOrder(left->subtree) i.e. traverse the left subtree
Call PreOrder(right->subtree) i.e. traverse the right subtree
2. InOrder Traversal:
The 'Order' refers to the order in which each node and its subtree is visited. So for InOrder, the root node is visited in between the left and right nodes
LEFT, ROOT, RIGHT - visit left node, then visit root node, then visit right node
NOTE: InOrder traversal ALWAYS retrieves the data items in alphabetical or ascending numerical order.
Using the example Binary Tree above, we start at the root (always), which is Legg. By following the algorithm, and recursively calling itself as needed, the data items will be retrieved in the following order:
Charlesworth, Hawthorne, Illman, Jones, Legg, Ravage, Todd, Youngman
What could it be used for?
Sort / search a binary tree
Traverse / retrieve items alphabetically (or ascending numerically)
What does the algorithm look like?
Algorithm InOrder(tree)
Call InOrder(left->subtree) i.e. traverse the left subtree
Visit the root
Call InOrder(right->subtree) i.e. traverse the right subtree
3. PostOrder Traversal:
The 'Order' refers to the order in which each node and its subtree is visited. So for PostOrder, the root node is visited last, after the left and right nodes
LEFT, RIGHT, ROOT - visit left node, then visit right node, then visit root node
Using the example Binary Tree above, we start at the root (always), which is Legg. By following the algorithm, and recursively calling itself as needed, the data items will be retrieved in the following order:
Hawthorne, Jones, Illman, Charlesworth, Ravage, Youngman, Todd, Legg
What could it be used for?
Deleting / Undoing a binary tree
Stack-based programming
What does the algorithm look like?
Algorithm PostOrder(tree)
Call PostOrder(left->subtree) i.e. traverse the left subtree
Call PostOrder(right->subtree) i.e. traverse the right subtree
Visit the root
Traversal of a Linked List:
A linked list can only be traversed sequentially
You have to start at the beginning of the linked list and follow the pointers in each node to the next item/node in the list
You cannot jump into the middle of the list
What does the Algorithm look like?
p=start {start at beginning of list}
While p <> 0 {loop while not end of linked list}
currentNode = node(p)
output currentNode.data
p = currentNode.pointer
Endwhile
Using the diagram above, the output from this algorithm would be:
1
2
3
4
5
6
7