19.1 Algorithms

Specification

The textbook has a good section (Chapter 23 Langfield and Duddell ) on the different algorithms. This page will add, rather than replace that information.

Linear/Binary Search

Sort Comparison

Here is an interesting video comparing different sort algorithms.  You'll notice bubble sort is the worst performing.  CIE only require you to know insertion and bubble sort from these.

Here are some of the same algorithms again, but in colour and with sound representing the sorted elements.

Finally, a more spaced out, hypnotic version.



If you are interested in making your own visualisation, check out the Coding Train video (it uses Processing (Java)).

Bubble Sort

Here is my implementation of the bubble sort in VB,NET (from lesson 4).  I have also included an output of the elements following the sort.

Dim Array() = {33, 6, 5, 8, 3, 11, 1}

Dim OuterLoop, InnerLoop, Temp, Length As Integer

Length = Array.Length

For OuterLoop = Length - 1 To 1 Step -1

     For InnerLoop = 0 To OuterLoop - 1

         If Array(InnerLoop) > Array(InnerLoop + 1) Then  ' Compare elements

             Temp = Array(InnerLoop)

             Array(InnerLoop) = Array(InnerLoop + 1)

             Array(InnerLoop + 1) = Temp

         End If

     Next InnerLoop

 Next OuterLoop

For Each a As Integer In Array

     Console.WriteLine(a)

Next

Console.ReadLine()

The textbook on page 149 gives a more improved version of the algorithm that replaces the outer loop with a WHILE loop that ends execution once the inner loop completes without making any swaps.  Try and re-create this with the code above.

Insertion Sort

Efficiency of bubble sort vs. insertion sort 

Insertion sort can perform much more efficiently than bubble sort as the routine will stop once a given value is in position, whereas bubble sort cannot do this.  However, a well written bubble sort (as shown in the text book) can abort if no swaps have been made, which insertion sort will not.  Therefore, in best case (pre-sorted), bubble sort is far more efficient; however, given an almost-sorted list, insertion sort will blitz bubble sort .Insertion sort is generally more efficient, although both are O(n2).  

Insertion sort will also perform faster asymptotically, despite having the same O(N^2). Specifically, Insertion is faster than Bubble because of what occurs in each pass: Bubble Sort swaps through all remaining unsorted values, moving one to the bottom. Insertion Sort swaps a value up into already-sorted values, stopping at the right place.

Once a pass starts, Bubble Sort must see it through to the end. Insertion bails early. They’re both adaptive, as they both sense when they don’t need another pass.

However, having the same Big-O doesn’t mean identical performance. Big-O ignores everything except the most-significant growth factor. It ignores constants, lesser factors, even coefficients on that most-significant factor.

If Algorithm A consumes half as much time as Algorithm B in all cases, they have the same Big-O. We don’t (for example) call them O(n^2) and O(2n^2) — both are O(n^2).  If Algorithm C is identical to Algorithm D, but in all cases Algorithm D has additional 20 seconds of one-time setup cost, they have the same Big-O (but we’d consider C faster). We wouldn’t (for example) call them O(n) and O(n+20000) — both are O(n).

Detail

The following video from Harvard shows an example of insertion sort.

Here is my implementation in VB of the insertion sort.

Dim Array() As Integer = {15, 8, 22, 43, 5, 9, 11, 3}

Dim iLoop, j, Element As Integer

For iLoop = 1 To Array.Length - 1

    Element = Array(iLoop)

    j = iLoop

    While j > 0 AndAlso Array(j - 1) > Element

        Array(j) = Array(j - 1)

        j = j - 1

    End While

    Array(j) = Element

Next iLoop

For Each num As Integer In Array

    Console.WriteLine(num)

Next

Console.ReadLine()

Another insertion sort VB video

Insertion Sort Videos

These are two videos from Kevin Drumm.  You can follow through to the video to watch the other videos in his series.

Stacks and Queues

This Computerphile video shows how stacks and pointers are used for managing subroutines.

Stack Algorithms

CIE sometimes uses two pointers for stacks, this assumes one will be used to indicate the current top of stack.  Using one pointer assumes that the base of stack pointer is simply the first element in the structure (e.g. array).  In addition, stacks generally do not implement circular references, as queues do.  We'll also assume any structures have a lower bound of 1, not 0.

Inserting an item onto a stack (pushing)

IF ToS = ARRAY.LENGTH THEN

    OUTPUT "STACK FULL"

ELSE

    ToS <-- ToS + 1

    ARRAY[ToS] <-- ITEM

ENDIF

Removing an item from a stack (popping)

IF ToS = 0 THEN

    OUTPUT "STACK EMPTY"

ELSE

    ITEM <-- ARRAY[ToS]

    ToS <-- ToS - 1

    RETURN ITEM

ENDIF

Stack Code

The worksheet requires you to use my existing class code to implement a basic stack.  The code is below.  The worksheet (found at the bottom of the page in the files section) will also ask you to extend the functionality of the class.

Class myStack

        Private sta() As String ' array used to simulate memory - we will not use element 0 to keep things simple

        Private BottomOfStackPointer As Integer ' represents a pointer indicating the start of the stack in memory (in our case, the array)

        Private TopOfStackPointer As Integer ' top of stack pointer

        Private StackTop As Integer = 20 ' sets the upper stack limit

        Public Sub New()

            ' constuctor procedure to initialise the stack

            ReDim sta(StackTop)

            TopOfStackPointer = 0 ' indicates nothing in stack

            BottomOfStackPointer = 1 ' the bottom of the stack

        End Sub

        Public Function Peek() As String

            ' will let the user view (without popping) the top of stack

            If isEmpty() Then

                Throw New Exception("Stack is empty!")

            Else

                Return sta(TopOfStackPointer)

            End If

        End Function

        Public Function Pop() As String

            ' pops the stack

            If isEmpty() Then ' the stack is empty

                Throw New Exception("Stack is empty!") ' generate an exception

            Else

                Pop = sta(TopOfStackPointer)

                TopOfStackPointer -= 1

            End If

        End Function

        Public Sub Push(Item As String)

            ' push an item onto the stack

            TopOfStackPointer += 1

            sta(TopOfStackPointer) = Item

        End Sub

        Public Function Size() As Integer

            ' returns the number of items stored in the stack

            Return TopOfStackPointer

        End Function

        Public Function isEmpty() As Boolean

            Return TopOfStackPointer < BottomOfStackPointer ' evaluate expression and return Bool result

        End Function

End Class

Here is the code that uses the class.  Both of these should be pasted within your Module 1 region.  The code is basic and you should definitely attempt to test it more thoroughly.  

Dim N As New MyStack

Console.WriteLine("# items: " & N.Size)

N.Push("Hello")

Console.WriteLine("# items: " & N.Size)

Console.WriteLine("peek: " & N.Peek)

Console.WriteLine("pop: " & N.Pop)

Console.WriteLine("# items: " & N.Size)

Console.WriteLine("pop: " & N.Pop)

Console.ReadKey()

Queue Algorithms

CIE sometimes uses two pointers for queues, and always if implementing a circular queue (which is often the examiner's preferred structure).  In addition, as with stacks, we'll assume any structures have a lower bound of 1, not 0.  The two pointers are the head or FrontOfQueue pointer and the tail or EndOfQueue pointer.  Due to the circular nature of this implementation, checking for a full queue is not as simple as with a stack.

Inserting an item into a queue (enqueuing)

IF (FoQ = 1 AND EoQ = ARRAY.LENGTH) OR (FoQ = EoQ + 1) THEN

    OUTPUT "Overflow! Queue Full"

ELSE

    IF EoQ = NullPointer THEN // Queue empty

        FoQ <-- 1

        EoQ <-- 1

    ELSEIF EoQ = ARRAY.LENGTH THEN // wrap around to front

        EoQ <-- 1

    ELSE

        EoQ <-- EoQ + 1

    ENDIF

    ARRAY[EoQ] <-- ITEM

ENDIF

Removing an item from a queue (dequeuing)

IF FrontOfQueue = NullPointer THEN // Queue empty

    OUTPUT "Queue Empty"

ELSE

    ITEM <-- ARRAY[FoQ]

    IF FoQ = EoQ THEN // Last item in queue

        FoQ <-- NullPointer

        EoQ <-- NullPointer

    ELSEIF FoQ = ARRAY.LENGTH THEN // Need to wrap around to start

        FoQ <-- 1

    ELSE

        FoQ <-- FoQ + 1

    ENDIF

    RETURN ITEM

ENDIF

Linked List

Linked lists are a fairly standard topic and both the lesson PowerPoint and textbook give enough background to get you started.  Therefore, this section will offer additional reading on the topic.

The following Wikibooks site gives detail on both linear lists and linked lists.

Stanford University have produced a document on problems with linked lists.  This is additional reading and not required for the specification.

Finally, the MSDN website has information on the actual linked list class available from the .NET platform.

The following video is one of a series looking at linked lists in VB.NET.  Only the first video is given below.

Algorithms

A linked list is often sorted by a key.  An unsorted linked list could be used to implement a stack or queue, but if you are able to handle a sorted list, an unsorted one (should it, unlikely come up in the exam), be trivial to implement.  Another consideration when dealing with linked lists, is the grouping of the data and pointer (as these are not stored in a contiguous manner).  To accommodate this, we usually create a user-defined data type (UDT) using TYPE in pseudocode, a structure/class in VB and a class in Python.  Also, in the real world, the data can be anything, including another data structure.  

When implementing linked lists in arrays, not only do we need to consider which node is the starting node (as you would using other methods), but also which elements are free for use.  Two external pointers are used:  StartPointer and FreePointer.  The NullPointer is used to signify the end of the chain, or that the list is either empty or full.

It should be noted, that linked lists, binary trees, dictionaries and hash tables hold more powerful applications than the way they are simplified for the exam.  We often just store one value, which has limited use.  In the real world, the value would be a record structure, class, list or other useful item and searching done against a key value.

The pseudocode for these algorithms can be found in the revision textbook (pp 222-224) or main textbook (pp 323-328)

Initialising a linked list

Sometimes, exam questions will ask you to initialise a linked list or binary tree, etc.  This is a simple task whereby all items in the array (usually the preferred underlying data structure) are chained together to form a linked list of free nodes.  The StartPointer is set to the NullPointer value and the FreeListPointer set to the first item in the array.  The final item in the array has its pointer set to NullPointer so as to indicate the end of the list.

An example of initialisation and declarations can be found in the 2015 CIE A Level book on page 326 and copied below for reference.   This assumes an array with 7 elements.  You can use array length to make this generic, or adapt as necessary.

// NullPointer should be set to -1 if using array element with index 0

CONSTANT NullPointer = 0

// Declare record type to store data and pointer

TYPE ListNode

    DECLARE Data     : STRING

    DECLARE Pointer  : INTEGER

ENDTYPE

DECLARE StartPointer : INTEGER

DECLARE FreeListPtr  : INTEGER

DECLARE List[l : 7] OF ListNode

PROCEDURE InitialiseList

    StartPointer <- NullPointer

    FreeListPtr <- 1

    FOR Index <-- 1 TO 6

    // set start pointer

    // set starting position of free list

    // link all nodes to make free list

    List [Index] .Pointer <-- Index + 1

    ENDFOR

    List [7]. Pointer +-- Null Pointer 11 last node of free list

END PROCEDURE

Searching a linked list

Recursively, it could be done as follows (taken and adapted from a past exam paper using CIE's nomenclature).  The function returns the position of the node which matches the parameter 'Find'.  The linked list has a global scope.

FUNCTION SearchList(Find, Position) RETURNS INTEGER

    IF List[Position].Data = Find THEN

        RETURN Position

    ELSE

        IF List[Position].Data <> -1 THEN

            Position ← SearchList(Find, List[Position].Pointer)

            RETURN Position

        ELSE

            RETURN -1 // a value to indicate no result was found

        ENDIF

    ENDIF

ENDFUNCTION

Inserting a node

Deleting a node

This could be done recursively, so be prepared to implement this is required.  QP 41 Q6  in summer 2018 had such a question, but moved away from having a separate set of free nodes, so do be prepared for the chief examiner to do whatever they like.

Binary Tree

From the same author as the linked list video, this goes into detail (ignore traversal, this is not in the CIE spec) on binary trees.  This video too is only one of a series of videos on binary trees.

Algorithms

A binary tree is an ordered data structure. Another consideration when dealing with binary trees, is the grouping of the data and pointer (as these are not stored in a contiguous manner), as with a linked list.  To accommodate this, as with linked lists, we create a user-defined data type (UDT) to hold the data and the two pointers (left and right sub-tree).

When implementing binary trees in arrays, not only do we need to consider which node is the starting node (as you would using other methods), but also which elements are free for use (using the left pointer). Two external pointers are used:  StartPointer and FreePointer.  The NullPointer is used to signify the end of the chain, or that the list is either empty or full.

A binary tree can be seen as a linked list, with each node chaining to successive nodes.

Initialising a binary tree

Sometimes, exam questions will ask you to initialise a linked list or binary tree, etc. This is a simple task whereby all items in the array (usually the preferred underlying data structure) are chained together to form a linked list of free nodes. The StartPointer is set to the NullPointer value and the FreeListPointer set to the first item in the array. The final item in the array has its left pointer set to NullPointer so as to indicate the end of the list.

An example of initialisation and declarations can be found in the 2015 CIE A Level book on page 330 and copied below for reference. This assumes an array with 7 elements. You can use array length to make this generic, or adapt as necessary.

// NullPointer should be set to -1 if using array element with index 0

CONSTANT NullPointer = 0

// Declare record type to store data and pointers

TYPE TreeNode

    DECLARE Data : STRING

    DECLARE LeftPointer : INTEGER // we only need LeftPointer to set up free list

    DECLARE RightPointer : INTEGER

ENDTYPE

DECLARE RootPointer : INTEGER

DECLARE FreePtr : INTEGER

DECLARE Tree[l : 7] OF TreeNode

PROCEDURE InitialiseTree

    RootPointer <-- NullPointer // set start pointer

    FreePtr <-- 1 // set starting position of free list

    FOR Index = 1 TO 6 // link all nodes to make free list

        Tree[Index].LeftPointer <-- Index + 1

    ENDFOR

    Tree[7].LeftPointer <-- NullPointer // last node of free list

END PROCEDURE

Searching a binary tree

Searching can be done recursively or non-recursively. However, with each node having two pointers, the search technique is slightly different to that of a linked list.  However, given that binary trees are ordered, deciding which sub-tree to traverse is a simple decision. 

The structured English below is more verbose, so you can follow the logic of searching a binary tree. 

The above can simplify the checking for NullPointers by only looping while the current node is not NULL.  This example is taken from the CIE textbook (p331).

FUNCTION FindNode(Searchitem) RETURNS INTEGER // returns pointer to node

    ThisNodePtr <-- RootPointer // start at the root of the tree

    

    WHILE ThisNodePtr <> NullPointer // while there's a pointer to follow

    AND Tree[ThisNodePtr].Data <> Searchitem // and search item not found

        IF Tree[ThisNodePointer].Data > Searchitem THEN // follow left pointer

            ThisNodePtr <-- Tree[ThisNodePointer].LeftPointer

        ELSE // follow right pointer

            ThisNodePtr <-- Tree[ThisNodePointer].RightPointer

        ENDIF

    ENDWHILE

    RETURN ThisNodePointer // will return null pointer if search item not found

END FUNCTION

Inserting into a binary tree

inserting a node must take account of position, which involves navigating around the tree until there is a leaf node (a node with no children) or a node with a null pointer at the position that will then have the new node added.  This can be done recursively, so ensure you understand the procedure, as the exam could go either way lately.

You can also find another example taken from the 2015 textbook on page 331.

Structured English

2015 text book

PROCEDURE InsertNode(Newitem)

    IF FreePtr <> NullPointer THEN // there is space in the array

        // take node from free list, store data item and set null pointers

        NewNodePtr <-- FreePtr

        FreePtr <-- Tree[FreePtr].LeftPointer

        Tree[NewNodePtr].Data <-- Newitem

        Tree[NewNodePtr].LeftPointer <-- NullPointer

        Tree[NewNodePtr].RightPointer <-- NullPointer

        

        // check if empty tree

        IF RootPointer = NullPointer THEN //insert new node at root

            RootPointer <-- NewNodePtr

        ELSE //find insertion point

            ThisNodePtr <-- RootPointer //start at the root of the tree

            WHILE ThisNodePtr <> NullPointer //while not a leaf node

                PreviousNodePtr <-- ThisNodePtr //remember this node

                IF Tree[ThisNodePtr].Data > NewItem THEN //follow left pointer

                    TurnedLeft <-- TRUE

                    ThisNodePtr <-- Tree[ThisNodePtr].LeftPointer

                ELSE //follow right pointer

                    TurnedLeft <-- FALSE

                    ThisNodePtr <-- Tree[ThisNodePtr].RightPointer

                ENDIF

            ENDWHILE

            

            IF TurnedLeft = TRUE THEN

                Tree[PreviousNodePtr].LeftPointer <-- NewNodePtr

            ELSE

                Tree[PreviousNodePtr].RightPointer <- NewNodePtr

            ENDIF

        ENDIF

    ENDIF

ENDPROCEDURE

Deleting a binary tree node

Deleting a binary tree node is reasonably complex, and may require re-organising the tree.  For this reason, it IS NOT in the specification and you WILL NOT be required to do this in your exam.

Outputting all items in a binary tree

This process can be done recursively and simply. The procedure looks like this:

PROCEDURE TraverseTree(BYVALUE Pointer : INTEGER)

    IF Pointer <> NULL THEN

        TraverseTree(Tree[Pointer].LeftPointer)

        OUTPUT Tree[Pointer].Data

        TraverseTree(Tree[Pointer].RightPointer)

    ENDIF

ENDPROCEDURE

Notice how there are no checks for nulls when traversing, only when the procedure is called.  This way, if any call is made with a NullPointer, the routine simply completes and nothing is output.

Graph (including Tree) Traversal Methods (new)

CAIE seem to have moved to ask for specific traversal methods (which are not listed in the specification). You will have done work on traversal with Dykstra and A*, but when searching a tree, there are numerous other ways to access nodes. This is a link to Isaac Computing (an excellent CS resource for all specifications). It will talk you through the simple types of graph traversal.

Isaac Computing Site

Hash tables & Dictionaries (NOT in new spec)

This is included to provide extra information on this ADT. It is NOT in the 9618 spec.

This video explains what an associative array is, aka hash table.  The videos after this focus specifically on hashing and their associations.

As hash tables/dictionaries are relatively simple, I have not gone to the trouble of outlining the procedures for inserting and sorting.  You can find these, in more detail, in the 2015 textbook on pages 332-334.

A more basic video to start with, looking at the concepts of, and how to derive a hash value and table.  The second and third video go into much more detail.

LEFT: Continuation video from Harvard, going into further detail.  E.g. hash functions, etc. 

RIGHT: The following video looks at open addressing (closed hashing).  it also covers double-hashing, which is beyond the scope of the specification.

Algorithms

A dictionary (and hash table - derived from a dictionary) is an abstract data type using a key-value relationship. The specification only considers hash tables for the algorithm work, so we will use these. 

A consideration with a hash table is open addressing (closed hashing) or closed addressing (open hashing).  The former requires you to start at the hashed value and then keep moving down until there is a free slot in which to insert.  Closed addressing must use a linked list, or other mechanism, but which multiple data items can hang from the same position in the table.

Remember, there is a difference between a hash value and a hash table. 

Graphs

Watson and Williams textbook page 487

Langfield and Duddell textbook pages 424-426

Algorithmic Efficiency (Big O)

This can be found on sub page here.