Arrays

Data structures are used to combine related data in a computer program. The structure you need for IGCSE is the array, which is one of the most fundamental data structures in computing.

The key attributes of arrays are:

fixed length
single data type
contiguous memory storage
(therefore) fast random access of elements

The latter works because the single data type and contiguous memory allows the calculation of the memory address of any element in the array

(nth element address) = (start address) + n * (element size)

There are a range of standard algorithms you need to know about arrays.

Array Basics

An array consists of a fixed number of elements of the same data type. E.g., [1, 4, 5, 6, 2] or ['a', 'd', 'e'] or [1.1, 2.3, 3.5, 4.7]. Elements are indexed from 1 to length(array)

The official IGCSE way to declare an array and then populate it is like:

DECLARE marks: ARRAY[1:4] OF Integers

marks[1] ← 60

marks[2] ← 78

...

You could also declare and initialize an array using an array literal (the DECLARE might be optional)

DECLARE marks ← [60, 78, 67, 92]

Indexing and reassigning elements:

OUTPUT marks[1]    // prints 60

marks[1] ← 62      // [62, 78, 67, 92]

In Python, the similar structure that is most commonly used is a list, which is sequence of data, but not necessarily of the same type and the list is allowed to change length. Elements are indexed from 0 to length(array)-1

In pseudocode, stick to using arrays of a fixed length and type.

You can then initialise lists to be "blank" such as

marks = [0]*4    # [0, 0, 0 ,0]

Or just initialize with an list literal

marks = [60, 78, 67, 92]

Indexing and reassigning elements:

print(marks[0])    # prints 60

marks[0] = 62      # [62, 78, 67, 92]

Appending, inserting, deleting and slicing elements from an list

Don't do this in IGCSE pseudocode!

marks.append(81) # [62, 78, 67, 92, 81]

marks.remove(78) # [62, 67, 92, 81]

print(marks[1:3])# is [67, 92]

Iterating through an Array

A key skill with arrays is looping through them. Given their size is known, they are a perfect example of where to use a FOR loop. For example, looping through a linked pair of arrays

DECLARE names ← ["anika", "barb", "cat"]

DECLARE marks ← [60, 78, 92]

FOR idx ← 1 TO length(marks) STEP 1

  OUTPUT names[idx], " scored ", marks[idx]

NEXT idx

Python can loop through arrays using similar for loops

names = ["anika", "bob", "cat"]marks = [60, 78, 92]for idx in range(len(marks)): print(names[idx], "scored", marks[idx])

However, it can also use nicer methods like enumerate and zip for such loops

names = ["anika", "bob", "cat"]marks = [60, 78, 92]for name, mark in zip(names, marks): print(name, "scored", mark)

Standard Algorithms

Many of the algorithms we've seen before can be altered to work with arrays.

Total and/or Average:

DECLARE marks ← [60, 81, 78, 92, 65]

total ← 0

FOR idx ← 1 to length(marks)

  total ← total + marks[idx]

NEXT idx

average ← total / length(marks)

OUTPUT total, average

Count, e.g., marks above 70

DECLARE marks ← [60, 81, 78, 92, 65]

count ← 0

FOR idx ← 1 to length(marks)

  IF marks[idx] > 70 THEN

    count ← count + 1

  END IF

NEXT idx

OUTPUT count

Max (Exercise: modify to Min)

DECLARE marks ← [60, 81, 78, 92, 65]

max ← 0 // or use value of 1st element

FOR idx ← 1 to length(marks)

  IF marks[idx] > max THEN

    max ← marks[idx]

  END IF

NEXT idx

OUTPUT max

Array Algorithms

New things we can do is say where in an array an item is found.

For example, modify the Max/Min algorithm to also return the index of the max/min element:

DECLARE marks ← [60, 81, 78, 92, 65]

max ← marks[1]

maxPos ← 1

FOR idx ← 2 to length(marks)

  IF marks[idx] > max THEN

    max ← marks[idx]

    maxPos ← idx

  END IF

NEXT idx

OUTPUT max, " at index ", maxPos

Linear Search

The linear search walks through an array looking for a particular element

DECLARE marks ← [60, 81, 78, 92, 65]

item = 78 // what we're looking for

FOR idx ← 1 to length(marks)

  IF marks[idx] > item THEN

    OUTPUT "Found at index ", idx

  END IF

NEXT idx

OUTPUT count

This can be optimised if you only care about the first occurrence of item (a common use case) by breaking out of the loop early using BREAK, or a flag variable and a WHILE loop, or by returning the index from a function/subprocedure.

Extension: Binary Search

Linear search can be very slow for large data sets. Provided your data is sorted then binary search is the much better choice! It exploits a sorted list by checking the middle element and then discarding the side that does not contain the element being searched for. This halves the list at each step and makes the algorithm really fast.

NOTE: Although this is not in the curriculum, it did turn up in October 2019 Paper 22 as an algorithm to analyze with a trace table, so it is worth seeing.

left ← 1

right ← length(array)

found ← false

WHILE left < right AND NOT found DO

  mid ← (left + right) DIV 2 // integer division

  IF item ← array(mid) THEN

    found ← true

  ELSE IF item < array(mid) THEN

    right ← mid - 1

  ELSE

    left ← mid + 1

  END IF

END WHILE

IF found THEN

  OUTPUT "Found at", mid

ELSE

  OUTPUT "Item not in array"

END IF

Extension: Bubble Sort

Binary Search (and lots of other things) require the data is sorted.

A bubble sort is not the best choice in almost any case, but it is simple! The idea of a bubble sort is that

you walk through the array swapping any adjacent element out of order
you keep doing that walk-through until no more swaps are made (each walk through is called a pass)

The following algorithm is not a bad bubble sort but can be optimised by using fact that top of array becomes sorted so the internal FOR loop does not need to go to the end of the array each time.

DECLARE data ← [60, 81, 78, 92, 65]

REPEAT

  swapped ← false // flag to stop the sort

  FOR idx ← 1 TO length(data)-1

    IF data[idx] > data[idx+1] THEN

      temp ← data[idx]

      data[idx] ← data[idx+1]

      data[idx+1] ← temp

    END IF

  NEXT idx

UNTIL NOT swapped

OUTPUT data

Extension: Selection Sort

Selection sort works by searching the array for the biggest element and moving it to the top, then the next biggest to the second top and so on.

TODO

Report abuse