1.2 Memory and Storage

1.2.4 Data storage (Characters) - click to expand

This section covers:

Characters

  • The use of binary codes to represent characters

  • The term β€˜character set’

  • The relationship between the number of bits per character in a character set, and the number of characters which can be represented, e.g.:

    • ASCII

    • Unicode

Numbers is covered on the previous section.

Images and Sound are covered on the next two pages

🏠 Click here to go back to the main page.

Other sections in this topic:

πŸ”—1.2.1 Primary storage (Memory)

πŸ”—1.2.2 Secondary storage

πŸ”—1.2.3 Units

πŸ”—1.2.5 Compression

This page looks at how we store text on a computer system. In Computer Science, the letters and symbols are know as characters.

Back in the Wild Wild West

When America started to construct their railway system, they needed a way for the stations to communicate with each other. The simplest way was to build a system where a wire was hung from posts at the side of the track. Basically, a switch was turned on and off at one end of the cable, and this produced a "click" at the other end.

If just one click (or no click) is used, then we can send two different signals - not really very much use. If we use two signals (or no signals), then we have four possibilities. You should be able to see that we have a binary system. If we use five signals, then we can represent 32 different values which means that we can now start to send every letter of the alphabet plus a few extras such as full stops and spaces.

This system was extended to include 128 characters (7 signals), so we could use upper case and lower case letters. numbers, and a large number of symbols shich we are all now very familiar with. The system was standardised so that everyone used the same numbers for the came character and was known as ASCII, the American Standard Code for Information Interchange. This was the first character set and used seven bits to represent each character.

A very old railroad telex machine.

The Use of Binary Codes to Represent Characters

When computers started to be able to display text, it seemed obvious to use a ready made system to represent the characters that it would have to use. ASCII was made into an eight bit system called extended ASCII.

If you look at the binary versions of both ASCII and extended ASCII, you will see that the letter A is represented by the number 1, B as 2 etc. The diagram on the right shows that actually to other bits are also used, and you can see the difference between how upper case and lower case numbers are stored.

Character Set

In your exam, you will need to be able to say what we mean by character set (a character set is one particular system of codes used to represent the characters that a system uses)

You may also be asked to explain how binary codes are used - basically this is what the diagram on the right and the text above explains.

The Relationship Between the Number of Bits used and the Number of Characters that can be Represented.

Back to the Modern World.

The railroad system's method of communicating worked well for English speaking countries, and was used on most countries computer systems. There was a massive problem though - how could we include different letters and symbols that are used in those countries?

The only way that we could store more characters is to use more bits. If we used two extra bits (ten bits in total), we would still need to use two bytes to store each character: so we may as well use the whole sixteen bits!

Unicode is s clever system what uses between two and four bytes and is continuously being developed to include new languages. It has also adapted to use emoji symbols and new symbols are being introduced every year.

Quick Test (Requires login)

Other Resources for this topic

This section has four pages (click on links)

πŸ”— Binary representation of Numbers

Binary representation of Characters (this page)

πŸ”— Binary representation of Images

πŸ”— Binary representation of Sound

Next section:

πŸ”—1.2.5 Compression