Data Compression

Learning Outcomes

• explain the need for data compression;

• describe how zipping is used to compress data;

• evaluate common data file formats: txt, wav, bitmap, Joint Photographic Experts Group (JPEG), Motion Picture Experts Group (MPEG) and Graphics Interchange Format (GIF);

Data Compression

Data compression is the process associated with reducing the memory or storage required by large files. This is often important when it comes to data transmission across a network and storage on secondary storage devices such as hard drives and usb drives. Data compression uses a series of algorithms to reduce the amount of real space taken up by data on a storage device.

Lossy compression will greatly reduce the file the most but will permanently lose some of the data. This is suitable for images/ music files but not word documents or programs.

Lossless compression will slightly reduce the file size but the file can be put back in its original format at a later data.

Zipping

Zipping is the process of compressing data files. It uses an algorithm called the LZW algorithm to reduce the memory or storage required by large files.

The LZW algorithm looks for repeating patterns in the data being compressed and will then replace these repeating patterns with a single character. For example in the first two sets of paragraphs and titles in this page there are 10 instances of ‘ss’, if these were all replaced by * this would save 10 characters.

File Formats

All files written to a storage medium must have a unique file name. The first part of the file name will be determined by the user and will help them identify the file at a later date; the second part (the file extension) will help identify that file as being a certain file type. Some of the more common file extensions are referenced below.

txt – files which are presented as lines of electronic text. These files are readable to the human eye but contain no formatting information and may often be used to store information which requires further processing by another application. Some operating systems will place an EOF (End of file character) after the last line in the text file to denote the end of the txt file.

wav – a file format widely used for professional recording and editing. These files use a process known as sampling to store a digital representation of a recorded analogue sound signal. During the sampling process the amplitude of the waveform being received are analysed very quickly and recorded (see diagram below). It is the angular waveform shown below that is played back to the listener. The quality of the sound is dependent upon the frequency of the sampling (sampling rate) and the number of bits used to store the digital value for each sample (sample resolution).

Generally a WAV file is uncompressed, although wav files can sometimes also be used to store compressed formats. A header in the file will indicate if the file is in compressed or uncompressed format.

bmp – a method of creating images where details of each pixel forming part of the overall image are held as a bit map in memory. A bit map is where a pattern of bits is used to hold data relating to the state of an individual pixel in the image; including text. Since a bitmap image is produced from tiny squares of colour, which are arranged to produce the effect of an image; this is a good method of reproducing ‘continuous tone’ images, such as photographs and for free hand drawings.

jpeg – jpegs identify arrangements of pixels which are repeated elsewhere in the image so this data needs only be saved once, rather than having to repeat the data and take up storage space unnecessarily.

mpeg – mpegs use a method called delta compression to record and transmit the data representing audio and video files. It works by only sending what has changed since the last recording / transmission of data. For example in the transmission of TV signals where 25 frames are transmitted per second, a full frame / picture is only sent occasionally and in between those transmissions data is sent relating only to changes in the full frame / picture. gif – a method developed to support the compression and storage of images using bit mapped data, a simple animated version is also available. Eight bits are used to represent data relating to each pixel so only 256 distinct colours can be represented; helping to minimize file size.

gif – a method developed to support the compression and storage of images using bit mapped data, a simple animated version is also available. Eight bits are used to represent data relating to each pixel so only 256 distinct colours can be represented; helping to minimize file size.

Possible Exam Questions

1.Why is the ability to compress files important?(4)

..more space available on the storage device

...more transferable across the internet/E mail.

2.Explain how the LZW algorithm is used to compress data?(4)

The LZW algorithm looks for repeating patterns in the data being compressed and will then replace these repeating patterns with a single character.

3.An image can be stored on a PC using bit-mapped and jpeg file format. a. Identify two characteristics of bit-mapped images. [2]

High quality image made up from pixels.

Become pixelated when enlarged.

b. i. Describe how file size of the same image stored in jpeg format might differ from its bit-mapped version. [1]

It will have a smaller file size and be of a slightly lower quality.

ii. Give one reason for this difference in file size .[1]

Jpegs are compressed(lossy) file types and so are of a lower quality.

4.A professional photographer is importing digital images to a PC for detailed editing.

a. Which image file type would be most appropriate for this task? [1]

Bitmap images would be of a higher quality for detailed editing by the photographer.

b. Give an explanation for your choice of file type in this instance. [2]

The image will be of a larger file size and higher qulaity and so more suitable for the purpose than a lower quality compressed jpeg.

5. Images can be stored using data compression.

Describe one advantage and one disadvantage of using data compression for an image inserted in a web page.

Advantage

The image will be of a lower file size and so will be downloaded by viewers in less time improving the user experience.

Small file size Reduced transfer time (2 × [1]) [2]

Disadvantage [4]

Compression and decompression … increase processing time (2 × [1]) May be ‘lossy’ … resulting in loss of image quality (2 × [1])

Page updated

Report abuse