1.3.1 Compression, Encryption & Hashing

What you need to know

(a) Lossy vs Lossless compression.

(b) Run length encoding and dictionary coding for lossless compression.

(d) Different uses of hashing.

Lossy Vs Lossless Compression

What is Compression

File compression: reducing the size of a file so it occupies less space on a storage medium. To compress a file you can use either a utility software or application that provide compressed output formats (e.g. Photoshop can convert .png to .jpg)

Advantages:

More space allows you to store more files on your given storage device

Smaller files are faster to send/receive, particularly when transmitting across a low bandwidth network

Disadvantages:

x Processing resources are required to decompress files

x Often dependant on software when receiving/archiving (e.g. WinRAR to unzip .rar files)

Compression software reduces the size of files using either lossy or lossless compression,

Image from BBC Bitesize - Which image represents Lossy and Lossless compression? Why?

Lossy compression:

Lossy compression the quality is affected, due to data is lost, the final quality is dependent on the level of compression.
Lossy compression takes a series of close-together values and approximates them to a single value (e.g. multiple similar pixels in an image become one colour).
Lossy compression is used when the original quality is unnecessary or when bandwidth/disk space is the main concern.
Lossy compression has a better compression ratio, compared to lossless compression, since more data is lost (E.g. JPEG files use lossy compression to allow the user to compromise between size and quality)

OCR A Level (H046-H446) Lossy vs lossless_720P HD.mp4

Craig n Dave explaining about Lossy and Lossless Compression

Lossless compression:

Lossless compression quality is unaffected as no data is lost a duplicate is made
All the original data is readable after compression
Lossless compression is used when the data being archived is important (e.g. letters such as bank statements, important raw photos, program files, etc.)

Lossless compression uses algorithms to compress files. The two main common lossless algorithms are RLE (Run Length Encoding) and Dictionary Compression.

Run Length Encoding (RLE)

Run length encoding (RLE) is an example of a compression algorithm that converts consecutive similar values into a code. This code consists of the identical value and the number of times this value is repeated.

The computer stores binary value 1 for white and binary value 0 for black for each row of the image.
The first row in the image (figure 1) can be represented as 2 0 , 5 1 ,1 0. This code represents 2 black pixels, 5 white pixels and 1 black pixels.
Similarly, the second row in the image (figure 1) is represented as 1 1 6 0 1 1.
The first number represents the run length and the second is the data to be used.
This type of coding is not efficient if the file does not have many runs. In some cases, the file size may increase instead of getting smaller.
Therefore RLE is used only in simple images with a large area of the same colour.

Figure 1.

Dictionary Compression

Another method used for lossless compression is called dictionary compression this is when a ‘dictionary’ is created of repeated code. The code is given a key which is then placed in a dictionary. This key replaces all instances of the code within the file.

Old MacDonald had a farm E-I-E-I-O And on his farm he had a cow E-I-E-I-O With a moo moo here And a moo moo there Here a moo, there a moo Everywhere a moo moo Old MacDonald had a farm E-I-E-I-O

Dictionary

1 = “E-I-E-I-O”

2 =“Old MacDonald had a farm”

3 = “ a moo moo ”

4 = “ a moo ”

2 1 And on his farm he had a cow 1 With3here And3there Here4, there4 Everywhere3 2 1

Compression_ Crash Course Computer Science 21_720P HD.mp4

Task

Use the Internet to find the uses of the following file formats JPEG, PNG, GIF, MP3, and MP4. You will also need to specify whether these file formats are lossy or lossless.
Answer the remaining questions about compression.

To get a copy of the worksheet CLICK HERE (remember to make a copy if you do not have one in your Google Classroom)

Symmetric and Asymmetric Encryption

Encryption

Plaintext of a message sent is encrypted using a cipher algorithm and key into equivalent ciphertext
When received, the ciphertext is decrypted back to plaintext using the same or different key

Caesar cipher

The Caesar cipher is the most basic type of encryption and the most insecure
Letters of the alphabet are shifted by a consistent amount

Task

Print out the worksheet (CLICK HERE) and create your own Ceasar Cipher - can you:

Encrypt your own message
Give the message to a friend and see if they can decipher with out a key (Was it easy to do, if so why?)

Algorithmic security

Ciphers are based on computational security
Keys are determined using a computer algorithm
Any key generated by an algorithm can be unpicked given enough ciphertext, computer power and time (except the one time pad)

Symmetric encryption

Symmetric encryption is also known as private key encryption, where the same key is used to encrypt and decrypt data. The key must also be transferred to the recipient. As a result, the key can be intercepted as easily as the message causing obvious security issues

Asymmetric encryption

Uses two separate but related keys. One key is made public so that others can use it to encryp thet data they wish to send to you. The public key cannot decrypt data. A private key known only to you is used to decrypt data

The Internet_ Encryption & Public Keys_720P HD.mp4

Task

Compare the following websites by putting the same message through their encryption mechanisms.

Comment on the differences.

a) [Simple] http://encryption.online-toolz.com/tools/text-encryption-decryption.php

b) [More sophisticated:] https://www.infoencrypt.com

In one of these websites, the encoded message is always the same but in another, it changes every time. Look at the help section of the site to find out how they achieve this effect?

Create a report using Google Docs to explain what is encryption? What is the difference between symmetric and asymmetric encryption, giving examples to your answer? (Remember to hand your report in via Google Docs) Teacher Presentation (CLICK HERE)

Hashing

Hashing is the process of changing a plain text or a key to a hashed value by applying a hash function. Usually, the input length is greater in size than the output hash value. Hashing is a one-way encryption process such that a hash value cannot be reverse engineered to get to the original plain text. Hashing is used in encryption to secure the information shared between two parties.

Some Common Hashing Algorithms

The following are the most used hashing functions:

1) Message Digest (MD5)

MD5 generates 128-bit outputs for a variable length of inputs (this means no matter the length of your message to encrypt the MD5 output will be 128 characters). It covered a lot of security threats but failed to provide full data security services. Though widely used, the main issues being raised with the use of MD5 are its vulnerability (so old that databases now store common MD5 hash outputs and collisions (Two different keys can produce identical hash value ).

Task

Using the MD5 python code above create your own MD5 Hashed message - then using Google can your reverse the MD5 Hash https://md5.gromweb.com/ ?

2) SHA - Secure Hashing Algorithm

SHA means Secure Hashing Algorithm; it was developed for the first time by the National Security Agency. This algorithm got updates repeatedly to improve security flaws in the old genre.

Now, SHA-2 is being used by many firms. SHA-2 is a set of cryptography hash function that includes six hash functions including SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, and SHA-512/256.

SHA_ Secure Hashing Algorithm - Computerphile_720P HD.mp4

Secure Hashing Algorithm (SHA1) explained. Dr Mike Pound explains how files are used to generate seemingly random hash strings. YouTube Channel Computerphile e https://www.youtube.com/channel/UC9-y-6csu5WGm29I7JiwpnA

PAST PAPER QUESTIONS

Try and answer the past paper exam questions -You can write your answers on paper or print out the exam paper - Mark Scheme is provided at the end of the paper questions. (try not to look at the answers before attempting all questions)

Website_questions_3.1.1_Compression_Encryption_ampampamp_Hashing.pdf

Page updated

Report abuse