Coding Information

Remember!

Computers do not understand words or decimal numbers and they do not understand natural languages such as English. Every data item and every instruction must be converted into binary digits (bits) to be understood by the machine.

Bit = 0 or 1. Bit is taken from the words Binary Digit.

Byte = the number of bits used to represent letters, numbers and special characters such as $ # ,

/ &. One BYTE is equal to 8 BITS.

Word = the number of bytes a computer can process at one time by the CPU.

A number of coding schemes have been developed to translate characters into a series of bits.

Taken together these bits form a byte, so that one character is stored as a single byte in memory.

THE BYTE, NIBBLE, AND WORD

Bytes

Most microcomputers handle and store binary data and information in groups of eight bits, so a special name is given to a string of eight bits: it is called a byte. A byte always consists of eight bits, and it can represent any of numerous types of data or information.

Nibbles

Binary numbers are often broken down into groups of four bits, as we have seen with BCD codes and hexadecimal number conversions. In the early days of digital systems, a term caught on to describe a group of four bits. Because it is half as big as a byte, it was named a nibble.

Words

Bits, nibbles, and bytes are terms that represent a fixed number of binary digits. As systems have grown over the years, their capacity (appetite?) for handling binary data has also grown. A word is a group of bits that represents a certain unit of information. The size of the word depends on the size of the data pathway in the system that uses the information. The word size can be defined as the number of bits in the binary word that a digital system operates on. For example, the computer in your microwave oven can probably handle only one byte at a time. It has a word size of eight bits. On the other hand, the personal computer on your desk can handle eight bytes at a time, so it has a word size of 64 bits.

DoubleWords

A DoubleWord is simply Double the Size of The Word of the system. A Word of 4 bytes will have a DoubleWOrd of 8 bytes.

ALPHANUMERIC CODES

In addition to numerical data, a computer must be able to handle nonnumerical information. In other words, a computer should recognize codes that represent letters of the alphabet, punctuation marks, and other special characters as well as numbers. These codes are called alphanumeric codes. A complete alphanumeric code would include the 26 lowercase letters, 26 uppercase letters, 10 numeric digits, 7 punctuation marks, and anywhere from 20 to 40 other characters, such as ., /, #, %, *, and so on.We can say that an alphanumeric code represents all of the various characters and functions that are found on a computer keyboard.

ASCII Code

The most widely used alphanumeric code is the American Standard Code for Information Interchange (ASCII). The ASCII (pronounced “askee”) code is a seven-bit code, and so it has possible code groups. This is more than enough to represent all of the standard keyboard characters as well as the control functions such as the (RETURN) and (LINEFEED) functions. Table below shows a listing of the standard seven-bit ASCII code. The table gives the hexadecimal and decimal equivalents. The seven-bit binary code for each character can be obtained by converting the hex value to binary.

Many applications of computers require the processing of data which contains numbers, letters, and other symbols such as punctuation marks. In order to transmit such alphanumeric data to or from a computer or store it internally in a computer, each symbol must be represented by a binary code. One common alphanumeric code which you have learned in Programming Course is the ASCII code (American Standard Code for Information Interchange). This is a 7-bit code, so 27 (128) different code combinations are available to represent letters, numbers, and other symbols.

Table below shows ASCII CODE in HEX

ASCII CODE in BINARY

The word “Start” is represented in ASCII code as follows:

1010011 1110100 1100001 1110010 1110100

S t a r t

The ASCII code is used for the transfer of alphanumeric information between a computer and the external devices such as a printer or another computer. A computer also uses ASCII internally to store the information that an operator types in at the computer’s keyboard. The following example illustrates this.

An operator is typing in a C language program at the keyboard of a certain microcomputer. The computer converts each keystroke into its ASCII code and stores the code as a byte in memory. Determine the binary strings that will be entered into memory when the operator types in the following C statement:

if (x>3)

Locate each character (including the space) in ASCII Table and record its ASCII code (Hex and Binary).

i 69 0110 1001

f 66 0110 0110

space 20 0010 0000

( 28 0010 1000

x 78 0111 1000

> 3E 0011 1110

3 33 0011 0011

) 29 0010 1001

Note that a 0 (zero) was added to the leftmost bit of each ASCII code because the codes must be stored as bytes (eight bits). This adding of an extra bit is called padding with 0s.

More info on ASCII at http://www.asciitable.com/

EBCDIC for Extended Binary Coded Decimal Interchange Code

Another form of coding Information is EBCDIC shown below.

Computer professionals often prefer ASCII because the pattern is easier to understand. Also because of the mathematical ordering of numbers and letters, tasks such as sorting and alphabetizing are possible since computers are really comparing numbers rather than letters. Since both ASCII (and EBCDIC) were based on English over time their value as a world wide coding system has decreased dramatically. At most there are 256 possible different codes in ASCII based on it’s 7-bit makeup. This is barely sufficient to encode English and most European languages. But how about the other languages like the Chinese, Japanese and Hebrew Characters that has more than 256 codes of information ? HOw do computers handle these Languages and what sort of CODE is needed ?

Unicode

The world of computers is not limited to European languages. In 1980 the Chinese Character Code for Information Interchange (CCCII) was created in Taiwan to code characters from Chinese, Taiwanese and Japanese.

Interesting reading http://unicode.org/

Unicode provides a single code for every character regardless of the natural language from which it comes. This is critical in using the Web. Unicode can code information in two directions so it can deal with languages like Hebrew and Arabic. Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language. The Unicode Standard has been adopted by such industry leaders as Apple, HP, IBM, JustSystems, Microsoft, Oracle, SAP, Sun, Sybase, Unisys and many others. Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is supported in many operating systems, all modern browsers, and many other products. The emergence of the Unicode Standard, and the availability of tools supporting it, are among the most significant recent global software technology trends.

Incorporating Unicode into client-server or multi-tiered applications and websites offers significant cost savings over the use of legacy character sets. Unicode enables a single software product or a single website to be targeted across multiple platforms, languages and countries without re-engineering. It allows data to be transported through many different systems without corruption.

Errors Happen

- When bytes/characters are sent from place to place, transmission errors can happen.

- Computers are only as reliable as the data and information they contain

- Garbage In = Garbage Out (GIGO)

PARITY METHOD FOR ERROR DETECTION

The movement of binary data and codes from one location to another is the most frequent operation performed in digital systems. Here are just a few examples:

■ The transmission of digitized voice over a microwave link

■ The storage of data in and retrieval of data from external memory devices such as magnetic and optical disk

■ The transmission of digital data from a computer to a remote computer over telephone lines (i.e., using a modem).

This is one of the major ways of sending and receiving information on the Internet.

Whenever information is transmitted from one device (the transmitter) to another device (the receiver), there is a possibility that errors can occur such that the receiver does not receive the identical information that was sent by the transmitter. The major cause of any transmission errors is electrical noise, which consists of spurious fluctuations in voltage or current that are present in all electronic systems to varying degrees. The Figure below is a simple illustration of a type of transmission error.

The transmitter sends a relatively noise-free serial digital signal over a signal line to a receiver. However, by the time the signal reaches the receiver, it contains a certain degree of noise superimposed on the original signal. Occasionally, the noise is large enough in amplitude that it will alter the logic level of the signal, as it does at point x. When this occurs, the receiver may incorrectly interpret that bit as a logic 1, which is not what the transmitter has sent.

Most modern digital equipment is designed to be relatively error-free, and the probability of errors such as the one shown in Figure above is very low. However, we must realize that digital systems often transmit thousands, even millions, of bits per second, so that even a very low rate of occurrence of errors can produce an occasional error that might prove to be bothersome, if not disastrous. For this reason, many digital systems employ some method for detection (and sometimes correction) of errors. One of the simplest and most widely used schemes for error detection is the parity method.

Parity Bit or Check Bit

A parity bit is an extra bit that is attached to a code group that is being transferred from one location to another. The parity bit is made either 0 or 1, depending on the number of 1's that are contained in the code group. Two different methods are used. In the even-parity method, the value of the parity bit is chosen so that the total number of 1s in the code group (including the parity bit) is an even number. For example, suppose that the group is 1000011. This is the ASCII character C. The code group has three 1s.Therefore,we will add a parity bit of 1 to make the total number of 1's an even number.The new code group, including the parity bit, thus becomes

If the code group contains an even number of 1s to begin with, the parity bit is given a value of 0. For example, if the code group were 1000001 (the ASCII code for A), the assigned parity bit would be 0, so that the new code, including the parity bit, would be

01000001.

The odd-parity method is used in exactly the same way except that the parity bit is chosen so the total number of 1s (including the parity bit) is an odd number. For example, for the code group 1000001, the assigned parity bit would be a 1. For the code group 1000011, the parity bit would be a 0.

Regardless of whether even parity or odd parity is used, the parity bit becomes an actual part of the code word. For example, adding a parity bit to the seven-bit ASCII code produces an eight-bit code. Thus, the parity bit is treated just like any other bit in the code.

The parity bit is issued to detect any single-bit errors that occur during the transmission of a code from one location to another. For example, suppose that the character “A” is being transmitted and odd parity is being used. The transmitted code would be

1 1 0 0 0 0 0 1 When the receiver circuit receives this code, it will check that the code contains an odd number of 1s (including the parity bit). If so, the receiver will assume that the code has been correctly received. Now, suppose that because of some noise or malfunction the receiver actually receives the following code:

1 1 0 0 0 0 0 0

The receiver will find that this code has an even number of 1s. This tells the receiver that there must be an error in the code because presumably the transmitter and receiver have agreed to use odd parity.There is no way, however, that the receiver can tell which bit is in error because it does not know what the code is supposed to be.

It should be apparent that this parity method would not work if two bits were in error, because two errors would not change the “oddness” or “evenness” of the number of 1s in the code. In practice, the parity method is used only in situations where the probability of a single error is very low and the probability of double errors is essentially zero.

When the parity method is being used, the transmitter and the receiver must have agreement, in advance, as to whether odd or even parity is being used. There is no advantage of one over the other, although even parity seems to be used more often. The transmitter must attach an appropriate parity bit to each unit of information that it transmits. For example, if the transmitter is sending ASCII-coded data, it will attach a parity bit to each seven-bit ASCII code group. When the receiver examines the data that it has received from the transmitter, it checks each code group to see that the total number of 1s (including the parity bit) is consistent with the agreedupon

type of parity. This is often called checking the parity of the data. In the event that it detects an error, the receiver may send a message back to the transmitter asking it to retransmit the last set of data.The exact procedure that is followed when an error is detected depends on the particular system.

When one computer is transmitting a message to another, the information is usually encoded in ASCII.What actual bit strings would a computer transmit to send the message HELLO, using ASCII with even parity?

First, look up the ASCII codes for each character in the message. Then for each code, count the number of 1s. If it is an even number, attach a 0 as the MSB. If it is an odd number, attach a 1. Thus, the resulting eight-bit codes (bytes) will all have an even number of 1s (including parity).

Let’s look at doing error checking on a telephone number

Using ASCII to translate a phone number 645 3180 into bits (binary digits) would give the following:

6 --> 011 0110

4 --> 011 0100

5 --> 011 0101

3 --> 011 0011

1 --> 011 0001

8 --> 011 1000

0 --> 011 0000

Using odd parity, remember, the computer counts the number of “1”s in the list of bits and decides if the result is odd. If the number of 1-bits is odd, a 0 is placed in the leftmost place keeping the number of 1-bits odd. If the number of 1-bits is even, a 1 is placed in front of the

number making the number of 1-bits odd. The receiving computer checks the transmission and then removes (strips off) the extra bit before using the remaining bits as data. The above binary translation for 645 3180 would look like the following:

SUMMARY: Parity BIT

Homework-03: (1 week deadline)

2-20. How many bits are required to represent the decimal numbers in the range from 0 to 999 using

(a) straight binary code? (b) Using BCD code?

2-21. The following numbers are in BCD. Convert them to decimal.

(a) 1001011101010010 (d) 0111011101110101

(b) 000110000100 (e) 010010010010

2-22.*(a) How many bits are contained in eight bytes?

(b) What is the largest hex number that can be represented in four bytes?

2-23. (a) What is the most significant nibble of the ASCII code for the letter X?

(b) How many nibbles can be stored in a 16-bit word?

2-24. Represent the statement “ ” in ASCII code. Attach an odd parity bit.

2-25.*Attach an even-parity bit to each of the ASCII codes for Problem 2-24, and give the results in hex.

2-26. The following bytes (shown in hex) represent a person’s name as it would be stored in a computer’s memory. Each byte is a padded ASCII code. Determine the name of each person.

(a) 42 45 4E 20 53 4D 49 54 48

(b) 4A 6F 65 20 47 72 65 65 6E

2-27. Convert the following decimal numbers to BCD code and then attach an odd parity bit.

(a) 74 (c) 8884 (e)* 65

(b) 38 (d) 275 (f) 9201

2-28.*In a certain digital system, the decimal numbers from 000 through 999 are represented in BCD code. An odd-parity bit is also included at the end of each code group. Examine each of the code groups below, and assume that each one has just been transferred from one location to another. Some of the groups contain errors. Assume that no more than two errors have occurred for each group. Determine which of the code groups have a single error and which of them definitely have a double error. (Hint: Remember that this is a BCD code. And the Parity bit is the LSB coded RED )

(a) 100101011000 0 parity bit

(b) 010001110110 0

(d) 100001100010 1

2-37.*In a microcomputer, the addresses of memory locations are binary numbers that identify each memory circuit where a byte is stored.The number of bits that make up an address depends on how many memorylocations there are. Since the number of bits can be very large, the addresses are often specified in hex instead of binary.

(a) If a microcomputer uses a 20-bit address, how many different memory locations are there?

(b) How many hex digits are needed to represent the address of a memory location?

2-38. In an audio CD, the audio voltage signal is typically sampled about 44,000 times per second, and the value of each sample is recorded on the CD surface as a binary number. In other words, each recorded binary number represents a single voltage point on the audio signal waveform.

(a) If the binary numbers are six bits in length, how many different voltage values can be represented by a single binary number?

Repeat for eight bits and ten bits.

(b) If ten-bit numbers are used, how many bits will be recorded on the CD in 1 second?

2-39. A black-and-white digital camera lays a fine grid over an image and then measures and records a binary number representing the level of gray it sees in each cell of the grid. For example, if four-bit numbers are used, the value of black is set to 0000 and the value of white to 1111, and any level of gray is somewhere between 0000 and 1111. If six-bit numbers are used, black is 000000, white is 111111, and all grays are between the two. Suppose we wanted to distinguish among 254 different levels of gray within each cell of the grid. How many bits would we need to use to represent these levels?

2-40. A 3-Megapixel digital camera stores an eight-bit number for the brightness of each of the primary colors (red, green, blue) found in

each picture element (pixel). If every bit is stored (no data compression), how many pictures can be stored on a 128-Megabyte memory

card? (Note: In digital systems, Mega means 220.)

Page updated

Google Sites

Report abuse