Internal Representations

How does the computer store different kinds of information?

Bits, bytes, & words:

The smallest unit of information a computer can store is a bit, represented as a 0 or a 1. The 0 and 1 values represent different voltages. A byte is a collection of 8 bits, and can be used for instance to represent a single character, such as 'A'. A word is typically a collection of two bytes, for a total of 16 bits.

Number Bases

  • Base 10 (decimal) uses digits 0 through 9 and is what we're used to.

    • 456 = (4 * 100) + (5 * 10) + (6 * 1).

    • Note the powers of ten (100, 10, 1) used in each place.

  • Base 2 (binary) uses only digits 0 and 1. A binary number is sometimes prefaced with the '%' symbol.

    • %1011 = (1 * 8) + ( 0 * 4) + (1 * 2) + (1 * 1) = 8 + 0 + 2 + 1 = 11 (decimal).

    • Note the powers of two (8, 4, 2, 1) used in each place.

  • Base 16 (hexadecimal, or simply "hex") uses digits 0 through 9 and additionally symbols A,B,C,D,E,F to represent values 10 through 15. A hexadecimal number is sometimes prefaced with the '$' symbol.

    • $9A = (9 * 16) + (A * 1).

    • Remember that the symbol 'A' stands for the value 10, giving:

    • (9 * 16) + (10 * 1) = 144 + 10 = 154 (decimal).

Conversions between bases

      1. Binary to Decimal

          1. Let's say we want to convert %1010 to Decimal. Rewrite the binary number with the place values above each position as follows:

          2. 8 4 2 1 1 0 1 0

        1. This gives us powers of two (starting with the right-most value of 2 to the 0th power, or simply 2^0) on the top line, with the number itself on the bottom line. Wherever there is a one on the bottom line add the place value on the top line to your sum. This gives 8 + 2 = 10, which is the answer in decimal.

      1. Decimal to Binary

          1. The process itself is simple, but trying to write it down makes it appear complicated. Let's say we want to convert 25 to binary. First write down binary place values (powers of 2) for approximately the size of number in question:

          2. 32 16 8 4 2 1 _ _ _ _ _ _

          3. Note that we stopped writing down the powers of two once we got to 32, since 25 < 32. Now we need to figure out whether to put a 0 or 1 in each position. Starting from the left, write down a 1 if that place value is smaller than the number in question, in this case 25. Since 32 is not less than 25, we put down a 0 in that position. 16 is less than 25, so we put down a 1 there giving:

          4. 32 16 8 4 2 1 0 1 _ _ _ _

          5. We then subtract this place value from 25, giving: 25 - 16 = 9. We now continue using 9 as the number in question. Is there an "8" in 9? Yes, so we put down a 1 in that position, giving:

          6. 32 16 8 4 2 1 0 1 1 _ _ _

          7. 9 - 8 gives 1. Both 4 and 2 are larger than 1, so we put down a 0 in both those positions, leaving us with a 1 in the last position for the final answer of:

          8. 32 16 8 4 2 1 0 1 1 0 0 1

          9. So 25 = %11001.

          10. Note that the same approach is used when converting between decimal and hexadecimal, except you would use powers of 16 rather than powers of 2.

      1. Sample Program

      2. See this sample program (in Java) that takes an integer number and converts it to binary, stored as an array of 0's and 1's

      3. Binary to Hexadecimal

        1. When converting between binary and hex the key is to note that each hex digit corresponds with 4 binary digits. Given the seven digits %1011110 we first add a leading zero so that the number of digits is evenly divisible by four. (You can add leading zeros without changing the value.) Then group by four starting from the right, giving us the groupings: %0101 and %1110. These respectively correspond to the hex digits $5 and $E, so %1011110 = $5E.

      1. Hexadecimal to Binary

          1. Again simply group each four binary digits to correspond to a single hex digit. For instance $7 = %0111, or simply %111, since leading zeros don't give any additional information. We also know that $B = %1011. This idea of grouping by four then shows us that $7B = %01111011.

Names of binary number formats

  1. Unsigned binary: This format means that all the bits are used to represent the magnitude of a number.

  2. Signed magnitude: In this format the left-most bit represents the sign of the number only. A 0 means the number is positive, and a 1 means the number is negative.

  3. Two's complement: This is a format used to facilitate the use of negative numbers. Don't worry about all the reasons why for now. To take the two's complement of a binary number simply reverse the bits and add one. For example, we know that 6 = %110. To represent -6 as a two's complement number, reverse the bits (giving %001) and add one, giving %001 + %1 = %010. (Note that %1 + %1 = %10)

Representing Characters

Certain bit values can be interpreted to stand for characters. For instance, %01000001 = 65, which when treated as a character is 'A'. Appendix A in our text shows the correspondence between characters and code values. You can also use the man ascii unix command to show a character table.

One set of bits, different interpretations:

A single bit string could mean different things, depending on how it is interpreted, i.e. the context in which it is used.

E.g. %1000001 could be:

        • 65 as a 7-bit unsigned integer

        • -1 as a 7-bit signed-magnitude number

        • -63 as a two's complement number

        • 'A' as an ASCII character

Compiling

This is a process used to translate high-level language into a machine's low-level language.

The first electronic computers were programmed by hand by people plugging in or unplugging wires to represent 0's or 1's, which gave instructions in machine language, e.g. 01100111. This was tedious, so assembler commands were developed (e.g. ADD R3,R5), which could then be translated into machine language. Subsequently high-level languages (e.g. x = x + y) were developed, which allowed expressing mathematical ideas more directly.

The translation process of going from a higher-level language into machine language is called compiling.