Bit manipulation: Introduction

A bit is the smallest unit of data in computers, which can have two states: zero and one. Bits can be used to represent many things. The simplest form is when each bit represents a digit in a non-negative (unsigned) integer in base 2. However, the same bits can represent signed numbers or floating-point numbers or even information other than numbers. The format or the "encoding" of the bitstring determines what that bitstring represents. For example, consider a 4-bit bitstring 1100. If the encoding is an unsigned integer, the value of the bitstring is 12. If the bitstring is a signed integer, then it represents -4. 

One can also encode non-numerical data into a bitstring. For example, if you had eight LEDs, you could assign each LED to a respective bit in a bitstring (i.e., Bit 0 represents LED #0, Bit 1 represents LED #1, etc.). You could also say that 0 indicates an LED is off and 1 suggests an LED is on. Then, the bitstring 00101101 tells us that LED #0, #2, #3, and #5 are all on, whereas the rest are off.

One advantage to this implementation is that the bits are easy to directly manipulate. If we wanted to turn LED #1 on, we could simply flip Bit 1, resulting in 00101111. Similarly, flipping Bit 3 turns LED #3 off, resulting in 00100111. In C, we have bitwise operations and bitmasks to accomplish this task of directly manipulating bits.

Interesting to know

In a computer, each instruction is a string of bits that carry a particular meaning. The designers put together several sequences of bits, each representing something in the instruction. Let's say you had a simple computer that could add or subtract two four-bit signed integers. Since the computer can only perform one of two operations, you could represent the operation as a single bit (let's say 0 for add and 1 for subtract). If you wanted to instruct the computer to add 3 and -1, your instruction could look like this: 000111111, which can be broken up into 0 0011 1111. 0 signifies the operation to be performed (addition), and 0011 and 1111 are 3 and -1, respectively, in two's complement form. Thus, 000111111 translates to "add 3 and -1." The instruction encoding is the basis of computer architecture.

Ask yourself