binary representation

#medium

Understanding bitwise operators, playing in Python

Binary representation consider numbers as its bit representation, where integer numbers larger than 0 can be expressed as a sum of smaller numbers (each of these numbers being the multiplication of a same number, called "base") For a "base 2" representation, any integer can be represented as the sum of 2 multiplied by itself n times (2**n=2*2*2... n times, base=2, exponent=n)

To prove my point:

  0 = you're special
  1 = 2**0 (a math rule)
  2 = 2**1
 15 = 2**0 + 2**1 + 2**2 + 2**3
100 = 2**2 + 2**5 + 2**6

If we want negative numbers, we multiply by -1.

So far so good? Let's use the base-2 binary representation, noticing we count from left to right. I will use 8 "bits" (number of 0/1) just for simplicity

 0 = 00000000
 1 = 00000001 = 2**0
 2 = 00000010 = 2**1
 3 = 00000011 = 2**1 + 2**0
 4 = 00000100 = 2**2
 5 = 00000101 = 2**2 + 2**0
 6 = 00000110 = 2**2 + 2
 7 = 00000111 = 2**2 + 2**1 + 2**0
 8 = 00001000 = 2**3
 9 = 00001001 = 2**3 + 2**0
10 = 0001010 = 2**3 + 2**1

If you want to use Python to print the binary representation, bin() does the trick:

>>> bin(3)
>>> '0b11'
>>> bin(4)
>>> '0b100'
>>> bin(5)
>>> '0b101'
>>> bin(-1)
>>> '-0b1'
>>> bin(-2)
>>> '-0b10'
>>> bin(-5)
>>> '-0b101'

Cool... but what can I do with it?

1. Lung radiograph example

Let's use this representation to do some boolean operations. Why? Well, imagine you have different features in a lung radiograph, and you categorize them in 5 different apparently non-related classes: 1=broken rib, 2=pleural diffusion, 4=nodules on periphery, 8=nodules in the inner region, 16=thick lung surface. Then, you can classify a patient exam having just a broken rib with a class=1. For a patient with nodules in lung periphery and a broken rib, we can combine both classes and do class=1+4=5.

NIH image

In a single integer we combine the information of many scenarios. If we use this schema, each integer represents a unique set of combinations. Note that we need to set a restriction, namely, you can only sum once each integer, a example:

  • we can do 2 + 4 + 8 = 14, then 14 represents 3 classes
  • we cannot do 2 + 2 + 2 + 8 = 14, because 14 is not representing a set of single classes.

1.a Bitwise Boolean Operators

Let's explore one of the most used two boolean: "and", "or" (there are others, as "shift bit", "exclusive or", and "complement"). We'll focus our problem in the lung example, imaging the scenario: We want to know, which radiographs has "nodules on the inner region" (class=8) among the following classes: 23 (classes 1+2+4+16=23), 15 (classes 1+2+4+8=15), and 26 (classes 2+8+16=26). We compare using "AND" because we want class=8 AND the current class, we do it bit-by-bit and write a 1 when the bit is the same in both, and a 0 when is not:

  • class = 23 -> 23 & 8 = 0, in binary 10111 & 01000 = 00000
  • class = 15 -> 15 & 8 = 8, in binary 01111 & 01000 = 01000
  • class = 26 -> 26 & 8 = 8, in binary 11010 & 01000 = 01000

For the OR operator you must keep in mind that the result will be the bitwise sum of the two integers being compared, because you request one or the another. Example: 12 | 6 = 14, in binary is easier to see why 1100 | 0110 = 1110, this is because we compare each position and set a 1 if at least one of them has a 1, and a 0 otherwise.

Hope you found this useful for future coding!



Useful resource: here is a webpage when you can put any integer and it will give you the bit representation (in base 2): http://bitwisecmd.com/