logic, arithmetic shift, rounding and optimization

Logic shift is defined as a shift of bits either left or right with vacant bits filled up with zeros.

E.g, assume integer is 4 bytes long,

0xFFFF1234 logic left shift 4 bits is 0xFFF12340

0xFFFF1234 logic right shift 4 bits is 0x0FFFF123

0x12345678 logic left shift 4 bits is 0x23456780

0x12345678 logic right shift 4 bits is 0x01234567

Arithmetic shift is defined as shift of bits either left or right with sign bit preserved if possible.

Arithmetic shift diffs from logic shift only when negative numbers are involved. C99 requires that a signed integer can be either represented as two's complement, or one's complement ( C99 6.2.6.2). Therefore the MSB ( most significant bit ) of an integer would be the sign bit.

E.g., assume integer is 4 bytes long, 0xffff1234 would be a negative number.

0xFFFF1234 arithmetic left shift 4 bits is 0xFFF12340. ( left shift we continue to fill in 0s for vacant bits )

0xFFFF1234 arithmetic right shift 4 bits is 0xFFFFF123. ( right shift we fill in 1s for vacant bits to preserve sign bit )

0x12345678 arithmetic left shift 4 bits is 0x23456780.

0x12345678 arithmetic right shift 4 bits is 0x01234567. ( right shift we fill in 0s for vacant bits to preserve sign bit)

So basically, left shit, regardless logic or arithmetic, and regardless signed integer or unsigned integer, we just fill in 0s for the vacant bits.

How logic and arithmetic shift is related to C99 shift operators?

1. is "<<" operators logic shift or arithmetic shift?

The answer is: it does not matter. for left shift, logic, or arithmetic, we just fill in 0s for vacant bits.

However, C99 does have one corner case:

C99 6.5.7:

"The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 * 2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 * 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined."

Above basically says: normally "<<" is left shift with vacant bits filled up with 0s. However,

( signed) 0x1FFFFFFF << 8, resulting value is bigger than maximal signed integer value, hence the result is "undefined". The compiler may not generate 0xFFFFFF00 as one expected above. The compiler may generate any value, or a trap.

However, (unsigned) 0x1FFFFFFF << 8 will always generate result 0xFFFFFF00 as it still smaller than the maximal unsigned integer value.

The above also suggest we should not use "<<" operator to try to multiply something by 2, or 2 power to something, because the operator is not meant to do the multiplication. It just meant to shift bits for an "unsigned integer", which might be a status register of a peripheral device.

2. is ">>" operators logic shift or arithmetic shift?

The answer is: depending on compiler, it can be logic, or arithmetic.

C99 6.5.7:

"The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2^E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined."

Above basically says that for 0x1234578, regardless it is signed or unsigned, regardless shifting preserver sign bit or not, result is the same ( because sign bit is 0 anyway, filling in 0s for vacant bit is also preserving sign bit)

However, for 0xFFFF1234,

(unsigned) 0xFFFF1234 >> 4 is 0x0FFFF123.

(signed ) 0xFFFF1234 >> 4, result could be 0x0FFFF123 ( logic right shift), or 0xFFFFF123 (arithmetic right shift), depending on compiler.

For GCC, it opts for arithmetic shift.

https://gcc.gnu.org/ml/gcc/2000-04/msg00152.html

"An arithmetic left shift is equivalent to a logical left shift (as far as GCC is concerned)" "For right shifts, if the type is signed, then we use an arithmetic shift; if the type is

unsigned then we use a logical."

Above also suggest that we should not use ">>" to do the division of 2, or 2 power to something. We should use ">>" only to shift an "unsigned integer", which might be a status register of a peripheral device.

Summary: for "<<" operator, we just fill in 0s for the vacant bits. for unsigned integer, we also just fill in 0s for ">>" operator. However, for signed integer, result of ">>" depends on compiler. Therefore, we should really avoid using shift operators on signed integer, we should always use shift operator on unsigned integer. In fact, we should only use shift operator for bit operation, we should avoid using shift operator to do the mulitiplciation or division as the result can depend on compiler for signed integers.

3. should we divide by 2 or right shift by 1?

One conclusion above is that shift operator is really not meant for mathematical calculation. They are meant for bit operation where the semantics of each bit is just value 0 or 1, sign bit should be treated as an ordinary bit with value either 0 or 1.

If we want to divide some integer by 2, we should use division operator instead of right shift operator. Some may argue that right shift operator might be faster than a division operator, there are three reasons against it.

  • code readability. "/2" suggest you want to do a mathematical calculation. ">>1" suggest you want to do a bit operation. Those are two totally different things, and putting in the context, may confuse reader.
  • shift operator may not necessarily faster than division operator. It all depends on compiler and underneath hardware. See http://stackoverflow.com/questions/6357038/is-multiplication-and-division-using-shift-operators-in-c-actually-faster
  • if the operand is a negative integer, then right shift may not be the same as division, depending on the compiler doing logic shift or arithmetic shift.
  • even if the operand is a positive integer, and shift operator is faster, such optimization is still not desirable. The compiler will most likely replace division with shift internally. In another word, this is a premature optimization. See http://cowboyprogramming.com/2007/01/04/mature-optimization-2/

4 rounding

Last note on the difference between "division" and "right shift" is that the former is "round to the nearest", the latter is "round to zero". This difference is not related to the earlier discussion since the earlier discussion is focused on integer and there is no decimal point involve, however it worth noting the difference between the two operators in turns of rounding.

IEEE 754/IEC 60559 defined four type of rounding rules:

  • round to the nearest: chose the nearest number that can be represented by the chosen precision.
  • round to zero: truncate the extra digits that cannot be represented by the chosen precision.
  • round to positive infinity: round to the next closest bigger number that can be represented by the chosen precision. also called ceil.
  • round to negative infinity: round to the next closest smaller number that can be represented by the chosen precision. also called floor.

Rounding to the nearest is the default setting for C99 conforming floating point library.

Note that right shift operator will discard the value of the bits that got shifted out, therefore it is "round to zero".

Division operator clearly will follow whatever the rounding setting in the floating library, which default is "round to the nearest".