Part 1: Data Representation (003.Data.Representation.pdf)
The Fundamentals & Why It Matters (Conceptual)
Why is "thinking like the hardware" (binary, hex, decimal) essential for Assembly programmers, unlike high-level developers?
Your notes say, "If you can't mentally switch between 0b1010, 10, and 0xA... you're gonna have a bad time." Elaborate on why this mental fluidity is so critical in debugging and memory analysis.
Explain the concept of a "base" in numbering systems using a real-world analogy beyond the phone number one, emphasizing "carrying over."
Why is hexadecimal considered the "Real MVP" in low-level programming? What specific characteristics make it superior to raw binary for common tasks like reading memory dumps?
If a single bit flip can lead to "A corrupted address, A wrong jump, A freaking crash," what does this tell you about the precision required when working with raw bytes and memory?
You note that "Assembly doesn't sugarcoat anything—it deals in raw data." How does this fundamental principle influence the debugging process compared to debugging in a high-level language?
Why do we use prefixes/suffixes like 0x, 0b, or a leading 0 for numbers in code, rather than just raw numbers? What ambiguity are we preventing?
Binary & Decimal (The Hard Skills)
A 4-bit number is 0b1011. Convert it to decimal, showing your step-by-step thinking for each place value.
A 5-bit number is 0b11011. Convert it to decimal. What is the maximum decimal value that a 5-bit unsigned number can hold?
A memory address in a simple 8-bit system is 214. Convert this to binary. Why is it easier for the CPU to process this as bits rather than the decimal value?
An 8-bit register stores the value 0b00111100. If you "flip all the bits" (NOT operation), what is the new binary value, and what is its decimal equivalent?
Explain the "Rule of 2" (powers of 2) and why it's the foundation of everything in digital electronics. List the first 10 powers of 2.
How would you represent the decimal number 127 in binary? What happens if you add 1 to it in an 8-bit register? (Briefly touch on overflow).
When you see a binary pattern like 1111 0000, why is it helpful to mentally group it into 4-bit "nibbles" rather than 8 individual bits?
Hexadecimal (The Reverse Engineering MVP)
Convert the hex value 0x4F to binary. Now convert that binary to decimal.
A pointer in memory is 0xDEADC0DE. Break this down: how many bytes is this? How many bits? (Note: each hex digit is 4 bits).
Why is it so much faster to convert Hex to Binary (and vice versa) than Decimal to Binary? Explain the "4-to-1" relationship.
Convert the binary value 1101 1010 0101 1110 into hexadecimal.
You're looking at a memory dump and see 0x7F. What is this in binary? If this represented an ASCII character, why is hex more readable than binary?
A color in a UI is represented as #FF5733 (Hex). Convert each of the three components (R, G, B) to decimal.
In a debugger, you see a register change from 0x0F to 0x10. What happened in binary? What happened in decimal?
Data Sizes & Memory (The Infrastructure)
Explain the hierarchy: Bit → Nibble → Byte → Word → Doubleword → Quadword. Provide the bit-count for each.
Why is a "Byte" (8 bits) the fundamental unit of addressable memory in most modern systems? Why not 4 or 16?
On a 64-bit system, what is the size of a "Word" in bits? Is this always the same across all architectures (x86 vs. ARM)? Explain.
You're analyzing a binary and see a "DWORD" being moved into a register. How many bytes of data are being moved?
If a program is "32-bit," what does that technically mean for the size of its memory addresses and registers?
Why do reverse engineers often focus on "alignment" (e.g., data starting at addresses ending in 0, 4, 8, or C in hex)?
What is a "NULL byte" (0x00), and why is it so significant in C-style strings and memory analysis?
Signed vs. Unsigned & Two’s Complement (The Brain Melters)
Explain the core difference between "Unsigned" and "Signed" integers. How does the CPU interpret the "Most Significant Bit" (MSB) differently?
What is the "Sign-Magnitude" method, and why is it inefficient for computer hardware compared to Two's Complement?
Describe the Two's Complement conversion process ("Flip and Add 1") using the decimal number -5 as an example (in an 8-bit system).
Why does Two's Complement eliminate the problem of having "two zeros" (+0 and -0)?
An 8-bit unsigned integer can range from 0 to 255. What is the range for an 8-bit signed integer? Why is the positive range one less than the absolute value of the negative range?
You see 0xFF in an 8-bit register. If the code treats this as an unsigned number, what is the value? If it treats it as a signed (Two's Complement) number, what is the value?
Explain "Sign Extension." If you move a signed 8-bit value (0x80) into a 16-bit register, what does the resulting 16-bit value look like?
Real-World RE/Malware Scenarios (Data Representation)
In a malware sample, you see a string being XORed with the constant 0x41. Why is 0x41 a common choice? (Hint: Think ASCII).
You're reversing a network protocol and see a 4-byte value sent over the wire as 0x12 0x34 0x56 0x78. The documentation says the value is 0x78563412. Explain "Endianness" (Little-Endian vs. Big-Endian) and which one this is.
How can a malware author hide a "malicious URL" by encoding it as a series of hex bytes rather than plain text? Provide a simple example.
You find a "Magic Header" in a file: 0x4D 0x5A (MZ). What does this tell you about the file type, and how did you use your knowledge of hex to identify it?
A packer uses a simple "substitution cipher" where each byte is added to 0x05. If the original byte was 0xFA, what is the "encrypted" byte? (Explain the overflow/wrap-around).
Why is understanding "Two's Complement" critical when you see a conditional jump like JL (Jump if Less) vs. JB (Jump if Below) in a debugger?
How can an "Integer Overflow" (e.g., adding 1 to 0xFF in an 8-bit unsigned register) be exploited by an attacker to bypass a security check (like a buffer size check)?
Part 2: Shifts and Rotates (004.Shifts.and.Rotates.pdf)
Logic Foundations (Conceptual)
What is a "Bitwise Shift," and how does it fundamentally differ from standard arithmetic operations like addition?
Explain the difference between a "Shift" and a "Rotate." What happens to the bits that "fall off" the end in each case?
Why are shifts and rotates considered "insanely fast" compared to multiplication or division in low-level programming?
In your notes, you call SHL/SHR the "Power of 2 Multiplexer/Divider." Explain this mathematical shortcut.
What is the "Carry Flag" (CF), and how does it act as a "temporary landing pad" during a shift operation?
Logical Shifts (SHL / SHR)
A register AL contains 0b00001101 (13 decimal). Perform SHL AL, 1. What is the new binary value, the new decimal value, and the state of the Carry Flag?
Perform SHR AL, 2 on the value 0b10100000. What is the result?
Why does a SHR (Logical Shift Right) always fill the vacated bits with zeros? What does this imply about the "sign" of the number?
If you perform SHL AL, 4 on an 8-bit register, you are effectively multiplying the value by what power of 2?
You have the value 0b11000000. If you perform SHL 1, what happens to the MSB? Where does it go?
In a high-level language like C++, how are the operators << and >> related to SHL and SHR?
What happens if you shift a value by more bits than the register size (e.g., SHL AL, 9 on an 8-bit register)? Why is this generally avoided?
Arithmetic Shifts (SAL / SAR)
Explain the "Golden Rule" of SAR (Arithmetic Shift Right): Why is the sign bit preserved?
Perform SAR on the signed 8-bit value 0b11110000 (-16 decimal) by 1. What is the result in binary and decimal?
Compare SHR and SAR using the binary value 0b10000000. Show the results of both operations shifted right by 1.
Why is SAL (Arithmetic Shift Left) technically identical to SHL in modern x86 architectures?
How does SAR handle the "rounding" of negative numbers during division? (e.g., -5 divided by 2).
You're looking at a piece of code and see SAR instead of SHR. What does this tell you about the nature of the data being processed?
Why is SAR essential for maintaining the mathematical integrity of signed integers in low-level calculations?
Rotates (ROL / ROR)
Explain the "Circular Logic" of a Rotate operation. Why is no data truly "lost"?
An 8-bit register contains 0b10110001. Perform ROL 1. What is the new value?
Perform ROR 3 on the value 0b00000111. What is the result?
How can a ROL operation be used to "reconstruct" a value if you know the original rotation count?
What is the relationship between ROL AL, 1 and ROR AL, 7 in an 8-bit register?
Explain the "Rotate Through Carry" (RCL / RCR) instructions. How does the Carry Flag participate in the circular movement?
Why are rotates rarely used for simple arithmetic but frequently used in "scrambling" data?
Advanced RE/Malware Scenarios (Shifts & Rotates)
In a malware sample, you see the instruction SHL EAX, 4 followed by ADD EAX, EBX. What kind of complex arithmetic (e.g., array indexing) might be happening here?
You're reversing a "Custom Packing Algorithm" and see a sequence: XOR, ROL, ADD, ROR. Why do malware authors use these instructions instead of standard encryption libraries?
How is SHR used to extract specific "bitfields" from a large data structure (e.g., getting bits 8-15 from a 32-bit header)?
Explain how a "Bitmask" (using AND) is often combined with a SHR to isolate a single bit of information.
A common technique for "zeroing out" a register is XOR EAX, EAX. How could you use a shift to achieve a similar result (though less efficiently)?
You find a "Bitwise Loop" in a binary that rotates a key and XORs it with a byte stream. What is this a classic sign of?
How can SHL and SHR be used to implement "Endness Swapping" (converting Big-Endian to Little-Endian) manually in Assembly?
Bitwise Puzzles & Logic Challenges
Write the binary result of: (0b10101010 >> 2) | (0b11110000 << 1). (Note: use 8-bit logic).
You need to multiply a number by 10 using only SHL and ADD. How would you do it? (Hint: 10 = 8 + 2).
Given a 16-bit value 0xABCD, use only shifts and rotates (and perhaps an AND) to move the "BC" nibbles to the beginning of the value (0xBC??).
If you rotate an 8-bit value 8 times, what is the result? What does this tell you about "periodicity" in rotates?
A piece of code does: SHL EAX, 1, SHL EAX, 1, SHL EAX, 1. How could this be optimized into a single instruction?
In a 32-bit register, you want to check if the 15th bit is set. How can you use SHR and the Carry Flag to do this?
What is the difference between "Logical" and "Arithmetic" shifts when the value is zero?
Deep Thinking & Synthesis (Part 3 - Combined Knowledge)
Why is the "Mental Translation" between Hex, Binary, and Shifts considered the "Holy Trinity" of Reverse Engineering?
A malware author wants to hide the string "CMD.EXE". They store it as 0x43, 0x4D, 0x44, 0x2E, 0x45, 0x58, 0x45. They then ROL each byte by 2. What are the resulting hex bytes?
You're analyzing a "Virtual Machine" (VM) based obfuscator. The VM's "Bytecode" uses the first 3 bits for the opcode and the last 5 bits for the register index. How would the dispatcher use SHR and AND to decode this?
In cryptographic algorithms (like AES or ChaCha20), why are ROL and ROR used so extensively compared to SHL and SHR? (Think about "diffusion" and "information loss").
Explain how "Arithmetic Shift Right" (SAR) can be used to "smear" the sign bit across a whole register to create a "mask" (e.g., 0xFFFFFFFF if negative, 0x00000000 if positive).
You're reversing a game and find that the player's "Health," "Mana," and "Level" are all packed into a single 32-bit integer. Health (10 bits), Mana (10 bits), Level (12 bits). How do you extract the "Mana" value?
A "Control Flow Integrity" (CFI) mechanism might use precise ROL or ROR operations on pointers stored in critical system structures (like the Interrupt Descriptor Table) to redirect execution to malicious code.
If a program uses SHR on an input value to derive an index for a sensitive lookup table (e.g., API calls), how could an attacker craft an input value that, when shifted, results in an unintended or malicious index?
You've identified a "custom hash function" in malware that uses a non-standard rotation count (e.g., ROR by 5 bits) and XORs. How would you determine if this is a known cryptographic hash or a custom, weaker one?
A common anti-analysis technique is "control flow flattening," where complex jumps are replaced with linear code and dispatchers. How might bitwise operations (including shifts) be used in the dispatcher to determine the next block of code to execute?
Given your journey learning 003.Data.Representation.pdf and 004.Shifts.and.Rotates.pdf, and your goal of reversing software from inside out for "understanding and mental curiosity," how will the mastery of these specific 200+ concepts (data representation, shifts, rotates, and their RE/Malware applications) directly empower you to break down and comprehend the most sophisticated layers of software obfuscations and virtualization like vmprotect and themida? Be specific and integrate multiple concepts.