DATA TRANSFER IN ASSEMBLY: WHAT’S REALLY GOING ON
At its core, data transfer in assembly is about moving values around so the CPU can work with them.
The processor itself can only operate on data that’s in the right place, usually a register, so most assembly programs spend a lot of time copying values between:
Registers (inside the CPU)
Memory (RAM)
The stack (a special area of memory used for function calls and temporary storage)
Nothing “magical” happens during data transfer. The CPU simply copies bits from one location to another. Understanding where data comes from and where it goes is the foundation of everything else in assembly.
The Three Basic Data Transfer Instructions
MOV
Copies data from a source to a destination
Does not modify the original source
Does not perform calculations
Think of MOV as a straight copy-paste operation.
PUSH
Places a value onto the stack
Automatically adjusts the stack pointer
POP
Removes a value from the stack
Stores it somewhere (usually a register)
Automatically adjusts the stack pointer back
Together, PUSH and POP are essential for function calls, saving registers, and managing temporary data.
Operand Types: What Instructions Work With
Every assembly instruction operates on operands. An operand is simply the thing the instruction uses or modifies.
There are three basic operand types in x86 assembly:
1️⃣. Immediate Operands
Immediate operands are literal values written directly in the instruction.
Examples:
10
-255
0FFh
Here, 10 is not stored in memory or a register beforehand—it’s embedded directly in the instruction.
Why this matters:
Immediate values are fast and convenient, but they’re fixed constants. You can’t change them at runtime.
2️⃣. Register Operands
Register operands refer to CPU registers, such as:
EAX
EBX
ECX
EDX
Registers are:
Extremely fast
Very limited in number
Where almost all real computation happens
Key idea:
If the CPU is going to do math or logic, the data usually has to be in registers first.
3️⃣. Memory Operands
Memory operands reference locations in RAM.
Memory is:
Much larger than registers
Slower to access
Where most program data lives long-term
Assembly forces you to be explicit about memory access.
You never “accidentally” touch memory, you have to say exactly where.
ADDRESSING MODES: HOW MEMORY IS REACHED
When an instruction refers to memory, the CPU needs to know how to find that memory address. This is where addressing modes come in.
1. Direct Addressing
Direct addressing specifies the memory location explicitly.
What this means:
myValue represents a fixed memory address
The CPU goes directly to that address and reads the value
Key characteristics:
The address is fixed
Very clear and readable
Mostly used with labels and global variables
In real programs, direct addressing is common when working with named data defined in the data segment.
2. Immediate Addressing (Not Memory!)
Immediate addressing does not access memory at all.
What’s happening:
The value 10 is placed directly into EAX
No memory lookup occurs
This is included here because beginners often confuse it with memory access — but it isn’t.
Rule of thumb: No brackets → no memory access
3. Indirect Addressing
Indirect addressing uses a register that contains a memory address.
Step-by-step:
EBX holds a memory address
The CPU looks at that address
The value stored there is loaded into EAX
This is exactly how pointers work at the assembly level.
Important distinction:
ebx → the number inside the register
[ebx] → the data stored at the address contained in EBX
4. Indexed Addressing
Indexed addressing calculates a memory address using a base register plus an offset.
What’s happening:
Start at the address in EBX
Move forward by 4 bytes
Read the value stored there
This addressing mode is commonly used for:
Arrays
Structures
Walking through memory in loops
Mental model:
“Start here, then move forward this many bytes.”
Indexed addressing is the foundation of data structures in assembly.
DATA TRANSFER INSTRUCTIONS: MOV, PUSH, AND POP
With operands and addressing modes understood, data transfer instructions become much clearer.
MOV — Copy Data
MOV copies data from a source to a destination.
Examples:
Important rule:
❌ You cannot move memory directly to memory and
✅ One operand must be a register.
PUSH and POP: Stack Transfers
The stack is a special region of memory managed using the stack pointer (ESP).
What PUSH does:
Decreases ESP
Stores the value at the new top of the stack
What POP does:
Reads the value at the top of the stack
Increases ESP
The stack is heavily used for:
Function calls
Passing parameters
Saving registers
Operators That Help with Memory
Assembly provides operators that help calculate and interpret memory addresses.
I. OFFSET
OFFSET gives the address of a variable, not its value.
This loads the memory address of myVar into EAX.
II. PTR
PTR tells the assembler how to treat a memory operand.
This forces the assembler to treat the memory as a WORD.
This matters because:
Assemblers do not perform strict type checking
The CPU needs to know how many bytes to read
III. LENGTHOF
LENGTHOF calculates how many elements are in a data structure.
This is commonly used when writing loops.
LOOPS AND ARITHMETIC (PREVIEW)
With data transfer understood, you can now:
Create loops using JMP and LOOP
Perform arithmetic with ADD, SUB, MUL, and DIV
Move through arrays and structures using indexed addressing
All of these depend on correct data movement.
FLAT MEMORY MODEL AND STDCALL (WINDOWS CONTEXT)
When writing 32-bit Windows programs, you’ll often see:
I. Flat Memory Model
One continuous 32-bit address space
No segment juggling
Memory is treated as a single linear block
This simplifies memory access and matches how modern Windows works.
II. STDCALL Calling Convention
STDCALL defines:
How parameters are passed (right to left on the stack)
Who cleans up the stack (the callee)
How functions interact with the Windows API
This consistency is critical for Windows compatibility.
III. Big Picture Summary
Data transfer moves values between registers, memory, and the stack
Operands define what data is used
Addressing modes define how memory is reached
MOV, PUSH, and POP are the core transfer instructions
OFFSET, PTR, and LENGTHOF help manage memory correctly
Flat memory and STDCALL define the Windows execution environment
Once this chapter clicks, you’re no longer guessing...
DIRECT MEMORY OPERANDS
The real concepts being discussed here are:
Direct memory operands (a form of direct addressing)
The difference between a value, an address, and the contents at an address
How notation (var, [var], hex literals) changes meaning
Direct Memory Operands: Talking to Memory by Name
A direct memory operand means:
“Access the contents of a specific memory location whose address is known at assembly time.”
In plain English:
The assembler knows exactly where this variable lives in memory
The instruction hardcodes that address into the machine code
I. What “direct” really means here
When you write:
You are saying: “Go to the memory location associated with var1, read the byte stored there, and copy it into AL.”
var1 is not the value itself.
var1 is a label that represents a fixed memory address.
II. Declaring a variable: clearing the confusion
This line means three separate things:
var1 → a label (a name for a memory location)
BYTE → the size of memory being reserved (1 byte)
10h
→ the initial value stored in that byte
→ hexadecimal 10h = decimal 16
So what does var1 contain?
It contains one byte
That byte’s value is 16 (decimal)
What it does not mean
❌ It does NOT mean “a string”
❌ It does NOT mean “hex data is a string”
❌ It does NOT mean “10 characters”
Hexadecimal is just a number format, not a data type.
III. Hex ≠ string (important mental reset)
This is where many people get tripped up.
Hexadecimal → a way to write numbers
String → a sequence of characters stored as numeric codes (ASCII / Unicode)
Example:
Stores:
That byte does not represent text unless you interpret it as text — and even then, ASCII 16 is a non-printable control character.
So no, a BYTE holding 10h cannot secretly be a string.
IV. Direct memory operand in action
What the CPU actually does:
The assembler replaces var1 with its memory address
That address is embedded into the instruction
At runtime, the CPU:
goes to that address
reads 1 byte
loads it into AL
V. Why machine code looks like this
We mentioned this:
Breaking it down:
A0 → opcode (MOV AL, moffs8)
00010400 → 32-bit memory address of var1
This is why it’s called direct:
the address is literally baked into the instruction.
VI. Direct memory operands with other instructions
Anywhere a memory operand is allowed, direct memory operands can be used:
These all mean: “Use the contents of the memory location named var1.”
VII. Direct memory + expressions
This is called a direct-offset operand.
Meaning:
Take the address of var1
Add 5 bytes
Access the memory at that computed address
This is commonly used for:
arrays
structure fields
table lookups
VIII. Why brackets matter (this is HUGE)
No brackets → value or address
This means:
“Load the number 10000438h into EAX.”
The CPU does not touch memory.
Brackets → dereference (go to memory)
This means:
“Go to memory address 10000438h and load what’s stored there.”
Golden rule:
Brackets mean “treat this as an address and go there.”
This applies whether the thing inside is:
a label → [var1]
a register → [ebx]
a hex literal → [10000438h]
IX. Can a hex number be a value or an address?
Yes — and this is why context matters.
Hex is just a number.
These are different even though the hex is the same:
Same number.
Completely different meaning.
X. BYTE vs DWORD (size clarity)
BYTE → 1 byte (0–255)
DWORD → 4 bytes (32 bits)
The type:
tells the assembler how much space to reserve
tells the CPU how many bytes to read/write
It does not decide whether something is a value or an address — brackets do.
XI. Opcode vs Operand (clean mental split)
Opcode → what the CPU should do
(MOV, ADD, SUB)
Operands → what data the operation works on
(registers, memory, immediates)
Example:
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
✅✅✅✅✅✅✅✅✅✅✅
Opcode → ADD
Operands → EAX, 3
XII. Big picture takeaway (this is the anchor)
Direct memory operands mean:
You name a variable
The assembler knows its exact address
The instruction directly accesses that memory location
Brackets decide everything:
No brackets → value
Brackets → memory dereference
Once this clicks, pointers, arrays, and structures stop feeling mysterious — they’re just address + size + interpretation.
MOV - THE BACKBONE OF ASSEMBLY
The MOV instruction as a controlled copy operation, governed by operand roles, sizes, and allowed combinations.
The MOV instruction is used to copy data from one place to another.
What MOV actually does
MOV copies data.
It does not:
add
swap
compare
or modify the source
After a MOV:
the destination changes
the source stays exactly the same
Think of it like copying text:
You paste the text somewhere else, but the original is untouched.
Destination vs Source (this matters)
Destination operand
The place where the data ends up
This operand is modified
Source operand
The data being copied
This operand is not modified
Example:
EAX → destination (changed)
EBX → source (unchanged)
Operand sizes must match
Both operands must be the same size. Valid:
Invalid:
The CPU needs to know exactly how many bytes to move.
MOV cannot move memory to memory (and why)
This is one of the most important rules:
❌ MOV cannot copy data directly from one memory location to another.
This is not a syntax limitation — it’s a CPU rule.
Memory-to-memory moves would require:
two memory reads
one memory write
extra internal buffering
x86 keeps things simple and fast by requiring at least one operand to be a register.
How to move data from memory to memory (the correct way)
You use a register as a temporary step.
What’s happening:
The value of var1 is loaded into AX
The value in AX is stored into var2
Result: var2 now contains the same value as var1
Register → Memory example
Step-by-step:
1024 is loaded into EAX
The contents of EAX are copied into memory at var3
Important idea:
var3 is just a name for a memory address
The assembler reserves 4 bytes because it is a DWORD
After execution: var3 holds the value 1024
Memory → Register example
What happens:
The CPU goes to the memory location named var4
Reads 4 bytes
Copies them into EAX
After this instruction: EAX = 1024
Immediate values with MOV
You can also move immediate (literal) values into registers or memory.
Rules:
Immediate → register ✔
Immediate → memory ✔
Immediate → immediate ❌ (doesn’t make sense)
Summary: All valid MOV forms
And the one that’s not allowed:
Why MOV is so important
Almost everything in assembly depends on MOV:
loading values
storing results
passing function parameters
working with memory
preparing data for arithmetic
If you understand:
destination vs source
size matching
register involvement
then you understand how data flows through the CPU.
Big picture takeaway
MOV is not “move” — it’s copy.
It copies:
values
addresses
memory contents
But it always follows strict rules so the CPU knows:
how much data to copy
where it’s coming from
where it’s going
Master MOV, and assembly stops feeling chaotic — it becomes deliberate and predictable.
OVERLAPPING REGISTERS AND PARTIAL REGISTER WRITES IN X86
More specifically:
How AL, AX, and EAX share the same physical storage
How writing to a smaller part of a register affects (or does not affect) the rest
Overlapping Values: How Partial Register Writes Work
In x86 assembly, many registers overlap.
This means that smaller registers are not separate storage — they are views into a larger register.
The best example is EAX.
I. The EAX register layout (mental model)
A 32-bit register like EAX is divided like this:
AL → lowest 8 bits
AH → next 8 bits
AX → lowest 16 bits (AH + AL)
EAX → all 32 bits
They all refer to the same physical register.
II. Data declarations (what we’re loading)
oneByte → 8 bits → 78h
oneWord → 16 bits → 1234h
oneDword → 32 bits → 12345678h
Step-by-step: how EAX changes
I. Clear EAX
All 32 bits are zeroed: EAX = 00000000h
II. Move a BYTE into AL
What happens:
Only AL (lowest 8 bits) is overwritten
The upper 24 bits remain unchanged
EAX = 00000078h
Key idea: Writing to AL affects only 1 byte.
III. Move a WORD into AX
What happens:
The lower 16 bits are overwritten
The upper 16 bits remain unchanged
EAX = 00001234h
Notice:
The previous 78h in AL is gone
Because AL is part of AX
IV. Move a DWORD into EAX
What happens:
All 32 bits are replaced - EAX = 12345678h
No overlap concerns here — the entire register is rewritten.
V. Zero AX only
What happens:
Lower 16 bits → set to zero
Upper 16 bits → unchanged
EAX = 12340000h
This is where people get surprised.
VI. Why this matters (the dangerous part)
Because partial writes do not clear the rest of the register, you can easily end up with garbage in the upper bits if you’re not careful.
Example bug:
Reality: EAX = ??????01h
Unless EAX was cleared earlier, the upper bits contain whatever was there before.
VII. Golden rules for overlapping registers
Rule 1: Smaller writes do not clear larger registers
mov al, x → changes 8 bits
mov ax, x → changes 16 bits
mov eax, x → changes all 32 bits
Rule 2: If you care about the full register, write the full register
Instead of:
Now you know the upper bits are zero.
Rule 3: Overlap works both ways
Writing to AX overwrites AL
Writing to AL does not preserve meaningful values in AX
Everything overlaps downward
VII. Why x86 works this way
This design exists for:
backward compatibility (8-bit and 16-bit CPUs)
performance
flexibility
It’s powerful — but sharp.
VIII. Big picture takeaway
AL, AX, and EAX are not separate registers.
They are different-sized windows into the same storage.
When you move smaller data into a larger register:
it goes into the lower portion
the upper bits are untouched
Understanding this prevents:
subtle bugs
incorrect comparisons
broken arithmetic
mysterious values
Once this clicks, you start controlling the CPU.
SIGNED AND ZERO EXTENSION
Zero Extension (moving a smaller unsigned value into a larger register)
Zero extension is used when you move a smaller unsigned value (for example, 8-bit or 16-bit) into a larger register (like a 32-bit register), and you want the extra space to be filled with zeros.
This is important because unsigned values do not have a sign.
So, when we make them bigger, we must not copy or invent a sign bit.
We simply add zeros to the left.
The basic idea is simple:
First, clear the larger register (set it to zero).
Then, move the smaller value into the lower part of that register.
The upper bits stay zero, which is exactly what we want.
Example setup
Let’s say we have a variable called count:
count is a 16-bit unsigned integer
Its value is 1
In hexadecimal, that value is 0001h
Assembly code example (Zero Extension)
What happens step by step
Step 1: Clear ECX
This sets all 32 bits of the ECX register to zero: ECX = 00000000h
This step is important because it guarantees that the upper 16 bits are zero before we move anything into ECX.
Step 2: Move the 16-bit value into CX
CX is the lower 16 bits of ECX
The value of count is 0001h
This instruction copies 0001h into the lower 16 bits only
After this instruction:
Upper 16 bits of ECX: 0000h
Lower 16 bits of ECX: 0001h
So, the full 32-bit value of ECX becomes: ECX = 00000001h
We took a 16-bit unsigned value and placed it into a 32-bit register.
The extra 16 bits on the left were filled with zeros, not sign bits.
That is why this is called zero extension.
Final result (important correction)
Hex value in ECX: 00000001h
Decimal value in ECX: 1
We copied the number 1 into ECX and padded it on the left with zeros.
The value stays 1, not 16.
MOVZX and MOVSX Instructions
(Move with Zero Extension / Move with Sign Extension)
When moving smaller values (8-bit or 16-bit) into larger registers (16-bit or 32-bit), we often need to extend the value.
Intel provides two special instructions that do this automatically:
MOVZX → for unsigned values
MOVSX → for signed values
These instructions save us from manually clearing registers and make our code shorter, safer, and easier to read.
Instruction Forms
I. MOVZX (Zero Extension)
II. MOVSX (Sign Extension)
MOVZX: Move with Zero Extension
MOVZX copies a smaller unsigned value into a larger register and fills all the extra bits with zeros.
It does not care about signs.
It treats the value as always positive.
I. Example: Zero extension with MOVZX
Data section
Code section
What happens step by step
count is 16 bits and equals 0001h
ECX is 32 bits
MOVZX:
Copies count into the lower 16 bits of ECX
Fills the upper 16 bits with zeros
Final result: ECX = 00000001h
Decimal value: 1
This is zero extension — the value is extended by adding zeros on the left.
MOVSX: Move with Sign Extension
MOVSX copies a smaller signed value into a larger register and preserves the sign.
If the value is positive, it fills the upper bits with 0
If the value is negative, it fills the upper bits with 1
This keeps the number mathematically correct after the move.
I. Example: Negative signed value
Data section and code section
What happens step by step
signedVal = -16
Hex value (16-bit): FFF0h
Sign bit = 1 (negative)
After MOVSX: ECX = FFFFFFF0h
The upper bits are filled with 1s, preserving the negative sign.
Important Question:
Do signed values have to be negative for MOVSX?
No. Absolutely not.
MOVSX works for both positive and negative signed values.
Example: Positive signed value
Data section and Code section
What happens here?
42 in hex is 002Ah
Sign bit = 0 (positive)
After MOVSX: ECX = 0000002Ah
The value stays positive, and the upper bits are filled with zeros.
Sign preserved correctly
Key Rule for Sign Extension (Very Important)
When using MOVSX, always look at the most significant bit (sign bit):
Sign bit = 0 → positive → extend with zeros
Sign bit = 1 → negative → extend with ones
Why do we need MOVZX if MOVSX exists?
Great question — and this is where many learners get confused.
Short answer: Because MOVSX can break unsigned values.
MOVZX — for unsigned data
Always fills upper bits with zeros
Safe for values like counters, sizes, indexes
Example:
MOVSX — for signed data
Copies the sign bit
Correct for negative numbers
Dangerous for unsigned values
Example problem:
That’s why MOVZX exists.🚨
Final Summary (Simple and Clear)
MOVZX
Used for unsigned values
Upper bits are filled with zeros
Prevents accidental negative values
MOVSX
Used for signed values
Upper bits copy the sign bit
Preserves positive or negative meaning
👉 Always choose based on the data type, not the instruction size.
Movzx Repeated
What is happening here:
MOVZX, which stands for Move with Zero Extension, is an instruction that lets us take a value that is stored in a smaller register or memory location and copy it into a bigger register.
When we do this, the extra space in the bigger register (the upper bits) is filled with zeros so that the value stays the same. Let’s look at some examples using registers to understand how this works step by step.
1. Example: Register Operands
In this example, we used the MOVZX instruction to copy smaller pieces of data from one place into bigger registers, making sure that any extra space is filled with zeros. Let’s take it one step at a time:
mov bx, 0A69Bh – Here, we put the 16-bit value 0A69Bh into the BX register. Think of BX as a small container that can hold 16 bits (2 bytes).
movzx eax, bx – Now, we want to copy that 16-bit value from BX into the larger 32-bit register EAX. Because EAX is bigger, MOVZX fills the extra 16 bits at the top with zeros. After this, EAX contains 0000A69Bh.
movzx edx, bl – BL is the lower 8 bits of BX (just the last byte). We copy this into the 32-bit register EDX using MOVZX. The top 24 bits of EDX are filled with zeros, so the final value is 0000009Bh.
movzx cx, bl – Finally, we copy that same 8-bit value from BL into the 16-bit register CX. Since CX has space for 16 bits, the upper 8 bits are filled with zeros, giving 009Bh.
2. Example: Memory Operands
In this example, we used MOVZX again, but this time the data comes from memory instead of registers. We have two variables in memory: byte1 (8 bits) and word1 (16 bits). Here’s what happens step by step:
movzx eax, word1 – We take the 16-bit value stored in word1 and copy it into the 32-bit register EAX. Since EAX is bigger than word1, MOVZX fills the extra 16 bits at the top with zeros. After this, EAX contains 0000A69Bh.
movzx edx, byte1 – Now we take the 8-bit value in byte1 and copy it into the 32-bit register EDX. The top 24 bits are filled with zeros, so EDX becomes 0000009Bh.
movzx cx, byte1 – Finally, we copy the same 8-bit value from byte1 into the 16-bit register CX. The upper 8 bits of CX are filled with zeros, giving 009Bh.
The important idea here is that MOVZX always fills the extra bits with zeros when moving smaller values into larger registers. This keeps the value correct and avoids accidentally changing the sign of the number.
MOVSX Repeated
What is happening here?
Here, we are looking at MOVSX, which copies smaller values into larger registers while preserving the sign (positive or negative). We’re using the hexadecimal value 0A69Bh as an example. Here’s what happens step by step:
mov bx, 0A69Bh – This puts the 16-bit value 0A69Bh into the BX register. The first digit "A" in hexadecimal means that the highest bit (the sign bit) is 1, so this value is considered negative in signed representation.
movsx eax, bx – We copy the 16-bit value in BX into the 32-bit register EAX. Because we are using MOVSX, it keeps the sign the same. The upper 16 bits of EAX are filled with ones (F in hex) to preserve the negative sign. After this, EAX = FFFFA69Bh.
movsx edx, bl – Now we take the lower 8 bits of BX (called BL) and copy them into the 32-bit EDX register. MOVSX again preserves the sign. The upper 24 bits of EDX are filled with ones, so EDX = FFFFFF9Bh.
movsx cx, bl – Finally, we copy the 8-bit BL into the 16-bit register CX. The upper 8 bits of CX are filled with ones because the sign bit is 1, giving CX = FF9Bh.
Key point: The "A" in 0A69Bh shows that the number is negative in signed form. MOVSX makes sure that when we copy smaller numbers into larger registers, the sign is preserved by filling the extra bits with ones for negative numbers.
LAHF AND SAHF INSTRUCTIONS
The LAHF (Load AH from Flags) and SAHF (Store AH into Flags) instructions are used to move specific CPU flags to and from the AH register.
These instructions are useful when you want to save or restore the status of certain flags without affecting the rest of the EFLAGS register.
LAHF — Load AH from Flags
I. What it does
The LAHF instruction takes the low byte of the FLAGS register (specifically the SF, ZF, AF, PF, and CF flags) and copies it into the AH register.
This allows you to save the state of these flags for later use, such as storing them in memory or comparing them without changing the CPU state.
II. Flags included in LAHF
Note: LAHF does not load all EFLAGS, only these selected flags.
III. Example: Using LAHF
Note: LAHF does not load all EFLAGS, only these selected flags.
Step-by-step explanation:
lahf copies the low byte of FLAGS into the AH register.
mov [savedFlags], ah stores the value of AH into a memory variable called savedFlags.
Later, you can restore these flags using SAHF.
In this example, after executing LAHF, the AH register contains the values of the specified flags, and you can save these values in the saveflags variable for future reference.
SAHF (Store AH into Flags)
The SAHF instruction works in the opposite direction. It copies the value from the AH register into the low byte of the EFLAGS (or RFLAGS) register.
This allows you to restore saved flag values.
Here's an example of how to use SAHF to retrieve saved flag values from a variable:
EFLAGS / RFLAGS Register
The EFLAGS register (called RFLAGS in 64-bit mode) is like a control panel for the CPU.
It’s 32 bits (4 bytes) in 32-bit mode, and 64 bits (8 bytes) in 64-bit mode.
It stores status flags that tell the CPU what happened after each instruction.
It also has control bits that affect how the CPU works.
Some of the important flags you’ll see in EFLAGS/RFLAGS are:
Carry Flag (CF): Tells if a calculation carried out of the highest bit.
Zero Flag (ZF): Tells if the result of a calculation was zero.
Sign Flag (SF): Tells if the result was negative.
Overflow Flag (OF): Tells if a calculation overflowed the allowed range.
…and many others that help the CPU track what’s going on.
In 64-bit mode, the RFLAGS register works the same way as EFLAGS, just bigger (64 bits).
XCHG (Exchange) Instruction
The XCHG instruction in x86 assembly is used to swap the contents of two operands.
Unlike doing a manual swap using a temporary register, XCHG can handle this directly, which makes your code shorter, cleaner, and sometimes faster.
There are three main variants:
XCHG reg, reg – exchanges contents between two registers.
XCHG reg, mem – exchanges contents between a register and a memory location.
XCHG mem, reg – exchanges contents between a memory location and a register.
Let’s break each one down, with examples and practical notes.
1. XCHG reg, reg
What it does: Swaps the contents of two registers.
No memory access is required—everything stays inside the CPU registers.
Useful when you need to reorder values quickly, e.g., swapping two counters.
Example:
Result after execution:
AX = 10
BX = 5
Key Points:
Fast, because it only involves registers.
Atomic on most processors, which means it can safely be used in some synchronization tasks (like implementing locks).
2. XCHG reg, mem
What it does: Swaps the value of a register with a value in memory.
Now the CPU has to read from and write to RAM, which is slower than register-only operations.
This is very handy when you need to update a variable without using an extra temporary register.
Example:
Result:
BX now contains the previous value of var1.
var1 now contains 20.
Why it matters:
Helps in swapping array elements directly in memory.
Can also be used in simple locking mechanisms, because XCHG is atomic when working with a register and memory.
3. XCHG mem, reg
What it does: Swaps a memory location with a register. This is technically the same as XCHG reg, mem because the CPU treats memory-register swaps identically, but conceptually it’s about memory being the first operand.
Example (swapping two memory locations using a temporary register):
Result:
val1 and val2 are swapped.
AX temporarily holds the value during the swap.
Why it matters:
Direct memory-to-memory swap isn’t supported on x86, so this pattern (using a register as a “bridge”) is essential.
Shows how XCHG can be combined with MOV to manipulate memory efficiently.
Practical Tips & Takeaways
Atomicity: XCHG is inherently atomic when one operand is a memory location. This makes it useful in multithreading scenarios.
Performance: Register-register swaps are fastest; memory involvement slows things down.
Common Uses:
Swapping variables (obvious!)
Sorting algorithms (like bubble sort)
Implementing simple locks or semaphores in low-level concurrent code
Direct-Offset Operands
Direct-offset operands let you access memory at a specific location by starting from a variable’s base address and adding an offset.
They’re useful for arrays or structures, where you want to pick a specific element.
The offset is just a number that tells the CPU how far from the base address to go.
Example concept:
If you have arrayB, then arrayB + 20 is the effective address of the second element (first element is at arrayB + 0).
Adding brackets, like [arrayB + 20], tells the assembler: “I want the value stored at this memory location, not the address itself.”
Tip:
You don’t have to use brackets in MASM, but it’s much clearer if you do.
MASM doesn’t check if your address is in range, so if your array has 20 bytes and you access arrayB + 20, you’ll read memory outside the array. This won’t cause an immediate error but can create a sneaky logic bug.
I. Word and Doubleword Arrays
When accessing word and doubleword arrays, you need to take into account the size of the array elements.
For example, the following code shows how to access the second element in an array of 16-bit words:
To put it simply, [arrayW + 2] is a way of telling the computer:
"Start at the beginning of the list called arrayW, skip the first item, and look at the second one."
II. Why do we add 2?
Computers don't see "items" in a list; they see bytes (tiny units of storage).
In this specific list (arrayW), every single item takes up 2 bytes of space.
To get to the second item, you have to jump over those first 2 bytes.
The code [arrayW+2] tells the computer to skip the first 2 bytes of the list. Since each item in this list is 2 bytes big, this jump moves you exactly to the start of the second item.
In the same way, if the items are "doublewords" (4 bytes big), you would add 4 to reach the second item.
The code [arrayD+4] tells the computer to skip the first 4 bytes to reach the second item in the list. This is because each item in this list is 4 bytes wide.
You must be careful: the computer will not stop you if you jump too far.
If you use a number that is too large, you might accidentally look at memory that doesn't belong to your program, which can cause errors.
III. Growing Your Data (MOVZX vs MOVSX)
When you move a small piece of data into a bigger box, you have to decide what to do with the empty space.
MOVZX (Zero Extension): This is the "clean" one. It just fills the empty space with zeros. It’s perfect for positive numbers.
MOVSX (Sign Extension): This is the "smart" one. It looks at the very first bit of your number. If it’s a negative number, it fills the empty space with 1s to keep the number negative. If it's positive, it uses 0s.
IV. The Swap (XCHG)
Usually, to swap two things, you need a third hand. XCHG is like a magic trick—it swaps the values in two places at the exact same time without needing a middleman.
V. Jumping Through Arrays
Computers don't see "lists"; they see a long line of bytes.
To get to the next Byte, you move 1 step.
To get to the next Word, you move 2 steps.
To get to the next Doubleword, you move 4 steps.
VI. The Exit
ExitProcess is just the "Power Off" button. It tells Windows, "I'm done, you can take your memory back now."
DEBUGGING
Here, we compiled the program and started debugging it in x64dbg instead of visual studio community, coz it doesn't have the registers option:
You can see the registers, flag registers, memory dumps, stack etc.
Each flag is assigned a value of 0 (clear) or 1 (set).
Here’s an example:
Yes, it’s completely normal for a disassembled EXE to have more lines of code than what you actually wrote.
That’s because the compiler adds extra code to handle things like library functions, operating system calls, and other behind-the-scenes tasks.
In the disassembly, you can see the compiler has added instructions to set up the stack, call main(), and exit the program.
It also includes code to handle errors, like division by zero or invalid memory access.
On top of that, any libraries your program uses—like the C standard library—bring in their own code, which gets included in the EXE.
And sometimes, other programs, like debugging tools or malware, can add code too.
If you want to dig deeper into what’s in the disassembled EXE, a debugger lets you step through the code and inspect registers and memory. You can also use a disassembler to get a detailed view of the assembly instructions.
x64dbg – Human-Friendly Window & Operand Guide
Debugging with x64dbg can feel overwhelming at first because there are a lot of windows and registers to keep track of.
Let’s break it down step by step, in plain English, explaining what each window shows, why it matters, and how you can use it.
1. Call Stack Window
What it is:
Think of the call stack as a breadcrumb trail of function calls that led to where the program is now.
How it works:
Functions are listed in reverse order. The top entry is the most recent function that was called.
You can see which function called it, and which function called that one, all the way back to the start of the program.
Why it matters:
Helps you figure out how you got here in the program.
Useful for tracking nested function calls and debugging crashes (like “segmentation faults”).
How to open:
View > Call Stack
2. Memory Map Window
What it is:
The Memory Map shows a bird’s-eye view of the program’s memory.
What you see:
Loaded modules (DLLs, EXEs)
Data segments (variables)
Stack and heap areas
Why it matters:
Helps you understand where different pieces of the program are in memory.
Essential when working with pointers, arrays, or investigating crashes due to invalid memory access.
How to open:
View > Memory Map
3. Command Window
What it is:
This is a text-based control panel where you type debugger commands directly.
Things you can do:
Start and stop the debugger
Set and remove breakpoints
Step through code one instruction at a time
Inspect or change values in registers or memory
Why it matters:
Gives you full control of the debugger
Great for advanced users who want precision over what the debugger is doing
4. Registers Window
What it is:
Registers are like tiny, ultra-fast memory slots inside the CPU. They store:
Current instruction pointer
Calculation results
Addresses for memory operations
Why it matters:
By watching registers, you can see exactly what the CPU is thinking.
Essential for understanding how the program is executing at the machine level.
5. Disassembly Window
What it is:
Shows machine code translated into assembly instructions.
Why it matters:
Lets you see what the CPU is actually doing
Critical for reversing programs or debugging tricky bugs
6. Variables Window
What it is:
Displays program variables and their current values.
Why it matters:
Lets you track how data changes over time
Useful for debugging logic errors or tracking unexpected program behavior
7. Threads Window
What it is:
Shows all threads (independent paths of execution) in a program.
Why it matters:
Lets you pause, resume, or switch between threads
Essential for debugging multithreaded programs
8. Breakpoints Window
What it is:
Lists all breakpoints, which are markers telling the debugger to pause execution at specific points.
Why it matters:
Lets you inspect program state at critical moments
Makes it easier to debug step by step instead of running blindly
Assembly Operand Basics (Human-Friendly Version)
Assembly instructions operate on operands, which are the “things” being manipulated. There are three basic types:
1. Register operands – stored in CPU registers (e.g., EAX, ECX)
2. Memory operands – stored in memory locations (e.g., [var1])
3. Immediate operands – fixed values built into the instruction itself (e.g., 42)
MOV Instruction Tips
Destination and Source:
The first operand = destination (where the data goes)
The second operand = source (where the data comes from)
Limitations:
Destination cannot be a segment register or the instruction pointer (EIP)
You can’t move data directly into special registers like EIP
Operand Notation
reg/mem32 → can be a 32-bit register or a 32-bit memory location
imm16 → a 16-bit immediate value (a number hard-coded into the instruction)
ADDITION AND SUBTRACTION
I. INC (Increment)
The INC instruction increments the value of a register or memory operand by 1. Here's the syntax:
For example, if you have a data segment with a variable myWord initialized to 1000h, you can use INC like this:
II. DEC (Decrement)
The DEC instruction decrements the value of a register or memory operand by 1. Its syntax is similar to INC:
For instance, you can use DEC to decrement the value stored in the bx register:
III. Flags Affected by INC and DEC
When you use the INC (increment) or DEC (decrement) instructions, they don’t just change a number—they also play with the CPU’s status flags. Here’s what happens:
Overflow Flag (OF): Changes if the operation causes a signed overflow.
Sign Flag (SF): Reflects the new sign (positive or negative) of the result.
Zero Flag (ZF): Turns on if the result ends up being zero.
Auxiliary Carry Flag (AF): Updates for certain lower-nibble carries—useful for BCD arithmetic.
Parity Flag (PF): Indicates whether the number of set bits in the result is even or odd.
Fun twist: The Carry Flag (CF) doesn’t budge! That’s a little unexpected since we usually associate arithmetic operations with the carry. So, INC and DEC can change a lot of flags, but CF stays put.
💡 Pro tip: In assembly, tiny details matter. One increment or decrement can subtly shift the CPU’s state. Knowing exactly which flags are affected helps you predict behavior and write programs that behave like clockwork.
ADD INSTRUCTION
The ADD instruction is used to add a source operand to a destination operand of the same size. Here's the syntax:
or
Assembler doesn't care about capital letters!
The source operand remains unchanged.
The sum of the operands is stored in the destination operand.
For example, let's add two 32-bit integers:
The instruction affects various CPU flags, including Carry, Zero, Sign, Overflow, Auxiliary Carry, and Parity.
How these flags change depends on the result placed in the destination operand.
SUB INSTRUCTION
The SUB instruction subtracts a source operand from a destination operand. The syntax is the same as for ADD:
Like ADD, the source operand remains unchanged.
The result of the subtraction is stored in the destination operand.
For example, let's subtract two 32-bit integers:
Again, this instruction affects CPU flags such as Carry, Zero, Sign, Overflow, Auxiliary Carry, and Parity based on the value stored in the destination operand.
The SUB instruction subtracts var2 from var1. So, in this case:
var1 is the source operand, and it contains the value 30000h.
var2 is the destination operand, and it contains the value 10000h.
The SUB instruction subtracts var2 from var1, resulting in EAX being set to 20000h. Therefore, var2 is subtracted from var1.
NEG INSTRUCTION
The NEG (negate) instruction reverses the sign of a number by converting it to its two's complement.
It can be applied to registers or memory. Here's the syntax:
To find the two's complement, reverse all the bits in the destination operand and add 1.
For example, to negate the value in EAX:
As with ADD and SUB, the NEG instruction also affects CPU flags based on the result.
IMPLEMENTING THE ARITHMETIC EXPRESSIONS
With the ADD, SUB, and NEG instructions, you can implement arithmetic expressions in assembly language.
You can break down an expression into individual operations and combine them.
For instance, if you want to calculate Rval = -Xval + (Yval - Zval), you can:
NEGATING A VARIABLE WITHOUT REGISTERS IN ASM
Good news: you don’t need a separate register just to negate a variable. You can do it directly!
In my earlier example, I used extra registers to make the steps super clear—but in real-world coding, you can streamline things for efficiency.
Here’s a leaner version: it negates Xval and calculates Rval without using any extra register:
💡 Tip: Using fewer registers keeps your code clean and can make it run faster. But sometimes, extra registers help with readability—so pick what works best for your situation.
SIGNED NUMBERS
Signed numbers let us represent both positive and negative values in binary. They do this by reserving one bit for the sign and using the rest to show the number’s magnitude (absolute value). The most common system for this is Two’s Complement.
Here’s the magic behind it:
Sign Bit: The leftmost bit (most significant bit) decides the sign.
0 → positive
1 → negative
Magnitude Bits: The remaining bits show the number’s size.
Negating a number in Two’s Complement is simple:
Flip all the bits (turn 0s → 1s and 1s → 0s).
Add 1 to the result.
Examples:
0010 → +2 (positive, because the sign bit is 0)
1010 → -2 (negative, sign bit is 1)
Step 1: Invert all bits → 0101
Step 2: Add 1 → 0101 + 1 = 0110 → -2
💡 Tip: Think of Two’s Complement as a clever trick: it lets addition and subtraction work seamlessly, even with negative numbers, without needing separate rules.
UNSIGNED NUMBERS
Unsigned numbers are the “always-positive” kind. They only represent zero or positive values, because they use all their bits just to show the magnitude—no sign bit needed.
Examples:
0010 → 2
There’s no way to represent negative numbers here, so things like negation just don’t exist.
Quick takeaway:
Signed numbers: have a sign bit, so they can go positive or negative.
Unsigned numbers: all positive, max value bigger for the same number of bits.
💡 Tip: Pick signed or unsigned based on what your program actually needs—don’t waste bits on a negative that will never show up!
SIGNED VS UNSIGNED NUMBERS: AVOIDING CONFUSION
Sometimes people see a binary number starting with 1 and immediately think, “Ah! It must be a negative signed number!” 🤔
This is a classic mix-up if you don’t know whether the number is signed or unsigned.
Here’s how to prevent that confusion:
1. Label it clearly
Use descriptive variable names: signedValue vs unsignedValue.
Add comments in your code:
2. Stick to standard conventions
Signed numbers → Two’s Complement
Unsigned numbers → plain binary
Knowing the convention tells you how to interpret the leading bit.
3. Use the correct data types
If your language supports it, pick int (signed) or uint (unsigned).
Be consistent: don’t mix signed and unsigned in the same calculation.
4. Document & educate
Always note your assumptions about signed vs unsigned numbers.
Teach your team: the leading 1 doesn’t automatically mean negative—it depends on the representation.
5. Test and validate
Run small examples to check your assumptions.
Example: 0010 → 2 unsigned, +2 signed; 1010 → 10 unsigned, -6 signed.
6. Keep operations separate
Don’t mix signed and unsigned arithmetic.
Separate them in code to avoid mistakes.
💡 Memory Trick:
Think of signed numbers as “the leading bit is a mood indicator”:
0 → happy/positive
1 → grumpy/negative
Unsigned numbers? They’re always happy—no grumpiness allowed!
CPU STATUS FLAGS/ FLAG REGISTERS: WHY THEY MATTER
When your program does arithmetic, you often want to know:
Is the result positive, negative, or zero?
Does it fit within the range of the variable storing it?
This is where CPU status flags come in—they’re tiny indicators that give you the answers instantly.
These flags are super important:
They help detect errors.
They let your program make decisions based on the result (like branching or looping).
Here's a straightforward overview of the main status flags.
First, remember this, MSB is the leftmost bit:
CARRY FLAG
Set when the result of an unsigned arithmetic operation is too large to fit into the destination operand.
In this example, when you add 1111 and 1101, you have a carry-out from the MSB (leftmost bit), resulting in a final carry-out value of 11000. The Carry flag would be set to indicate this carry-out condition.
For subtraction, there's typically no carry-out from the MSB, as subtraction involves taking away one value from another:
In this subtraction, there's no carry-out from the LSB, so the Carry flag would not be set. The carry-out occurs from the least significant bit (LSB), and propagates towards the most significant bit (MSB) during addition, not from the MSB.
This flag indicates unsigned integer overflow. For instance, if an operation has an 8-bit destination operand, but the result exceeds 255 in binary (11111111), the Carry flag is set.
Another example, subtracting 2 from 1 sets the carry flag.
The Carry Flag (CF) is like the CPU’s little “overflow detector” for the most significant bit (MSB). It tells you whether a number got too big to fit in the bits you have.
CF = 1 → Carry happened
This means the sum overflowed the MSB during an addition.
In plain English: the number got too big to fit, so a carry was “pushed out” of the leftmost bit.
CF = 0 → No carry
Nothing overflowed; the result fits perfectly in the allocated bits.
💡 Think of it like this:
CF = 1 → “Oops! Too big!” 😬
CF = 0 → “All good, fits just right!” 😎
In short, CF helps your program know when numbers spill over the size limit of their binary container.
AUXILIARY CARRY FLAG
This flag is set when a 1 bit carries out of position 3 in the least significant byte of the destination operand.
Suppose we want to add 1 to the BCD value 0Fh:
Here's the binary representation and the arithmetic:
1. Binary Representation of 0Fh: 00001111
2. Binary Representation of 1: 00000001
Now, let's add them together, keeping track of the AC flag:
The Auxiliary Carry (AC) flag is a bit like the “bit 3 whisperer” of the CPU—it tells you if a carry happened from bit 3 to bit 4 (counting from the right, starting at bit 0).
This is especially handy in BCD (Binary-Coded Decimal) arithmetic, where numbers are handled in a decimal-like way using binary.
Example:
Adding 0Fh + 1 → the result is 10h.
Notice that a carry goes from bit 3 to bit 4.
That’s exactly what the AC flag tracks:
AC = 1 → Carry from bit 3 to bit 4 occurred
AC = 0 → No carry from bit 3 to bit 4
💡 Think of it like this:
Bit 3 is having a party. If someone “overflows” into bit 4, AC says,
“Hey, bit 4 got some action!” 🎉
If not, AC stays 0, calm and quiet.
In short, AC is a tiny but important helper for operations where carries between the lower nibbles matter, like in BCD arithmetic.
PARITY FLAG (PF)
The Parity Flag (PF) is the CPU’s way of counting 1s in the least significant byte (the rightmost 8 bits) of a result after an arithmetic or logical operation.
PF = 1 → Even parity
The byte has an even number of 1s.
PF = 0 → Odd parity
The byte has an odd number of 1s.
💡 Think of it like this:
PF is like a little “even-odd checker” for your bits.
After each operation, it asks: “Do we have an even number of 1s?” ✅
If yes → PF = 1 (party’s even!)
If no → PF = 0 (party’s odd!)
It’s mostly used for error checking and low-level debugging, but knowing it can also help you understand subtle CPU behaviors.
💡 Example 1: ADD Instruction
After the ADD instruction, AL contains the binary value 10001110.
In this binary representation, there are four 0 bits and four 1 bits.
Since there is an even number of 1 bits (four), the Parity flag (PF) is set to 1.
This means that PF indicates that the result has an even parity.
💡 Example 2: SUB Instruction
Since there is an odd number of 1 bits (one), the Parity flag (PF) is set to 0. (odd parity)
Parity Flag (PF) – Quick Example
Think of PF as the CPU’s “even-odd bit detector.”
After ADD, AL = 1111 (four 1s → even) → PF = 1 ✅
After SUB, AL = 0001 (one 1 → odd) → PF = 0 ❌
In short: PF tells you whether the number of 1 bits in the result is even or odd.
It’s handy for error detection or any situation where parity matters.
OVERFLOW FLAG
I. Signed Numbers and Overflow
An overflow is set when the result of a signed arithmetic operation is too large or too small to fit into the destination operand.
Signed Numbers
A signed number is typically an 8-bit value (it can also be 16-bit) that is evenly divided between positive and negative numbers.
The Most Significant Bit (MSB) determines the sign:
MSB = 0 → Positive number
MSB = 1 → Negative number
For example, the binary number 1011 0011 has MSB = 1, so it is negative.
Taking its two’s complement gives 0100 1101, which is 4D in hexadecimal and represents −77 in decimal.
II. Range of Signed Numbers
For an 8-bit signed number:
The range is −128 to +127 (hexadecimal 80 to 7F).
This includes:
128 positive values (0 to +127)
128 negative values (−1 to −128)
III. Overflow Detection
Overflow occurs when the result of an arithmetic operation exceeds the valid signed number range.
If a positive result produces MSB = 1, a positive overflow has occurred.
If a negative result produces MSB = 0, a negative overflow has occurred.
IV. Overflow Detection Mechanism
To detect overflow, the processor performs an XOR operation on the carry flags C6 and C7 during arithmetic operations.
C0 is the carry from bit 0 to bit 1, C1 from bit 1 to bit 2, and so on.
If the XOR of C6 and C7 equals 1, it indicates that the result has exceeded the limits of signed numbers and an overflow has occurred.
The image represents the carry flags from bit 0 to bit 7 that are generated during an arithmetic operation.
The Overflow flag (OF) is set when the result of a signed arithmetic operation is too large or too small to fit into the destination operand. In the case of 8-bit signed numbers, overflow occurs when the result is:
Greater than +127, or
Less than −128
V. How Overflow Is Detected
To detect overflow, the processor performs an XOR operation on the carry flags C6 and C7:
C6 is the carry into the sign bit (bit 7)
C7 is the carry out of the sign bit
If the result of C6 XOR C7 = 1, then an overflow has occurred, and the Overflow flag is set.
VII. Example
Consider adding two 8-bit signed numbers: 127 + 1
The mathematical result is 128
However, 128 is outside the valid range of an 8-bit signed number
As a result, overflow occurs, and the Overflow flag is set.
The table shows how the carry flags (C6 and C7) and the Overflow flag are affected during this addition. Since the XOR of C6 and C7 equals 1, the processor correctly detects an overflow condition.
VIII. Why the Overflow Flag Is Useful
The Overflow flag helps detect errors in signed arithmetic operations.
For example, if a program expects the result of an operation to stay within a specific signed range, it can check the Overflow flag:
OF = 0 → Result is valid
OF = 1 → Overflow occurred, and corrective action may be needed
IX. Using the Overflow Flag in Assembly Language
Here is an example of how the Overflow flag can be used in assembly language:
X. Using the Overflow Flag in Assembly Language
The assembly code shown is a simple example of how the Overflow flag (OF) can be used to detect overflow during an addition operation. Here’s how it works:
The instruction
add al, 1
adds the value 1 to the AL register.
After the addition, the processor updates the status flags. The instruction jno no_overflow (Jump if No Overflow) transfers control to the no_overflow label if the Overflow flag is not set.
If the Overflow flag is set, execution continues to the overflow label, where the program can handle the error or take corrective action.
XI. Why Overflow Occurs
The Overflow flag is set when the result of a signed arithmetic operation is too large (or too small) to fit into the destination operand.
In this example, the destination operand is the AL register, which is an 8-bit signed register.
The valid range for an 8-bit signed value is −128 to +127.
If the result of the addition exceeds +127 or goes below −128, an overflow occurs, and the Overflow flag is set.
XII. Key Takeaway
The Overflow Flag (OF) is used to detect overflow in signed arithmetic operations.
It plays a crucial role in ensuring correct results when working with signed data types.
The processor detects overflow by performing an XOR operation on carry flags C6 and C7.
If the result of this XOR operation is 1, the Overflow flag is set, indicating that the signed result is invalid
ZERO FLAG
Set when the result of an arithmetic operation is zero.
For example, subtracting two equal values will set the Zero flag.
SIGN FLAG
The Sign flag (SF) is set when the result of an arithmetic operation is negative.
It directly reflects the value of the Most Significant Bit (MSB) of the destination operand.
In simple terms, the Sign flag copies the MSB of the result:
MSB = 1 → SF = 1, indicating a negative result
MSB = 0 → SF = 0, indicating a positive result
So, the Sign flag gives a quick way to determine whether the result of an operation is positive or negative.
I. Control of Status Flags
Status flags, including the Sign flag, are controlled automatically by the Arithmetic Logic Unit (ALU).
This means:24861
The user cannot directly set or clear these flags.
Flags are updated automatically based on the result of arithmetic or logical operations.
This behavior ensures that flags always accurately represent the outcome of the last operation.
II. Using Status Flags to Detect Errors
Status flags are commonly used to check for errors or special conditions after arithmetic operations.
Some important examples include:
Overflow Flag (OF):
If OF is set after an addition, it means the result is too large or too small to fit into the destination operand (signed overflow).
Zero Flag (ZF):
If ZF is set, the result of the operation is zero. This is useful for comparisons and decision-making in programs.
III. Role of the Sign Flag in Error Detection
The Sign flag can also help detect unexpected results. For example:
If two positive numbers are added together, the result should also be positive.
If the Sign flag is set after such an operation, it indicates that something has gone wrong (such as overflow).
This makes SF useful not just for identifying negative values, but also for validating arithmetic results.
IV. Determining the Sign of a Result
The Sign flag can be directly used to determine the sign of a number:
SF = 1 → Result is negative
SF = 0 → Result is positive
This is especially useful in conditional branching and decision-making instructions.
V. Effect of Subtraction on the Sign Flag
The provided code demonstrates how the Sign flag is affected by a subtraction operation.
First, a value is moved into a register.
Then, a larger value is subtracted from it.
Since the subtracted value is greater than the original value, the result becomes negative.
As a result, the Sign flag is set to 1, indicating a negative outcome.
In the second example, subtraction is performed on an 8-bit register. The result of the subtraction is −1, which is represented as FFh in hexadecimal (two’s complement form).
Because the MSB of this result is 1, the Sign flag is set.
VI. Final Note
The Sign flag is a powerful and simple tool for:
Identifying negative results
Validating arithmetic operations
Supporting decision-making in programs
By understanding how the Sign flag works alongside other status flags, you gain a much clearer picture of how the CPU processes arithmetic operations and handles errors.
CONTROL FLAGS
Control flags are special flags that control how the CPU operates. In the 8086 microprocessor, there are three control flags:
I. Trap Flag (TF)
When TF = 1, the CPU enters single-step mode.
In this mode, the CPU executes one instruction at a time and then pauses.
This is especially useful for debugging, as it allows programmers to closely observe how each instruction affects registers and flags.
II. Interrupt Flag (IF)
When IF = 1, the CPU allows interrupts.
Interrupts are signals from external devices (such as keyboard, timer, or I/O devices) that require the CPU’s attention.
When an interrupt occurs, the CPU:
Temporarily stops the current program
Executes the interrupt service routine
Returns to the original program after handling the interrupt
III. Direction Flag (DF)
The Direction flag controls the direction of string operations.
When DF = 0, the CPU increments the string pointer after each operation.
When DF = 1, the CPU decrements the string pointer after each operation.
(Image showing TF, IF, and DF in the 8086 flag register)
By default, these flags are cleared (set to 0), which means:
The CPU is not in single-step mode
Interrupts are disabled
String operations move in the forward direction
IV. Uses of Control Flags
Debugging
The TF flag can be used to execute a program one instruction at a time.
This makes it easier to trace program flow and identify errors.
For example, TF can help observe how string instructions behave when the DF flag changes.
Disabling or Enabling Interrupts
• The IF flag allows programmers to disable interrupts when needed.
• This is useful during debugging or while executing critical code sections where interruptions are undesirable.
Controlling String Operations
The DF flag determines whether string operations move forward or backward in memory.
For example, setting DF = 1 can be useful when reversing a string.
V. Summary
TF → Controls single-step execution (debugging)
IF → Enables or disables interrupts
DF → Controls the direction of string operations
Control flags play a crucial role in managing CPU behavior. Understanding how they work helps in debugging programs, handling interrupts efficiently, and performing advanced string operations.
OFFSET
The image shows myByte variable, an offset, and the data segment.
The data segment is a section of memory that stores program data eg variables, constants, strings.
The offset is the distance from the beginning of the data segment to the variable myByte.
I. Data Segment, Offset, and Memory Addressing
If the data segment starts at memory address 1000h and the offset of a variable myByte is 200h, then the variable will be stored at:
1000h + 200h = 1200h
So, myByte is located at memory address 1200h.
II. What Is an Offset?
The offset is the distance of a variable from the start of its segment.
The processor uses this offset to locate variables in memory.
When the CPU needs to access myByte, it:
Takes the starting address of the data segment
Adds the offset of myByte
Uses the result as the memory address
The operating system keeps track of:
The starting address of each program’s data segment
The size of that data segment
This helps protect one program’s data from being accessed by another program.
IV. Correct Assembly Example (8086-style)
Here is a clean and correct example showing how to access myByte:
V. How This Code Works (Step-by-Step)
mov ax, data_segment
Loads the starting address of the data segment into the AX register.
add ax, offset myByte
Adds the offset of myByte to AX, giving the exact memory address of the variable.
mov al, [ax]
Loads the value stored at that memory address into AL.
The [ax] notation means “use the memory location whose address is stored in AX”.
This is called dereferencing the register.
After execution, AL contains the value of myByte.
Is the Statement About Offset Correct?
Yes — it is correct. ✅
The offset is added to the data segment’s starting address to get the memory address of a variable. This is exactly how the 8086 microprocessor accesses data in memory.
VI. Key Points About Offsets
The offset is a 16-bit value
It allows access to up to 64 KB of memory
This matches the maximum size of a data segment in the 8086
VII. Instructions That Use Offsets
mov — moves data between registers and memory
lea — loads the effective address (offset) of a variable into a register
Example:
VIII. Final Summary
A variable’s memory address = data segment address + offset
The offset tells the CPU where the variable is located inside the segment
[ax] means “read the value from the memory address stored in AX”
Understanding offsets is essential for working with segments, pointers, and memory in assembly language
HOW OFFSETS WORK IN ASSEMBLY
The assembly statement: pArray DWORD bigArray, defines a doubleword (DWORD) variable named pArray and initializes it with the address of bigArray.
In simple terms, pArray acts as a pointer that points to the first element of the bigArray.
The array bigArray contains 500 doubleword (DWORD) elements, so pArray holds the starting address of this block of memory.
I. Using the OFFSET Operator
The OFFSET operator is commonly used to obtain the offset address of a variable in memory.
When accessing a variable, the processor adds this offset to the starting address of the data segment to calculate the variable’s actual memory address.
In other words:
Memory Address = Data Segment Base + OFFSET of the Variable
II. Accessing a Variable Using OFFSET
To access a variable like bVal in memory, you first load its offset and then read the value from that address.
When the code executes:
The OFFSET operator provides the location of bVal within the data segment.
The processor uses this offset, along with the data segment’s base address, to locate bVal in memory.
The value stored at that memory location can then be accessed or moved into a register.
III. Dereferencing and the ESI Register
Dereferencing means accessing the data stored at a memory address.
In this case, the address is stored in the ESI register, and the CPU uses that address to read the actual data from memory.
(If this feels confusing, revisiting pointer book in my C Github repo can really help.)
IV. Effect of the MOV Instruction with OFFSET
After the code executes, the ESI register contains the offset of the variable specified in the MOV instruction.
For example:
After the first MOV instruction, ESI contains the offset of the variable bVal.
This offset is 00404000h, which is the memory location where bVal is stored.
After the second MOV instruction, ESI contains the offset of the variable wVal.
This offset is 00404001h, which is the memory location where wVal is stored.
The same pattern continues for subsequent variables:
each MOV instruction loads the offset of the specified variable into ESI.
Once the offset is in ESI, the processor can dereference it (for example, using [ESI]) to access the actual data stored at that address.
V. Why the OFFSET Operator Is Important
The OFFSET operator is a powerful and commonly used tool in assembly language. It is especially useful for:
Accessing variables in memory
Initializing pointer registers like ESI
Working with arrays, structures, and buffers
In short, OFFSET gives you the address of a variable, and dereferencing that address lets you access the value stored there.
VI. Accessing an Array Element Using OFFSET and ESI
The first line of code in the data segment declares an array named myArray that contains five 16-bit (WORD) values.
In the code segment, the next instruction moves the offset of myArray plus 4 into the ESI register.
The OFFSET operator returns the offset of a variable or label, which is the distance from the beginning of the data segment to that variable or label.
In this case, since myArray is the first variable in the data segment, its offset is 0000h.
Each element in myArray is a WORD (2 bytes):
First element → offset 0000h
Second element → offset 0002h
Third element → offset 0004h
By adding 4 to the offset of myArray, we obtain the offset of the third integer in the array.
VII. Using ESI as a Pointer
The ESI register is a general-purpose register that can store addresses.
By loading the offset of the third element of myArray into ESI, we are effectively creating a pointer to the third integer in the array.
Once ESI holds this address, it can be used to access the value stored at that location.
For example, the following instruction loads the value of the third integer in myArray into the AL register:
VIII. Calculating offsets
For arrays…
BYTE Arrays:
If you are working with a BYTE array, where each element is 1 byte in size, you can access the next element by adding +1 to the current element's address.
WORD Arrays:
For WORD arrays, where each element is 2 bytes (16 bits) in size, you should add +2 to move to the next element's address.
DWORD Arrays:
DWORD arrays have elements that are 4 bytes (32 bits) in size. To access the next element, add +4 to the current element's address.
QWORD Arrays:
If you are dealing with QWORD arrays, where each element is 8 bytes (64 bits) in size, you should add +8 to move to the address of the next element.
ALIGN DIRECTIVE
The ALIGN directive is used to place a variable or label on a specific memory boundary, such as a byte, word, doubleword, or paragraph boundary.
Syntax: ALIGN bound
Where bound can be 1, 2, 4, or 16.
For example:
ALIGN 1 → byte boundary
ALIGN 2 → word boundary
ALIGN 4 → doubleword boundary
ALIGN 16 → paragraph boundary
Using ALIGN 4 ensures that the next variable starts at a memory address that is a multiple of 4.
I. Why ALIGN Is Important
Proper alignment can improve performance, especially for certain instructions and hardware units.
For example, the FPU (Floating Point Unit) works more efficiently when data is aligned on a doubleword boundary.
II. PTR Operator and Data Types
The PTR operator is used when moving data between memory and registers to explicitly tell the assembler how much data should be accessed.
Common data types used with PTR include:
BYTE, SBYTE
WORD, SWORD
DWORD, SDWORD
FWORD, QWORD, TBYTE
The PTR operator ensures that the processor interprets the data with the correct size, especially when the variable’s declared type does not match the destination register size.
III. Moving Smaller Data into Larger Registers
Sometimes, you may want to move smaller memory values into a larger register. This is where the PTR operator becomes very useful.
Example:
What’s Happening Here?
wordList contains two WORD values
DWORD PTR tells the assembler to treat the first two WORDs (4 bytes) as a single doubleword
These 4 bytes are loaded into EAX, combining them into one 32-bit value
IV. ALIGN vs PTR — Not the Same Thing
Even though they’re often mentioned together, ALIGN and PTR serve completely different purposes:
Example of ALIGN Directive:
Here:
ALIGN 4 ensures myDouble starts at an address divisible by 4
This improves access efficiency and guarantees proper alignment
V. Final Takeaway
ALIGN decides where data lives in memory
PTR decides how big the data is when you access it
They solve different problems but are both essential for clean, efficient assembly code
BYTE PTR OPERATOR
Think of BYTE PTR as a way to tell the computer, "I know this variable is big, but I only want to look at one byte of it right now."
By default, if you have a large variable (like a 4-byte Doubleword), the assembler expects you to move all 4 bytes at once.
Using the PTR operator lets you override that. It’s like using a magnifying glass to pick out a specific "chunk" of data rather than grabbing the whole box.
You'll see this a lot when moving data between memory and registers—it's basically a size-check to make sure the data fits perfectly into the register you're using.
Here is how you'd use BYTE PTR to pluck a single byte out of a larger myDouble variable:
myDouble is a doubleword(4 bytes), so the first byte is 78h.
This code will move the value of the least significant byte of the variable myDouble into the register BL.
Memory Layout of the myDouble Variable. The following image shows the memory layout of the myDouble variable:
The least significant byte of the variable myDouble is at offset 0000h. The most significant byte of the variable myDouble is at offset 0003h.
I. Moving Smaller Values into Larger Destinations
We can use the BYTE PTR operator to move smaller values into larger destination operands.
For example, the following code moves two words from memory to the register EAX:
II. Grabbing Data with PTR
This code will move the two words from the wordList array into the lower half and upper half of the register EAX, respectively.
Think of PTR as a size filter.
If you have a big 32-bit variable (a DWORD), the computer usually wants to grab the whole thing. PTR lets you change the rules:
DWORD PTR: You’re saying, "Give me the whole 4-byte chunk."
This fills the entire EAX register.
BYTE PTR: You’re saying, "Just give me the very first byte (the 'LSB')."
This fits perfectly into the tiny AL register.
III. ALIGN vs. PTR: What’s the difference?
They aren't even doing the same job.
ALIGN is for Organization: Imagine you’re parking cars. ALIGN makes sure every car starts exactly at a line so they’re easy to find and the computer can "drive" to them faster. It lives in your data section.
PTR is for Execution: This is used inside your code (the instructions). It tells the computer exactly how much data to "pick up" right now—whether it's a tiny 1-byte crumb or the whole 4-byte cookie.
TLDR: ALIGN organizes the shelf; PTR decides how much you’re taking off the shelf.
TYPE, LENGTHOF, SIZEOF, LABEL
The TYPE operator returns the size, in bytes, of a single element of a variable.
Think of the TYPE operator as a quick way to ask the computer, "How much space does one single item in this list take up?"
Instead of you having to remember if a variable is a Byte or a DWORD, you just use TYPE. It returns a simple number representing the byte count:
Byte: Returns 1
Word: Returns 2
Doubleword (DWORD): Returns 4
Quadword (QWORD): Returns 8
I. Why use it?
It’s much safer than hard-coding numbers. If you ever change your array from WORD to DWORD, using TYPE means your code updates itself automatically without you having to hunt down every "+2" and change it to a "+4."
II. The Rest of the "Size" Family:
LENGTHOF: Counts how many items are in the array.
SIZEOF: Multiplies LENGTHOF by TYPE to give you the total bytes the whole array uses.
LABEL: Gives a name to a memory location without actually taking up any space itself.
Think of LENGTHOF as a simple counter that tells you how many "items" are in a list, rather than the total number of bytes.
It’s smart with DUP: If you tell the computer to repeat an item 50 times using 50 DUP(0), LENGTHOF knows there are 50 items there.
The "First Line" Catch: This is the part that trips people up. If you start an array on one line and continue it on the next line (without using a comma or specific formatting), LENGTHOF only counts what it sees on that very first line. It’s like it stops reading the list as soon as it hits the end of the row.
III. The Cheat Sheet
If you have myArray WORD 10, 20, 30:
TYPE is 2 (because it's a Word).
LENGTHOF is 3 (because there are 3 numbers).
SIZEOF is 6 (3 items × 2 bytes each).
IV. SIZEOF OPERATOR
The SIZEOF operator returns a value that is equivalent to multiplying LENGTHOF by TYPE. For example:
IntArray is indeed an array of 32 words. Each word is 16 bits or 2 bytes in size. Therefore, the correct calculation for SIZEOF intArray should be:
SIZEOF intArray = 32 * 2 = 64
So, the correct value for EAX should be 64.
I know for beginners your brains have bluescreened right here, let me re-explain properly.
SIZEOF, LENGTHOF, and TYPE — Explained Like a Human
At first glance, SIZEOF, LENGTHOF, and TYPE sound intimidating… but they’re really just answering three very simple questions about your data.
Let’s break them apart before we put them together.
I. LENGTHOF — “How many items are there?”
LENGTHOF tells you how many elements are in the array.
If you have:
That means:
intArray has 32 elements
Each element is one WORD
So: LENGTHOF intArray = 32
👉 Important:
LENGTHOF does NOT care about size in bytes.
It only counts how many items exist.
II. TYPE — “How big is ONE item?”
TYPE tells you the size of a single element, in bytes.
BYTE = 1 byte
WORD = 2 bytes
DWORD = 4 bytes
QWORD = 8 bytes
Since intArray is made of WORDs:
👉 Think of TYPE as the size of one slot, not the whole array.
III. SIZEOF — “How much space does this take TOTAL?”
This is where beginners usually lose it 😅
SIZEOF is NOT magic. It literally just does this:
So, for intArray:
IV. Why EAX Ends Up Being 64
If your code does something like:
What’s happening mentally is:
“Hey assembler, how many BYTES does this entire array occupy?”
And the assembler answers: “32 items × 2 bytes each = 64 bytes”
So: EAX = 64
✔ Correct
✔ Logical
✔ No black magic involved
V. The Mental Model That FINALLY Makes It Click
Think of an egg carton 🥚:
LENGTHOF → how many eggs
TYPE → size of one egg
SIZEOF → how much space the whole carton takes
You don’t argue with the carton.
You just multiply.
VI. Why Beginners Get Wrecked Here
Because they:
Confuse elements with bytes
Think SIZEOF “counts items”
Forget that WORD ≠ 1 byte
Once they realize: “SIZEOF is always bytes”
Everything snaps into place.
VII. One-Line Rule to Remember Forever
SIZEOF always tells you how many BYTES the data occupies.
LENGTHOF counts items.
TYPE tells you the size of one item.
That’s it. No more suffering, my beginners…😌
NB: AI is a mega-brain use it as a partner, not a slop generation machine….
LABEL DIRECTIVE
The LABEL directive lets you give a name to a memory location and specify its size without actually reserving any storage.
You’ll often see it used to create an alternative name for a variable in the data segment, or to attach a size attribute to a variable that’s declared immediately after it.
In this example, val16 is just another name for the same memory location as val32. The LABEL directive doesn’t create any new storage on its own—it simply gives an alternate name and size to an existing variable.
The LABEL directive is like giving an existing variable a nickname.
You can also attach a size to that nickname without reserving any extra memory.
This is handy when you want to refer to the same data in a slightly different way—maybe because you’re working with different sizes or just want a clearer name in your code.
For example, you might have a 32-bit variable called val32, but sometimes you only need to look at it as a 16-bit value.
With LABEL, you can create a “shortcut” called val16 that points to the same spot in memory. No extra space is taken—val16 is just another name for val32.
In this example, the label val16 lets you peek at just the first two bytes of the 32-bit variable val32.
It’s like looking at a slice of a bigger cake without cutting the whole thing.
You can also use a LABEL to build bigger numbers from smaller ones.
For instance, imagine you have two 16-bit variables, val1 and val2, and you want to treat them together as a single 32-bit value.
By creating a label called LongValue, you can access both of them as if they were one larger integer—again, without allocating any extra memory.
In this example, the LongValue label is used to access the 32-bit integer that is constructed from the val1 and val2 variables.
The LABEL directive does not actually allocate any storage.
It simply creates a label with a specific size attribute.
This can be useful for creating aliases for variables, constructing larger integers from smaller integers, and other purposes.
I. Understanding Labels and Memory Access: The LongValue Example
The LongValue label in assembly doesn’t actually store the values of val1 and val2. Instead, it acts as a pointer to a memory location. Here’s how it works step by step:
Memory Reservation
In the .data section, LongValue is declared as a LABEL DWORD. This doesn’t assign a value—it just reserves 4 bytes of memory (a doubleword) and gives it a name.
Declaring Values
val1 and val2 are declared as 16-bit words with values 5678h and 1234h. They are stored in memory consecutively, right after LongValue.
Loading Data
In the .code section, the instruction:
…moves 4 bytes of data from the memory location pointed to by LongValue into the EAX register.
Since LongValue itself doesn’t contain a specific value, the CPU reads the next 4 bytes in memory, which happen to be the bytes of val1 and val2.
II. Memory Layout Visualization
Let’s assume little-endian storage (least significant byte first):
When you do: mov eax, LongValue
EAX gets loaded as: EAX = 12345678h
So, effectively, the consecutive bytes from memory (val1 and val2) are loaded into EAX.
The assembler doesn’t need LongValue to hold actual data—it just provides a reference point in memory.
III. Key Takeaways
Labels are memory references, not variables that must hold values.
MOV with a label loads data from the memory starting at that address.
In little-endian systems, lower bytes come first, so multi-byte values like words and doublewords are concatenated accordingly.
IV. Section Review: Operators Recap
DIRECT ADDRESSING MODE
Direct addressing is a memory addressing mode in which the operand of an instruction is the actual address of the data to be accessed.
This allows the CPU to reach the data directly, without performing any extra calculations.
Because of this, direct addressing is one of the simplest and most efficient addressing modes.
Limitations of Direct Addressing
While direct addressing is fast and straightforward, it does have some limitations:
I. Address Must Be Known at Compile Time
The memory address of the data must be known when the program is assembled.
As a result, direct addressing cannot be used for dynamically allocated data, such as data read from a file or created at runtime.
II. Limited to the CPU’s Address Space
Direct addressing can only access data that resides within the CPU’s addressable memory space.
It cannot be used to directly access data stored in external storage (such as a hard disk).
Where Direct Addressing Is Commonly Used
Despite these limitations, direct addressing is widely used because of its simplicity and speed.
It is commonly used to access:
Global variables
Static data
Constants stored in memory
Examples of Direct Addressing
Here are some examples of direct addressing in assembly language:
Direct addressing can also be used in high-level programming languages.
The following C code uses direct addressing to access an element of an array:
Think of Direct Addressing as telling the computer exactly where to go using a specific "home address."
When you load something like the 5th item in an array into a register, the computer doesn't have to do any math or "think" about where it is—it just goes straight to that spot in one shot. It’s fast because there are no extra steps to calculate the address.
The Catch
The big downside is that you have a very limited amount of "room" to write out that address.
Imagine trying to write a really long, specific address on a tiny sticky note; eventually, you run out of space.
In the same way, direct addressing often can't handle very large or complex memory locations because the instruction itself only has so many bits to hold the address.
It’s powerful for quick, simple jumps, but you hit a wall once your program gets too big or complex.
Quick Summary
Direct addressing uses the actual memory address of the operand
It requires no extra calculations
It is fast, simple, and efficient
Best suited for static and known-at-compile-time data
INDIRECT ADDRESSING MODE
Indirect addressing is a memory addressing mode in which the operand of an instruction is not the data itself, but a pointer to the data.
The actual address of the data is stored in a register or memory location, so the CPU must first dereference the pointer to access the data.
Because of this extra step, indirect addressing is a bit more complex than direct addressing, but it comes with several powerful advantages.
Advantages of Indirect Addressing
I. Dynamic Data Access
The address of the data doesn’t need to be known at compile time.
This means indirect addressing can be used to access data that is created at runtime, such as:
Data read from files
Data dynamically allocated in memory
II. Reusable Code
Since the pointer can be changed at runtime, the same instructions can operate on different data items.
This makes indirect addressing ideal for:
Loops
Arrays
Functions that process multiple variables
III. Examples of Indirect Addressing
Here are some examples in assembly language:
Indirect addressing is a total game-changer, but it’s got some quirks you should know about.
One catch is that it’s a bit slower than direct addressing.
This happens because the CPU can't just jump to the data—it has to take an extra step to look up where the pointer is actually aiming first.
Another thing is that it can be a bit risky. If your pointer isn't set up right, or if it’s pointing to a spot in memory that doesn't exist, your program is going to crash or act weird.
As for the code:
Is it actually necessary to include that [5]? If you just write &array, you're grabbing the address of the very first item. By adding the [5], you're telling the computer,
"I don't want the start; I want to point specifically at the sixth item."
Technically, you don't have to use brackets to get there. These two lines do the exact same thing:
The reason why you can omit the 5 in brackets is because arrays are implicitly converted to pointers to their first element.
This means that the expression &array is equivalent to the expression &array[0].
However, there are some cases where it is good practice to explicitly specify the array index.
For example, if you are declaring a pointer to the last element of an array, it is more clear to write something like this:
Writing out the index makes it crystal clear that your pointer is aiming at the end of the array rather than the beginning. It's a "tell, don't hide" approach to coding.
It’s also a smart move when you're working with functions or libraries. If a specific tool expects to start working from a certain spot in your list, being explicit prevents the program from getting lost.
Ultimately, it’s your call whether to include the index or not. Just keep in mind that leaving it out changes where the pointer lands, so you have to be sure you're aiming at the right target.
Another Example:
Think about a string of text like "HELLO".
If you use &string, you’re pointing at the 'H'.
If you use &string[4], you’re pointing specifically at the 'O'.
If you pass just &string to a function that's supposed to delete the last letter, the function has to do extra work to find the end. If you pass &string[4], you’ve handed it the exact spot it needs to work on.
Another Example:
Basically, you’re just setting a marker in the computer's memory.
The first line creates a tiny piece of data (a byte) named byteVal and gives it the value 10h.
The second line takes the "home address" of that data and saves it into the esi register.
Think of the offset as a distance: it’s the exact number of bytes you’d have to walk from the very start of the data section to reach the front door of byteVal.
Since byteVal is sitting in the data segment, its offset is just its position relative to the start of that "storage room."
By putting that address into esi, you’re essentially giving the computer a pointer so it knows exactly where to look later.
The following assembly language instruction moves the offset of the variable byteVal into the register esi:
After this instruction is executed, the register esi will contain the offset of the variable byteVal.
The offset of a variable can be used to load or store the value of the variable.
For example, the following assembly language instruction loads the value of the variable byteVal into the register al:
To put it simply, an offset is just a GPS coordinate for your data.
When you run an instruction like mov al, [byteVal], you're telling the CPU to go to a specific "distance" from the start of memory to find your value.
How the Compiler Uses Offsets for Functions
When your code calls a function, the compiler has to do some heavy lifting behind the scenes using these offsets:
Setting the Table: Before the function starts, the compiler uses offsets to "push" your arguments onto the stack. It needs to know exactly where each piece of data sits so it can line them up correctly.
The Big Jump: It uses the function's own offset to know where the actual instructions for that function are located in memory.
Cleaning Up: Once the function is done, the compiler uses offsets again to "pop" those arguments off the stack and grab the return value.
The Bottom Line
Think of the offset as the "starting point" for everything.
Whether you're just moving a single byte into a register or handling a complex function call with multiple variables, the computer relies on these relative distances to make sure it’s always grabbing the right data from the right drawer.
Using OFFSET to Access Memory
In this code, we have a variable byteVal declared in the .data section with the value 10h (16 in decimal).
The OFFSET keyword calculates the memory address of byteVal and stores it in the ESI register. Here’s what’s happening step by step:
byteVal is a byte variable, so it occupies 1 byte in memory.
OFFSET byteVal tells the assembler to compute the address of byteVal relative to the start of the data segment.
Since byteVal is the first item in the data segment, its offset is 0x0.
mov esi, OFFSET byteVal loads the offset 0x0 into the ESI register.
The next instruction, mov al, [esi], reads the byte at the memory address stored in ESI (which is the address of byteVal) and stores it in the AL register. Now, AL contains 10h (16 decimal).
Summary:
The OFFSET keyword gives the relative memory address of a variable in the data segment, allowing registers like ESI to act as pointers for accessing the variable.
Accessing Arrays with OFFSET
The same concept works for arrays. Suppose we have an array of three doublewords called arrayD:
The OFFSET keyword calculates the starting address of arrayD.
The ESI register is used as a pointer to traverse the array.
To move to the next element, you add 4 to ESI (because each doubleword is 4 bytes):
This allows you to sequentially access each element of the array in memory.
For example, if arrayD starts at offset 10200h, ESI will first point to 10200h, then 10204h, then 10208h, and so on, allowing each element to be loaded into EAX one by one.
INDEXED ADDRESSING
Indexed addressing is a way of finding the address of a value in memory. It works by adding a constant (called a displacement) to the contents of a special register called an index register.
Index register: A CPU register that stores a base value used to offset an address.
Displacement: A constant added to the index register to calculate the final memory address.
This mode is especially handy for working with arrays or other data structures like linked lists and trees, because it lets you jump directly to a specific element without knowing the absolute memory address.
I. Example: Accessing an array element
R2 = base address of the array
5 = index/displacement
This loads the 6th element (index 5) into register R1
The effective address of the element is calculated as:
II. Example: Accessing a linked list element
EAX = address of the current element
4 = offset to the next element
Loads the next element into R1
Indexed addressing makes it easy to navigate data structures, but it has some limitations:
Can be slower than direct addressing (CPU needs extra calculation).
If the index register isn’t initialized correctly, it can lead to errors.
INDEXED OPERANDS
Indexed operands are the combination of a register (index) and a constant (displacement) to access memory.
They’re very useful for arrays because you don’t need to know an element’s exact memory address—just its index.
I. Examples in assembly:
Equivalent C code:
II. Indexed Operands with Scale Factors
When working with arrays where elements vary in size (like words, doublewords, etc.), scale factors are used to calculate the correct offset.
Scale factor: The size of one array element in bytes
Word = 2 bytes
Doubleword = 4 bytes
III. Two common forms of indexed operands:
IV. Example: Byte array
V. Example: Word array with displacements
VI. Example: Doubleword array with scale factor
TYPE arrayD gives the size of each element (4 bytes here).
3 * 4 = 12 → the fourth element (index 3) is 12 bytes from the start.
Important tip: Arrays in assembly are zero-indexed:
arrayD[0] → first element
arrayD[3] → fourth element
So, setting esi = 12 correctly accesses the element containing 400h.
Quick Tips
Always initialize the index register before using it.
Use displacements or scale factors to navigate arrays of different sizes.
Double-check your offsets to avoid accessing the wrong memory location.
Indexed addressing is powerful but can be slower than direct addressing.
IMMEDIATE ADDRESSING MODE
Immediate addressing is the simplest way to provide data to an instruction.
Instead of going to memory or a register to fetch a value, you put the constant value directly into the instruction.
Used for instructions like MOV, ADD, SUB, etc.
The operand is embedded in the instruction itself, not stored elsewhere.
Here, 100 is the immediate value.
It’s directly loaded into the EAX register.
Compare that with direct addressing:
direct_value is a memory location.
The instruction fetches the value stored at that memory address and loads it into EBX.
So, the key difference:
Comparing the Five Common Addressing Modes
Here’s a simple breakdown using a small assembly snippet:
What each does:
Direct addressing: mov eax, [array]
Loads the value at the memory location named array into EAX.
Indirect addressing: mov ebx, [ptr]
Loads the value from the memory location pointed to by ptr into EBX.
Indexed addressing: mov ecx, [array + 4]
Adds an offset (4 bytes here) to array and loads that element into ECX.
Great for accessing array elements.
Immediate addressing: mov edx, 100
Loads the constant value 100 directly into EDX.
Stack addressing: push 200 / pop stackVal
push 200 puts the value 200 on top of the stack.
pop stackVal retrieves the top value from the stack into the variable stackVal.
Quick Tip
Immediate = in the instruction
Direct = value at memory location
Indirect = memory address in a register
Indexed = base address + offset
Stack = value on the stack
This table really helps visualize the differences:
STACK ADDRESSING
Stack addressing is a key concept in computer architecture and assembly language, especially in stack-oriented architectures like x86.
It uses a special memory region called the stack to manage data and control program flow.
I. What is the Stack?
The stack is a temporary storage area in memory.
It works as a Last-In, First-Out (LIFO) structure: The last item pushed onto the stack is the first one popped off.
II. Stack Pointer (SP)
The stack is managed with a special register called the Stack Pointer (SP).
The SP points to the top of the stack.
Operations:
Push: Decrement SP, then store data at the address SP points to.
Pop: Retrieve data from the address SP points to, then increment SP.
III. Stack Operations
Stack operations are fundamental actions used to manage data in a stack, which follows the Last-In, First-Out (LIFO) principle. The first operation is the push operation, which is used to place a value on top of the stack. For example, when the instruction PUSH 200 is executed, the value 200 is added to the top of the stack. As this happens, the Stack Pointer (SP) moves downward to point to the new top position where the value is stored.
The second operation is the pop operation, which is used to remove the value at the top of the stack. For instance, the instruction POP stackVal takes the value from the top of the stack and stores it in the variable named stackVal. After removing the value, the Stack Pointer moves upward, returning to its previous position before the value was pushed.
In x86 assembly language, these stack operations are commonly used to temporarily store data, pass parameters to functions, and save register values during program execution.
IV. Example in x86 Assembly
PUSH adds a plate to the top, and POP takes the top one off. It’s the primary way assembly handles function calls:
The Call: The computer pushes the return address (so it knows where to go back to) and any parameters the function needs onto the stack.
The Work: The function uses the stack to store its own temporary local variables.
The Return: Once finished, it pops those values off to clean up and uses the return address to jump back to the main code.
V. The "CALL" and "RET" Loop
CALL: Pushes the return address (the next instruction) onto the stack, then jumps to the function.
Function: Can use the stack to save registers or store local variables. Remember to pop what you pushed before returning.
RET: Pops the return address from the stack and jumps back to it.
TL;DR: CALL saves your place; RET returns to it.
VI. Passing Data via the Stack
Push arguments: Before calling a function, push the function’s arguments onto the stack.
Call the function: CALL pushes the return address onto the stack and jumps to the function.
Return value: After the function finishes, the return value is pushed onto the stack.
Pop result: Retrieve the return value by popping it into a register.
Quick Tips for Stack Addressing:
The stack is Last-In, First-Out (LIFO): the last value pushed is the first one popped.
The stack pointer (SP) keeps track of the top of the stack.
Used for:
Function arguments
Return values
Temporary storage during operations
Step-by-Step:
PUSH EAX / PUSH EBX: Save the values of EAX and EBX onto the stack.
ADD EAX, [ESP]: Add the value at the top of the stack (ESP points here) to EAX.
POP EBX: Restore the original value of EBX from the stack.
POP EAX: Retrieve the result of the addition back into EAX.
Notes:
This is an unconventional way of doing addition using the stack.
Typically, you would add registers directly (ADD EAX, EBX).
The stack here is used to temporarily store operands and results.
Precondition: EAX and EBX must already contain values before pushing.
Data Section
Defines two 32-bit integers: value1 = 10 and value2 = 20.
Code Section
main PROC marks the program’s entry point.
Stack Operations
PUSH is used to place value1 and value2 onto the stack as 32-bit integers.
CALL add_values calls a function to add the two numbers.
After the function returns, the result is left on top of the stack.
Retrieving and Using the Result
The result is moved from the stack into a register (e.g., EAX).
Stack cleanup is done with ADD ESP, 8 to remove the two pushed values.
Function (add_values)
POP retrieves value2 into EBX and value1 into EAX.
ADD EAX, EBX computes the sum.
The result is pushed back onto the stack for the caller.
RET returns control to the calling code.
Program Exit
Exit code is set with a register (e.g., EAX = 1) and a system call (e.g., INT 20h).
Summary
This program demonstrates stack-based addition: push arguments, call a function, compute, push result, retrieve result, clean stack, and exit.
Key stack concepts: LIFO order, using PUSH/POP, and function return with RET.
The function addValues takes two integers, adds them, and returns the result.
This corresponds to the add_values function in assembly.
In main:
value1 = 10 and value2 = 20 correspond to the assembly variables value1 and value2.
addValues(value1, value2) performs the addition, just like the assembly function.
The result is stored in result and printed with printf.
Summary:
This C code performs the same addition operation as the MASM assembly code.
The difference is that C uses higher-level syntax and function calls instead of manual stack operations.
POINTERS IN ASSEMBLY LANGUAGE
A pointer is a variable that stores the memory address of another variable. Pointers are essential in assembly for direct memory manipulation, efficient array access, and dynamic data handling.
POINTER BASICS
Pointers allow you to access and modify memory at runtime.
In 32-bit mode, pointers are typically doublewords (DWORD).
The PTR keyword tells the assembler the type of the pointer (BYTE PTR, WORD PTR, DWORD PTR).
The OFFSET operator returns the address of a variable.
I. Example: Pointer to a Byte Array
ptrB stores the memory address of arrayB.
Using OFFSET makes it clear that ptrB points to arrayB.
II. Near Pointers
In 32-bit assembly, pointers are usually stored in DWORDs (4 bytes).
Example with bytes and words:
III. Pointer Operations
Declaring and Using a Pointer
Storing an Address
Accessing the Value
Modifying the Value
Explanation:
Declare myPointer as a pointer to a byte.
Store the address of myByte in myPointer using OFFSET.
Load the value at that address into EAX by dereferencing the pointer ([myPointer]).
Increment the value at the pointer’s address using INC [myPointer].
Why Pointers Are Useful
Efficiently manipulate arrays and data structures.
Dynamically allocate or modify memory at runtime.
Pass arguments to functions by reference.
Implement function pointers or jump tables.
TYPEDEF OPERATOR
The TYPEDEF operator allows you to create a user-defined type that has all the status of a built-in type when defining variables.
This is useful for creating pointer types, as it makes the code more readable and maintainable.
For example, the following declaration creates a new data type called PBYTE that is a pointer to bytes:
This declaration would usually be placed near the beginning of a program, before the data segment. Then, variables could be defined using the PBYTE type:
Advantages of using the TYPEDEF operator to create pointer types:
It makes the code more readable and maintainable, as it is clear what type of data the pointer is pointing to.
It can help to prevent errors, as the compiler will check that the pointer is only used to access the type of data that it is pointing to.
It can make the code more portable, as the same TYPEDEF declaration can be used on different platforms.
The following program demonstrates how to use the TYPEDEF operator to create pointer types:
The TYPEDEF operator is a powerful tool that can be used to create pointer types. This can make the code more readable, maintainable, portable, and error-prone.
JMP AND LOOP INSTRUCTIONS
The JMP and LOOP instructions in assembly language are used to implement conditional and unconditional transfers of control, respectively.
The JMP instruction causes an unconditional transfer to a destination, identified by a code label.
The LOOP instruction repeats a block of statements a specific number of times, using the ECX register as a counter.
Here is an example of how to use the JMP and LOOP instructions to implement a simple loop:
This loop will execute as long as the ECX register is not equal to zero. When ECX reaches zero, the loop will exit.
The LOOP instruction is a more concise and efficient way to implement a loop than using a JMP instruction.
The LOOP instruction automatically decrements the ECX register and performs the jump to the loop label, so you don't have to explicitly write these instructions yourself.
Here is an example of how to use the LOOP instruction to implement the same loop as above:
This loop is equivalent to the previous loop, but it is more concise and efficient.
You can also use the LOOP instruction to implement more complex loops, such as loops that iterate over an array or loops that exit based on a different condition.
The loop count is initialized to 100 by moving the value 100 into the ECX register.
The loop counter is saved in the count variable by moving the value of the ECX register to the count variable.
The ECX register is modified. The loop counter is restored by moving the value of the count variable to the ECX register.
The LOOP instruction is executed, which decrements the ECX register and jumps to the top label if the ECX register is not equal to zero.
The loop exit code is executed if the ECX register is equal to zero. The following is a more detailed explanation of each step:
This label marks the beginning of the loop.
This instruction decrements the ECX register and jumps to the top label if the ECX register is not equal to zero. This causes the loop to repeat until the ECX register is equal to zero.
NESTED LOOPS
The nested loop you provided is an example of a combination of an outer loop and an inner loop, a common construct in programming. Let me explain what this nested loop does:
Outer Loop (L1):
The outer loop is controlled by the ecx register and is initialized with the value 100.
Inside the outer loop: The current value of ecx (the outer loop counter) is saved into the count variable.
The inner loop is initialized by setting ecx to 20.
Inner Loop (L2):
The inner loop is also controlled by the ecx register and is initialized with the value 20.Inside the inner loop:
There is a placeholder comment for the loop body, where actual operations or instructions would be performed.
dec ecx is used to decrement the inner loop counter.
jne L2 is a conditional jump instruction. It checks if the inner loop counter (ecx) is not equal to zero. If it's not zero, the program jumps back to the beginning of the inner loop (L2) to continue iterating.
Outer Loop Continuation:
After the inner loop completes (when ecx becomes zero), the program restores the original value of ecx from the count variable.
dec ecx is used to decrement the outer loop counter.
jne L1 is another conditional jump instruction. It checks if the outer loop counter (ecx) is not equal to zero. If it's not zero, the program jumps back to the beginning of the outer loop (L1) to continue iterating.
Summary:
The inner loop runs 20 times per outer loop iteration.
The outer loop runs 100 times, giving a total of 100 × 20 = 2000 iterations.
The loop body executes repeatedly for each inner loop iteration.
Summing an Array in Assembly
This example shows how to sum the elements of an array using registers and a simple loop in x86 assembly. Let’s go through it step by step.
I. Setting Up the Array and Main Function
The code first defines intArray, which is an array of four 16-bit integers.
The program defines a main() function—this is the entry point where execution begins.
II. Initializing Registers for the Loop
Before starting the loop, we prepare a few CPU registers:
EDI – Holds the address of the array. We’ll use it as a pointer to the current element in the array.
EAX – Initialized to 0. This register will accumulate the sum of the array elements.
ECX – Set to the length of the array. This acts as the loop counter, telling the CPU how many elements to process.
III. The Loop (Label: L1)
The loop is where the actual summing happens. Here’s what each step does:
Add current array element to EAX
The CPU reads the value at the memory address in EDI and adds it to EAX.
EAX now contains the sum of all elements processed so far.
Move to the next element
Increment EDI by the size of one array element (2 bytes for a 16-bit integer).
This updates EDI so it points to the next element in the array.
Loop control
Decrement ECX automatically each time through the loop.
If ECX is not zero, jump back to the L1 label to process the next element.
This continues until all elements have been added.
IV. Exiting the Program
After the loop finishes:
The program calls ExitProcess to exit cleanly, returning a success code.
At this point, EAX holds the total sum of the array elements.
V. Key Takeaways
Registers as helpers:
EDI → points to array elements
EAX → accumulates the sum
ECX → counts iterations
Looping in assembly:
Use a label (L1) as a loop start point.
Use a conditional jump (loop or jnz) to repeat until the counter reaches zero.
Memory addressing: Incrementing EDI by the element size moves the pointer to the next array element.
COPYING STRINGS
Using a loop that copies a string, represented as an array of bytes with a null terminator value.
Indexed addressing works well for this type of operation because the same index register references both strings.
The target string must have enough available space to receive the copied characters, including the null byte at the end:
When you move a 32-bit constant to a 64-bit register, the upper 32 bits (bits 32–63) of the destination are cleared (equal to zero):
When you move a 16-bit constant or an 8-bit constant into a 64-bit register, the upper bits are also cleared:
When you move memory operands into 64-bit registers, however, the results are mixed. For example, moving a 32-bit memory operand into EAX (the lower half of RAX) causes the upper 32 bits in RAX to be cleared:
But when you move an 8-bit or a 16-bit memory operand into the lower bits of RAX, the highest bits in the destination register are not affected:
The following assembly code example shows how moving an 8-bit or 16-bit memory operand into the lower bits of RAX does not affect the highest bits in the destination register:
1. Moving Data into RAX
Think of RAX as a massive 64-bit storage unit. When you move a smaller piece of data into it, like AX (16-bit) or AL (8-bit), you are only touching the "bottom floor" of that unit.
mov ax, myWord: Only the last 16 bits of RAX change. The other 48 bits stay exactly as they were.
mov al, myByte: Only the very last 8 bits change.
2. The MOVSXD Trick (Sign Extension)
MOVSXD is the "Smart Copy" for moving a 32-bit value into a 64-bit register.
It doesn't just copy the number; it looks at the sign bit (the very first bit of the 32-bit number).
If the number is negative, it fills all the new empty space in the 64-bit register with 1s to keep it negative.
In our example, ebx starts with 9..., which in binary starts with a 1. So, MOVSXD floods the rest of RAX with Fs (all ones), turning the whole register into FFFFFFFFFFFFFFFFh.
3. Why bother?
We do this because computers are picky about math. You can't easily add a 32-bit number to a 64-bit number.
MOVSXD "inflates" the smaller number so it fits the 64-bit math operations without changing its value.
Here is an example of how the MOVSXD instruction can be used to prepare data for further processing:
This code can then be used to perform a mathematical operation on the value in RAX, such as:
The ADD instruction expects a 64-bit operand, so the sign extended value in RAX is used in the operation.
The MOVSXD instruction is a powerful tool that can be used to prepare data for further processing in a variety of ways.
The OFFSET operator generates a 64-bit address, which must be held by a 64-bit register or variable. In the following example, we use the RSI register:
Data Section Allocates memory for 4 words (8 bytes total).
Code Section:
OFFSET myArray calculates the memory address of the array.
mov rsi, OFFSET myArray stores this address in the 64-bit RSI register.
RSI now points to the start of myArray in memory.
4. Why This Matters
Storing the memory address in a register allows you to manipulate the array efficiently.
Enables pointer-based operations, indexed addressing, or iteration in loops.
Important in 64-bit mode, since registers like RSI, RDI, RCX, and RAX can hold full 64-bit addresses.
5. Additional Notes on 64-Bit Assembly
Loop Counters: RCX is often used with the LOOP instruction.
64-bit integers and registers simplify arithmetic and memory operations.
ASCII strings (byte arrays) commonly use indirect or indexed addressing to access elements.
This sets the stage for our 64-bit version of SumArray, which can iterate through the array using the RSI pointer.
The SumArray program is a straightforward way to see how 64-bit assembly handles math on a larger scale.
Instead of the old 32-bit registers, this version uses the 64-bit heavyweights: RAX for the running total, RDI to point to the data, and RCX as the loop counter. It uses the QWORD directive to tell the assembler that every number in the array is a full 64 bits wide.
The logic is simple: the code walks through the array, adding each quadword into RAX one by one. The loop keeps spinning until RCX hits zero. Once the math is done, the program clears out ECX to signal a clean exit and calls ExitProcess.
6. Addition and Subtraction in 64-Bit Mode
In 64-bit mode, ADD and SUB work exactly like they do in 32-bit mode, just with more "elbow room."
You can manipulate massive numbers directly in the 64-bit registers without having to worry about breaking them into smaller chunks.
The ADD, SUB, INC, and DEC instructions in 64-bit mode affect CPU status flags similarly to 32-bit mode.
In an example, adding 1 to a 32-bit number in RAX causes bit 32 to receive a 1, resulting in RAX becoming 100000000h.
When using partial register operands, be aware that the remainder of the register is not modified. For example, adding 16-bit values in AX and BX does not affect the upper bits in RAX.
The same principle applies to subtraction. Subtracting 1 from zero in EAX or AX modifies the lower bits of RAX while leaving the upper bits unchanged.
These concepts outline the behavior of instructions and registers in 64-bit mode, particularly with regard to handling integers and loops.
Use of 64-bit Registers with Indirect Operands
Memory Increment and Decrement Instructions in 64-Bit Assembly
In 64-bit mode, when working with indirect operands, you must use a 64-bit general-purpose register to hold the memory address.
Additionally, it's essential to use the PTR operator to specify the size of the target operand. Here are some examples:
The following instructions demonstrate how to manipulate data directly in memory for different operand sizes using 64-bit registers:
I. dec BYTE PTR [rdi]
Instruction: dec – decrements the operand by 1.
Operand size: BYTE PTR indicates that the target is a single byte (8 bits).
Memory reference: [rdi] points to the memory location addressed by the 64-bit register RDI.
Explanation:
This instruction subtracts 1 from the byte stored at the memory location pointed to by RDI. It is commonly used when working with 8-bit data in memory, such as decrementing a counter or a byte variable.
II. inc WORD PTR [rbx]
Instruction: inc – increments the operand by 1.
Operand size: WORD PTR indicates that the target is a word (16 bits).
Memory reference: [rbx] points to the memory location addressed by the 64-bit register RBX.
Explanation:
This instruction adds 1 to the 16-bit value stored at the memory location pointed to by RBX. It is used for operations on 16-bit data, like incrementing a short integer in memory.
III. inc QWORD PTR [rsi]
Instruction: inc – increments the operand by 1.
Operand size: QWORD PTR indicates that the target is a quadword (64 bits).
Memory reference: [rsi] points to the memory location addressed by the 64-bit register RSI.
Explanation:
This instruction adds 1 to the 64-bit value stored at the memory location pointed to by RSI. It is suitable for operations involving 64-bit integers in memory, like counters or accumulators in 64-bit programs.
✅ Summary
These instructions allow direct memory manipulation of values of different sizes:
SCALE FACTORS IN INDEXED OPERANDS
In 64-bit mode, you can also use scale factors in indexed operands, similar to 32-bit mode.
If you're working with an array of 64-bit integers, you can use a scale factor of 8 to correctly calculate offsets. Here's an example:
In this example, we're accessing a 64-bit integer in the array using a scale factor of 8 to account for the size of each element.
A scale factor in assembly language is a multiplier used in indexed addressing to calculate the correct memory address of an element in an array. It accounts for the size of each element in the array.
For example, if you have an array of 64-bit integers (8 bytes each), using a scale factor of 8 ensures that when you access array[i], the CPU correctly computes the memory offset as base_address + i * 8. Without the scale factor, the CPU would access the wrong memory location.
64-BIT POINTERS IN ASSEMBLY
In 64-bit mode, pointers are 64-bit values. That means they can hold the memory address of anything in your program—like pointing to a variable, an array, or any data in memory.
I. Example: Pointing to an Array
arrayB is just a simple array of 4 bytes: 10h, 20h, 30h, 40h.
ptrB is a 64-bit variable (QWORD) that holds the address of arrayB.
Think of it like this:
arrayB = a row of boxes with numbers in them: [10][20][30][40]
ptrB = a sticky note that says: “Start looking at box number X”
The CPU doesn’t copy the array into ptrB; it just stores the location of the array.
II. Making It Clearer with OFFSET
You can make the pointer’s relationship to the array more explicit using the OFFSET operator:
OFFSET arrayB literally says: “Store the address of arrayB in ptrB”
Functionally, it’s the same as before, but it’s clearer when reading the code.
Beginners often think ptrB stores the array itself—it does not! It just points to where the array lives in memory.
III. Visualization
Imagine memory like a street with houses:
Imagine memory like a street with houses:
1000h → 10h
1001h → 20h
1002h → 30h
1003h → 40h
ptrB (at 2000h) = 1000h
ptrB doesn’t copy the contents of arrayB.
It points to the first byte of arrayB.
If you want to read the array using ptrB, you tell the CPU:
Now AL = 10h.
Increment the pointer to get the next byte:
IV. Key Takeaways for Beginners
64-bit pointers = 64-bit memory addresses
A pointer never stores the data itself, only the location of the data
Using OFFSET makes the pointer’s target explicit and easier to understand
You can then use the pointer to read/write array elements in memory
V. Beginner Analogy
arrayB = row of numbered mailboxes
ptrB = address written on a sticky note pointing to the first mailbox
Reading/writing via ptrB = going to the mailbox the note points to and checking the content
CHAPTER SUMMARY: WHAT WE COVERED
In this chapter, we've covered various essential concepts related to data transfers, addressing, arithmetic, and loops in assembly language programming. Here's a concise summary of the key points:
MOV Instruction:
The MOV instruction is used for transferring data from a source operand to a destination operand.
Zero and Sign Extension:
MOVZX extends a smaller operand to a larger one by zero-filling.
MOVSX extends a smaller operand to a larger one by sign extension.
XCHG Instruction:
XCHG swaps the contents of two operands, with at least one of them being a register.
Operand Types:
Direct operand represents the address of a variable.
Direct-offset operand adds a displacement to a variable's name, generating a new offset.
Indirect operand uses a register containing the address of data.
Indexed operand combines a constant with an indirect operand for effective memory access.
Arithmetic Instructions:
INC adds 1 to an operand.
DEC subtracts 1 from an operand.
ADD adds a source operand to a destination operand.
SUB subtracts a source operand from a destination operand.
NEG reverses the sign of an operand.
Status Flags:
Various CPU status flags are affected by arithmetic operations, including Sign, Carry, Parity, Auxiliary Carry, Zero, and Overflow flags.
Operators:
OFFSET operator returns the distance of a variable from the beginning of its segment.
PTR operator overrides a variable's declared size.
TYPE operator returns the size of a variable or array element.
LENGTHOF operator returns the number of elements in an array.
SIZEOF operator returns the number of bytes used by an array initializer.
TYPEDEF operator creates user-defined types.
Loops:
JMP instruction unconditionally branches to another location.
LOOP instruction is used in counting-type loops, with ECX as the counter in 32-bit mode and RCX in 64-bit mode.
64-Bit Mode:
In 64-bit mode, a 64-bit general-purpose register must be used when working with indirect operands.
Scale factors can be used in indexed operands, useful for arrays of 64-bit integers
QUESTIONS
Question 9: What value will RAX contain after the following instruction executes? mov rax,44445555h
Answer: RAX will contain the hexadecimal value 44445555h.
Question 10: What value will RAX contain after the following instructions execute?
Answer: RAX will contain the value FFFFFFFF84326732h.
Question 11: What value will EAX contain after the following instructions execute?
Answer: EAX will contain the value 00030000h.
Question 12.5: Is it possible to set the Overflow flag if you add a positive integer to a negative integer?
Answer: Yes, it is possible to set the Overflow flag when adding a positive integer to a negative integer if the result is out of range for the destination operand.
Question 13: Will the Overflow flag be set if you add a negative integer to a negative integer and produce a positive result?
Answer: No, the Overflow flag will not be set when adding two negative integers that result in a positive value.
Question 14: Is it possible for the NEG instruction to set the Overflow flag?
Answer: Yes, the NEG instruction can set the Overflow flag if the operation results in an overflow.
Question 15: Is it possible for both the Sign and Zero flags to be set at the same time?
Answer: No, it is not possible for both the Sign and Zero flags to be set simultaneously. If the result is zero, the Zero flag is set, indicating that the sign is not negative, so the Sign flag is not set. If the result is non-zero, the Sign flag is set to indicate whether it's positive or negative.
Exercise 1: Converting from Big Endian to Little Endian
Indented:
This program copies the value from bigEndian to littleEndian while reversing the order of the bytes.
Exercise 2: Exchanging Pairs of Array Values
This program exchanges every pair of values in the array with an even number of elements.
Exercise 3: Summing the Gaps between Array Values
This program calculates the sum of all the gaps between successive array elements in the array variable.
You can assemble and run these programs using an x86 assembler like NASM or MASM in either 32-bit or 64-bit mode as per your preference.
That was chapter 2