x86 PROCESSOR BASICS (HOW THE CPU ACTUALLY RUNS THE SHOW)
Imagine the CPU as the brain of your computer.
But not a chill brain — a cracked-out microsecond freak that runs everything on caffeine and electricity. Here’s how it works:
🏛️ The CPU – Central Processing Unit
This is where all the thinking, math, and decision-making happens.
It has:
Registers – tiny super-fast storage slots (think 32-bit pockets for numbers)
Clock – keeps time like a heartbeat so stuff happens in sync
Control Unit (CU) – the boss that decides what happens next
ALU (Arithmetic Logic Unit) – the muscle that does all the math and logic ops (ADD, SUB, AND, OR, NOT, etc.)
🔌 How the CPU Connects to the World
The CPU talks to the rest of the PC through pins on its socket. These pins connect it to buses — long electric highways carrying signals.
🔌The 3 main buses:
Data Bus
Moves the actual data and instructions between the CPU, memory and I/O devices.
The data bus is bidirectional, meaning information can flow in both directions.
The "width" of the data bus (how many parallel wires it has) determines how much data can be transferred at once.
A 64-bit data bus can move 64 bits of data simultaneously.
Analogy: The data bus is like a fleet of delivery trucks that transport goods (data) and mail (instructions) between the city hall (CPU), the library (memory), and various businesses (I/O devices). These trucks can deliver or pick up cargo.
🟨 Address Bus
Says where in memory we’re looking.
The address bus is unidirectional, meaning information flows only from the CPU to other components.
It carries the memory addresses or I/O port addresses where data is to be read from or written to.
When the CPU wants to access a specific piece of data or instruction, it places its memory address on the address bus, telling the memory unit exactly where to find or store that information.
The width of the address bus determines the maximum amount of memory the CPU can access.
A 32-bit address bus can address 232 unique memory locations (4 Gigabytes).
Imagine your computer's RAM as a massive library, and each book in that library has a unique shelf and position. When the CPU wants to read a specific piece of information (a "book"), it doesn't just shout out the book's title.
Instead, it sends out the exact "shelf number" and "position" through the address bus. This "shelf number and position" is what we call a memory address.
This one-way communication ensures that the CPU can accurately request data from, or send data to, a specific spot in memory.
🟥 Control Bus
Uses binary signals (on/off) to tell devices when to send or receive. It synchronizes the actions and manages the flow of information among all devices attached to the system bus.
Think: "Hey RAM — CPU wants to read now!"
It carries control signals that dictate operations like "memory read," "memory write," "I/O read," "I/O write," "interrupt request," and "bus grant."
These signals ensure that devices don't try to use the buses simultaneously or perform conflicting operations.
The control bus is like the city's traffic light system.
Other Buses
⚫ I/O Bus
Handles data moving between CPU and input/output devices (keyboard, mouse, etc.)
Also called the Peripheral bus, considered part of the system bus, but, yeah, it’s a bit different coz its dedicated to transferring data between the CPU and the system I/O devices.
Modern systems often use high-speed serial buses like PCI Express (PCIe) for this purpose.
This bus is all about getting data to and from your input/output devices. Imagine:
Keyboard Input: When you type "hello," that information needs to travel from your keyboard into the computer. The I/O bus is the route that data takes, like supplies being delivered to a restaurant. ⌨️➡️💻
Printer Output: When you hit "print," the document data needs to go from your computer out to the printer. The I/O bus handles this, much like official documents being sent out to residents. 💻➡️🖨️
The I/O bus is considered part of the overall system bus, but it's "a bit different" because it's dedicated to these specific external communications. Modern systems use advanced, high-speed I/O buses like PCI Express (PCIe).
Memory: Where Programs & Data Live
All your running programs and variables are stored in RAM. But here’s the kicker:
The CPU can’t run them straight from RAM coz it is a temporary storage locker for the CPU.
It always does this:
Grabs the instruction from memory.
Brings it into the CPU which has small temporary storage locations, registers, the ones we covered in the previous chapter.
Executes it.
Maybe sends a result back to memory.
So, your code doesn’t run in RAM, it runs inside the CPU — one piece at a time, or in chunks.
We’re going to see more about this Fetch, Decode, Execute cycle ahead.
📦 Buses Summary (Quick Table):
TLDR – Reverse Engineering Focus:
Know the ALU is where bitwise ops live (AND, OR, SHL, etc.)
Know that registers are the CPU’s playground — what you see in disasm (like eax, edx, rsi, etc.)
Remember: instructions run inside the CPU, not memory. Memory just holds them until they’re needed.
Buses = wires that move the ops around. If you’re watching malware move code into memory and jump to it — that’s this system in action.
A register's purpose often becomes clear from the instructions around it. Is it being used as a counter in a loop? An argument for a function? The return value? The context will clue you in.
Learn by Doing: The more assembly code you read and write (even small snippets!), the more you'll see how registers are actually used in real programs. This hands-on experience beats rote memorization any day.
CLOCK & CLOCK CYCLE (X86 CPU TIMING EXPLAINED)
The Unseen Rhythm: What’s the Clock?
The CPU clock is like the relentless, precisely timed heartbeat of your processor — ticking at a fixed speed (e.g., 1 GHz = 1 billion ticks per second).
It is an internal electronic signal that oscillates at an incredibly fixed and high frequency.
This isn't just a simple timer; it's the master synchronizer that orchestrates every single operation within the processor and its interactions with the rest of the computer system.
This clock ticks at a specific, fixed speed, often measured in Gigahertz (GHz). For example, a 3 GHz CPU means the clock "ticks" 3 billion times every second. This incredible speed allows for billions of individual operations to occur in a mere blink of an eye.
The clock keeps the CPU, RAM, and buses perfectly in sync. It ensures data moves smoothly and at the right time — no timing chaos, no crashes. Without it, everything would fall apart.
What’s a Clock Cycle?
One clock cycle = one complete tick = the smallest unit of time the CPU understands.
One clock cycle is equivalent to one complete tick. It represents the smallest indivisible unit of time the CPU understands and utilizes to perform any action. Nothing, absolutely nothing, can happen for a duration shorter than one clock cycle.
The duration of a single clock cycle is simply the inverse of the clock speed.
For a CPU running at 1 GHz (1,000,000,000 cycles per second), one clock cycle lasts:
This is an incredibly tiny slice of time, emphasizing the sheer speed at which modern processors operate.
Clock Cycle in Action
Every CPU instruction takes at least 1 clock cycle to run.
Thanks to pipelining, modern CPUs can crunch simple operations super fast — even finishing one per cycle.
But older CPUs? Different story.
On something like the Intel 8088, a single MUL instruction could eat up tens or even hundreds of cycles. 🐢
Meet the 8088 – The OG PC Chip
Dropped in 1981, the Intel 8088 powered the first IBM PCs.
That moment? Kicked off the whole personal computer era.
It was a cost-cut version of the 8086 — same 16-bit CPU inside, but with an 8-bit external data bus instead of 16.
Why? So, IBM could use cheaper 8-bit parts and simpler motherboard designs. 💸
Downside? To move 16-bit data, the 8088 had to do two 8-bit transfers. Slower memory and I/O — but it was worth it for the cost savings at the time.
✅ Segmented Memory (Remember this?)
The 8088, like the 8086, used segment:offset addressing to get around the 64KB memory limit.
It combined:
A 16-bit segment register (points to a 64KB block)
A 16-bit offset
Together = a 20-bit address → Boom, access to 1MB of RAM.
(2¹⁶ segment shifted left by 4 bits + offset = 20-bit address) – we discussed this before.
✅ Compatibility Bonus
The 8088 ran the same instructions as the 8086 — full instruction set compatibility.
So devs didn’t have to rewrite anything. If it ran on 8086, it ran on 8088.
That made adoption easy and fast — crucial for software devs.
Modern x86 CPUs are incredibly sophisticated. They employ techniques like pipelining and out-of-order execution.
✅ Pipelining
Imagine an assembly line. Instead of one worker building an entire car from start to finish, different workers perform different stages simultaneously on different cars.
In a CPU, this means that while one instruction is in its "execute" phase, another might be in "decode", and a third in "fetch."
✅ Out-of-Order Execution:
The CPU skips stalled instructions and runs independent ones first, then reorders the results.
It’s like working on what’s ready instead of waiting — keeps the clock cycles busy.
If the CPU has to wait for slow memory? That gap = wait states (empty cycles where CPU chills while memory catches up).
If someone’s confused about how a CPU doesn’t freeze when one instruction stalls, here’s the reason:
“The CPU looks at its queue like this. If one instruction’s waiting on RAM, it just hops to the next one that’s ready. Keeps that pipeline moving.”
Wait States – When the CPU’s Just… Waiting
🚧 The Problem:
The CPU’s insanely fast — like sprinting ahead at gigahertz speeds.
But RAM? RAM’s out here jogging. 🐢
So when the CPU asks RAM for some data, it’s gotta wait... and wait...
...because RAM’s still looking for it in its dusty file cabinet.
🚧 The Result:
While waiting, the CPU basically just sits idle, burning through clock cycles doing nothing.
These wasted cycles?
They’re called wait states — and yeah, they suck. It’s like revving a Ferrari just to sit in traffic.
🚧 The Fix: Enter the Caches
Modern CPUs fight back using caches — tiny, super-fast memory layers:
L1 – Small but lightning fast, closest to the core
L2 – Bigger, a bit slower
L3 – Even bigger, shared across cores
These caches stash frequently used data so the CPU doesn’t always have to bug slow RAM.
If the data’s in cache? Boom — no wait states.
If not? Well... back to traffic.
🔁 TLDR:
Wait states = CPU stalls for the RAM to catch up. Wasted Clock Cycles. Like a 140wpm giga-typist server admin waiting for some document from an intern who types at 20wpm, in order to reboot the server.
Happens when RAM’s too slow to deliver data on time.
Cache acts as the CPU's ultra-fast, on-site mini-warehouse. It’s a small, incredibly quick memory buffer that stores frequently accessed data and instructions.
Repeat: Cache is fast, tiny, and loaded with the stuff the CPU uses most so it doesn’t have to keep calling slowpoke RAM.
THE INSTRUCTION EXECUTION CYCLE: THE CPU’S ETERNAL GRIND
Modern CPUs may have deep pipelines, out-of-order logic, and all kinds of secret sauce —
but at the core?
They still run the same ancient loop over and over:
Fetch → Decode → Execute → Store
Billions of times per second. Non-stop.
⚙️ Step 1: Fetch – Go Get the Next Command
What Happens:
The Control Unit checks the Instruction Pointer (IP / EIP / RIP) — this register holds the address of the next instruction to run.
That address is slapped onto the address bus.
CPU sends a READ signal on the control bus.
RAM (like a good librarian) grabs the binary instruction from that address — say 0100101010101010 — and sends it back through the data bus.
The instruction lands in the Instruction Register (IR) — ready to be decoded next.
CPU bumps the Instruction Pointer forward, pointing it to the next instruction for the next cycle.
TLDR:
Instruction Pointer → Address Bus → RAM → Data Bus → Instruction Register.
Analogy Time:
Imagine a factory worker following a checklist.
They look at the next task on their list (Instruction Pointer).
Head over to the supply room (RAM) to grab the specific blueprint for the task (instruction).
Bring it back to their station (Instruction Register) and get ready to work.
Then? Flip the page to the next item on the checklist — ready for the next fetch.
Step 2: Decode – “Alright, What Are We Even Doing?”
Once the instruction is loaded into the Instruction Register, the CPU begins the decode stage, where it interprets the raw binary instruction. During this phase, the Control Unit analyzes the instruction to determine exactly what needs to be done.
First, the CPU identifies the opcode (operation code), which specifies the type of operation to perform. This could be an instruction such as ADD, MOV, or JMP, and each opcode corresponds to a unique binary pattern that tells the CPU what action is required.
Next, the CPU determines the operands, which indicate the data involved in the operation. These operands may refer to values stored in registers, specific memory addresses, or immediate values that are directly embedded within the instruction itself. At this point, the CPU figures out where the data will come from and where the result should be placed.
After identifying the opcode and operands, the CPU translates the instruction into a sequence of micro-operations. These are small, precise internal steps that the hardware can execute, such as sending data from a register to the ALU, instructing the ALU to perform a calculation, and storing the result back into a register.
An easy way to understand this stage is through a factory analogy. After fetching a blueprint, the worker studies it carefully to understand what needs to be built. The opcode represents the final product (like “Widget X”), the operands are the required parts (such as Part A and Part B), and the micro-operations are the specific tools and steps used to assemble everything. At this stage, no actual building happens yet—the system is simply preparing and understanding the task before execution begins.
🔨 Step 3: Execute – “Do the Work!”
During the execute stage, the CPU carries out the action that was identified in the decode phase. At this point, there is no more interpretation or preparation—the system simply performs the required operation. The Control Unit sends signals to activate the appropriate components of the CPU based on the instruction.
One possible operation involves math or logic processing. In this case, the Arithmetic Logic Unit (ALU) takes control. The Control Unit supplies the operands to the ALU, which then performs operations such as addition, subtraction, or logical comparisons like AND and OR. Once the calculation is complete, the result is produced and prepared for storage. This process is powered by countless transistors switching on and off, allowing binary values to flow through logic gates at extremely high speeds.
Another possibility is data movement. For instructions such as MOV or LOAD, the CPU simply transfers data between registers or between memory and registers. No calculations are performed here; instead, the CPU focuses on efficiently routing data to the correct destination.
A third scenario involves control flow operations. Instructions like JMP or CALL cause the CPU to change the sequence of execution. The Instruction Pointer is updated to a new memory address, allowing the program to jump to a different section, such as a loop or function. This enables programs to make decisions and repeat tasks.
Physically, all of these operations are executed through the rapid switching of transistors inside the CPU. These tiny electronic switches work in perfect synchronization with the system clock, coordinating the flow of electrical signals. What appears as complex computation is actually the precise movement of electrons through circuits.
A helpful analogy is a factory worker who has already understood the blueprint and gathered the necessary parts. Now, they begin assembling the product by using the required tools, combining components, and completing the task. This stage represents the hands-on work, where the CPU actively processes data and produces results.
📥 Step 4: Store (Write-Back) – “Save the Result!”
What Happens:
The CPU just finished running the instruction — now it needs to put that result somewhere useful.
If it’s needed immediately?
→ Stored in a register for the next instruction to grab.
If it’s meant for long-term use?
→ Sent over the data bus to a specific spot in RAM (picked out using the address bus).
→ CPU also fires a WRITE signal through the control bus to tell memory, “Hey, store this here.”
TLDR:
Output goes to a register or memory, depending on where it’s needed next.
Analogy:
The factory worker just finished building Widget X.
If another worker needs it right away?
→ It goes into the "active bin" at their workstation (register).
If it’s going to storage or shipping?
→ They box it up and send it off to the warehouse (memory).
♻️ CPU Life:
This 4-step loop — Fetch, Decode, Execute, Store — never stops.
From the moment you power on to the second you shut down.
Billions of instructions per second. No breaks. No excuses.
♻️ Real Talk:
If the CPU clock is the beat, then the instruction cycle(F-D-E-S) is the dance.
Every instruction is a dancer. Same steps every time, just with different moves.
The Grand Choreography – From Boot to Shutdown
The Fetch → Decode → Execute → Store cycle isn’t optional.
It’s the heartbeat of your computer.
Every game you’ve played, every piece of malware you’ve reverse engineered, every compiler you’ve used —
all of them are just riding this loop. Billions of times per second.
🎵 So, What’s the Vibe?
If the CPU’s clock is a drumbeat, the instruction cycle is the choreo.
Fetch: Scope out the next move.
Decode: Understand what the move means.
Execute: Process the move.
Store: Save it and prepare for the next beat.
Even the wildest zero-day malware is just flipping bits in this same rhythm.
Its logic that runs the digital universe.
🕵️ Why This Matters for You:
As a reverse engineer, this is your map.
When you’re analyzing disassembly, you’re watching this dance play out — step by step.
When there’s lag? You’re spotting missed steps (wait states, cache misses).
When code’s obfuscated? You're untangling its footwork.
The better you understand this cycle, the more x-ray vision you get into what software really does underneath all the GUI fluff.
This isn’t just theory — it’s the grind behind every syscall, jump, XOR, and function call you’ll ever break down.
Once assembly becomes second nature, you’ll find it really easy to work with big tools like:
Binary Ninja:
x64dbg:
Reading from Memory: Why It’s Slower Than Just Grabbing from a Register
🐢 Why So Slow?
Accessing memory (RAM) ain’t like snatching something from your pocket (registers). It’s more like texting someone, waiting for a reply, and then saving that reply.
Here’s the real sequence when your CPU wants to read from memory:
Address Bus: CPU places the memory address it wants to read from.
Read Signal (RD Pin): It triggers a read signal — basically asking “Yo RAM, give me that.”
Wait: Gotta pause one or more clock cycles while memory gets its act together.
Data Bus: RAM finally sends the requested data back to the CPU, which copies it to a register or operand.
Each step takes time. Multiply by billions of reads? That lag adds up.
⚡ Registers vs Memory — Quick Comparison
Cache to the Rescue:
Because RAM is slow and registers are limited, CPUs got smart and added a middleman: Cache.
🟢 Level 1 (L1) Cache: Smallest, fastest, and lives inside the CPU.
🟡 Level 2 (L2) Cache: Slightly slower, but larger. Connected to CPU with high-speed buses.
🔴 Level 3 (L3) Cache (optional in some CPUs): Even bigger, shared between cores.
If the data’s in the cache? That’s a cache hit → super fast access.
If not? Cache miss → gotta go all the way out to RAM.
💾 Loading and Executing a Program
Step-by-Step: How Programs Go From File to Running
Before your code can run, the Operating System (OS) has to do a whole setup behind the scenes:
Find the Program: OS locates the file on disk by scanning the file system (directory).
Load It into RAM: The actual binary gets pulled into memory.
Allocate Memory: OS sets aside a chunk of RAM just for that program to play in.
Track the Program: It creates a Process ID (PID) and adds it to its tracking system.
Set Entry Point: OS points the CPU to the start of the program’s instructions.
Run It: CPU begins executing from the program’s entry point — instruction cycle begins.
Clean Up: When the program ends, OS clears its memory and removes the process from its tables.
🧃 Analogy:
Think of it like opening a game:
You click the icon → OS finds it.
Game files get loaded into RAM.
OS gives it some desk space to work on.
CPU is told: “Start here.”
CPU starts reading instructions one-by-one like it’s reading off a game manual at insane speed.
Linkers & Loaders: The Final Plug That Makes Code Run
⚒️ First up: The Linker
You wrote code. You compiled it. Now what?
When your source code is compiled (.c → .o), it’s not fully standalone yet. It’s like having puzzle pieces — but they haven’t been snapped together.
👉 What the Linker does:
Takes all those object files (.o, .obj) from different modules
Resolves external references (like when one file calls a function from another file)
Pulls in libraries (like printf() from libc)
Glues it all together into one executable file (.exe, .out, etc.)
Example:
Each gets compiled into separate .o files. The linker merges them into one file and makes sure the main() knows exactly where greet() lives in memory. Output of linker = a complete executable with all pieces in place.
Now Enter: The Loader
The loader is part of the Operating System, and it comes into play when you run the program.
👉 What the Loader does:
Reads the executable file from disk
Allocates memory for the code, stack, heap, etc.
Sets up the process environment (process ID, file descriptors, etc.)
Maps libraries into memory if needed (e.g., dynamic/shared libs)
Fixes up addresses if relocations are needed
Tells the CPU: “Alright, start here at the entry point”
It’s the person backstage setting up the mic, the lights, and the props right before the band walks on.
👉 Comparison:
🕶️ Analogy: Compiler, Linker, and Loader
Think of the whole process like building and driving a car. The compiler creates individual car parts, turning source code into machine-level pieces. The linker then assembles those parts into a complete car, connecting everything so it works as one unit. Finally, the loader takes that finished car out of the garage, places it on the track, starts the engine, and hands control over so it can actually run.
❓ Why This Matters (Especially in Reverse Engineering)
Understanding how programs are linked and loaded is extremely important in reverse engineering. When you know how a program is linked, you can identify its dependencies, such as external libraries or functions it relies on. This helps you understand the structure of the program and where key functionality might reside.
Knowing how a program is loaded into memory allows you to identify entry points, memory regions, and how execution begins. This is critical when analyzing or modifying program behavior. You can locate sections like .text (code), .data (initialized data), and .reloc (relocation info), which are essential when inspecting or patching binaries.
In more advanced scenarios, such as malware analysis, attackers may manipulate the loading process itself. They can inject code before execution begins or alter how memory is initialized, making it crucial to understand these stages in detail.
🔄 Are Crack “Loaders” the Same as OS Loaders?
The short answer is no—they are not the same. However, they rely on similar underlying concepts.
🖥️ The Legitimate OS Loader
The operating system’s loader is a built-in component responsible for preparing a program to run. It loads the executable into memory, assigns it a process space, resolves dependencies, and transfers execution to the program’s entry point. In simple terms, it acts like a stage crew that sets everything up before the performance begins.
🕵️ The Crack Loader (Warez/Gaming Context)
A crack loader is a separate tool designed to manipulate how a program runs. Instead of simply launching the executable, it interferes with or modifies execution to bypass restrictions such as licensing or trial limitations.
These loaders often intercept the program at runtime and alter its behavior in memory. For example, they may modify functions responsible for license verification so that they always return a “valid” result. In some cases, they inject additional code, bypass protections, or disable security checks.
🔫 What Crack Loaders Do
Crack loaders typically modify execution in several ways. They may intercept how the original program starts, inject code into memory, or patch logic related to licensing and restrictions. Some loaders hook into internal functions and force them to return favorable values, effectively unlocking premium features.
They can also bypass anti-debugging or anti-tampering mechanisms, allowing deeper inspection or modification of the program. In many cases, the loader runs the original executable in memory, applies patches dynamically, and then starts execution—leaving the actual file on disk unchanged.
💣 Why Use a Loader Instead of Cracking the EXE Directly?
Using a loader can be more effective than modifying the executable file itself. Some programs include integrity checks, such as checksums, that detect file modifications and prevent execution if changes are found. Others are packed or encrypted, making static patching difficult without first unpacking them in memory.
Advanced protections like anti-tamper systems further complicate direct modification. Loaders avoid these issues by applying changes at runtime, which makes them harder to detect. Because the original file remains untouched, anti-cheat or anti-crack systems are less likely to flag the program.
🧪 Example Flow of a Crack Loader
When you run a crack loader, the process begins when you click the loader executable. This program acts as a middleman between you and the actual application you want to run.
First, the loader silently launches the original game or software in the background. However, before the program fully starts, the loader intervenes and begins modifying its behavior in memory. It may patch specific bytes, effectively changing how certain instructions behave. It can also skip or disable license verification checks, ensuring that any validation logic is bypassed.
In some cases, the loader goes further by faking responses, such as simulating a successful login or license validation. As a result, when the program continues execution, it behaves as if everything is legitimate. From the program’s perspective, it appears that the user has valid access, even though the loader has manipulated the outcome behind the scenes.
💀 So… Same Loader?
Although both OS loaders and crack loaders deal with loading programs into memory, they serve very different purposes. An operating system loader is designed for normal, secure execution of software, ensuring everything is properly initialized and ready to run.
A crack loader, on the other hand, is a custom tool built to interfere with that process. It leverages the same fundamental idea—controlling how a program is loaded into memory—but uses it to alter execution in ways the original developers did not intend.
At the core, both rely on the same principle: if you can control what gets placed into memory, you can influence what the CPU executes.
🎯 In Reverse Engineering and Malware Analysis
This concept is especially important in reverse engineering and malware analysis. Analysts study executable formats like the Portable Executable (PE) structure to understand how programs are organized and loaded.
They also learn techniques such as dumping unpacked binaries from memory, which allows them to analyze code after it has been decrypted or unpacked at runtime. This is crucial when dealing with protected or obfuscated software.
Additionally, analysts watch for advanced techniques like code caves, manual mapping, and reflective loading. These methods allow code to be injected or executed without following the standard loading process used by the operating system.
Understanding these behaviors helps analysts detect, analyze, and counteract software that manipulates execution, since such loaders often bypass normal system rules and operate in less visible ways.
DLL Linking — Static vs Dynamic
🧱 Static Linking (Old School, But Solid)
All the required library code gets copied directly into your final .exe during compilation.
The final executable becomes self-contained.
Bigger file size, but no dependency on external DLLs.
Example:
If you statically link math.lib, the functions like pow() are baked into the EXE. No DLL needed at runtime.
👊 Pro: No external DLL problems
Con: Bigger EXEs, can’t patch/update libraries easily
Dynamic Linking (Welcome to the Modern World)
Your EXE doesn’t contain the actual function code.
Instead, it says: “Yo OS, when I run, grab this function from some.dll please.”
Code is loaded at runtime from DLL files.
Makes your EXE lighter and more modular.
Example:
You compile with user32.dll for MessageBoxW() — it’s not copied into your EXE.
Instead, Windows loads user32.dll when your app runs and links the call then.
Pro: Smaller files, easier library updates
Con: If DLL is missing, wrong version, or hacked... chaos.
And Now… Cracks, Loaders, and DLLs
🔁 DLL Injection:
Malware or game cracks use dynamic linking to their advantage.
Here’s how:
A “loader” injects a custom DLL into a target process (like a game).
That DLL might:
Hook system functions.
Bypass checks.
Unlock features.
Or even replace existing DLLs.
🥷 DLL Hijacking:
If your EXE expects libX.dll and it finds your fake libX.dll in the same folder, guess what?
💥 It’ll load yours.
Crackers use this to:
Replace original DLLs with modded ones.
Force dynamic linking to their own payloads.
Intercept or reroute legit game functions.
That's why you sometimes see “Put cracked DLLs in game folder” — it's hijacking the load path.
🧵 TLDR Recap:
Bonus Thought:
In reverse engineering or malware analysis:
Look for Import Address Tables (IAT) in PE files to see dynamic links
Use tools like CFF Explorer, x64dbg, or Ghidra to trace loaded DLLs
Trace calls like LoadLibraryA, GetProcAddress, VirtualAlloc, etc. — that’s where dynamic magic (or evil) happens
Advanced stuff we won’t touch over here
You down for a visual table of what the EXE looks like when statically vs dynamically linked?
I can also show how a loader intercepts the load path using a diagram.
This rabbit hole just keeps going deeper 🐇🔍
CFF Explorer:
Ghidra:
Cutter v2.0:
FIRST ASSEMBLY PROGRAM 👶💻
We are done with theory. Let's write code.
We will look at a simple program that takes two numbers, adds them together, and saves the result in a Register (a tiny, super-fast storage slot inside the CPU).
The Basic Structure
main PROC: This marks the beginning. Think of PROC (Procedure) as the start of a function in Python or C++. It tells the computer, "Start executing here."
MOV eax, 5: This is the assignment operator. We are putting the value 5 into the register named EAX.
Note: MOV stands for "Move," but it really means "Copy." The 5 doesn't disappear from where it came from; it just gets copied into EAX.
ADD eax, 6: The math happens here. The CPU takes the value currently in EAX (which is 5), adds 6 to it, and stores the result (11) back into EAX.
INVOKE ExitProcess, 0: This is a call to the Operating System (Level 2!). It tells Windows, "I am done here, shut it down." Without this, the program might crash or hang.
main ENDP: The "End Procedure" marker. It closes the block we opened with main PROC.
Introducing Variables and Segments
Real programs need to store data, not just hard-coded numbers.
To do this, we divide our program into Segments.
Think of segments as different rooms in a house, each with a specific purpose.
Here is the upgraded program with variables:
I. The .data Segment
This is where you declare variables. It is a specific area in memory reserved just for storage.
sum DWORD 0:
Name: sum
Size: DWORD (Double Word). This means 32 bits.
Value: 0 (The initial value).
II. The .code Segment
This is where your instructions (logic) live. This area is usually "Read-Only" so you don't accidentally overwrite your own program code while it's running.
III. The .stack Segment
(Mentioned briefly) We will cover this later, but this is a scratchpad area for temporary storage during function calls.
The Wild West of Data Types
In high-level languages like C++ or Java, data types are strict. You must clearly say whether something is an integer, a floating number, or a character.
If you try to store a letter in an integer, the compiler immediately throws an error and stops you.
Assembly language works very differently. Assembly does not enforce data types at all. It does not protect you or correct your mistakes.
In Assembly, size is what matters, not meaning. When you write something like DWORD, you are only telling the computer to reserve 32 bits of memory. You are not saying what kind of data will be stored there.
There is no type checking. The CPU does not know or care whether those 32 bits represent a number, a letter, or a memory address. It will process the data exactly as you tell it to.
This gives you total control, but also total responsibility. You can treat a number like a character or an address if you want, and Assembly will allow it. If you make a mistake, the program will crash or behave incorrectly. There are no safety rails.
Big Idea to Remember
Memory is organized into segments. The .code segment holds the program logic, while the .data segment holds variables.
Registers, such as EAX, are the CPU’s working space. They temporarily hold data while the processor performs operations.
Instructions tell the CPU what to do. MOV copies data, ADD performs math, and INVOKE communicates with the operating system.
Assembly does not understand data types. It only understands how many bits something uses, not what those bits are meant to represent.
INTEGER LITERALS
An integer literal (also called an integer constant) is a number written directly in a program.
An integer literal can have:
an optional sign (+ or -)
one or more digits
an optional radix letter at the end that tells us what base the number is written in
General form:
[{+ | -}] digits [radix]
Examples
26 - This is a valid integer literal. It has no radix letter, so we assume it is decimal (base 10).
26h - This means 26 in hexadecimal (base 16).
1101 - This is treated as decimal, not binary, because there is no radix letter.
1101b - The b tells us this number is binary (base 2).
So, without a radix letter, the number is always assumed to be decimal.
Radix Table
Here is the table:
Important note about Encoded Real
Encoded Real does not have a specific base value.
It is a binary format used to represent floating-point numbers, not normal integers.
Examples of Integer Literals with Radixes
Each line below shows an integer literal, followed by a comment explaining its base:
HEXADECIMAL BEGINNING WITH A LETTER
In assembly language, a hexadecimal number that starts with a letter must have a leading zero.
Why?
Because the assembler might think the value is a name (identifier) instead of a number.
Example that causes an error
This causes an undefined symbol error.
Why this happens:
The value starts with the letter A
The assembler assumes A123h is the name of a variable or label
Since no such name exists, it throws an error
Correct version (with leading zero)
Now it works correctly.
The leading zero tells the assembler:
“This is a hexadecimal number, not an identifier.”
Rule to remember
Any hexadecimal literal that begins with a letter must start with 0.
Examples:
0A3h ✅
0FFh ✅
A3h ❌
CONSTANT INTEGER EXPRESSIONS
A constant integer expression is a math expression made using:
integer literals
arithmetic operators
These expressions are calculated at assembly time, not while the program is running.
From now on, we’ll just call them integer expressions.
Important rule: The final result must be an integer, must fit in 32 bits, valid range: 0 to FFFFFFFFh
Arithmetic Operators and Precedence
Operator precedence means the order in which operations are done.
Here is the table, from highest priority to lowest priority:
What does unary mean?
Unary means the operator works on one value only.
Examples:
-5 → unary minus (one number)
+3 → unary plus (one number)
This is different from: 5 - 2 → subtraction (two numbers)
Unary operators explained
Unary plus (+)
Just returns the value +5 → 5
Unary minus (-)
Changes the sign -5 → negative five
Why unary has higher precedence
Unary plus and minus are done before multiplication and division.
Example:
What happens:
-2 is evaluated first (unary minus)
Then -2 * 3
Result is -6
Operator Precedence Examples
Multiply first, then add. Result: 14
1 mod 5 first → 1
Then subtraction. Result: 11
Unary minus first → -5
Then add. Result: -3
Parentheses first
Then multiply
Result: 36
Using Parentheses (Best Practice)
Even if you know the rules, use parentheses.
Why?
Makes expressions easier to read
Prevents mistakes
You don’t have to remember precedence rules
Modulus Operator (mod or %)
The modulus operator gives the remainder of a division.
Example:
That’s all it does—no magic.
REAL NUMBER LITERALS
A real number literal is just a number that can have:
a decimal point
or a fraction
or a very large / very small value
These are also called floating-point numbers.
In assembly, real numbers can be written in two ways:
Decimal reals (the normal way humans write numbers)
Encoded reals (hexadecimal form, using IEEE format)
Decimal Real Numbers
A decimal real looks like a normal decimal number.
A decimal real number is a number written in base-10 (decimal) notation, the same format used in everyday arithmetic.
It represents a value on the real number line and may include a fractional part and, optionally, an exponent. Examples include 3.14, -0.5, and 6.02 × 10²³.
General form:
Let’s break that into plain English.
A decimal real number can be broken into several components. Some parts are required, while others are optional, depending on how the number is written.
Parts of a decimal real
⭐ The sign indicates whether the number is positive or negative.
Represented by + or -
If no sign is written, the number is assumed to be positive
The sign applies to the entire value of the number
Examples:
+7.25 → positive
-4.6 → negative
9.1 → implicitly positive
⭐ The integer part (also called the whole number part) is the sequence of digits to the left of the decimal point.
Represents the whole units of the number
Can be 0 if the value is less than 1
Must contain at least one digit if a decimal point is present
Examples:
123.45 → integer part is 123
0.75 → integer part is 0
-8.9 → integer part is 8
⭐ The decimal point separates the integer part from the fractional part.
Indicates that digits to the right represent fractions of a whole
In decimal real numbers, a dot (.) is used (not a comma)
Without a decimal point, the number is an integer, not a decimal real
Example: In 45.67, the dot separates 45 and 67
⭐ The fractional part consists of digits to the right of the decimal point.
Represents values less than one (tenths, hundredths, thousandths, etc.)
Each digit has a place value based on powers of 10
Can be omitted if the number is a whole number
Examples:
3.14 → fractional part is 14
10.0 → fractional part is 0
6. → fractional part omitted (still valid in many contexts)
⭐ The exponent is used in scientific notation to scale the number by a power of 10.
Written using × 10ⁿ or e notation (e.g., 1.5e3)
Allows compact representation of very large or very small numbers
The exponent indicates how many places the decimal point is shifted
Examples:
6.02 × 10²³ → very large number
3.1 × 10⁻⁴ → very small number
7.5e2 → same as 750
⭐ Why Decimal Reals Are Used
Decimal real numbers are especially useful because they:
Accurately represent fractions and continuous values
Are intuitive and easy for humans to read
Can represent very large or very small quantities when combined with exponents
Are widely used in science, engineering, finance, and computing
Exponent format
E [+ or -] integer
The exponent means:
“Multiply this number by 10 raised to some power.”
I. What “Exponent format” means
Exponent format is a shortcut way of writing big or small decimal numbers.
It looks like this:
Eg 44.2E5
This does NOT mean a new kind of number.
It simply means:
Take the number and multiply it by 10 raised to a power
What the E actually means
The letter E stands for “× 10 to the power of”. So:
Examples:
E5 means × 10⁵
E-3 means × 10⁻³
How to Think About Exponents
Golden Rule (memorize this)
👉 The exponent never changes the digits.
👉 It only moves the decimal point.
That’s it. Nothing else.
Step-by-Step Examples (Slow and Clear)
Example 1: 2.
Means 2.0
The decimal point is present
Any number with a decimal point is a real number
Value = 2
Example 2: +3.0
+ means positive
Same value as 3.0
Value = 3
Example 3: -44.2E+05 (this looks scary, but it’s not)
Step 1: Ignore the sign for now. Start with 44.2
Step 2: Understand the exponent - E+05 means × 105
So, we are doing: 44.2 × 105
Step 3: Move the decimal point
Power is +5
Move the decimal 5 places to the right
44.2 → 4,420,000
Step 4: Apply the sign – The original sign was negative
✅ Final answer: -4,420,000
Example 4: 26.E5 (this confuses many beginners)
Step 1: Look carefully at the number - 26.
There are no digits after the decimal point.
👉 This is allowed.
👉 It is automatically assumed to be: 26.0
Step 2: Apply the exponent - E5 means ×105
So, 26.0 * 105
Step 3: Move the decimal point 5 places to the right - 26.0 → 2,600,000
✅ Final answer: 2,600,000
“But there are no digits after the dot!”
That’s okay.
26. means 26.0
Missing fractional digits are assumed to be zero
So: 26.E5 = 26.0 × 105
This is 100% valid.
Another Example: 44.2E05
E05 still means 10⁵
Leading zeros in the exponent do not change the value
So: 44.2E05 = 44.2 × 105 = 4,420,000
⭐ 26.E5 → valid
⭐ 44.2E05 → valid
Both are correct scientific notation.
The “Aha” Idea (Most Important Part)
🔑 The exponent does NOT change the digits.
🔑 It only moves the decimal point left or right.
Positive exponent → move right
Negative exponent → move left
Once this clicks, exponent format becomes easy.
Encoded Real Numbers (Beginner Explanation)
Why this exists?
Humans and computers do not store numbers the same way.
Humans write numbers like: 1.0
Computers cannot store decimals directly
Computers store numbers as binary patterns (0s and 1s)
An encoded real number is: A real number converted into a binary pattern so the computer can store and process it.
An encoded real is a real number that has been:
Converted into binary
Stored using a fixed standard format
Written in hexadecimal to make it easier for humans to read
This standard format is called: IEEE floating-point format.
Why Hexadecimal Is Used
Binary numbers are very long and hard to read:
So, we group the bits into chunks of 4 and write them in hexadecimal:
That gives: 3F800000
Important Idea (Very Important)
3F800000 is NOT a normal number⚠️
It does not mean “three million something”.
It is: A code that represents the real number 1.0
Humans vs Computers (Clear Comparison)
They represent the same value, just in different forms.
The r at the End (Assembler Hint)
When writing encoded reals in assembly language, you may see: 3F800000r
The r tells the assembler:
“This hexadecimal value is an encoded real number, not an integer.”
Without the r, the assembler would treat it as a normal hex integer.
Example 1: Encoded Real for 1.0
Step 1: Binary representation
This binary pattern follows the IEEE 32-bit floating-point layout.
Step 2: Convert to hexadecimal - Group bits into 4s.
Final hex: 3F800000
Step 3: Mark it as a real number
3F800000r
This tells the assembler:
“Store the real number 1.0 using IEEE floating-point encoding.”
Summary
An encoded real is how a computer stores a real number
It is written in hexadecimal
It follows the IEEE floating-point format
The hex value is a bit pattern, not a normal number
The suffix r tells the assembler it is a real number
Encoded reals are not numbers — they are instructions for how the computer should interpret bits as a real value.
IEEE Floating-Point (Short Real)
A short real uses 32 bits, split like this:
Example 2: Decimal +1.0
Binary representation:
0 01111111 00000000000000000000000
Breakdown:
0 → positive number
01111111 → exponent for 1.0
000... → mantissa
Converting to hexadecimal
Group bits into 4s: 0011 1111 1100 0000 0000 0000 0000 0000
Convert each group to hex: 3FC00000
So, the encoded real is: 3FC00000
Important note (and a relief)
We won’t use real-number constants for a while.
Why?
Most x86 instructions work with integers
Floating-point math is more advanced
You’ll come back to this later (Chapter 12), when it actually makes sense and feels useful.
Big-picture summary (don’t skip this)
Decimal reals → for humans
(3.0, -44.2E5, 26.E5)
Encoded reals → for the computer
(3F800000r, IEEE format)
You are not expected to memorize the binary layouts right now
Just understand what they are, not how to build them by hand
CHARACTER LITERALS
A character literal is one single character written inside single quotes or double quotes. Examples: ‘a’, “d”
How characters are stored
Even though a character looks like a letter, the computer stores it as a number.
This number comes from the ASCII table.
Example: 'A'
ASCII value (decimal): 65
ASCII value (hex): 41h
So, when you write: ‘A’
What actually goes into memory is: 65 or 41h
Important Reminder: Characters Are Just Numbers
The core idea (say this slowly)
👉 A computer does not understand letters or symbols.
👉 It only understands numbers (stored in binary).
So, when you see a character like A, the computer actually stores a number that stands for A.
When we say: “Characters are not magic inside the computer”
It means:
The computer does not store the shape of the letter
It does not store meaning
It stores a number code
Example: 'A' is stored as the number 65
The computer treats 65 as just a number.
Humans interpret that number as the letter A.
Why We Need a Table (ASCII)
Because characters are just numbers, everyone must agree on:
“Which number represents which symbol?”
That agreement is called ASCII.
The ASCII table is simply a lookup chart that says:
What the ASCII Table Contains (With Meaning)
1. Letters
Numbers assigned to alphabet characters.
Examples:
A → 65
a → 97
Uppercase and lowercase have different numbers.
2. Digits
Characters that look like numbers, but are still characters.
Examples:
'0' → 48
'1' → 49
⚠️ Important: '5' ≠ 5
'5' is a character
5 is a numeric value
3. Symbols
Punctuation and special characters.
Examples:
+ → 43
# → 35
@ → 64
4. Control Characters
Characters that do not print anything, but control behavior.
Examples:
New line
Tab
Backspace
They tell the computer how to format text, not what to display.
“You’re Expected to Recognize Common Ones”
This does not mean memorise the whole ASCII table.
It means:
Know a few important examples
Understand the idea, not the entire list
Common ones to recognize:
That’s usually enough for exams and understanding code.
💡 Characters are just numbers.
💡 ASCII is the dictionary that maps numbers to symbols.
💡 The computer only sees numbers — humans see letters.
STRING LITERALS
A string literal is more than one character written inside quotes.
It can include:
letters
numbers
symbols
spaces
Examples:
📦 Notice About Strings
When working with strings, every character inside the quotes matters, including spaces. For example, '4096' is treated as a string, not a number, because it is enclosed in quotes. If you write ' 4096 ', the spaces before and after the digits are also part of the string and will be stored in memory just like any other character.
📦 How Strings Are Stored in Memory
A string is stored in memory as a sequence of bytes, where each byte represents a single character. Each character is converted into its corresponding numeric value based on a character encoding standard.
For example, the string "ABCD" is stored as four separate bytes. Each character is translated into its ASCII hexadecimal value:
A → 41h
B → 42h
C → 43h
D → 44h
So in memory, "ABCD" becomes a series of bytes: 41h 42h 43h 44h.
📦 Why Characters and Strings Are Stored as Integers
Computers can only store and process numbers. At the lowest level, memory is made up of bits, and bits represent binary values (0s and 1s). Because of this, everything—including text—must be converted into numerical form before it can be stored or processed.
This is why characters are represented as integers behind the scenes.
📦 Encoding Schemes
To make text representation possible, computers rely on encoding schemes such as ASCII and Unicode. These systems assign a unique numeric value to each character and provide a standard that all systems can follow.
For example, in ASCII:
'A' is represented by the number 65
'B' is represented by 66
'a' is represented by 97
These mappings allow computers to consistently interpret and display characters.
📦 Strings in Memory
A string is stored as a sequence of these character codes placed one after another in memory. In many programming languages (like C), strings are typically followed by a special value called a null terminator, which is 0.
This null terminator signals the end of the string. It tells the program where the string stops, so it doesn’t continue reading into unrelated memory.
For example, the string "ABC" in memory would look like: 41h 42h 43h 00h
The final 00h is the null terminator indicating the end of the string.
📦 Big idea
Characters look like letters
Strings look like words
But in memory:
characters = integers
strings = sequences of integers
At the memory level, everything is numbers.
"CAT" → 67 65 84
Each number is the ASCII code of one character.
Characters and strings are stored as integers because all data in computer memory is represented as numbers, using encoding schemes like ASCII.
👉 There is no special “text” storage inside the computer.
What changes is how the numbers are interpreted.
ASCII says:
65 means A
66 means B
97 means a
So, the computer stores numbers, and software decides:
“These numbers should be treated as characters.”
Text is not special inside a computer — it is just numbers that we choose to read as letters.
Characters and strings are stored as integers because computer memory can only represent numbers, and encoding schemes such as ASCII define how numeric values correspond to characters.
RESERVED WORDS
A reserved word is a word that the assembler has already claimed.
Think of it like this:
The assembler says: “This word already has a job. You can’t reuse it for something else.”
So:
You cannot use reserved words as variable names
You must use them only where they are meant to be used
Case Sensitivity
Reserved words are not case-sensitive.
That means:
They all mean the same instruction.
Types of Reserved Words (With Meaning)
1. Instruction Mnemonics
These are the actual commands the CPU understands.
Examples:
MOV → move data
ADD → add values
MUL → multiply values
You must not use these as identifiers.
2. Register Names
Registers are small storage locations inside the CPU.
Examples:
AX, BX, CX
EAX, EBX
These names are reserved because they refer to real hardware.
3. Directives
Directives tell the assembler, not the CPU, what to do.
They control how the program is built, not how it runs.
Examples:
.data → start of data section
.code → start of code section
4. Attributes
Attributes describe size or type of data.
Examples:
BYTE → 1 byte
WORD → 2 bytes
They help the assembler know how much memory to use.
5. Operators
Operators are symbols or words used in constant expressions.
Examples:
+, -, *
AND, OR
They are reserved because they perform calculations.
6. Predefined Symbols
These are special names that already have values.
Example: @data → returns a constant integer at assembly time
You don’t define them — the assembler provides them.
7. Summary: Reserved Words
Reserved words have special meaning
They can only be used in their intended context
They are not case-sensitive
You cannot use them as identifiers
IDENTIFIERS
I. What Is an Identifier?
An identifier is a name you choose.
You use identifiers to name:
Variables
Constants
Procedures
Labels
Identifiers exist to make code readable and understandable.
II. Rules for Forming Identifiers
1. Length
Must be between 1 and 247 characters
Long names are allowed
Short, meaningful names are recommended
2. Case Sensitivity
Identifiers are not case-sensitive. So:
They all refer to the same identifier.
3. First Character Rule
The first character must be one of these:
A letter (A–Z or a–z)
_ (underscore)
@
?
$
Cannot start with a digit.
Valid:
❌ Invalid:
4. Remaining Characters
After the first character, you may also use:
Letters
Digits (0–9)
Valid:
5. Cannot Be a Reserved Word
You cannot use:
These already belong to the assembler.
Good Identifier Naming (Style Matters)
Even though assembly looks cryptic, your names don’t have to be.
Invalid identifiers (why they are wrong):
Spaces are not allowed in identifiers.
Legal but Not Desirable
These work, but are discouraged:
Why?
_, $, and @ are often used internally by assemblers
Using them can cause confusion or conflicts
Reserved words have predefined meanings in assembly and cannot be used as identifiers, while identifiers are programmer-defined names that follow specific rules to improve code readability.
The assembler already owns some words — you choose names for everything else.
ASSEMBLER DIRECTIVES: THE BLUEPRINTS
If instructions (like MOV and ADD) are the bricks and actions of your program, Directives are the blueprints.
Directives are special commands for the Assembler (the software building your program), not for the CPU.
👉 Directives:
Are read only when assembling
Do not run when you click the .exe file.
Do not generate machine code instructions
Think of directives as setup instructions:
“Assembler, here’s how to build my program.”
They tell the assembler how to set up memory, where to put variables, and how much space to reserve before the program ever starts.
Directives are generally not case-sensitive. .data, .DATA, and .Data are all the same thing.
Directives vs. Instructions
The Directive (DWORD): Tells the assembler,
"Hey, reserve 4 bytes of space right here and call it myVar." (During assembly time).
Talks to the assembler.
The Instruction (MOV): Tells the CPU,
"Hey, go grab the data inside myVar and move it to a register." (Happens during run time).
Talks to the CPU.
DWORD is a directive.
It tells the assembler: “Reserve 4 bytes and store the value 26”
No CPU action happens here.
MOV is an instruction.
It runs at runtime.
It copies data into a register.
Important Properties of Directives
Directives are not case-sensitive
These all mean the same thing:
Why Directives Exist
Directives are used to:
Define variables
Allocate memory
Organize program sections
Define constants
Set up the stack
Control how code is assembled
PROGRAM SEGMENTS
A program is divided into segments.
Each segment has a specific purpose.
Directives tell the assembler: “This part of the program is for X.”
Common Assembly Directives
.data — Initialized Data Segment
The .data directive marks the section where variables with known initial values are stored.
“The following lines define data that already has values.”
Memory is reserved
Values are stored immediately
This is where you put constants and variables that have a starting value.
Defining known values (like a high score starting at 0, or a username).
.bss — Uninitialized Data Segment
This stands for "Block Started by Symbol" (an old history term), bss is an empty space.
Its used for variables that exist but start with no value. (like a buffer for user input).
“Reserve memory, but don’t store values yet.”
It saves space in the executable file. You don't need to store 1,000 zeros; you just tell the OS, "I need 1,000 bytes of empty space here."
Space is reserved
Contents are undefined (garbage)
Used for arrays and large buffers
.text or .code — Code Segment
This is the Read-Only zone where your actual code instructions live.
The CPU fetches commands from here.
“The CPU will run what comes next.”
This is actual program logic.
.equ — Define a Constant Symbol
A symbol is a name that represents:
A constant value
A memory location
An address
Symbols make code readable and maintainable.
The .equ directive defines a constant.
Once defined, it cannot change.
It works like Find and Replace.
It does not use any memory; it just helps you read the code.
Why .equ is useful
Avoids magic numbers
Easy to change values
Makes code portable
.stack — Define the Runtime Stack
What is the stack?
The stack is a special, dynamic area of memory used for temporary storage.
It manages subroutine calls (keeping track of where to return to) and local variables.
LIFO Structure: It works like a stack of plates. The last plate you put on top (Push) is the first one you take off (Pop).
Growth: Weirdly, the stack usually grows downwards in memory (from high addresses to low addresses).
Setting the Size: You must tell the assembler how big this scratchpad should be using the .STACK directive.
The runtime stack:
Stores return addresses
Stores local variables
Grows downward in memory
Uses LIFO (Last In, First Out)
What .STACK does
The .STACK directive:
Reserves memory for the stack
Sets its maximum size
Allocates 100 bytes for the stack
Prevents stack overflow (if sized correctly)
Why Stack Size Matters
If the stack grows beyond its allocated space:
Memory gets overwritten
Program may crash
Behavior becomes unpredictable
This is called stack overflow.
What happens here
.STACK → sets stack size
.data → stores text
.text → contains instructions
Directives prepare the program
Instructions run the program
Assemblers share the same instruction set, but directives differ between assemblers.
Example:
Microsoft assembler supports REPT
Other assemblers may not
Directives are assembler commands that control program structure, memory allocation, and symbol definition, and they do not generate executable machine instructions.
Directives build the program. Instructions run the program.
Directives (starts with .) = Instructions for the Assembler (Setup).
Instructions (like MOV) = Commands for the CPU (Action).
Segments:
.data = Variables with values.
.bss = Empty variables.
.code = The actual program logic.
.stack = Temporary scratchpad.
INSTRUCTIONS
Think of an instruction as a single, clear command given to the computer’s brain (the CPU).
When you write a line of assembly code, you are basically writing a to-do list for the processor.
However, the CPU doesn't speak English; it only understands bits and bytes.
So, when you assemble your code, a program called an Assembler acts as a translator, turning your written instructions into the machine language the computer actually runs.
An instruction isn't just one big lump of text; it’s usually broken down into four specific parts.
1. The Label (The "Bookmark")
A label is completely optional, but it’s incredibly useful.
Think of it like a bookmark or a signpost in your code.
It’s just a name you give to a specific spot in the program so you can find it easily later.
If you want the computer to "jump" back to a certain spot or repeat a section of code (like a loop), you give that spot a label.
How it looks: You write a word and follow it with a colon (like loop:).
The Golden Rule: Every label name has to be unique. You can't have two spots named "Step1," or the computer won't know which one you're talking about.
How this explains labels
start:
This is a label. It marks the beginning of the program.
The program does not have to have it, but it’s useful.
print_msg:
This label marks a spot we want to jump back to.
It acts like a bookmark.
jnz print_msg
This tells the computer:
“If the condition is true, go back to the place named print_msg.”
That’s the label being used.
end:
Another label marking where the program finishes.
2. The Mnemonic
The mnemonic is the actual command in an assembly instruction.
It is the only part that must exist.
A mnemonic is just a short, easy-to-remember name for something the CPU knows how to do.
Before mnemonics existed, programmers had to write long strings of numbers to control the computer. That was slow, hard to read, and easy to mess up.
Mnemonics fixed that by giving those numbers names.
Think of the mnemonic as the verb in a sentence:
MOV → move data
ADD → add numbers
SUB → subtract
JMP → jump to another place in the code
If an instruction has no mnemonic, it is not an instruction.
The CPU cannot guess what you want—it needs a command.
mov → mnemonic (the command)
eax, 5 → operands (what the command works on)
This tells the CPU: “Move the value 5 into the register EAX.”
add is the mnemonic. It tells the CPU to perform addition
Without the mnemonic, this would mean nothing.
The mnemonic is the name of the operation being performed.
It tells the CPU what action to take and is the only required part of an assembly instruction.
3. The Operands (The “Targets”)
If the mnemonic is the verb, then the operands are the nouns.
Operands are the things the instruction works on.
Most instructions don’t make sense without them. If you tell the CPU to ADD, its next question is:
“Add what to what?”
That’s what operands answer.
What operands can be
Operands can be different kinds of things, depending on the instruction:
Constants - A fixed number written directly in the code: 5
Registers - Small, fast storage locations inside the CPU: eax, ebx
Memory locations - A specific place in RAM where data is stored: [value]
Labels - A named location in the program, used mainly with jump instructions: loop
mov → mnemonic
eax and 5 → operands
Meaning: move the constant 5 into the register eax
Operands: eax, ebx
Meaning: add the value in ebx to the value in eax
Operand: loop (a label)
Meaning: jump to the place in the code named loop
Code label (for jumps/loops):
Data label (for variables):
Array example with offset:
Important detail
Some instructions have one operand
Some have two
A few have none
But when operands are present, they always tell the CPU:
where the data comes from and where the result goes
Operands are the values, registers, memory locations, or labels that an instruction acts upon.
4. The Comments (Notes for Humans)
Comments are only for humans.
The CPU and the assembler completely ignore them.
Comments do not become machine code
They do not take up memory
They exist only to help you and anyone else reading the code
Assembly can get confusing fast, so comments explain why something is done, not just what is done.
A comment starts with a semicolon:
The second one is more useful when you come back later or need to fix a bug.
5. Putting it all together
Here is a full instruction showing all parts working together:
loop_start: → Label (marks a location in the code)
add → Mnemonic (the action)
eax, 1 → Operands (what the action works on)
; increase the counter for each loop → Comment (human explanation)
Comments don’t affect the program at all, but they make the code understandable and easier to maintain.
NOP (No Operation)
The NOP instruction stands for No Operation.
When executed, it does absolutely nothing.
It takes 1 byte of memory.
It’s mostly used as a placeholder or for aligning code in memory.
Why use NOP?
Alignment: Some processors work faster if instructions start at specific memory addresses (like multiples of 4).
Padding: To maintain the size of an instruction stream.
Debugging: You can insert NOPs temporarily to test timing or skip over instructions without changing program behavior.
Example 1: Simple NOP
The nop instruction does nothing.
It’s just a placeholder between the two instructions.
Example 2: Alignment Example
The nop ensures that the next instruction starts at a multiple-of-4 address.
This can improve performance because the processor accesses memory more efficiently.
Key Points
NOP does nothing when executed.
It’s used for padding, alignment, or debugging.
Takes 1 byte of memory.
Does not affect registers or memory.
x86 PROCESSORS AND SPEED
x86 processors work faster when code and data start at even doubleword addresses
(that means addresses that are multiples of 4 bytes).
Why this matters
The x86 CPU moves data in 4-byte chunks
If data starts at a 4-byte boundary, the CPU can fetch it in one step
If it’s not aligned, the CPU needs two memory accesses
Two accesses = slower program
Aligned vs unaligned (idea)
Aligned address: 0, 4, 8, 12, ...
Unaligned address: 1, 2, 3, 5, 6, ...
When data is unaligned, performance drops.
How programmers fix this
To avoid slowdown, programmers:
align code and data to 4-byte boundaries
use padding
use NOP instructions to push code to the correct address
x86 processors load code and data faster from even doubleword (4-byte aligned) addresses because aligned data can be fetched in a single memory access.
ANATOMY OF A 32-BIT ASSEMBLY PROGRAM
Here is the full, working source code for addTwo.asm
SETUP DIRECTIVES (THE RULES)
Before writing any instructions, we must tell the assembler what kind of program we are writing.
These directives do not generate machine code. They only set rules.
I. Processor Directive — .386
This tells the assembler:
“Generate code for an Intel 80386 processor or newer.”
Why this matters:
The 80386 was the first 32-bit x86 processor
This directive enables 32-bit registers such as EAX, EBX, ECX
Without it, the assembler assumes 16-bit mode
Bottom line:
.386 is required for 32-bit assembly programs.
II. Memory Model — .model flat, stdcall
This directive defines how memory is addressed and how functions are called.
Flat memory model
Memory is treated as one continuous address space
You can access any memory location directly
This is the standard model used by 32-bit Windows
Stdcall calling convention
Used by Windows API functions
Function arguments are passed on the stack
The called function cleans up the stack
Bottom line:
32-bit Windows programs require flat memory and stdcall function calls.
III. Stack Directive — .stack 4096
This reserves 4096 bytes (4 KB) for the program stack.
Why 4096 bytes:
4 KB is the size of a standard memory page
Enough space for local variables and function calls in small programs
Bottom line: The stack is required for function calls and parameter passing.
TALKING TO THE OPERATING SYSTEM
A Windows program must tell the OS when it finishes and whether it succeeded.
I. Function Prototype — PROTO
This tells the assembler:
There is a function named ExitProcess
It takes one parameter
The parameter is a DWORD
Why this is required:
The INVOKE instruction needs to know the function’s parameters
Prevents calling functions with the wrong number or type of arguments
Bottom line:
You must declare a function prototype before using INVOKE.
II. Exit Code — dwExitCode
When a program ends, it returns an exit code to the operating system.
Common values:
0 → Program completed successfully
Non-zero → Program failed or encountered an error
III. Why the Exit Code Matters
Operating systems and scripts check the program’s exit code.
Example:
A batch file runs several programs
It checks %ERRORLEVEL%
If the exit code is 0, it continues
If the exit code is non-zero, it stops or reports an error
Bottom line: Always return 0 if your program finishes correctly.
BUILDING THE PROGRAM (COMPILE & LINK)
Assembly language programs are built in two steps.
Step 1: Assembly
What happens:
MASM converts .asm source code into machine code
Output is an object file (.obj)
Options:
/c → assemble only (do not link)
/coff → use Common Object File Format
Step 2: Linking
What happens:
The linker combines your object file with system libraries
Produces the final executable file (.exe)
Links against Windows libraries such as kernel32.lib
/subsystem:console: Tells Windows this is a console application
BIG IDEAS TO REMEMBER
Use .386 to enable 32-bit registers
Use .model flat, stdcall for 32-bit Windows programs
Declare external functions with PROTO
Always return 0 to indicate success
Build process: .asm → .obj → .exe
THE STACK
You already know this idea from C, so we’ll use C just to confirm what the stack does.
What happens on the stack when main() calls factorial(5)
Each function call creates a stack frame.
When factorial(5) is called:
The return address is pushed onto the stack
(where to go back after the function finishes)
The parameter (n) is pushed onto the stack
Control jumps to the factorial function
The function runs
When it returns:
parameters are removed
return address is popped
execution continues where it left off
Key idea (this is all you need)
The stack remembers where to return and stores function data.
Stack Frames and Local Variables
Local variables live on the stack.
result and i are local variables
They exist only while the function runs
When the function returns, the stack frame is destroyed
The runtime stack stores return addresses, parameters, and local variables for function calls.
ASSEMBLY PROGRAM STRUCTURE
Now let’s connect this to assembly, very simply.
.CODE Directive
Marks the start of executable instructions
Everything after this is code
Usually followed by the program’s entry point, commonly main.
Procedures: PROC and ENDP
PROC marks the start of a procedure
ENDP marks the end
The names must match
END Directive (Very Important)
What this means:
Marks the end of the entire program
Tells the assembler where execution starts
Difference between ENDP and END
Important note: Any lines written after END are ignored by the assembler.
You can put comments there — it won’t matter.
RUNNING AND DEBUGGING (SHORT & PRACTICAL)
Assembly programs run inside a console window
Same window as cmd.exe
Breakpoints (Visual Studio)
Click in the gray bar next to a line
A red dot appears
Program pauses before executing that line
If you place a breakpoint on a non-executable line:
VS moves it to the next executable instruction
Debug Mode Visual Cues
Orange bar → debugger is running
Blue bar → edit mode
You cannot edit code while debugging.
Registers While Debugging
Registers window shows CPU registers
Registers that change turn red
EAX = 0000000B → hex for decimal 11
(New VS versions hide some of these by default — that’s normal.)
PROGRAM TEMPLATE IDEA
Assembly programs follow a fixed structure, so we use templates.
Why?
Avoid rewriting setup code
Reduce mistakes
Faster development
Always comment:
program purpose
author
date
changes
This helps future you, not just others.
INCLUDE Directive
Copies another file into your program
Often used for:
macros
procedures
library code
ASSEMBLE → LINK → RUN (Final Form)
Assembly programs cannot run directly in their raw .asm form, so they must go through a series of steps before the CPU can execute them.
🔗 1. Edit
In this step, you write your assembly source code in a .asm file. This file contains human-readable instructions written using mnemonics and labels.
🔗 2. Assemble
The assembler takes the .asm file and converts it into an object file (.obj). This process translates the human-readable assembly instructions into machine code, but the program is not yet complete or ready to run.
🔗 3. Link
The linker combines one or more .obj files with any required libraries to produce a final executable file (such as .exe). During this step, all references to external functions and data are resolved, and the program becomes a complete unit.
🔗 4. Run
The operating system loads the executable into memory, sets up the necessary environment, and transfers control to the program’s entry point. From there, the CPU begins executing the instructions.
🔗 Final One-Line Summary
The stack manages function calls, .CODE defines executable instructions, PROC/ENDP define procedures, and END marks the program entry point, while assembly programs must be assembled, linked, and then executed before they can run.
LISTING FILES & SYMBOL TABLES
What is a Listing File?
A listing file is a detailed report created by the assembler.
It shows:
Your original source code
Line numbers
The memory address of each instruction
The machine code bytes (in hex)
A symbol table
Think of it as: “Show me exactly what the assembler generated.”
Who actually needs listing files?
Beginners → to understand how assembly becomes machine code
Advanced programmers → to debug performance or instruction layout
For normal programs, you usually don’t need it.
The Symbol Table (The Important Part)
Early programmers had to manually decide memory locations:
This was:
hard to remember
extremely error-prone
The assembler fixes this.
Instead of using raw addresses, we use symbols.
Symbolic Addressing (This is the key idea)
What’s happening here?
PayRate is a symbolic name
DB tells the assembler:
allocate 1 byte of memory
initialize it to 100h
The assembler:
assigns a real memory address
stores it in the symbol table
When it sees:
It silently replaces PayRate with the correct address.
Why this is powerful
You don’t care where the data lives
You only care that the name stays consistent
Code becomes:
readable
maintainable
safe
What does the Symbol Table store?
A symbol table keeps track of:
variables and their addresses
labels (jump targets)
procedures
constants
segments
In short: Every name in your program gets an address.
Listing File Example (Simple Explanation)
A listing file shows lines like this (conceptually):
What this tells you:
00000000 → memory offset
B8 → opcode for mov eax, imm32
00000005 → value being moved
Instructions are stored as hex bytes
INVOKE in the listing file
In the listing file, this expands to:
So, INVOKE is just a shortcut — the assembler writes the real instructions.
Why listing files are useful
Listing files help you:
verify machine code generation
see instruction sizes
understand how macros expand
learn how the CPU really sees your program
They are learning and debugging tools, not everyday tools.
80386 Reminder (Very Short)
When you see:
It means target is 80386 or newer, enables 32-bit registers and all modern CPUs qualify.
That’s it. Nothing more needed.
END vs ENDP
🎯 Procedure and Program Endings
In assembly, main ENDP is used to mark the end of a procedure. It tells the assembler that the block of instructions belonging to that procedure is finished.
On the other hand, END main signals the end of the entire program and also specifies the entry point. This tells the system where execution should begin when the program is loaded into memory.
🎯 Generating a Listing File (Optional)
A listing file is something you can generate if you want deeper visibility into how your assembly code is translated. In environments like Visual Studio, you can enable it through the project settings by navigating to the Microsoft Macro Assembler options and turning on listing file generation.
However, in most cases, you don’t actually need a listing file unless you are debugging, learning, or doing low-level analysis.
🎯 Final Chapter Takeaway
Listing files help you see the direct relationship between your source code and the resulting machine code. Symbol tables allow you to use meaningful names instead of raw memory addresses, making programs easier to write and understand.
The assembler plays a much bigger role than simply translating instructions. It allocates memory for variables and data, keeps track of all symbols such as labels and procedures, and replaces those symbolic names with actual memory addresses during assembly.
Because of this, assembly language becomes far more usable and readable. The assembler essentially acts as a bridge between human-friendly code and the low-level machine instructions that the CPU executes.
INTRINSIC DATA TYPES
What does “intrinsic data types” mean?
Intrinsic data types are the built-in data sizes that the assembler understands.
They answer three simple questions:
How big is the data? (8 bits, 16 bits, 32 bits, etc.)
Is it signed or unsigned? (can it be negative?)
Is it an integer or a real (floating-point) number?
That’s it. No magic.
What the assembler actually cares about
Here’s the key idea:
The assembler mainly cares about size.
It needs to know:
how many bytes to reserve
how many bytes an instruction will read or write
The assembler does NOT strongly enforce: signed vs unsigned
That distinction is mostly for humans.
Signed vs Unsigned (Important but subtle)
DWORD → 32-bit unsigned
SDWORD → 32-bit signed
Both:
are 32 bits
take up 4 bytes
look identical in memory
The only difference is how you interpret the bits
That’s why programmers often use SDWORD:
not because the assembler demands it
but because it makes intent clear
Why intrinsic data types matter
Intrinsic data types help you:
choose the correct operand size
avoid reading or writing the wrong number of bytes
understand how values are stored in memory
If you get the size wrong, the CPU will still execute, but your result may be wrong or corrupted.
Key Takeaways
Intrinsic data types describe the size, signed/unsigned nature, and whether the value is an integer or real number.
The assembler cares about operand size, but does not enforce signed vs unsigned.
Programmers often use SDWORD to indicate signedness, but it is not required.
Intrinsic data types help explain how data is stored and used in assembly.
About overlapping types (Very important concept)
Some types overlap in functionality.
Example:
DWORD → 32-bit unsigned
SDWORD → 32-bit signed
Same size. Same memory.
Different meaning.
The assembler sees “32 bits”.
The programmer sees “signed” or “unsigned”.
So when I say “intrinsic data types”…
These are the basic building blocks of all data in a computer.
Let’s walk through them naturally.
🌊 Bit-Level Building Blocks (From Smallest to Largest)
At the lowest level, all data in a computer is built from simple binary units that scale up into larger, more useful forms.
A bit is the smallest unit of data and can only hold a value of 0 or 1. It represents a single binary state.
A nibble consists of 4 bits. It is half a byte and is commonly used to represent a single hexadecimal digit.
A byte is made up of 8 bits and is one of the most important basic units in computing. A byte can store a single character (like a letter) or a small number, which is why it is widely used.
A word is 16 bits, or two bytes. It allows the storage of larger numbers and is often used in older or lower-level systems.
A double word (DWORD) is 32 bits, or four bytes. This size is very common in 32-bit systems and programs.
A quad word (QWORD) is 64 bits, or eight bytes. It is used for very large numbers and is standard in modern 64-bit systems.
All data structures, variables, and memory layouts are ultimately built from these fundamental units.
🌊 Intrinsic Data Types in Assembly
Assembly language provides several built-in data types that map directly to these bit sizes, especially for integers.
For integer types, a BYTE is an 8-bit unsigned value with a range from 0 to 255, while an SBYTE is also 8 bits but signed, allowing values from –128 to 127.
A WORD is a 16-bit unsigned integer ranging from 0 to 65,535, while an SWORD is signed and ranges from –32,768 to 32,767.
A DWORD is a 32-bit unsigned integer with a range from 0 to 4,294,967,295, and an SDWORD is signed, ranging from –2,147,483,648 to 2,147,483,647.
🌊 Larger and Special Integer Types
Some data types are less common but still important in specific contexts.
An FWORD is 48 bits and was mainly used for far pointers in older protected-mode systems.
A QWORD is 64 bits and is used for very large integers, especially in modern systems.
A TBYTE is 80 bits and is rarely used. It is mostly associated with specialized operations in the floating-point unit.
🌊 Floating-Point (Real Number) Types
For handling decimal (real) numbers, assembly provides floating-point types with different levels of precision.
A REAL4 is a 32-bit floating-point number and is commonly used for basic decimal values.
A REAL8 is a 64-bit floating-point number, offering much higher precision and accuracy.
A REAL10 is an 80-bit floating-point number that provides very high precision but is rarely used in typical applications.
🌊 Key Idea
Even though this may seem like a lot of different types, they all come down to the same fundamental concept: everything in a computer is stored as binary. These data types simply define how many bits are used and how those bits should be interpreted—whether as integers, signed values, or floating-point numbers.
The assembler cares about how many bytes.
The programmer cares about what those bytes mean.
That’s why intrinsic data types exist.
DATA DEFINITIONS (ASSEMBLY VARIABLES)
A data definition in assembly is how you create a variable.
It answers two questions:
How much memory do I need?
What value should it start with?
General syntax
label → the variable name (optional, but almost always used)
directive → the data type / size
value → the initial value
Example
This means:
create a variable named count
reserve 4 bytes (32 bits)
store the value 12345 in it
Equivalent C code:
Same idea, different language.
More examples
📦 What’s Happening Here
In assembly, each variable declaration tells the assembler how much memory to reserve and what kind of data will be stored.
The variable message uses DB (Define Byte), which allocates one byte per character. When you store "Hello, world!", each character takes one byte, so the full string occupies 13 bytes in memory (including punctuation and spaces).
The variable age is declared as a BYTE, which reserves 1 byte of memory. This is enough to store a small number, and in this case, it holds the value 25.
The variable salary is declared as an SDWORD, which reserves 4 bytes (32 bits). This allows it to store a signed integer, meaning it can handle both positive and negative values within a much larger range than a single byte.
Each declaration controls two things: how much memory is reserved and how the stored value is interpreted.
📦 Why the data type matters
The assembler must know the size of the variable:
how many bytes to reserve
how many bytes instructions should read or write
If you don’t specify the type, the assembler has no idea what to do.
📦 Assembly vs C (Same concept)
Both:
reserve memory
assign an initial value
give the memory a name
Assembly just makes the size explicit.
Short forms (Just aliases)
These are short names, not new types:
BYTE → DB
WORD → DW
DWORD → DD
QWORD → DQ
TBYTE → DT
They all do the same job: reserve memory.
Legacy Data Directives (Still used in 2026?)
Yes — absolutely still used.
Directives like DB, DW, DD, DQ, and DT are:
still supported
still common
still the standard way to define data in MASM
They are called “legacy” only because they’ve been around forever, not because they’re obsolete.
The Core Data Directives (Explained Clearly)
1. DB — Declare Byte (8 bits)
Reserves 1 byte per value.
Common uses:
characters
small numbers
strings (byte-by-byte)
2. DW — Declare Word (16 bits)
Reserves 2 bytes.
Used for:
16-bit values
older or compact data
3. DD — Declare Doubleword (32 bits)
Reserves 4 bytes.
This is one of the most common directives in 32-bit programs.
4. DQ — Declare Quadword (64 bits)
Reserves 8 bytes.
Used for:
large integers
64-bit values
5. DT — Declare Ten Bytes (80 bits)
Reserves 10 bytes.
Used for:
extended precision floating-point (FPU)
rare, but valid
ABOUT STRINGS AND NULL TERMINATORS
Both are valid.
The second one:
adds a null terminator
is better when interacting with C-style functions
MASM does not automatically add 0 for you.
Big Idea to Remember
Data definition directives:
reserve memory
define size
optionally initialize values
The assembler:
assigns addresses
tracks them in the symbol table
replaces variable names with real memory locations
You write names.
The assembler handles addresses.
Data definitions are how assembly creates variables, by explicitly stating how many bytes to reserve and what value to store in them.
DEFINING DATA TYPES (PART 1 – BEGINNER EXPLANATION)
Big Picture: What This Section Is About
This section explains:
How variables are defined in assembly
How variables are initialized
What happens if variables are not initialized
How different byte-sized data types work
Main Rules for Data Definitions
1. At Least One Initializer Is Required
When you define a variable, the assembler expects a value.
Example:
DWORD → data type (4 bytes)
0 → initializer
Even zero counts as a valid initializer.
2. Multiple Initializers Use Commas
You can define multiple values at once by separating them with commas. Example:
This creates four bytes in memory.
3. Integer Initializers Must Match the Data Size
For integer data types, the value must fit in the size of the variable. Example:
4. Leaving a Variable Uninitialized (?)
If you want to reserve memory without giving it a value, use ?.
Example:
This means:
Memory is reserved
The value is unknown (garbage) at program start
⚠️ Important: Uninitialized variables must not be used before assigning a value.
5. Everything Becomes Binary
No matter how you write an initializer:
Decimal
Hex
Character literal
The assembler converts it into binary before storing it in memory.
6. Worked Example: Adding Two Numbers
Defines a variable: sum DWORD 0
sum is a 4-byte integer initialized to 0; the program loads 5 into eax, adds 6 to it so eax becomes 11, and then stores the result: mov sum, eax
The program exits and final value is 11.
7. Debugging Tip
To observe the variable, set a breakpoint after mov sum, eax, step through the instructions, and watch sum in the debugger to see the memory value change in real time.
BYTE-SIZED DATA TYPES (Very Important)
BYTE / DB (Unsigned, 8 bits)
Size: 1 byte (8 bits)
Range: 0 to 255
Used for: small numbers, characters, raw data
Examples:
SBYTE is a signed 8-bit data type that occupies 1 byte of memory, can store values from −128 to +127, and is commonly used for small numbers that may be negative (for example: temp SBYTE -10 or change SBYTE 5).
Signed vs Unsigned
Unsigned → only positive values (and zero)
Signed → positive and negative values
Uninitialized Variables (Important Warning)
Reserves 1 byte of memory but does not initialize it, so the value stored is random garbage just like
…in C language, which is why you must always initialize variables before using them.
Character Initialization Example
'B' is a character
ASCII value of 'B' = 66
Stored as one byte
Signed Byte Example
Stores -12
Uses signed representation
Can hold negative values
Key Takeaways (Exam-Ready)
Variables must have an initializer (or ?)
? means uninitialized (garbage value)
BYTE / DB = unsigned 8-bit
SBYTE = signed 8-bit
Character literals are stored as ASCII values
All data becomes binary in memory
Defining a variable means reserving memory and deciding how the bits should be interpreted.
DATA DEFINITION PART 2: ARRAYS & SIZES
In high-level languages like C++ or Python, you create an array with brackets []. In Assembly, you just list values one after another.
Creating Arrays (The "Label" Trick)
When you define multiple values under one name, you are creating an array.
You are creating 4 bytes in memory:
The label list only points to the first value, which is 10.
The assembler doesn’t automatically give names to the other values (20, 30, 40).
To access them, you have to calculate their position relative to list.
For example:
list → gives you 10
list + 1 → gives you 20
list + 2 → gives you 30
list + 3 → gives you 40
So, the label is like the “starting address” of your array, and the other elements are reached by adding an offset in bytes.
The Memory Map
If list starts at memory offset 0000:
Contiguous Memory
When you write:
Here’s what’s happening:
The assembler doesn’t care about line breaks.
As long as you don’t give a new label, it just keeps placing the numbers right after the previous ones in memory.
So, all 12 numbers are stored one after another in memory.
Memory layout looks like this:
The label list points only to the first number (10 at offset 0).
To access the others, you use offsets: list + 1 → 20, list + 4 → 50, etc.
To the computer, this is just one long strip of memory, like a long row of boxes.
BYTE vs INTEGER Confusion 🤯
Many beginners get confused because:
In C++/Java, int is always 4 bytes (32 bits).
In Assembly, numbers don’t have a fixed size by default. They are stored in a container (data type) you choose.
Think of it like boxes:
Number 10 fits easily in a BYTE (8-bit box).
You don’t need a DWORD (4-byte box) for such a small number.
U is unsigned, S is signed.
Why use BYTE instead of DWORD?
Memory efficiency: 1,000 small numbers (like ages 0–100) → 1,000 bytes with BYTE, but 4,000 bytes with DWORD. Saving 75% of memory!
Compatibility: Some old hardware or file formats expect data to be in bytes.
⚠️ The Catch
If you try to put a number bigger than 255 into a BYTE:
The assembler will give an error, or
It might silently chop off the extra bits, giving you the wrong value.
✅ In short:
You can spread your data across multiple lines; the assembler just packs them in a row.
BYTE is just a small container, use it when the number is small.
Integers in assembly are as big as you declare (BYTE, WORD, DWORD, etc.), unlike high-level languages.
MIXING RADIXES (THE "SALAD BOWL")
Assembly doesn't care how you write the number.
You can mix Hex, Decimal, Binary, and Character literals in the same list.
They all get converted to binary in the end.
Big Idea to Remember
Labels point to the start: list is just the address of the first item. To get the rest, you add to the address (Offset).
Contiguous Memory: Data defined sequentially sits sequentially in RAM.
Size matters, not type: You can store an "integer" in a BYTE as long as it fits (0-255). You don't always need a DWORD.
STRINGS
Strings Are Just Arrays of Bytes
In assembly, there is no “string type” like in high-level languages (C, Python, etc.).
A string is just a sequence of bytes.
Each character in the string is stored in one byte.
The byte holds the ASCII value of the character.
Example:
✅ Notice that:
Each character takes 1 byte.
The null terminator (0) is also a single byte marking the end of the string.
Labels Are Just Starting Addresses
names1, names2, names3 are labels.
A label is just a pointer to the first byte of the string in memory.
The computer uses the label as a starting reference, but it doesn’t know the length of the string unless you tell it.
Everything after the first byte is contiguous memory (like we discussed with list BYTE 10,20…).
Why We Use BYTE
We write BYTE because each character fits in 1 byte.
Strings are really arrays of bytes, not a special datatype.
Think of it like this:
Each character is stored in one box (byte).
The null byte (0) is the stop signal for string functions, like printf in C or WinAPI string routines.
Multi-line Strings & Special Characters
You can split strings across multiple lines or add special characters:
0Dh = carriage return (CR) → moves cursor to start of line
0Ah = line feed (LF) → moves cursor down a line
\ → line continuation character (lets you break one string across multiple lines)
Memory layout is still just a sequence of bytes, now including CR/LF:
Everything remains a byte.
Putting It All Together
Each string is a contiguous sequence of bytes in memory.
The label points to the first byte.
Each character = 1 byte (ASCII code).
Null terminator (0) = 1 byte marks the end.
Multi-line strings or special characters like CR/LF are just additional bytes in the same array.
So even the biggest sentence like "Learning Reverse Engineering then C#" is just a row of bytes:
✅ Key Insight
Strings in assembly are not magical objects.
They are arrays of bytes.
The label is the pointer.
The assembler only cares about memory.
Null terminators allow functions to know where the string ends.
✅ DUP Operator (Duplicate Made Easy)
The DUP operator in assembly is all about making copies—it lets you allocate multiple pieces of memory and optionally initialize them with the same value.
Think of it as a “memory copy machine” for variables, arrays, strings, or even structures.
How it works:
Count: How many times you want to repeat something.
Value: What you want to repeat (it can be a number, a string, or even an uninitialized placeholder).
The syntax looks like this:
<data type> could be BYTE, WORD, DWORD, etc.
<count> is how many times you want to repeat.
<value> is what you want to fill each slot with. If you leave it as ?, the memory is just reserved but contains random “garbage” values until you set it.
Examples:
Allocate 20 bytes, all zero:
This creates a block of 20 bytes, each containing 0.
Allocate 20 bytes, uninitialized:
Memory is reserved for 20 bytes, but the values are undefined.
Think of it like an empty box, you can fill it later.
Create a repeated string:
This repeats the sequence "STACK" four times in memory, effectively making "STACKSTACKSTACKSTACK".
Allocate an array of 10 integers, initialized to zero:
Here, you get 10 integers, each 4 bytes, all set to 0.
Allocate an array of structures:
This reserves space for 10 structures, each containing a 4-byte integer and a 4-byte string.
Key idea:
Yes, DUP literally means “duplicate”. It’s your way to repeat a value or pattern efficiently in memory without writing it out multiple times.
Whether you’re filling arrays, initializing strings, or creating structures, DUP saves time, space, and effort.
Think of it like telling the assembler: “Hey, make 10 of this, or 20 of that, all lined up in memory, and set them to this value—or leave them blank for now.”
WORD and SWORD
In assembly language, WORD and SWORD are used to work with 16-bit numbers. Each 16-bit number takes 2 bytes of memory.
WORD (Unsigned 16-bit Integer)
WORD is for unsigned numbers, meaning only positive numbers from 0 to 65535.
Each WORD reserves 2 bytes in memory.
SWORD (Signed 16-bit Integer)
SWORD is for signed numbers, meaning it can store negative and positive numbers from -32768 to 32767.
Each SWORD also takes 2 bytes in memory.
Example:
Key Idea:
Think of WORD as a box that only holds positive numbers, and SWORD as a box that can hold negative numbers too. Both boxes are 16 bits (2 bytes) wide, so the memory size is the same, only the interpretation changes.
WORD Arrays
You can create arrays of 16-bit numbers in assembly, just like arrays in C, using either explicit listing or the DUP operator.
Memory layout: Each 16-bit element occupies 2 bytes. So if your array starts at memory offset 0000, the next element is at 0002, then 0004, and so on.
Example with explicit listing:
Example with DUP (uninitialized array):
Here, ? means the elements are uninitialized. They have random “garbage” values until your code sets them.
Visualizing Memory (Conceptual):
Each element takes 2 bytes, so to access the next element, you increment the offset by 2.
Summary:
WORD: Unsigned 16-bit number (0 to 65535)
SWORD: Signed 16-bit number (-32768 to 32767)
Arrays: Use listing or DUP to store multiple words, remembering each takes 2 bytes in memory.
DWORD and SDWORD
In assembly language, DWORD and SDWORD are used to work with 32-bit integers. Each 32-bit number takes 4 bytes of memory.
I. DWORD (Unsigned 32-bit Integer)
DWORD is for unsigned numbers, meaning only positive numbers from 0 to 4,294,967,295.
Each DWORD reserves 4 bytes in memory.
Usage Tip: You can also use DD (Define Doubleword) as a legacy directive. It works the same as DWORD:
II. SDWORD (Signed 32-bit Integer)
SDWORD is for signed numbers, meaning it can store negative and positive numbers from -2,147,483,648 to 2,147,483,647.
Each SDWORD also takes 4 bytes in memory.
III. Arrays of 32-bit Numbers
You can create arrays of DWORDs or SDWORDs either by listing values explicitly or using the DUP operator:
Explicit initialization:
Uninitialized array using DUP:
Memory layout concept:
Each element occupies 4 bytes, so if the first element is at offset 0000, the next is at 0004, then 0008, and so on.
Arrays let you easily store multiple 32-bit numbers in contiguous memory.
IV. Extra Tip: DWORD for Offsets
You can also use DWORD to store the 32-bit memory offset of another variable:
This is useful for pointers or referencing other variables in memory.
V. Summary:
DWORD: Unsigned 32-bit integer, 4 bytes, 0 → 4,294,967,295
SDWORD: Signed 32-bit integer, 4 bytes, -2,147,483,648 → 2,147,483,647
Arrays: Use listing or DUP to store multiple DWORDs
Legacy DD directive works the same as DWORD
QWORD (Quadword)
The QWORD directive in assembly language is used to allocate storage for 64-bit values, meaning each QWORD takes 8 bytes of memory.
Think of it as a really big box that can hold very large numbers.
1. Syntax and Usage
You can define QWORD values in two ways:
Standard directive:
Short form (DQ – Define Quadword):
Tip: The value must fit in 64 bits, otherwise the assembler will throw an error.
2. Memory Organization
Each QWORD takes 8 bytes, so if you define multiple QWORDs in an array, memory offsets increase by 8 each time:
This is just like how DWORD arrays worked, but each element is double the size.
3. Arrays of QWORDs
Just like with BYTE or DWORD, you can use the DUP operator to define multiple QWORDs at once:
Each element is 8 bytes, so this reserves 80 bytes total (10 × 8).
Using ? instead of 0 leaves them uninitialized:
4. QWORD and Registers
In 32-bit mode, your registers like EAX are 32 bits, so storing a 64-bit QWORD might need two 32-bit registers or special memory instructions.
In 64-bit mode, the RAX register can hold a full QWORD directly.
Example:
This is important if you start working with large numbers, addresses, or high-precision calculations.
5. Summary Notes
QWORD = 64 bits = 8 bytes
QWORD can be initialized directly or with DUP
Short form DQ is equivalent to QWORD
Arrays increment in memory by 8 bytes per element
32-bit registers can’t hold QWORDs directly; use 64-bit registers or split into two 32-bit halves
💡 Memory efficiency tip:
Use QWORD only when you need numbers bigger than 32 bits, otherwise DWORD is enough and takes half the memory.
Never forget this concept in Assembly:
Let’s continue….
PACKED BCD AND TBYTE
Packed BCD (Binary Coded Decimal) is a special way to represent decimal numbers in binary, designed for efficiency and precision, especially in financial or scientific applications.
1. What is Packed BCD?
Packed BCD stores decimal digits in pairs, with two decimal digits per byte.
Example: The decimal number 1234 in packed BCD is stored as 34 12 (hex representation in memory).
The lower nibble of a byte stores one digit.
The higher nibble stores the next digit.
Sign byte: The highest byte of a packed BCD variable indicates the sign.
00h → Positive
80h → Negative
Think of it like a tightly packed stack of digits, with the sign sitting on top.
Why use it?
Efficient storage of large decimal numbers (takes less memory than converting to binary integers).
Accurate decimal arithmetic — important for money calculations, scientific data, and some embedded systems.
2. The TBYTE Directive
In MASM, TBYTE is used to declare variables that can store packed BCD data.
Even though TBYTE is 80 bits (10 bytes), it isn’t just “10 bytes of storage” — it can also hold floating-point numbers or other data formats.
Memory layout for a TBYTE BCD number:
1st byte: Sign
Next 9 bytes: Decimal digits, 2 digits per byte
Example: Declaring a packed BCD variable
The correct way:
Important: MASM does not automatically convert decimal numbers to BCD. You must write them in hexadecimal BCD form.
3. Packed BCD in Memory
Let’s look at 1234 as an example:
Each byte after the sign byte stores two decimal digits.
Positive numbers start with 00h. Negative numbers start with 80h.
Visualizing storage:
If the number were larger, you’d continue storing 2 digits per byte.
4. Declaring Arrays of Packed BCD
You can use DUP with TBYTE too:
Each element takes 10 bytes.
Total memory: 5 × 10 = 50 bytes
Uninitialized array:
5. Converting Real Numbers to Packed BCD
Sometimes, you have floating-point numbers (REAL4, REAL8, etc.) and want them as packed BCD.
This is done using FPU instructions:
FLD → Load floating-point number onto FPU stack
FBSTP → Convert the value to packed BCD and store it in bcdVal
Example: If posVal = 1.5, then bcdVal will store 02 in packed BCD.
6. Why Packed BCD Matters
Efficiency: Stores two digits per byte instead of wasting 8 bits for a single decimal digit.
Accuracy: No rounding errors when doing decimal math, unlike floating-point binary.
Applications: Financial apps, calculators, scientific measurements, embedded systems.
Analogy: Think of Packed BCD as a neatly packed number stack, where each box holds 2 digits, and the top box holds the sign. Computers can easily read, write, and calculate with these numbers without wasting memory.
7. Quick Reference
Directive: TBYTE
Size: 10 bytes (80 bits)
Sign byte: First byte, 00h positive, 80h negative
Digits: Next 9 bytes, 2 digits per byte
Initialization: Must be in hexadecimal
Arrays: Use DUP operator for multiple variables
✅ Example: Complete Packed BCD Declaration
Memory usage:
myBCD1 → 10 bytes
myBCD2 → 10 bytes
myBCDArray → 30 bytes (3 × 10)
Packed BCD is essentially a super-efficient way to store decimal numbers where every byte counts.
The TBYTE directive is just your tool for declaring variables that can hold packed BCD or other 10-byte data types.
DEFINING FLOATING-POINT TYPES IN MASM
Floating-point numbers are used to represent real numbers, meaning numbers with fractional parts, like 1.23456789.
In MASM, there are three main floating-point types:
I. Single-Precision: REAL4
Size: 4 bytes (32 bits)
Precision: ~7 significant digits
Range: ±3.4×10³⁸ to ±1.2×10⁻³⁸
Example:
Memory usage: 4 bytes
Good for general-purpose calculations where moderate precision is enough.
II. Double-Precision: REAL8
Size: 8 bytes (64 bits)
Precision: ~15 significant digits
Range: ±1.7×10³⁰⁸ to ±2.4×10⁻³⁰⁸
Example:
Memory usage: 8 bytes
Use when high precision is required, such as scientific calculations or very small/large numbers.
III. Extended-Precision: REAL10
Size: 10 bytes (80 bits)
Precision: ~19 significant digits
Range: ±4.9×10³²⁴ to ±1.1×10⁻³²⁴
Example:
Memory usage: 10 bytes
Ideal for high-precision math, financial, or scientific computations where single- or double-precision isn’t enough.
IV. Arrays of Floating-Point Numbers
You can use the DUP operator to declare arrays of floating-point variables:
Memory usage: 20 × 4 = 80 bytes
Efficient way to initialize large arrays of floating-point numbers.
V. Using DD, DQ, and DT Directives
MASM also allows you to declare floating-point numbers with DD, DQ, and DT, which are legacy equivalents:
Examples:
This is equivalent to using REAL4, REAL8, REAL10. Use whichever style you prefer, but REALx directives are clearer for readability.
VI. Precision vs Range
Precision: Number of significant digits the type can represent.
Range: Maximum and minimum values it can store.
Tip: Extended-precision is overkill for most applications but is useful for scientific or financial computing.
VII. Real Numbers vs Floating-Point Numbers
Real Numbers (math concept): Infinite precision and size. Can include fractions, irrational numbers (like π), etc.
Floating-Point Numbers (computer representation): Approximation of real numbers.
Finite precision and range, limited by storage size (4, 8, or 10 bytes).
✅ Key points:
Floating-point types approximate real numbers.
Precision is limited.
They can represent extremely large or small numbers, but not perfectly.
✅ Quick Example Summary
Memory usage: 4, 8, 10 bytes for each variable
Arrays: multiply size by element count
Summary:
Floating-point types in MASM (REAL4, REAL8, REAL10) let you store real numbers of varying precision.
Choose REAL4 for normal calculations, REAL8 for high-precision scientific data, and REAL10 for extreme precision.
Arrays can be initialized using DUP, and legacy directives DD, DQ, DT are equivalent but less readable.
ADD NUMBERS PROGRAM (ADDING INTEGER VARIABLES)
This program shows how to add three 32-bit integers stored in memory and store the result in a fourth variable.
Explanation:
.386: Enables 80386 instructions, meaning we can use 32-bit registers like EAX.
.model flat, stdcall: Flat memory model (all memory in one linear space) with the stdcall calling convention.
.stack 4096: Reserves a 4 KB stack for function calls and local variables.
ExitProcess PROTO: Declares a prototype for the Windows API function ExitProcess, which terminates the program.
Declaring Data (Variables)
.data section: Where all global variables and constants are stored.
DWORD: Each variable is 4 bytes (32 bits).
Hexadecimal values like 20002000h are base-16 numbers, which the CPU stores as binary in memory.
Memory Layout (example):
Each value occupies 4 bytes in memory, stored consecutively unless alignment or padding is introduced.
Code Section (Adding the Variables)
🔑 Step-by-Step Explanation
The instruction mov eax, firstval copies the value stored in firstval (which is 20002000h) into the 32-bit register EAX. At this point, EAX is holding that value for further operations.
Next, add eax, secondval adds the value of secondval (11111111h) to whatever is already in EAX. After this instruction, EAX now contains the result of 20002000h + 11111111h.
Then, add eax, thirdval adds thirdval (22222222h) to the current value in EAX. Now, EAX holds the total sum of all three values.
The instruction mov sum, eax takes the final result stored in EAX and writes it back into the variable sum in memory. This preserves the result outside the register.
Finally, INVOKE ExitProcess, 0 calls a system function to terminate the program. The 0 indicates that the program ended successfully.
🔑 Key Points to Understand
Registers and memory serve different purposes. Registers like EAX are very fast and are used for temporary storage during calculations, while memory variables such as firstval or sum are slower but used to store data more permanently during program execution.
Hexadecimal values are commonly used in assembly because they map neatly to binary. Each hex digit represents 4 bits, so an 8-digit hex number corresponds to 32 bits, which is exactly the size of a DWORD.
In x86 assembly, you cannot directly perform operations between two memory locations. This is why one value must first be moved into a register before performing arithmetic operations.
Once the computation is complete, the result is typically moved back into memory so it can be used later or preserved after the register is reused.
🔑 Optional Visual Memory Diagram
After execution: sum = 20002000h + 11111111h + 22222222h = 53315333h
✅ Summary:
Define variables in .data using DWORD for 32-bit integers.
Use a register (EAX) to perform arithmetic.
Use mov and add instructions to manipulate and sum values.
Store the result back to memory.
This is the memory + register way to add numbers in assembly.