ASM008:PROCESSORS

x86 PROCESSOR BASICS (HOW THE CPU ACTUALLY RUNS THE SHOW)

Imagine the CPU as the brain of your computer.

But not a chill brain — a cracked-out microsecond freak that runs everything on caffeine and electricity. Here’s how it works:

🏛️ The CPU – Central Processing Unit

This is where all the thinking, math, and decision-making happens.

It has:

Registers – tiny super-fast storage slots (think 32-bit pockets for numbers)
Clock – keeps time like a heartbeat so stuff happens in sync
Control Unit (CU) – the boss that decides what happens next
ALU (Arithmetic Logic Unit) – the muscle that does all the math and logic ops (ADD, SUB, AND, OR, NOT, etc.)

🔌 How the CPU Connects to the World

The CPU talks to the rest of the PC through pins on its socket. These pins connect it to buses — long electric highways carrying signals.

🔌The 3 main buses:

Data Bus

Moves the actual data and instructions between the CPU, memory and I/O devices.

The data bus is bidirectional, meaning information can flow in both directions.

The "width" of the data bus (how many parallel wires it has) determines how much data can be transferred at once.

A 64-bit data bus can move 64 bits of data simultaneously.

Analogy: The data bus is like a fleet of delivery trucks that transport goods (data) and mail (instructions) between the city hall (CPU), the library (memory), and various businesses (I/O devices). These trucks can deliver or pick up cargo.

🟨 Address Bus

Says where in memory we’re looking.

The address bus is unidirectional, meaning information flows only from the CPU to other components.

It carries the memory addresses or I/O port addresses where data is to be read from or written to.

When the CPU wants to access a specific piece of data or instruction, it places its memory address on the address bus, telling the memory unit exactly where to find or store that information.

The width of the address bus determines the maximum amount of memory the CPU can access.

A 32-bit address bus can address 232 unique memory locations (4 Gigabytes).

Imagine your computer's RAM as a massive library, and each book in that library has a unique shelf and position. When the CPU wants to read a specific piece of information (a "book"), it doesn't just shout out the book's title.

Instead, it sends out the exact "shelf number" and "position" through the address bus. This "shelf number and position" is what we call a memory address.

This one-way communication ensures that the CPU can accurately request data from, or send data to, a specific spot in memory.

🟥 Control Bus

Uses binary signals (on/off) to tell devices when to send or receive. It synchronizes the actions and manages the flow of information among all devices attached to the system bus.

Think: "Hey RAM — CPU wants to read now!"

It carries control signals that dictate operations like "memory read," "memory write," "I/O read," "I/O write," "interrupt request," and "bus grant."

These signals ensure that devices don't try to use the buses simultaneously or perform conflicting operations.

The control bus is like the city's traffic light system.

Other Buses

⚫ I/O Bus

Handles data moving between CPU and input/output devices (keyboard, mouse, etc.)

Also called the Peripheral bus, considered part of the system bus, but, yeah, it’s a bit different coz its dedicated to transferring data between the CPU and the system I/O devices.

Modern systems often use high-speed serial buses like PCI Express (PCIe) for this purpose.

This bus is all about getting data to and from your input/output devices. Imagine:

Keyboard Input: When you type "hello," that information needs to travel from your keyboard into the computer. The I/O bus is the route that data takes, like supplies being delivered to a restaurant. ⌨️➡️💻

Printer Output: When you hit "print," the document data needs to go from your computer out to the printer. The I/O bus handles this, much like official documents being sent out to residents. 💻➡️🖨️

The I/O bus is considered part of the overall system bus, but it's "a bit different" because it's dedicated to these specific external communications. Modern systems use advanced, high-speed I/O buses like PCI Express (PCIe).

Memory: Where Programs & Data Live

All your running programs and variables are stored in RAM. But here’s the kicker:

The CPU can’t run them straight from RAM coz it is a temporary storage locker for the CPU.

It always does this:

Grabs the instruction from memory.
Brings it into the CPU which has small temporary storage locations, registers, the ones we covered in the previous chapter.
Executes it.
Maybe sends a result back to memory.

So, your code doesn’t run in RAM, it runs inside the CPU — one piece at a time, or in chunks.

We’re going to see more about this Fetch, Decode, Execute cycle ahead.

📦 Buses Summary (Quick Table):

TLDR – Reverse Engineering Focus:

Know the ALU is where bitwise ops live (AND, OR, SHL, etc.)

Know that registers are the CPU’s playground — what you see in disasm (like eax, edx, rsi, etc.)

Remember: instructions run inside the CPU, not memory. Memory just holds them until they’re needed.

Buses = wires that move the ops around. If you’re watching malware move code into memory and jump to it — that’s this system in action.

A register's purpose often becomes clear from the instructions around it. Is it being used as a counter in a loop? An argument for a function? The return value? The context will clue you in.

Learn by Doing: The more assembly code you read and write (even small snippets!), the more you'll see how registers are actually used in real programs. This hands-on experience beats rote memorization any day.

CLOCK & CLOCK CYCLE (X86 CPU TIMING EXPLAINED)

The Unseen Rhythm: What’s the Clock?

The CPU clock is like the relentless, precisely timed heartbeat of your processor — ticking at a fixed speed (e.g., 1 GHz = 1 billion ticks per second).

It is an internal electronic signal that oscillates at an incredibly fixed and high frequency.

This isn't just a simple timer; it's the master synchronizer that orchestrates every single operation within the processor and its interactions with the rest of the computer system.

This clock ticks at a specific, fixed speed, often measured in Gigahertz (GHz). For example, a 3 GHz CPU means the clock "ticks" 3 billion times every second. This incredible speed allows for billions of individual operations to occur in a mere blink of an eye.

The clock keeps the CPU, RAM, and buses perfectly in sync. It ensures data moves smoothly and at the right time — no timing chaos, no crashes. Without it, everything would fall apart.

What’s a Clock Cycle?

One clock cycle = one complete tick = the smallest unit of time the CPU understands.

One clock cycle is equivalent to one complete tick. It represents the smallest indivisible unit of time the CPU understands and utilizes to perform any action. Nothing, absolutely nothing, can happen for a duration shorter than one clock cycle.

The duration of a single clock cycle is simply the inverse of the clock speed.

For a CPU running at 1 GHz (1,000,000,000 cycles per second), one clock cycle lasts:

This is an incredibly tiny slice of time, emphasizing the sheer speed at which modern processors operate.

Clock Cycle in Action

Every CPU instruction takes at least 1 clock cycle to run.

Thanks to pipelining, modern CPUs can crunch simple operations super fast — even finishing one per cycle.

But older CPUs? Different story.

On something like the Intel 8088, a single MUL instruction could eat up tens or even hundreds of cycles. 🐢

Meet the 8088 – The OG PC Chip

Dropped in 1981, the Intel 8088 powered the first IBM PCs.

That moment? Kicked off the whole personal computer era.

It was a cost-cut version of the 8086 — same 16-bit CPU inside, but with an 8-bit external data bus instead of 16.

Why? So, IBM could use cheaper 8-bit parts and simpler motherboard designs. 💸

Downside? To move 16-bit data, the 8088 had to do two 8-bit transfers. Slower memory and I/O — but it was worth it for the cost savings at the time.

✅ Segmented Memory (Remember this?)

The 8088, like the 8086, used segment:offset addressing to get around the 64KB memory limit.

It combined:

A 16-bit segment register (points to a 64KB block)

A 16-bit offset

Together = a 20-bit address → Boom, access to 1MB of RAM.

(2¹⁶ segment shifted left by 4 bits + offset = 20-bit address) – we discussed this before.

✅ Compatibility Bonus

The 8088 ran the same instructions as the 8086 — full instruction set compatibility.

So devs didn’t have to rewrite anything. If it ran on 8086, it ran on 8088.

That made adoption easy and fast — crucial for software devs.

Modern x86 CPUs are incredibly sophisticated. They employ techniques like pipelining and out-of-order execution.

✅ Pipelining

Imagine an assembly line. Instead of one worker building an entire car from start to finish, different workers perform different stages simultaneously on different cars.

In a CPU, this means that while one instruction is in its "execute" phase, another might be in "decode", and a third in "fetch."

✅ Out-of-Order Execution:

The CPU skips stalled instructions and runs independent ones first, then reorders the results.

It’s like working on what’s ready instead of waiting — keeps the clock cycles busy.

If the CPU has to wait for slow memory? That gap = wait states (empty cycles where CPU chills while memory catches up).

If someone’s confused about how a CPU doesn’t freeze when one instruction stalls, here’s the reason:

“The CPU looks at its queue like this. If one instruction’s waiting on RAM, it just hops to the next one that’s ready. Keeps that pipeline moving.”

Wait States – When the CPU’s Just… Waiting

🚧 The Problem:

The CPU’s insanely fast — like sprinting ahead at gigahertz speeds.

But RAM? RAM’s out here jogging. 🐢

So when the CPU asks RAM for some data, it’s gotta wait... and wait...

...because RAM’s still looking for it in its dusty file cabinet.

🚧 The Result:

While waiting, the CPU basically just sits idle, burning through clock cycles doing nothing.

These wasted cycles?

They’re called wait states — and yeah, they suck. It’s like revving a Ferrari just to sit in traffic.

🚧 The Fix: Enter the Caches

Modern CPUs fight back using caches — tiny, super-fast memory layers:

L1 – Small but lightning fast, closest to the core
L2 – Bigger, a bit slower
L3 – Even bigger, shared across cores

These caches stash frequently used data so the CPU doesn’t always have to bug slow RAM.

If the data’s in cache? Boom — no wait states.

If not? Well... back to traffic.

🔁 TLDR:

Wait states = CPU stalls for the RAM to catch up. Wasted Clock Cycles. Like a 140wpm giga-typist server admin waiting for some document from an intern who types at 20wpm, in order to reboot the server.

Happens when RAM’s too slow to deliver data on time.

Cache acts as the CPU's ultra-fast, on-site mini-warehouse. It’s a small, incredibly quick memory buffer that stores frequently accessed data and instructions.

Repeat: Cache is fast, tiny, and loaded with the stuff the CPU uses most so it doesn’t have to keep calling slowpoke RAM.

THE INSTRUCTION EXECUTION CYCLE: THE CPU’S ETERNAL GRIND

Modern CPUs may have deep pipelines, out-of-order logic, and all kinds of secret sauce —

but at the core?

They still run the same ancient loop over and over:

Fetch → Decode → Execute → Store

Billions of times per second. Non-stop.

⚙️ Step 1: Fetch – Go Get the Next Command

What Happens:

The Control Unit checks the Instruction Pointer (IP / EIP / RIP) — this register holds the address of the next instruction to run.

That address is slapped onto the address bus.

CPU sends a READ signal on the control bus.

RAM (like a good librarian) grabs the binary instruction from that address — say 0100101010101010 — and sends it back through the data bus.

The instruction lands in the Instruction Register (IR) — ready to be decoded next.

CPU bumps the Instruction Pointer forward, pointing it to the next instruction for the next cycle.

TLDR:

Instruction Pointer → Address Bus → RAM → Data Bus → Instruction Register.

Analogy Time:

Imagine a factory worker following a checklist.

They look at the next task on their list (Instruction Pointer).

Head over to the supply room (RAM) to grab the specific blueprint for the task (instruction).

Bring it back to their station (Instruction Register) and get ready to work.

Then? Flip the page to the next item on the checklist — ready for the next fetch.

Step 2: Decode – “Alright, What Are We Even Doing?”

Once the instruction is loaded into the Instruction Register, the CPU begins the decode stage, where it interprets the raw binary instruction. During this phase, the Control Unit analyzes the instruction to determine exactly what needs to be done.

First, the CPU identifies the opcode (operation code), which specifies the type of operation to perform. This could be an instruction such as ADD, MOV, or JMP, and each opcode corresponds to a unique binary pattern that tells the CPU what action is required.

Next, the CPU determines the operands, which indicate the data involved in the operation. These operands may refer to values stored in registers, specific memory addresses, or immediate values that are directly embedded within the instruction itself. At this point, the CPU figures out where the data will come from and where the result should be placed.

After identifying the opcode and operands, the CPU translates the instruction into a sequence of micro-operations. These are small, precise internal steps that the hardware can execute, such as sending data from a register to the ALU, instructing the ALU to perform a calculation, and storing the result back into a register.

An easy way to understand this stage is through a factory analogy. After fetching a blueprint, the worker studies it carefully to understand what needs to be built. The opcode represents the final product (like “Widget X”), the operands are the required parts (such as Part A and Part B), and the micro-operations are the specific tools and steps used to assemble everything. At this stage, no actual building happens yet—the system is simply preparing and understanding the task before execution begins.

🔨 Step 3: Execute – “Do the Work!”

During the execute stage, the CPU carries out the action that was identified in the decode phase. At this point, there is no more interpretation or preparation—the system simply performs the required operation. The Control Unit sends signals to activate the appropriate components of the CPU based on the instruction.

One possible operation involves math or logic processing. In this case, the Arithmetic Logic Unit (ALU) takes control. The Control Unit supplies the operands to the ALU, which then performs operations such as addition, subtraction, or logical comparisons like AND and OR. Once the calculation is complete, the result is produced and prepared for storage. This process is powered by countless transistors switching on and off, allowing binary values to flow through logic gates at extremely high speeds.

Another possibility is data movement. For instructions such as MOV or LOAD, the CPU simply transfers data between registers or between memory and registers. No calculations are performed here; instead, the CPU focuses on efficiently routing data to the correct destination.

A third scenario involves control flow operations. Instructions like JMP or CALL cause the CPU to change the sequence of execution. The Instruction Pointer is updated to a new memory address, allowing the program to jump to a different section, such as a loop or function. This enables programs to make decisions and repeat tasks.

Physically, all of these operations are executed through the rapid switching of transistors inside the CPU. These tiny electronic switches work in perfect synchronization with the system clock, coordinating the flow of electrical signals. What appears as complex computation is actually the precise movement of electrons through circuits.

A helpful analogy is a factory worker who has already understood the blueprint and gathered the necessary parts. Now, they begin assembling the product by using the required tools, combining components, and completing the task. This stage represents the hands-on work, where the CPU actively processes data and produces results.

📥 Step 4: Store (Write-Back) – “Save the Result!”

What Happens:

The CPU just finished running the instruction — now it needs to put that result somewhere useful.

If it’s needed immediately?

→ Stored in a register for the next instruction to grab.

If it’s meant for long-term use?

→ Sent over the data bus to a specific spot in RAM (picked out using the address bus).

→ CPU also fires a WRITE signal through the control bus to tell memory, “Hey, store this here.”

TLDR:

Output goes to a register or memory, depending on where it’s needed next.

Analogy:

The factory worker just finished building Widget X.

If another worker needs it right away?

→ It goes into the "active bin" at their workstation (register).

If it’s going to storage or shipping?

→ They box it up and send it off to the warehouse (memory).

♻️ CPU Life:

This 4-step loop — Fetch, Decode, Execute, Store — never stops.

From the moment you power on to the second you shut down.

Billions of instructions per second. No breaks. No excuses.

♻️ Real Talk:

If the CPU clock is the beat, then the instruction cycle(F-D-E-S) is the dance.

Every instruction is a dancer. Same steps every time, just with different moves.

The Grand Choreography – From Boot to Shutdown

The Fetch → Decode → Execute → Store cycle isn’t optional.

It’s the heartbeat of your computer.

Every game you’ve played, every piece of malware you’ve reverse engineered, every compiler you’ve used —

all of them are just riding this loop. Billions of times per second.

🎵 So, What’s the Vibe?

If the CPU’s clock is a drumbeat, the instruction cycle is the choreo.

Fetch: Scope out the next move.
Decode: Understand what the move means.
Execute: Process the move.
Store: Save it and prepare for the next beat.

Even the wildest zero-day malware is just flipping bits in this same rhythm.

Its logic that runs the digital universe.

🕵️ Why This Matters for You:

As a reverse engineer, this is your map.

When you’re analyzing disassembly, you’re watching this dance play out — step by step.

When there’s lag? You’re spotting missed steps (wait states, cache misses).

When code’s obfuscated? You're untangling its footwork.

The better you understand this cycle, the more x-ray vision you get into what software really does underneath all the GUI fluff.

This isn’t just theory — it’s the grind behind every syscall, jump, XOR, and function call you’ll ever break down.

Once assembly becomes second nature, you’ll find it really easy to work with big tools like:

Binary Ninja:

x64dbg:

Reading from Memory: Why It’s Slower Than Just Grabbing from a Register

🐢 Why So Slow?

Accessing memory (RAM) ain’t like snatching something from your pocket (registers). It’s more like texting someone, waiting for a reply, and then saving that reply.

Here’s the real sequence when your CPU wants to read from memory:

Address Bus: CPU places the memory address it wants to read from.
Read Signal (RD Pin): It triggers a read signal — basically asking “Yo RAM, give me that.”
Wait: Gotta pause one or more clock cycles while memory gets its act together.
Data Bus: RAM finally sends the requested data back to the CPU, which copies it to a register or operand.

Each step takes time. Multiply by billions of reads? That lag adds up.

⚡ Registers vs Memory — Quick Comparison

Cache to the Rescue:

Because RAM is slow and registers are limited, CPUs got smart and added a middleman: Cache.

🟢 Level 1 (L1) Cache: Smallest, fastest, and lives inside the CPU.

🟡 Level 2 (L2) Cache: Slightly slower, but larger. Connected to CPU with high-speed buses.

🔴 Level 3 (L3) Cache (optional in some CPUs): Even bigger, shared between cores.

If the data’s in the cache? That’s a cache hit → super fast access.

If not? Cache miss → gotta go all the way out to RAM.

💾 Loading and Executing a Program

Step-by-Step: How Programs Go From File to Running

Before your code can run, the Operating System (OS) has to do a whole setup behind the scenes:

Find the Program: OS locates the file on disk by scanning the file system (directory).
Load It into RAM: The actual binary gets pulled into memory.
Allocate Memory: OS sets aside a chunk of RAM just for that program to play in.
Track the Program: It creates a Process ID (PID) and adds it to its tracking system.
Set Entry Point: OS points the CPU to the start of the program’s instructions.
Run It: CPU begins executing from the program’s entry point — instruction cycle begins.
Clean Up: When the program ends, OS clears its memory and removes the process from its tables.

🧃 Analogy:

Think of it like opening a game:

You click the icon → OS finds it.

Game files get loaded into RAM.

OS gives it some desk space to work on.

CPU is told: “Start here.”

CPU starts reading instructions one-by-one like it’s reading off a game manual at insane speed.

Linkers & Loaders: The Final Plug That Makes Code Run

⚒️ First up: The Linker

You wrote code. You compiled it. Now what?

When your source code is compiled (.c → .o), it’s not fully standalone yet. It’s like having puzzle pieces — but they haven’t been snapped together.

👉 What the Linker does:

Takes all those object files (.o, .obj) from different modules

Resolves external references (like when one file calls a function from another file)

Pulls in libraries (like printf() from libc)

Glues it all together into one executable file (.exe, .out, etc.)

Example:

Each gets compiled into separate .o files. The linker merges them into one file and makes sure the main() knows exactly where greet() lives in memory. Output of linker = a complete executable with all pieces in place.

Now Enter: The Loader

The loader is part of the Operating System, and it comes into play when you run the program.

👉 What the Loader does:

Reads the executable file from disk
Allocates memory for the code, stack, heap, etc.
Sets up the process environment (process ID, file descriptors, etc.)
Maps libraries into memory if needed (e.g., dynamic/shared libs)
Fixes up addresses if relocations are needed
Tells the CPU: “Alright, start here at the entry point”

It’s the person backstage setting up the mic, the lights, and the props right before the band walks on.

👉 Comparison:

🕶️ Analogy: Compiler, Linker, and Loader

Think of the whole process like building and driving a car. The compiler creates individual car parts, turning source code into machine-level pieces. The linker then assembles those parts into a complete car, connecting everything so it works as one unit. Finally, the loader takes that finished car out of the garage, places it on the track, starts the engine, and hands control over so it can actually run.

❓ Why This Matters (Especially in Reverse Engineering)

Understanding how programs are linked and loaded is extremely important in reverse engineering. When you know how a program is linked, you can identify its dependencies, such as external libraries or functions it relies on. This helps you understand the structure of the program and where key functionality might reside.

Knowing how a program is loaded into memory allows you to identify entry points, memory regions, and how execution begins. This is critical when analyzing or modifying program behavior. You can locate sections like .text (code), .data (initialized data), and .reloc (relocation info), which are essential when inspecting or patching binaries.

In more advanced scenarios, such as malware analysis, attackers may manipulate the loading process itself. They can inject code before execution begins or alter how memory is initialized, making it crucial to understand these stages in detail.

🔄 Are Crack “Loaders” the Same as OS Loaders?

The short answer is no—they are not the same. However, they rely on similar underlying concepts.

🖥️ The Legitimate OS Loader

The operating system’s loader is a built-in component responsible for preparing a program to run. It loads the executable into memory, assigns it a process space, resolves dependencies, and transfers execution to the program’s entry point. In simple terms, it acts like a stage crew that sets everything up before the performance begins.

🕵️ The Crack Loader (Warez/Gaming Context)

A crack loader is a separate tool designed to manipulate how a program runs. Instead of simply launching the executable, it interferes with or modifies execution to bypass restrictions such as licensing or trial limitations.

These loaders often intercept the program at runtime and alter its behavior in memory. For example, they may modify functions responsible for license verification so that they always return a “valid” result. In some cases, they inject additional code, bypass protections, or disable security checks.

🔫 What Crack Loaders Do

Crack loaders typically modify execution in several ways. They may intercept how the original program starts, inject code into memory, or patch logic related to licensing and restrictions. Some loaders hook into internal functions and force them to return favorable values, effectively unlocking premium features.

They can also bypass anti-debugging or anti-tampering mechanisms, allowing deeper inspection or modification of the program. In many cases, the loader runs the original executable in memory, applies patches dynamically, and then starts execution—leaving the actual file on disk unchanged.

💣 Why Use a Loader Instead of Cracking the EXE Directly?

Using a loader can be more effective than modifying the executable file itself. Some programs include integrity checks, such as checksums, that detect file modifications and prevent execution if changes are found. Others are packed or encrypted, making static patching difficult without first unpacking them in memory.

Advanced protections like anti-tamper systems further complicate direct modification. Loaders avoid these issues by applying changes at runtime, which makes them harder to detect. Because the original file remains untouched, anti-cheat or anti-crack systems are less likely to flag the program.

🧪 Example Flow of a Crack Loader

When you run a crack loader, the process begins when you click the loader executable. This program acts as a middleman between you and the actual application you want to run.

First, the loader silently launches the original game or software in the background. However, before the program fully starts, the loader intervenes and begins modifying its behavior in memory. It may patch specific bytes, effectively changing how certain instructions behave. It can also skip or disable license verification checks, ensuring that any validation logic is bypassed.

In some cases, the loader goes further by faking responses, such as simulating a successful login or license validation. As a result, when the program continues execution, it behaves as if everything is legitimate. From the program’s perspective, it appears that the user has valid access, even though the loader has manipulated the outcome behind the scenes.

💀 So… Same Loader?

Although both OS loaders and crack loaders deal with loading programs into memory, they serve very different purposes. An operating system loader is designed for normal, secure execution of software, ensuring everything is properly initialized and ready to run.

A crack loader, on the other hand, is a custom tool built to interfere with that process. It leverages the same fundamental idea—controlling how a program is loaded into memory—but uses it to alter execution in ways the original developers did not intend.

At the core, both rely on the same principle: if you can control what gets placed into memory, you can influence what the CPU executes.

🎯 In Reverse Engineering and Malware Analysis

This concept is especially important in reverse engineering and malware analysis. Analysts study executable formats like the Portable Executable (PE) structure to understand how programs are organized and loaded.

They also learn techniques such as dumping unpacked binaries from memory, which allows them to analyze code after it has been decrypted or unpacked at runtime. This is crucial when dealing with protected or obfuscated software.

Additionally, analysts watch for advanced techniques like code caves, manual mapping, and reflective loading. These methods allow code to be injected or executed without following the standard loading process used by the operating system.

Understanding these behaviors helps analysts detect, analyze, and counteract software that manipulates execution, since such loaders often bypass normal system rules and operate in less visible ways.

DLL Linking — Static vs Dynamic

🧱 Static Linking (Old School, But Solid)

All the required library code gets copied directly into your final .exe during compilation.
The final executable becomes self-contained.
Bigger file size, but no dependency on external DLLs.

Example:

If you statically link math.lib, the functions like pow() are baked into the EXE. No DLL needed at runtime.

👊 Pro: No external DLL problems

Con: Bigger EXEs, can’t patch/update libraries easily

Dynamic Linking (Welcome to the Modern World)

Your EXE doesn’t contain the actual function code.

Instead, it says: “Yo OS, when I run, grab this function from some.dll please.”

Code is loaded at runtime from DLL files.

Makes your EXE lighter and more modular.

Example:

You compile with user32.dll for MessageBoxW() — it’s not copied into your EXE.

Instead, Windows loads user32.dll when your app runs and links the call then.

Pro: Smaller files, easier library updates

Con: If DLL is missing, wrong version, or hacked... chaos.

And Now… Cracks, Loaders, and DLLs

🔁 DLL Injection:

Malware or game cracks use dynamic linking to their advantage.

Here’s how:

A “loader” injects a custom DLL into a target process (like a game).

That DLL might:

Hook system functions.

Bypass checks.

Unlock features.

Or even replace existing DLLs.

🥷 DLL Hijacking:

If your EXE expects libX.dll and it finds your fake libX.dll in the same folder, guess what?

💥 It’ll load yours.

Crackers use this to:

Replace original DLLs with modded ones.

Force dynamic linking to their own payloads.

Intercept or reroute legit game functions.

That's why you sometimes see “Put cracked DLLs in game folder” — it's hijacking the load path.

🧵 TLDR Recap:

Bonus Thought:

In reverse engineering or malware analysis:

Look for Import Address Tables (IAT) in PE files to see dynamic links

Use tools like CFF Explorer, x64dbg, or Ghidra to trace loaded DLLs

Trace calls like LoadLibraryA, GetProcAddress, VirtualAlloc, etc. — that’s where dynamic magic (or evil) happens

Advanced stuff we won’t touch over here

You down for a visual table of what the EXE looks like when statically vs dynamically linked?

I can also show how a loader intercepts the load path using a diagram.

This rabbit hole just keeps going deeper 🐇🔍

CFF Explorer:

Ghidra:

Cutter v2.0:

FIRST ASSEMBLY PROGRAM 👶💻

We are done with theory. Let's write code.

We will look at a simple program that takes two numbers, adds them together, and saves the result in a Register (a tiny, super-fast storage slot inside the CPU).

The Basic Structure

main PROC: This marks the beginning. Think of PROC (Procedure) as the start of a function in Python or C++. It tells the computer, "Start executing here."
MOV eax, 5: This is the assignment operator. We are putting the value 5 into the register named EAX.
Note: MOV stands for "Move," but it really means "Copy." The 5 doesn't disappear from where it came from; it just gets copied into EAX.
ADD eax, 6: The math happens here. The CPU takes the value currently in EAX (which is 5), adds 6 to it, and stores the result (11) back into EAX.
INVOKE ExitProcess, 0: This is a call to the Operating System (Level 2!). It tells Windows, "I am done here, shut it down." Without this, the program might crash or hang.
main ENDP: The "End Procedure" marker. It closes the block we opened with main PROC.

Introducing Variables and Segments

Real programs need to store data, not just hard-coded numbers.

To do this, we divide our program into Segments.

Think of segments as different rooms in a house, each with a specific purpose.

Here is the upgraded program with variables:

I. The .data Segment

This is where you declare variables. It is a specific area in memory reserved just for storage.

sum DWORD 0:

Name: sum
Size: DWORD (Double Word). This means 32 bits.
Value: 0 (The initial value).

II. The .code Segment

This is where your instructions (logic) live. This area is usually "Read-Only" so you don't accidentally overwrite your own program code while it's running.

III. The .stack Segment

(Mentioned briefly) We will cover this later, but this is a scratchpad area for temporary storage during function calls.

The Wild West of Data Types

In high-level languages like C++ or Java, data types are strict. You must clearly say whether something is an integer, a floating number, or a character.

If you try to store a letter in an integer, the compiler immediately throws an error and stops you.

Assembly language works very differently. Assembly does not enforce data types at all. It does not protect you or correct your mistakes.

In Assembly, size is what matters, not meaning. When you write something like DWORD, you are only telling the computer to reserve 32 bits of memory. You are not saying what kind of data will be stored there.

There is no type checking. The CPU does not know or care whether those 32 bits represent a number, a letter, or a memory address. It will process the data exactly as you tell it to.

This gives you total control, but also total responsibility. You can treat a number like a character or an address if you want, and Assembly will allow it. If you make a mistake, the program will crash or behave incorrectly. There are no safety rails.

Big Idea to Remember

Memory is organized into segments. The .code segment holds the program logic, while the .data segment holds variables.

Registers, such as EAX, are the CPU’s working space. They temporarily hold data while the processor performs operations.

Instructions tell the CPU what to do. MOV copies data, ADD performs math, and INVOKE communicates with the operating system.

Assembly does not understand data types. It only understands how many bits something uses, not what those bits are meant to represent.

INTEGER LITERALS

An integer literal (also called an integer constant) is a number written directly in a program.

An integer literal can have:

an optional sign (+ or -)

one or more digits

an optional radix letter at the end that tells us what base the number is written in

General form:

[{+ | -}] digits [radix]

Examples

26 - This is a valid integer literal. It has no radix letter, so we assume it is decimal (base 10).

26h - This means 26 in hexadecimal (base 16).

1101 - This is treated as decimal, not binary, because there is no radix letter.

1101b - The b tells us this number is binary (base 2).

So, without a radix letter, the number is always assumed to be decimal.

Radix Table

Here is the table:

Important note about Encoded Real

Encoded Real does not have a specific base value.

It is a binary format used to represent floating-point numbers, not normal integers.

Examples of Integer Literals with Radixes

Each line below shows an integer literal, followed by a comment explaining its base:

HEXADECIMAL BEGINNING WITH A LETTER

In assembly language, a hexadecimal number that starts with a letter must have a leading zero.

Why?

Because the assembler might think the value is a name (identifier) instead of a number.

Example that causes an error

This causes an undefined symbol error.

Why this happens:

The value starts with the letter A
The assembler assumes A123h is the name of a variable or label
Since no such name exists, it throws an error

Correct version (with leading zero)

Now it works correctly.

The leading zero tells the assembler:

“This is a hexadecimal number, not an identifier.”

Rule to remember

Any hexadecimal literal that begins with a letter must start with 0.

Examples:

0A3h ✅

0FFh ✅

A3h ❌

CONSTANT INTEGER EXPRESSIONS

A constant integer expression is a math expression made using:

integer literals

arithmetic operators

These expressions are calculated at assembly time, not while the program is running.

From now on, we’ll just call them integer expressions.

Important rule: The final result must be an integer, must fit in 32 bits, valid range: 0 to FFFFFFFFh

Arithmetic Operators and Precedence

Operator precedence means the order in which operations are done.

Here is the table, from highest priority to lowest priority:

What does unary mean?

Unary means the operator works on one value only.

Examples:

-5 → unary minus (one number)

+3 → unary plus (one number)

This is different from: 5 - 2 → subtraction (two numbers)

Unary operators explained

Unary plus (+)

Just returns the value +5 → 5

Unary minus (-)

Changes the sign -5 → negative five

Why unary has higher precedence

Unary plus and minus are done before multiplication and division.

Example:

What happens:

-2 is evaluated first (unary minus)
Then -2 * 3
Result is -6

Operator Precedence Examples

Multiply first, then add. Result: 14

1 mod 5 first → 1

Then subtraction. Result: 11

Unary minus first → -5

Then add. Result: -3

Parentheses first
Then multiply
Result: 36

Using Parentheses (Best Practice)

Even if you know the rules, use parentheses.

Why?

Makes expressions easier to read
Prevents mistakes
You don’t have to remember precedence rules

Modulus Operator (mod or %)

The modulus operator gives the remainder of a division.

Example:

That’s all it does—no magic.

REAL NUMBER LITERALS

A real number literal is just a number that can have:

a decimal point
or a fraction
or a very large / very small value

These are also called floating-point numbers.

In assembly, real numbers can be written in two ways:

Decimal reals (the normal way humans write numbers)
Encoded reals (hexadecimal form, using IEEE format)

Decimal Real Numbers

A decimal real looks like a normal decimal number.

A decimal real number is a number written in base-10 (decimal) notation, the same format used in everyday arithmetic.

It represents a value on the real number line and may include a fractional part and, optionally, an exponent. Examples include 3.14, -0.5, and 6.02 × 10²³.

General form:

Let’s break that into plain English.

A decimal real number can be broken into several components. Some parts are required, while others are optional, depending on how the number is written.

Parts of a decimal real

⭐ The sign indicates whether the number is positive or negative.

Represented by + or -

If no sign is written, the number is assumed to be positive

The sign applies to the entire value of the number

Examples:

+7.25 → positive

-4.6 → negative

9.1 → implicitly positive

⭐ The integer part (also called the whole number part) is the sequence of digits to the left of the decimal point.

Represents the whole units of the number

Can be 0 if the value is less than 1

Must contain at least one digit if a decimal point is present

Examples:

123.45 → integer part is 123

0.75 → integer part is 0

-8.9 → integer part is 8

⭐ The decimal point separates the integer part from the fractional part.

Indicates that digits to the right represent fractions of a whole

In decimal real numbers, a dot (.) is used (not a comma)

Without a decimal point, the number is an integer, not a decimal real

Example: In 45.67, the dot separates 45 and 67

⭐ The fractional part consists of digits to the right of the decimal point.

Represents values less than one (tenths, hundredths, thousandths, etc.)

Each digit has a place value based on powers of 10

Can be omitted if the number is a whole number

Examples:

3.14 → fractional part is 14
10.0 → fractional part is 0
6. → fractional part omitted (still valid in many contexts)

⭐ The exponent is used in scientific notation to scale the number by a power of 10.

Written using × 10ⁿ or e notation (e.g., 1.5e3)

Allows compact representation of very large or very small numbers

The exponent indicates how many places the decimal point is shifted

Examples:

6.02 × 10²³ → very large number
3.1 × 10⁻⁴ → very small number
7.5e2 → same as 750

⭐ Why Decimal Reals Are Used

Decimal real numbers are especially useful because they:

Accurately represent fractions and continuous values

Are intuitive and easy for humans to read

Can represent very large or very small quantities when combined with exponents

Are widely used in science, engineering, finance, and computing

Exponent format

E [+ or -] integer

The exponent means:

“Multiply this number by 10 raised to some power.”

I. What “Exponent format” means

Exponent format is a shortcut way of writing big or small decimal numbers.

It looks like this:

Eg 44.2E5

This does NOT mean a new kind of number.

It simply means:

Take the number and multiply it by 10 raised to a power

What the E actually means

The letter E stands for “× 10 to the power of”. So:

Examples:

E5 means × 10⁵

E-3 means × 10⁻³

How to Think About Exponents

Golden Rule (memorize this)

👉 The exponent never changes the digits.

👉 It only moves the decimal point.

That’s it. Nothing else.

Step-by-Step Examples (Slow and Clear)

Example 1: 2.

Means 2.0

The decimal point is present

Any number with a decimal point is a real number

Value = 2

Example 2: +3.0

+ means positive

Same value as 3.0

Value = 3

Example 3: -44.2E+05 (this looks scary, but it’s not)

Step 1: Ignore the sign for now. Start with 44.2

Step 2: Understand the exponent - E+05 means × 105

So, we are doing: 44.2 × 105

Step 3: Move the decimal point

Power is +5

Move the decimal 5 places to the right

44.2 → 4,420,000

Step 4: Apply the sign – The original sign was negative

✅ Final answer: -4,420,000

Example 4: 26.E5 (this confuses many beginners)

Step 1: Look carefully at the number - 26.

There are no digits after the decimal point.

👉 This is allowed.

👉 It is automatically assumed to be: 26.0

Step 2: Apply the exponent - E5 means ×105

So, 26.0 * 105

Step 3: Move the decimal point 5 places to the right - 26.0 → 2,600,000

✅ Final answer: 2,600,000

“But there are no digits after the dot!”

That’s okay.

26. means 26.0

Missing fractional digits are assumed to be zero

So: 26.E5 = 26.0 × 105

This is 100% valid.

Another Example: 44.2E05

E05 still means 10⁵

Leading zeros in the exponent do not change the value

So: 44.2E05 = 44.2 × 105 = 4,420,000

⭐ 26.E5 → valid

⭐ 44.2E05 → valid

Both are correct scientific notation.

The “Aha” Idea (Most Important Part)

🔑 The exponent does NOT change the digits.

🔑 It only moves the decimal point left or right.

Positive exponent → move right

Negative exponent → move left

Once this clicks, exponent format becomes easy.

Encoded Real Numbers (Beginner Explanation)

Why this exists?

Humans and computers do not store numbers the same way.

Humans write numbers like: 1.0

Computers cannot store decimals directly
Computers store numbers as binary patterns (0s and 1s)

An encoded real number is: A real number converted into a binary pattern so the computer can store and process it.

An encoded real is a real number that has been:

Converted into binary
Stored using a fixed standard format
Written in hexadecimal to make it easier for humans to read

This standard format is called: IEEE floating-point format.

Why Hexadecimal Is Used

Binary numbers are very long and hard to read:

So, we group the bits into chunks of 4 and write them in hexadecimal:

That gives: 3F800000

Important Idea (Very Important)

3F800000 is NOT a normal number⚠️

It does not mean “three million something”.

It is: A code that represents the real number 1.0

Humans vs Computers (Clear Comparison)

They represent the same value, just in different forms.

The r at the End (Assembler Hint)

When writing encoded reals in assembly language, you may see: 3F800000r

The r tells the assembler:

“This hexadecimal value is an encoded real number, not an integer.”

Without the r, the assembler would treat it as a normal hex integer.

Example 1: Encoded Real for 1.0

Step 1: Binary representation

This binary pattern follows the IEEE 32-bit floating-point layout.

Step 2: Convert to hexadecimal - Group bits into 4s.

Final hex: 3F800000

Step 3: Mark it as a real number

3F800000r

This tells the assembler:

“Store the real number 1.0 using IEEE floating-point encoding.”

Summary

An encoded real is how a computer stores a real number

It is written in hexadecimal
It follows the IEEE floating-point format
The hex value is a bit pattern, not a normal number
The suffix r tells the assembler it is a real number

Encoded reals are not numbers — they are instructions for how the computer should interpret bits as a real value.

IEEE Floating-Point (Short Real)

A short real uses 32 bits, split like this:

Example 2: Decimal +1.0

Binary representation:

0 01111111 00000000000000000000000

Breakdown:

0 → positive number

01111111 → exponent for 1.0

000... → mantissa

Converting to hexadecimal

Group bits into 4s: 0011 1111 1100 0000 0000 0000 0000 0000

Convert each group to hex: 3FC00000

So, the encoded real is: 3FC00000

Important note (and a relief)

We won’t use real-number constants for a while.

Why?

Most x86 instructions work with integers

Floating-point math is more advanced

You’ll come back to this later (Chapter 12), when it actually makes sense and feels useful.

Big-picture summary (don’t skip this)

Decimal reals → for humans

(3.0, -44.2E5, 26.E5)

Encoded reals → for the computer

(3F800000r, IEEE format)

You are not expected to memorize the binary layouts right now

Just understand what they are, not how to build them by hand

CHARACTER LITERALS

A character literal is one single character written inside single quotes or double quotes. Examples: ‘a’, “d”

How characters are stored

Even though a character looks like a letter, the computer stores it as a number.

This number comes from the ASCII table.

Example: 'A'

- ASCII value (decimal): 65
- ASCII value (hex): 41h
- So, when you write: ‘A’
- What actually goes into memory is: 65 or 41h
- Important Reminder: Characters Are Just Numbers

The core idea (say this slowly)

👉 A computer does not understand letters or symbols.

👉 It only understands numbers (stored in binary).

So, when you see a character like A, the computer actually stores a number that stands for A.

When we say: “Characters are not magic inside the computer”

It means:

The computer does not store the shape of the letter

It does not store meaning

It stores a number code

Example: 'A' is stored as the number 65

The computer treats 65 as just a number.

Humans interpret that number as the letter A.

Why We Need a Table (ASCII)

Because characters are just numbers, everyone must agree on:

“Which number represents which symbol?”

That agreement is called ASCII.

The ASCII table is simply a lookup chart that says:

What the ASCII Table Contains (With Meaning)

1. Letters

Numbers assigned to alphabet characters.

Examples:

A → 65

a → 97

Uppercase and lowercase have different numbers.

2. Digits

Characters that look like numbers, but are still characters.

Examples:

'0' → 48

'1' → 49

⚠️ Important: '5' ≠ 5

'5' is a character

5 is a numeric value

3. Symbols

Punctuation and special characters.

Examples:

+ → 43

# → 35

@ → 64

4. Control Characters

Characters that do not print anything, but control behavior.

Examples:

New line

Tab

Backspace

They tell the computer how to format text, not what to display.

“You’re Expected to Recognize Common Ones”

This does not mean memorise the whole ASCII table.

It means:

Know a few important examples
Understand the idea, not the entire list

Common ones to recognize:

That’s usually enough for exams and understanding code.

💡 Characters are just numbers.

💡 ASCII is the dictionary that maps numbers to symbols.

💡 The computer only sees numbers — humans see letters.

STRING LITERALS

A string literal is more than one character written inside quotes.

It can include:

letters
numbers
symbols
spaces

Examples:

📦 Notice About Strings

When working with strings, every character inside the quotes matters, including spaces. For example, '4096' is treated as a string, not a number, because it is enclosed in quotes. If you write ' 4096 ', the spaces before and after the digits are also part of the string and will be stored in memory just like any other character.

📦 How Strings Are Stored in Memory

A string is stored in memory as a sequence of bytes, where each byte represents a single character. Each character is converted into its corresponding numeric value based on a character encoding standard.

For example, the string "ABCD" is stored as four separate bytes. Each character is translated into its ASCII hexadecimal value:

A → 41h
B → 42h
C → 43h
D → 44h

So in memory, "ABCD" becomes a series of bytes: 41h 42h 43h 44h.

📦 Why Characters and Strings Are Stored as Integers

Computers can only store and process numbers. At the lowest level, memory is made up of bits, and bits represent binary values (0s and 1s). Because of this, everything—including text—must be converted into numerical form before it can be stored or processed.

This is why characters are represented as integers behind the scenes.

📦 Encoding Schemes

To make text representation possible, computers rely on encoding schemes such as ASCII and Unicode. These systems assign a unique numeric value to each character and provide a standard that all systems can follow.

For example, in ASCII:

'A' is represented by the number 65
'B' is represented by 66
'a' is represented by 97

These mappings allow computers to consistently interpret and display characters.

📦 Strings in Memory

A string is stored as a sequence of these character codes placed one after another in memory. In many programming languages (like C), strings are typically followed by a special value called a null terminator, which is 0.

This null terminator signals the end of the string. It tells the program where the string stops, so it doesn’t continue reading into unrelated memory.

For example, the string "ABC" in memory would look like: 41h 42h 43h 00h

The final 00h is the null terminator indicating the end of the string.

📦 Big idea

Characters look like letters

Strings look like words

But in memory:

characters = integers
strings = sequences of integers

At the memory level, everything is numbers.

"CAT" → 67 65 84

Each number is the ASCII code of one character.

Characters and strings are stored as integers because all data in computer memory is represented as numbers, using encoding schemes like ASCII.

👉 There is no special “text” storage inside the computer.

What changes is how the numbers are interpreted.

ASCII says:

65 means A
66 means B
97 means a

So, the computer stores numbers, and software decides:

“These numbers should be treated as characters.”

Text is not special inside a computer — it is just numbers that we choose to read as letters.

Characters and strings are stored as integers because computer memory can only represent numbers, and encoding schemes such as ASCII define how numeric values correspond to characters.

RESERVED WORDS

A reserved word is a word that the assembler has already claimed.

Think of it like this:

The assembler says: “This word already has a job. You can’t reuse it for something else.”

So:

You cannot use reserved words as variable names
You must use them only where they are meant to be used

Case Sensitivity

Reserved words are not case-sensitive.

That means:

They all mean the same instruction.

Types of Reserved Words (With Meaning)

1. Instruction Mnemonics

These are the actual commands the CPU understands.

Examples:

MOV → move data
ADD → add values
MUL → multiply values

You must not use these as identifiers.

2. Register Names

Registers are small storage locations inside the CPU.

Examples:

AX, BX, CX
EAX, EBX

These names are reserved because they refer to real hardware.

3. Directives

Directives tell the assembler, not the CPU, what to do.

They control how the program is built, not how it runs.

Examples:

.data → start of data section
.code → start of code section

4. Attributes

Attributes describe size or type of data.

Examples:

BYTE → 1 byte
WORD → 2 bytes

They help the assembler know how much memory to use.

5. Operators

Operators are symbols or words used in constant expressions.

Examples:

+, -, *
AND, OR

They are reserved because they perform calculations.

6. Predefined Symbols

These are special names that already have values.

Example: @data → returns a constant integer at assembly time

You don’t define them — the assembler provides them.

7. Summary: Reserved Words

Reserved words have special meaning

They can only be used in their intended context

They are not case-sensitive

You cannot use them as identifiers

IDENTIFIERS

I. What Is an Identifier?

An identifier is a name you choose.

You use identifiers to name:

Variables
Constants
Procedures
Labels

Identifiers exist to make code readable and understandable.

II. Rules for Forming Identifiers

1. Length

Must be between 1 and 247 characters

Long names are allowed

Short, meaningful names are recommended

2. Case Sensitivity

Identifiers are not case-sensitive. So:

They all refer to the same identifier.

3. First Character Rule

The first character must be one of these:

A letter (A–Z or a–z)
_ (underscore)
@
?
$

Cannot start with a digit.

Valid:

❌ Invalid:

4. Remaining Characters

After the first character, you may also use:

Letters

Digits (0–9)

Valid:

5. Cannot Be a Reserved Word

You cannot use:

These already belong to the assembler.

Good Identifier Naming (Style Matters)

Even though assembly looks cryptic, your names don’t have to be.

Invalid identifiers (why they are wrong):

Spaces are not allowed in identifiers.

Legal but Not Desirable

These work, but are discouraged:

Why?

_, $, and @ are often used internally by assemblers
Using them can cause confusion or conflicts

Reserved words have predefined meanings in assembly and cannot be used as identifiers, while identifiers are programmer-defined names that follow specific rules to improve code readability.

The assembler already owns some words — you choose names for everything else.

ASSEMBLER DIRECTIVES: THE BLUEPRINTS

If instructions (like MOV and ADD) are the bricks and actions of your program, Directives are the blueprints.

Directives are special commands for the Assembler (the software building your program), not for the CPU.

👉 Directives:

Are read only when assembling
Do not run when you click the .exe file.
Do not generate machine code instructions

Think of directives as setup instructions:

“Assembler, here’s how to build my program.”

They tell the assembler how to set up memory, where to put variables, and how much space to reserve before the program ever starts.

Directives are generally not case-sensitive. .data, .DATA, and .Data are all the same thing.

Directives vs. Instructions

The Directive (DWORD): Tells the assembler,

"Hey, reserve 4 bytes of space right here and call it myVar." (During assembly time).

Talks to the assembler.

The Instruction (MOV): Tells the CPU,

"Hey, go grab the data inside myVar and move it to a register." (Happens during run time).

Talks to the CPU.

DWORD is a directive.

It tells the assembler: “Reserve 4 bytes and store the value 26”

No CPU action happens here.

MOV is an instruction.

It runs at runtime.

It copies data into a register.

Important Properties of Directives

Directives are not case-sensitive

These all mean the same thing:

Why Directives Exist

Directives are used to:

Define variables
Allocate memory
Organize program sections
Define constants
Set up the stack
Control how code is assembled

PROGRAM SEGMENTS

A program is divided into segments.

Each segment has a specific purpose.

Directives tell the assembler: “This part of the program is for X.”

Common Assembly Directives

.data — Initialized Data Segment

The .data directive marks the section where variables with known initial values are stored.

“The following lines define data that already has values.”

Memory is reserved
Values are stored immediately
This is where you put constants and variables that have a starting value.
Defining known values (like a high score starting at 0, or a username).

.bss — Uninitialized Data Segment

This stands for "Block Started by Symbol" (an old history term), bss is an empty space.

Its used for variables that exist but start with no value. (like a buffer for user input).

“Reserve memory, but don’t store values yet.”

It saves space in the executable file. You don't need to store 1,000 zeros; you just tell the OS, "I need 1,000 bytes of empty space here."

Space is reserved
Contents are undefined (garbage)
Used for arrays and large buffers

.text or .code — Code Segment

This is the Read-Only zone where your actual code instructions live.

The CPU fetches commands from here.

“The CPU will run what comes next.”

This is actual program logic.

.equ — Define a Constant Symbol

A symbol is a name that represents:

A constant value
A memory location
An address

Symbols make code readable and maintainable.

The .equ directive defines a constant.

Once defined, it cannot change.

It works like Find and Replace.

It does not use any memory; it just helps you read the code.

Why .equ is useful

Avoids magic numbers
Easy to change values
Makes code portable

.stack — Define the Runtime Stack

What is the stack?

The stack is a special, dynamic area of memory used for temporary storage.

It manages subroutine calls (keeping track of where to return to) and local variables.

LIFO Structure: It works like a stack of plates. The last plate you put on top (Push) is the first one you take off (Pop).

Growth: Weirdly, the stack usually grows downwards in memory (from high addresses to low addresses).

Setting the Size: You must tell the assembler how big this scratchpad should be using the .STACK directive.

The runtime stack:

Stores return addresses
Stores local variables
Grows downward in memory
Uses LIFO (Last In, First Out)

What .STACK does

The .STACK directive:
Reserves memory for the stack
Sets its maximum size

Allocates 100 bytes for the stack
Prevents stack overflow (if sized correctly)

Why Stack Size Matters

If the stack grows beyond its allocated space:

Memory gets overwritten

Program may crash

Behavior becomes unpredictable

This is called stack overflow.

What happens here

- .STACK → sets stack size
- .data → stores text
- .text → contains instructions

Directives prepare the program

Instructions run the program

Assemblers share the same instruction set, but directives differ between assemblers.

Example:

Microsoft assembler supports REPT

Other assemblers may not

Directives are assembler commands that control program structure, memory allocation, and symbol definition, and they do not generate executable machine instructions.

Directives build the program. Instructions run the program.

Directives (starts with .) = Instructions for the Assembler (Setup).

Instructions (like MOV) = Commands for the CPU (Action).

Segments:

.data = Variables with values.
.bss = Empty variables.
.code = The actual program logic.
.stack = Temporary scratchpad.

INSTRUCTIONS

Think of an instruction as a single, clear command given to the computer’s brain (the CPU).

When you write a line of assembly code, you are basically writing a to-do list for the processor.

However, the CPU doesn't speak English; it only understands bits and bytes.

So, when you assemble your code, a program called an Assembler acts as a translator, turning your written instructions into the machine language the computer actually runs.

An instruction isn't just one big lump of text; it’s usually broken down into four specific parts.

1. The Label (The "Bookmark")

A label is completely optional, but it’s incredibly useful.

Think of it like a bookmark or a signpost in your code.

It’s just a name you give to a specific spot in the program so you can find it easily later.

If you want the computer to "jump" back to a certain spot or repeat a section of code (like a loop), you give that spot a label.

How it looks: You write a word and follow it with a colon (like loop:).

The Golden Rule: Every label name has to be unique. You can't have two spots named "Step1," or the computer won't know which one you're talking about.

How this explains labels

start:

This is a label. It marks the beginning of the program.

The program does not have to have it, but it’s useful.

print_msg:

This label marks a spot we want to jump back to.

It acts like a bookmark.

jnz print_msg

This tells the computer:

“If the condition is true, go back to the place named print_msg.”

That’s the label being used.

end:

Another label marking where the program finishes.

2. The Mnemonic

The mnemonic is the actual command in an assembly instruction.

It is the only part that must exist.

A mnemonic is just a short, easy-to-remember name for something the CPU knows how to do.

Before mnemonics existed, programmers had to write long strings of numbers to control the computer. That was slow, hard to read, and easy to mess up.

Mnemonics fixed that by giving those numbers names.

Think of the mnemonic as the verb in a sentence:

- MOV → move data
- ADD → add numbers
- SUB → subtract
- JMP → jump to another place in the code

If an instruction has no mnemonic, it is not an instruction.

The CPU cannot guess what you want—it needs a command.

mov → mnemonic (the command)
eax, 5 → operands (what the command works on)

This tells the CPU: “Move the value 5 into the register EAX.”

add is the mnemonic. It tells the CPU to perform addition

Without the mnemonic, this would mean nothing.

The mnemonic is the name of the operation being performed.

It tells the CPU what action to take and is the only required part of an assembly instruction.

3. The Operands (The “Targets”)

If the mnemonic is the verb, then the operands are the nouns.

Operands are the things the instruction works on.

Most instructions don’t make sense without them. If you tell the CPU to ADD, its next question is:

“Add what to what?”

That’s what operands answer.

What operands can be

Operands can be different kinds of things, depending on the instruction:

Constants - A fixed number written directly in the code: 5
Registers - Small, fast storage locations inside the CPU: eax, ebx
Memory locations - A specific place in RAM where data is stored: [value]
Labels - A named location in the program, used mainly with jump instructions: loop

mov → mnemonic

eax and 5 → operands

Meaning: move the constant 5 into the register eax

Operands: eax, ebx

Meaning: add the value in ebx to the value in eax

Operand: loop (a label)

Meaning: jump to the place in the code named loop

Code label (for jumps/loops):

Data label (for variables):

Array example with offset:

Important detail

Some instructions have one operand
Some have two
A few have none

But when operands are present, they always tell the CPU:

where the data comes from and where the result goes

Operands are the values, registers, memory locations, or labels that an instruction acts upon.

4. The Comments (Notes for Humans)

Comments are only for humans.

The CPU and the assembler completely ignore them.

Comments do not become machine code
They do not take up memory
They exist only to help you and anyone else reading the code

Assembly can get confusing fast, so comments explain why something is done, not just what is done.

A comment starts with a semicolon:

The second one is more useful when you come back later or need to fix a bug.

5. Putting it all together

Here is a full instruction showing all parts working together:

loop_start: → Label (marks a location in the code)
add → Mnemonic (the action)
eax, 1 → Operands (what the action works on)
; increase the counter for each loop → Comment (human explanation)

Comments don’t affect the program at all, but they make the code understandable and easier to maintain.

NOP (No Operation)

The NOP instruction stands for No Operation.

When executed, it does absolutely nothing.

It takes 1 byte of memory.

It’s mostly used as a placeholder or for aligning code in memory.

Why use NOP?

Alignment: Some processors work faster if instructions start at specific memory addresses (like multiples of 4).
Padding: To maintain the size of an instruction stream.
Debugging: You can insert NOPs temporarily to test timing or skip over instructions without changing program behavior.

Example 1: Simple NOP

The nop instruction does nothing.

It’s just a placeholder between the two instructions.

Example 2: Alignment Example

The nop ensures that the next instruction starts at a multiple-of-4 address.

This can improve performance because the processor accesses memory more efficiently.

Key Points

NOP does nothing when executed.

It’s used for padding, alignment, or debugging.

Takes 1 byte of memory.

Does not affect registers or memory.

x86 PROCESSORS AND SPEED

x86 processors work faster when code and data start at even doubleword addresses

(that means addresses that are multiples of 4 bytes).

Why this matters

The x86 CPU moves data in 4-byte chunks

If data starts at a 4-byte boundary, the CPU can fetch it in one step

If it’s not aligned, the CPU needs two memory accesses

Two accesses = slower program

Aligned vs unaligned (idea)

Aligned address: 0, 4, 8, 12, ...

Unaligned address: 1, 2, 3, 5, 6, ...

When data is unaligned, performance drops.

How programmers fix this

To avoid slowdown, programmers:

align code and data to 4-byte boundaries
use padding
use NOP instructions to push code to the correct address

x86 processors load code and data faster from even doubleword (4-byte aligned) addresses because aligned data can be fetched in a single memory access.

ANATOMY OF A 32-BIT ASSEMBLY PROGRAM

Here is the full, working source code for addTwo.asm

SETUP DIRECTIVES (THE RULES)

Before writing any instructions, we must tell the assembler what kind of program we are writing.

These directives do not generate machine code. They only set rules.

I. Processor Directive — .386

This tells the assembler:

“Generate code for an Intel 80386 processor or newer.”

Why this matters:

The 80386 was the first 32-bit x86 processor

This directive enables 32-bit registers such as EAX, EBX, ECX

Without it, the assembler assumes 16-bit mode

Bottom line:

.386 is required for 32-bit assembly programs.

II. Memory Model — .model flat, stdcall

This directive defines how memory is addressed and how functions are called.

Flat memory model

Memory is treated as one continuous address space
You can access any memory location directly
This is the standard model used by 32-bit Windows

Stdcall calling convention

Used by Windows API functions
Function arguments are passed on the stack
The called function cleans up the stack

Bottom line:

32-bit Windows programs require flat memory and stdcall function calls.

III. Stack Directive — .stack 4096

This reserves 4096 bytes (4 KB) for the program stack.

Why 4096 bytes:

4 KB is the size of a standard memory page
Enough space for local variables and function calls in small programs

Bottom line: The stack is required for function calls and parameter passing.

TALKING TO THE OPERATING SYSTEM

A Windows program must tell the OS when it finishes and whether it succeeded.

I. Function Prototype — PROTO

This tells the assembler:

There is a function named ExitProcess

It takes one parameter

The parameter is a DWORD

Why this is required:

The INVOKE instruction needs to know the function’s parameters

Prevents calling functions with the wrong number or type of arguments

Bottom line:

You must declare a function prototype before using INVOKE.

II. Exit Code — dwExitCode

When a program ends, it returns an exit code to the operating system.

Common values:

0 → Program completed successfully
Non-zero → Program failed or encountered an error

III. Why the Exit Code Matters

Operating systems and scripts check the program’s exit code.

Example:

A batch file runs several programs
It checks %ERRORLEVEL%
If the exit code is 0, it continues
If the exit code is non-zero, it stops or reports an error

Bottom line: Always return 0 if your program finishes correctly.

BUILDING THE PROGRAM (COMPILE & LINK)

Assembly language programs are built in two steps.

Step 1: Assembly

What happens:

MASM converts .asm source code into machine code

Output is an object file (.obj)

Options:

/c → assemble only (do not link)

/coff → use Common Object File Format

Step 2: Linking

What happens:

The linker combines your object file with system libraries
Produces the final executable file (.exe)
Links against Windows libraries such as kernel32.lib
/subsystem:console: Tells Windows this is a console application

BIG IDEAS TO REMEMBER

Use .386 to enable 32-bit registers
Use .model flat, stdcall for 32-bit Windows programs
Declare external functions with PROTO
Always return 0 to indicate success
Build process: .asm → .obj → .exe

THE STACK

You already know this idea from C, so we’ll use C just to confirm what the stack does.

What happens on the stack when main() calls factorial(5)

Each function call creates a stack frame.

When factorial(5) is called:

The return address is pushed onto the stack

(where to go back after the function finishes)

The parameter (n) is pushed onto the stack

Control jumps to the factorial function

The function runs

When it returns:

parameters are removed
return address is popped
execution continues where it left off

Key idea (this is all you need)

The stack remembers where to return and stores function data.

Stack Frames and Local Variables

Local variables live on the stack.

result and i are local variables
They exist only while the function runs
When the function returns, the stack frame is destroyed

The runtime stack stores return addresses, parameters, and local variables for function calls.

ASSEMBLY PROGRAM STRUCTURE

Now let’s connect this to assembly, very simply.

.CODE Directive

Marks the start of executable instructions
Everything after this is code

Usually followed by the program’s entry point, commonly main.

Procedures: PROC and ENDP

PROC marks the start of a procedure
ENDP marks the end
The names must match

END Directive (Very Important)

What this means:

Marks the end of the entire program
Tells the assembler where execution starts

Difference between ENDP and END

Important note: Any lines written after END are ignored by the assembler.

You can put comments there — it won’t matter.

RUNNING AND DEBUGGING (SHORT & PRACTICAL)

Assembly programs run inside a console window

Same window as cmd.exe

Breakpoints (Visual Studio)

Click in the gray bar next to a line

A red dot appears

Program pauses before executing that line

If you place a breakpoint on a non-executable line:

VS moves it to the next executable instruction

Debug Mode Visual Cues

Orange bar → debugger is running
Blue bar → edit mode

You cannot edit code while debugging.

Registers While Debugging

Registers window shows CPU registers
Registers that change turn red
EAX = 0000000B → hex for decimal 11

(New VS versions hide some of these by default — that’s normal.)

PROGRAM TEMPLATE IDEA

Assembly programs follow a fixed structure, so we use templates.

Why?

Avoid rewriting setup code
Reduce mistakes
Faster development

Always comment:

program purpose
author
date
changes

This helps future you, not just others.

INCLUDE Directive

Copies another file into your program

Often used for:

macros
procedures
library code

ASSEMBLE → LINK → RUN (Final Form)

Assembly programs cannot run directly in their raw .asm form, so they must go through a series of steps before the CPU can execute them.

🔗 1. Edit

In this step, you write your assembly source code in a .asm file. This file contains human-readable instructions written using mnemonics and labels.

🔗 2. Assemble

The assembler takes the .asm file and converts it into an object file (.obj). This process translates the human-readable assembly instructions into machine code, but the program is not yet complete or ready to run.

🔗 3. Link

The linker combines one or more .obj files with any required libraries to produce a final executable file (such as .exe). During this step, all references to external functions and data are resolved, and the program becomes a complete unit.

🔗 4. Run

The operating system loads the executable into memory, sets up the necessary environment, and transfers control to the program’s entry point. From there, the CPU begins executing the instructions.

🔗 Final One-Line Summary

The stack manages function calls, .CODE defines executable instructions, PROC/ENDP define procedures, and END marks the program entry point, while assembly programs must be assembled, linked, and then executed before they can run.

LISTING FILES & SYMBOL TABLES

What is a Listing File?

A listing file is a detailed report created by the assembler.

It shows:

Your original source code
Line numbers
The memory address of each instruction
The machine code bytes (in hex)
A symbol table

Think of it as: “Show me exactly what the assembler generated.”

Who actually needs listing files?

Beginners → to understand how assembly becomes machine code
Advanced programmers → to debug performance or instruction layout

For normal programs, you usually don’t need it.

The Symbol Table (The Important Part)

Early programmers had to manually decide memory locations:

This was:

hard to remember
extremely error-prone

The assembler fixes this.

Instead of using raw addresses, we use symbols.

Symbolic Addressing (This is the key idea)

What’s happening here?

PayRate is a symbolic name

DB tells the assembler:

allocate 1 byte of memory
initialize it to 100h

The assembler:

assigns a real memory address
stores it in the symbol table

When it sees:

It silently replaces PayRate with the correct address.

Why this is powerful

You don’t care where the data lives

You only care that the name stays consistent

Code becomes:

readable
maintainable
safe

What does the Symbol Table store?

A symbol table keeps track of:

variables and their addresses
labels (jump targets)
procedures
constants
segments

In short: Every name in your program gets an address.

Listing File Example (Simple Explanation)

A listing file shows lines like this (conceptually):

What this tells you:

00000000 → memory offset
B8 → opcode for mov eax, imm32
00000005 → value being moved
Instructions are stored as hex bytes

INVOKE in the listing file

In the listing file, this expands to:

So, INVOKE is just a shortcut — the assembler writes the real instructions.

Why listing files are useful

Listing files help you:

verify machine code generation
see instruction sizes
understand how macros expand
learn how the CPU really sees your program

They are learning and debugging tools, not everyday tools.

80386 Reminder (Very Short)

When you see:

It means target is 80386 or newer, enables 32-bit registers and all modern CPUs qualify.

That’s it. Nothing more needed.

END vs ENDP

🎯 Procedure and Program Endings

In assembly, main ENDP is used to mark the end of a procedure. It tells the assembler that the block of instructions belonging to that procedure is finished.

On the other hand, END main signals the end of the entire program and also specifies the entry point. This tells the system where execution should begin when the program is loaded into memory.

🎯 Generating a Listing File (Optional)

A listing file is something you can generate if you want deeper visibility into how your assembly code is translated. In environments like Visual Studio, you can enable it through the project settings by navigating to the Microsoft Macro Assembler options and turning on listing file generation.

However, in most cases, you don’t actually need a listing file unless you are debugging, learning, or doing low-level analysis.

🎯 Final Chapter Takeaway

Listing files help you see the direct relationship between your source code and the resulting machine code. Symbol tables allow you to use meaningful names instead of raw memory addresses, making programs easier to write and understand.

The assembler plays a much bigger role than simply translating instructions. It allocates memory for variables and data, keeps track of all symbols such as labels and procedures, and replaces those symbolic names with actual memory addresses during assembly.

Because of this, assembly language becomes far more usable and readable. The assembler essentially acts as a bridge between human-friendly code and the low-level machine instructions that the CPU executes.

INTRINSIC DATA TYPES

What does “intrinsic data types” mean?

Intrinsic data types are the built-in data sizes that the assembler understands.

They answer three simple questions:

How big is the data? (8 bits, 16 bits, 32 bits, etc.)
Is it signed or unsigned? (can it be negative?)
Is it an integer or a real (floating-point) number?

That’s it. No magic.

What the assembler actually cares about

Here’s the key idea:

The assembler mainly cares about size.

It needs to know:

how many bytes to reserve
how many bytes an instruction will read or write

The assembler does NOT strongly enforce: signed vs unsigned

That distinction is mostly for humans.

Signed vs Unsigned (Important but subtle)

DWORD → 32-bit unsigned
SDWORD → 32-bit signed

Both:

are 32 bits
take up 4 bytes
look identical in memory

The only difference is how you interpret the bits

That’s why programmers often use SDWORD:

not because the assembler demands it
but because it makes intent clear

Why intrinsic data types matter

Intrinsic data types help you:

choose the correct operand size
avoid reading or writing the wrong number of bytes
understand how values are stored in memory

If you get the size wrong, the CPU will still execute, but your result may be wrong or corrupted.

Key Takeaways

Intrinsic data types describe the size, signed/unsigned nature, and whether the value is an integer or real number.

The assembler cares about operand size, but does not enforce signed vs unsigned.

Programmers often use SDWORD to indicate signedness, but it is not required.

Intrinsic data types help explain how data is stored and used in assembly.

About overlapping types (Very important concept)

Some types overlap in functionality.

Example:

DWORD → 32-bit unsigned
SDWORD → 32-bit signed

Same size. Same memory.

Different meaning.

The assembler sees “32 bits”.

The programmer sees “signed” or “unsigned”.

So when I say “intrinsic data types”…

These are the basic building blocks of all data in a computer.

Let’s walk through them naturally.

🌊 Bit-Level Building Blocks (From Smallest to Largest)

At the lowest level, all data in a computer is built from simple binary units that scale up into larger, more useful forms.

A bit is the smallest unit of data and can only hold a value of 0 or 1. It represents a single binary state.

A nibble consists of 4 bits. It is half a byte and is commonly used to represent a single hexadecimal digit.

A byte is made up of 8 bits and is one of the most important basic units in computing. A byte can store a single character (like a letter) or a small number, which is why it is widely used.

A word is 16 bits, or two bytes. It allows the storage of larger numbers and is often used in older or lower-level systems.

A double word (DWORD) is 32 bits, or four bytes. This size is very common in 32-bit systems and programs.

A quad word (QWORD) is 64 bits, or eight bytes. It is used for very large numbers and is standard in modern 64-bit systems.

All data structures, variables, and memory layouts are ultimately built from these fundamental units.

🌊 Intrinsic Data Types in Assembly

Assembly language provides several built-in data types that map directly to these bit sizes, especially for integers.

For integer types, a BYTE is an 8-bit unsigned value with a range from 0 to 255, while an SBYTE is also 8 bits but signed, allowing values from –128 to 127.

A WORD is a 16-bit unsigned integer ranging from 0 to 65,535, while an SWORD is signed and ranges from –32,768 to 32,767.

A DWORD is a 32-bit unsigned integer with a range from 0 to 4,294,967,295, and an SDWORD is signed, ranging from –2,147,483,648 to 2,147,483,647.

🌊 Larger and Special Integer Types

Some data types are less common but still important in specific contexts.

An FWORD is 48 bits and was mainly used for far pointers in older protected-mode systems.

A QWORD is 64 bits and is used for very large integers, especially in modern systems.

A TBYTE is 80 bits and is rarely used. It is mostly associated with specialized operations in the floating-point unit.

🌊 Floating-Point (Real Number) Types

For handling decimal (real) numbers, assembly provides floating-point types with different levels of precision.

A REAL4 is a 32-bit floating-point number and is commonly used for basic decimal values.

A REAL8 is a 64-bit floating-point number, offering much higher precision and accuracy.

A REAL10 is an 80-bit floating-point number that provides very high precision but is rarely used in typical applications.

🌊 Key Idea

Even though this may seem like a lot of different types, they all come down to the same fundamental concept: everything in a computer is stored as binary. These data types simply define how many bits are used and how those bits should be interpreted—whether as integers, signed values, or floating-point numbers.

The assembler cares about how many bytes.

The programmer cares about what those bytes mean.

That’s why intrinsic data types exist.

DATA DEFINITIONS (ASSEMBLY VARIABLES)

A data definition in assembly is how you create a variable.

It answers two questions:

How much memory do I need?
What value should it start with?

General syntax

label → the variable name (optional, but almost always used)

directive → the data type / size

value → the initial value

Example

This means:

create a variable named count
reserve 4 bytes (32 bits)
store the value 12345 in it

Equivalent C code:

Same idea, different language.

More examples

📦 What’s Happening Here

In assembly, each variable declaration tells the assembler how much memory to reserve and what kind of data will be stored.

The variable message uses DB (Define Byte), which allocates one byte per character. When you store "Hello, world!", each character takes one byte, so the full string occupies 13 bytes in memory (including punctuation and spaces).

The variable age is declared as a BYTE, which reserves 1 byte of memory. This is enough to store a small number, and in this case, it holds the value 25.

The variable salary is declared as an SDWORD, which reserves 4 bytes (32 bits). This allows it to store a signed integer, meaning it can handle both positive and negative values within a much larger range than a single byte.

Each declaration controls two things: how much memory is reserved and how the stored value is interpreted.

📦 Why the data type matters

The assembler must know the size of the variable:

how many bytes to reserve
how many bytes instructions should read or write

If you don’t specify the type, the assembler has no idea what to do.

📦 Assembly vs C (Same concept)

Both:

reserve memory
assign an initial value
give the memory a name

Assembly just makes the size explicit.

Short forms (Just aliases)

These are short names, not new types:

BYTE → DB
WORD → DW
DWORD → DD
QWORD → DQ
TBYTE → DT

They all do the same job: reserve memory.

Legacy Data Directives (Still used in 2026?)

Yes — absolutely still used.

Directives like DB, DW, DD, DQ, and DT are:

still supported
still common
still the standard way to define data in MASM

They are called “legacy” only because they’ve been around forever, not because they’re obsolete.

The Core Data Directives (Explained Clearly)

1. DB — Declare Byte (8 bits)

Reserves 1 byte per value.

Common uses:

characters
small numbers
strings (byte-by-byte)

2. DW — Declare Word (16 bits)

Reserves 2 bytes.

Used for:

16-bit values
older or compact data

3. DD — Declare Doubleword (32 bits)

Reserves 4 bytes.

This is one of the most common directives in 32-bit programs.

4. DQ — Declare Quadword (64 bits)

Reserves 8 bytes.

Used for:

large integers
64-bit values

5. DT — Declare Ten Bytes (80 bits)

Reserves 10 bytes.

Used for:

extended precision floating-point (FPU)
rare, but valid

ABOUT STRINGS AND NULL TERMINATORS

Both are valid.

The second one:

adds a null terminator
is better when interacting with C-style functions

MASM does not automatically add 0 for you.

Big Idea to Remember

Data definition directives:

reserve memory
define size
optionally initialize values

The assembler:

assigns addresses
tracks them in the symbol table
replaces variable names with real memory locations

You write names.

The assembler handles addresses.

Data definitions are how assembly creates variables, by explicitly stating how many bytes to reserve and what value to store in them.

DEFINING DATA TYPES (PART 1 – BEGINNER EXPLANATION)

Big Picture: What This Section Is About

This section explains:

How variables are defined in assembly
How variables are initialized
What happens if variables are not initialized
How different byte-sized data types work

Main Rules for Data Definitions

1. At Least One Initializer Is Required

When you define a variable, the assembler expects a value.

Example:

DWORD → data type (4 bytes)
0 → initializer

Even zero counts as a valid initializer.

2. Multiple Initializers Use Commas

You can define multiple values at once by separating them with commas. Example:

This creates four bytes in memory.

3. Integer Initializers Must Match the Data Size

For integer data types, the value must fit in the size of the variable. Example:

4. Leaving a Variable Uninitialized (?)

If you want to reserve memory without giving it a value, use ?.

Example:

This means:

Memory is reserved
The value is unknown (garbage) at program start

⚠️ Important: Uninitialized variables must not be used before assigning a value.

5. Everything Becomes Binary

No matter how you write an initializer:

Decimal
Hex
Character literal

The assembler converts it into binary before storing it in memory.

6. Worked Example: Adding Two Numbers

Defines a variable: sum DWORD 0

sum is a 4-byte integer initialized to 0; the program loads 5 into eax, adds 6 to it so eax becomes 11, and then stores the result: mov sum, eax

The program exits and final value is 11.

7. Debugging Tip

To observe the variable, set a breakpoint after mov sum, eax, step through the instructions, and watch sum in the debugger to see the memory value change in real time.

BYTE-SIZED DATA TYPES (Very Important)

BYTE / DB (Unsigned, 8 bits)

Size: 1 byte (8 bits)
Range: 0 to 255
Used for: small numbers, characters, raw data

Examples:

SBYTE is a signed 8-bit data type that occupies 1 byte of memory, can store values from −128 to +127, and is commonly used for small numbers that may be negative (for example: temp SBYTE -10 or change SBYTE 5).

Signed vs Unsigned

Unsigned → only positive values (and zero)
Signed → positive and negative values

Uninitialized Variables (Important Warning)

Reserves 1 byte of memory but does not initialize it, so the value stored is random garbage just like

…in C language, which is why you must always initialize variables before using them.

Character Initialization Example

'B' is a character
ASCII value of 'B' = 66
Stored as one byte

Signed Byte Example

Stores -12
Uses signed representation
Can hold negative values

Key Takeaways (Exam-Ready)

Variables must have an initializer (or ?)
? means uninitialized (garbage value)
BYTE / DB = unsigned 8-bit
SBYTE = signed 8-bit
Character literals are stored as ASCII values
All data becomes binary in memory

Defining a variable means reserving memory and deciding how the bits should be interpreted.

DATA DEFINITION PART 2: ARRAYS & SIZES

In high-level languages like C++ or Python, you create an array with brackets []. In Assembly, you just list values one after another.

Creating Arrays (The "Label" Trick)

When you define multiple values under one name, you are creating an array.

You are creating 4 bytes in memory:

The label list only points to the first value, which is 10.

The assembler doesn’t automatically give names to the other values (20, 30, 40).

To access them, you have to calculate their position relative to list.

For example:

list → gives you 10
list + 1 → gives you 20
list + 2 → gives you 30
list + 3 → gives you 40

So, the label is like the “starting address” of your array, and the other elements are reached by adding an offset in bytes.

The Memory Map

If list starts at memory offset 0000:

Contiguous Memory

When you write:

Here’s what’s happening:

The assembler doesn’t care about line breaks.

As long as you don’t give a new label, it just keeps placing the numbers right after the previous ones in memory.

So, all 12 numbers are stored one after another in memory.

Memory layout looks like this:

The label list points only to the first number (10 at offset 0).

To access the others, you use offsets: list + 1 → 20, list + 4 → 50, etc.

To the computer, this is just one long strip of memory, like a long row of boxes.

BYTE vs INTEGER Confusion 🤯

Many beginners get confused because:

In C++/Java, int is always 4 bytes (32 bits).

In Assembly, numbers don’t have a fixed size by default. They are stored in a container (data type) you choose.

Think of it like boxes:

Number 10 fits easily in a BYTE (8-bit box).

You don’t need a DWORD (4-byte box) for such a small number.

U is unsigned, S is signed.

Why use BYTE instead of DWORD?

Memory efficiency: 1,000 small numbers (like ages 0–100) → 1,000 bytes with BYTE, but 4,000 bytes with DWORD. Saving 75% of memory!
Compatibility: Some old hardware or file formats expect data to be in bytes.

⚠️ The Catch

If you try to put a number bigger than 255 into a BYTE:

The assembler will give an error, or
It might silently chop off the extra bits, giving you the wrong value.

✅ In short:

You can spread your data across multiple lines; the assembler just packs them in a row.

BYTE is just a small container, use it when the number is small.

Integers in assembly are as big as you declare (BYTE, WORD, DWORD, etc.), unlike high-level languages.

MIXING RADIXES (THE "SALAD BOWL")

Assembly doesn't care how you write the number.

You can mix Hex, Decimal, Binary, and Character literals in the same list.

They all get converted to binary in the end.

Big Idea to Remember

Labels point to the start: list is just the address of the first item. To get the rest, you add to the address (Offset).
Contiguous Memory: Data defined sequentially sits sequentially in RAM.
Size matters, not type: You can store an "integer" in a BYTE as long as it fits (0-255). You don't always need a DWORD.

STRINGS

Strings Are Just Arrays of Bytes

In assembly, there is no “string type” like in high-level languages (C, Python, etc.).

A string is just a sequence of bytes.

Each character in the string is stored in one byte.

The byte holds the ASCII value of the character.

Example:

✅ Notice that:

Each character takes 1 byte.
The null terminator (0) is also a single byte marking the end of the string.

Labels Are Just Starting Addresses

names1, names2, names3 are labels.
A label is just a pointer to the first byte of the string in memory.
The computer uses the label as a starting reference, but it doesn’t know the length of the string unless you tell it.
Everything after the first byte is contiguous memory (like we discussed with list BYTE 10,20…).

Why We Use BYTE

We write BYTE because each character fits in 1 byte.

Strings are really arrays of bytes, not a special datatype.

Think of it like this:

Each character is stored in one box (byte).

The null byte (0) is the stop signal for string functions, like printf in C or WinAPI string routines.

Multi-line Strings & Special Characters

You can split strings across multiple lines or add special characters:

0Dh = carriage return (CR) → moves cursor to start of line
0Ah = line feed (LF) → moves cursor down a line
\ → line continuation character (lets you break one string across multiple lines)

Memory layout is still just a sequence of bytes, now including CR/LF:

Everything remains a byte.

Putting It All Together

Each string is a contiguous sequence of bytes in memory.
The label points to the first byte.
Each character = 1 byte (ASCII code).
Null terminator (0) = 1 byte marks the end.
Multi-line strings or special characters like CR/LF are just additional bytes in the same array.

So even the biggest sentence like "Learning Reverse Engineering then C#" is just a row of bytes:

✅ Key Insight

Strings in assembly are not magical objects.

They are arrays of bytes.
The label is the pointer.
The assembler only cares about memory.
Null terminators allow functions to know where the string ends.

✅ DUP Operator (Duplicate Made Easy)

The DUP operator in assembly is all about making copies—it lets you allocate multiple pieces of memory and optionally initialize them with the same value.

Think of it as a “memory copy machine” for variables, arrays, strings, or even structures.

How it works:

Count: How many times you want to repeat something.
Value: What you want to repeat (it can be a number, a string, or even an uninitialized placeholder).

The syntax looks like this:

<data type> could be BYTE, WORD, DWORD, etc.
<count> is how many times you want to repeat.
<value> is what you want to fill each slot with. If you leave it as ?, the memory is just reserved but contains random “garbage” values until you set it.

Examples:

Allocate 20 bytes, all zero:

This creates a block of 20 bytes, each containing 0.

Allocate 20 bytes, uninitialized:

Memory is reserved for 20 bytes, but the values are undefined.

Think of it like an empty box, you can fill it later.

Create a repeated string:

This repeats the sequence "STACK" four times in memory, effectively making "STACKSTACKSTACKSTACK".

Allocate an array of 10 integers, initialized to zero:

Here, you get 10 integers, each 4 bytes, all set to 0.

Allocate an array of structures:

This reserves space for 10 structures, each containing a 4-byte integer and a 4-byte string.

Key idea:

Yes, DUP literally means “duplicate”. It’s your way to repeat a value or pattern efficiently in memory without writing it out multiple times.

Whether you’re filling arrays, initializing strings, or creating structures, DUP saves time, space, and effort.

Think of it like telling the assembler: “Hey, make 10 of this, or 20 of that, all lined up in memory, and set them to this value—or leave them blank for now.”

WORD and SWORD

In assembly language, WORD and SWORD are used to work with 16-bit numbers. Each 16-bit number takes 2 bytes of memory.

WORD (Unsigned 16-bit Integer)

WORD is for unsigned numbers, meaning only positive numbers from 0 to 65535.
Each WORD reserves 2 bytes in memory.

SWORD (Signed 16-bit Integer)

SWORD is for signed numbers, meaning it can store negative and positive numbers from -32768 to 32767.
Each SWORD also takes 2 bytes in memory.

Example:

Key Idea:

Think of WORD as a box that only holds positive numbers, and SWORD as a box that can hold negative numbers too. Both boxes are 16 bits (2 bytes) wide, so the memory size is the same, only the interpretation changes.

WORD Arrays

You can create arrays of 16-bit numbers in assembly, just like arrays in C, using either explicit listing or the DUP operator.

Memory layout: Each 16-bit element occupies 2 bytes. So if your array starts at memory offset 0000, the next element is at 0002, then 0004, and so on.

Example with explicit listing:

Example with DUP (uninitialized array):

Here, ? means the elements are uninitialized. They have random “garbage” values until your code sets them.

Visualizing Memory (Conceptual):

Each element takes 2 bytes, so to access the next element, you increment the offset by 2.

Summary:

WORD: Unsigned 16-bit number (0 to 65535)
SWORD: Signed 16-bit number (-32768 to 32767)
Arrays: Use listing or DUP to store multiple words, remembering each takes 2 bytes in memory.

DWORD and SDWORD

In assembly language, DWORD and SDWORD are used to work with 32-bit integers. Each 32-bit number takes 4 bytes of memory.

I. DWORD (Unsigned 32-bit Integer)

DWORD is for unsigned numbers, meaning only positive numbers from 0 to 4,294,967,295.
Each DWORD reserves 4 bytes in memory.

Usage Tip: You can also use DD (Define Doubleword) as a legacy directive. It works the same as DWORD:

II. SDWORD (Signed 32-bit Integer)

SDWORD is for signed numbers, meaning it can store negative and positive numbers from -2,147,483,648 to 2,147,483,647.
Each SDWORD also takes 4 bytes in memory.

III. Arrays of 32-bit Numbers

You can create arrays of DWORDs or SDWORDs either by listing values explicitly or using the DUP operator:

Explicit initialization:

Uninitialized array using DUP:

Memory layout concept:

Each element occupies 4 bytes, so if the first element is at offset 0000, the next is at 0004, then 0008, and so on.
Arrays let you easily store multiple 32-bit numbers in contiguous memory.

IV. Extra Tip: DWORD for Offsets

You can also use DWORD to store the 32-bit memory offset of another variable:

This is useful for pointers or referencing other variables in memory.

V. Summary:

DWORD: Unsigned 32-bit integer, 4 bytes, 0 → 4,294,967,295
SDWORD: Signed 32-bit integer, 4 bytes, -2,147,483,648 → 2,147,483,647
Arrays: Use listing or DUP to store multiple DWORDs
Legacy DD directive works the same as DWORD

QWORD (Quadword)

The QWORD directive in assembly language is used to allocate storage for 64-bit values, meaning each QWORD takes 8 bytes of memory.

Think of it as a really big box that can hold very large numbers.

1. Syntax and Usage

You can define QWORD values in two ways:

Standard directive:

Short form (DQ – Define Quadword):

Tip: The value must fit in 64 bits, otherwise the assembler will throw an error.

2. Memory Organization

Each QWORD takes 8 bytes, so if you define multiple QWORDs in an array, memory offsets increase by 8 each time:

This is just like how DWORD arrays worked, but each element is double the size.

3. Arrays of QWORDs

Just like with BYTE or DWORD, you can use the DUP operator to define multiple QWORDs at once:

Each element is 8 bytes, so this reserves 80 bytes total (10 × 8).

Using ? instead of 0 leaves them uninitialized:

4. QWORD and Registers

In 32-bit mode, your registers like EAX are 32 bits, so storing a 64-bit QWORD might need two 32-bit registers or special memory instructions.

In 64-bit mode, the RAX register can hold a full QWORD directly.

Example:

This is important if you start working with large numbers, addresses, or high-precision calculations.

5. Summary Notes

QWORD = 64 bits = 8 bytes
QWORD can be initialized directly or with DUP
Short form DQ is equivalent to QWORD
Arrays increment in memory by 8 bytes per element
32-bit registers can’t hold QWORDs directly; use 64-bit registers or split into two 32-bit halves

💡 Memory efficiency tip:

Use QWORD only when you need numbers bigger than 32 bits, otherwise DWORD is enough and takes half the memory.

Never forget this concept in Assembly:

Let’s continue….

PACKED BCD AND TBYTE

Packed BCD (Binary Coded Decimal) is a special way to represent decimal numbers in binary, designed for efficiency and precision, especially in financial or scientific applications.

1. What is Packed BCD?

Packed BCD stores decimal digits in pairs, with two decimal digits per byte.

Example: The decimal number 1234 in packed BCD is stored as 34 12 (hex representation in memory).

The lower nibble of a byte stores one digit.
The higher nibble stores the next digit.

Sign byte: The highest byte of a packed BCD variable indicates the sign.

00h → Positive
80h → Negative

Think of it like a tightly packed stack of digits, with the sign sitting on top.

Why use it?

Efficient storage of large decimal numbers (takes less memory than converting to binary integers).
Accurate decimal arithmetic — important for money calculations, scientific data, and some embedded systems.

2. The TBYTE Directive

In MASM, TBYTE is used to declare variables that can store packed BCD data.

Even though TBYTE is 80 bits (10 bytes), it isn’t just “10 bytes of storage” — it can also hold floating-point numbers or other data formats.

Memory layout for a TBYTE BCD number:

1st byte: Sign
Next 9 bytes: Decimal digits, 2 digits per byte

Example: Declaring a packed BCD variable

The correct way:

Important: MASM does not automatically convert decimal numbers to BCD. You must write them in hexadecimal BCD form.

3. Packed BCD in Memory

Let’s look at 1234 as an example:

Each byte after the sign byte stores two decimal digits.
Positive numbers start with 00h. Negative numbers start with 80h.

Visualizing storage:

If the number were larger, you’d continue storing 2 digits per byte.

4. Declaring Arrays of Packed BCD

You can use DUP with TBYTE too:

Each element takes 10 bytes.
Total memory: 5 × 10 = 50 bytes

Uninitialized array:

5. Converting Real Numbers to Packed BCD

Sometimes, you have floating-point numbers (REAL4, REAL8, etc.) and want them as packed BCD.

This is done using FPU instructions:

FLD → Load floating-point number onto FPU stack
FBSTP → Convert the value to packed BCD and store it in bcdVal
Example: If posVal = 1.5, then bcdVal will store 02 in packed BCD.

6. Why Packed BCD Matters

Efficiency: Stores two digits per byte instead of wasting 8 bits for a single decimal digit.

Accuracy: No rounding errors when doing decimal math, unlike floating-point binary.

Applications: Financial apps, calculators, scientific measurements, embedded systems.

Analogy: Think of Packed BCD as a neatly packed number stack, where each box holds 2 digits, and the top box holds the sign. Computers can easily read, write, and calculate with these numbers without wasting memory.

7. Quick Reference

Directive: TBYTE

Size: 10 bytes (80 bits)

Sign byte: First byte, 00h positive, 80h negative

Digits: Next 9 bytes, 2 digits per byte

Initialization: Must be in hexadecimal

Arrays: Use DUP operator for multiple variables

✅ Example: Complete Packed BCD Declaration

Memory usage:

myBCD1 → 10 bytes
myBCD2 → 10 bytes
myBCDArray → 30 bytes (3 × 10)

Packed BCD is essentially a super-efficient way to store decimal numbers where every byte counts.

The TBYTE directive is just your tool for declaring variables that can hold packed BCD or other 10-byte data types.

DEFINING FLOATING-POINT TYPES IN MASM

Floating-point numbers are used to represent real numbers, meaning numbers with fractional parts, like 1.23456789.

In MASM, there are three main floating-point types:

I. Single-Precision: REAL4

Size: 4 bytes (32 bits)
Precision: ~7 significant digits
Range: ±3.4×10³⁸ to ±1.2×10⁻³⁸

Example:

Memory usage: 4 bytes

Good for general-purpose calculations where moderate precision is enough.

II. Double-Precision: REAL8

Size: 8 bytes (64 bits)
Precision: ~15 significant digits
Range: ±1.7×10³⁰⁸ to ±2.4×10⁻³⁰⁸

Example:

Memory usage: 8 bytes

Use when high precision is required, such as scientific calculations or very small/large numbers.

III. Extended-Precision: REAL10

Size: 10 bytes (80 bits)
Precision: ~19 significant digits
Range: ±4.9×10³²⁴ to ±1.1×10⁻³²⁴

Example:

Memory usage: 10 bytes

Ideal for high-precision math, financial, or scientific computations where single- or double-precision isn’t enough.

IV. Arrays of Floating-Point Numbers

You can use the DUP operator to declare arrays of floating-point variables:

Memory usage: 20 × 4 = 80 bytes

Efficient way to initialize large arrays of floating-point numbers.

V. Using DD, DQ, and DT Directives

MASM also allows you to declare floating-point numbers with DD, DQ, and DT, which are legacy equivalents:

Examples:

This is equivalent to using REAL4, REAL8, REAL10. Use whichever style you prefer, but REALx directives are clearer for readability.

VI. Precision vs Range

Precision: Number of significant digits the type can represent.

Range: Maximum and minimum values it can store.

Tip: Extended-precision is overkill for most applications but is useful for scientific or financial computing.

VII. Real Numbers vs Floating-Point Numbers

Real Numbers (math concept): Infinite precision and size. Can include fractions, irrational numbers (like π), etc.

Floating-Point Numbers (computer representation): Approximation of real numbers.

Finite precision and range, limited by storage size (4, 8, or 10 bytes).

✅ Key points:

Floating-point types approximate real numbers.
Precision is limited.
They can represent extremely large or small numbers, but not perfectly.

✅ Quick Example Summary

Memory usage: 4, 8, 10 bytes for each variable
Arrays: multiply size by element count

Summary:

Floating-point types in MASM (REAL4, REAL8, REAL10) let you store real numbers of varying precision.

Choose REAL4 for normal calculations, REAL8 for high-precision scientific data, and REAL10 for extreme precision.

Arrays can be initialized using DUP, and legacy directives DD, DQ, DT are equivalent but less readable.

ADD NUMBERS PROGRAM (ADDING INTEGER VARIABLES)

This program shows how to add three 32-bit integers stored in memory and store the result in a fourth variable.

Explanation:

.386: Enables 80386 instructions, meaning we can use 32-bit registers like EAX.
.model flat, stdcall: Flat memory model (all memory in one linear space) with the stdcall calling convention.
.stack 4096: Reserves a 4 KB stack for function calls and local variables.
ExitProcess PROTO: Declares a prototype for the Windows API function ExitProcess, which terminates the program.

Declaring Data (Variables)

.data section: Where all global variables and constants are stored.
DWORD: Each variable is 4 bytes (32 bits).
Hexadecimal values like 20002000h are base-16 numbers, which the CPU stores as binary in memory.

Memory Layout (example):

Each value occupies 4 bytes in memory, stored consecutively unless alignment or padding is introduced.

Code Section (Adding the Variables)

🔑 Step-by-Step Explanation

The instruction mov eax, firstval copies the value stored in firstval (which is 20002000h) into the 32-bit register EAX. At this point, EAX is holding that value for further operations.

Next, add eax, secondval adds the value of secondval (11111111h) to whatever is already in EAX. After this instruction, EAX now contains the result of 20002000h + 11111111h.

Then, add eax, thirdval adds thirdval (22222222h) to the current value in EAX. Now, EAX holds the total sum of all three values.

The instruction mov sum, eax takes the final result stored in EAX and writes it back into the variable sum in memory. This preserves the result outside the register.

Finally, INVOKE ExitProcess, 0 calls a system function to terminate the program. The 0 indicates that the program ended successfully.

🔑 Key Points to Understand

Registers and memory serve different purposes. Registers like EAX are very fast and are used for temporary storage during calculations, while memory variables such as firstval or sum are slower but used to store data more permanently during program execution.

Hexadecimal values are commonly used in assembly because they map neatly to binary. Each hex digit represents 4 bits, so an 8-digit hex number corresponds to 32 bits, which is exactly the size of a DWORD.

In x86 assembly, you cannot directly perform operations between two memory locations. This is why one value must first be moved into a register before performing arithmetic operations.

Once the computation is complete, the result is typically moved back into memory so it can be used later or preserved after the register is reused.

🔑 Optional Visual Memory Diagram