emu8086/MASM Setup and Syntax Basics
Get your assembly toolkit running — write, assemble, and debug 8086 programs on a modern PC
Open interactive version (quiz + challenge)Real-world analogy
What is it?
emu8086 is a free integrated development environment that emulates the 8086 microprocessor, letting you write, assemble, and debug assembly programs on a modern PC. MASM (Microsoft Macro Assembler) syntax is the Intel-style assembly language it uses, featuring simplified segment directives (.MODEL, .STACK, .DATA, .CODE) that organize programs into logical sections. Together, they provide a complete hands-on platform for learning 8086 assembly.
Real-world relevance
Every microprocessor course in universities uses either emu8086, TASM (Turbo Assembler), or DOSBox+MASM for lab work. Industry professionals use the same MASM syntax when writing inline assembly in C/C++ for performance-critical code, boot loaders, or driver ISRs. Understanding the toolchain is essential for debugging embedded systems and reverse engineering.
Key points
- What is emu8086? — emu8086 is a free 8086 microprocessor emulator and assembler. It provides a complete simulated 8086 environment: CPU registers, memory, I/O ports, a screen, and a keyboard — all on your modern Windows PC. It assembles, runs, and debugs 8086 code in one integrated tool.
- MASM Syntax Overview — MASM (Microsoft Macro Assembler) uses Intel syntax: destination comes first. It supports simplified segment directives (.MODEL, .DATA, .CODE), rich macro language, and typed data definitions. emu8086 uses MASM-compatible syntax.
- .MODEL Directive — .MODEL specifies the memory model. SMALL means one code segment and one data segment (max 64KB each) — sufficient for most 8086 learning programs. Other models: TINY (COM, one segment for everything), MEDIUM, COMPACT, LARGE.
- .STACK Directive — .STACK allocates space for the stack segment. The parameter specifies the size in bytes. 100h (256 bytes) is typical for small programs. The linker sets SS:SP to the top of this area.
- .DATA Directive — .DATA begins the data segment where you define variables using DB, DW, DD. The assembler groups all .DATA definitions into one segment. You must load DS with the data segment address at runtime (MOV AX, @DATA / MOV DS, AX).
- .CODE Directive and Entry Point — .CODE begins the code segment containing executable instructions. The PROC with the name matching the END directive is the entry point. END MAIN tells the linker that execution starts at the MAIN label.
- @DATA Symbol — @DATA is an assembler symbol that represents the segment address of the .DATA segment. You cannot MOV directly into DS (segment registers cannot be loaded with immediate values), so you use AX as an intermediary: MOV AX, @DATA / MOV DS, AX.
- Assembling and Linking (EXE) — The assembly process has two phases: (1) Assembling: MASM converts .ASM source to .OBJ object file. (2) Linking: LINK combines .OBJ files into an .EXE executable. In emu8086, both happen with one click of the 'compile' button.
- COM vs EXE Format — COM files are simpler: one segment for everything, max 64KB, starts at offset 100h (after PSP). EXE files support multiple segments, larger programs, and have a header with relocation info. For learning, COM is easier; for real programs, EXE is more flexible.
- emu8086 User Interface — The emu8086 UI shows: source editor (write code), register panel (AX, BX, CX, DX, SI, DI, BP, SP, IP, flags), memory viewer (hex dump), stack viewer, variables list, and an emulated screen. The toolbar has: compile, run, single-step, and stop.
Code example
; Complete emu8086 EXE template
; This is the standard skeleton for every program
.MODEL SMALL
.STACK 100h
.DATA
greeting DB 'Welcome to emu8086!', 0Dh, 0Ah, '$'
name_buf DB 20 DUP('$')
prompt DB 'Enter your name: $'
.CODE
MAIN PROC
; Step 1: Initialize data segment
MOV AX, @DATA
MOV DS, AX
; Step 2: Print the greeting
MOV AH, 09h
LEA DX, greeting
INT 21h
; Step 3: Print prompt
MOV AH, 09h
LEA DX, prompt
INT 21h
; Step 4: Read a string (simplified — read chars)
LEA SI, name_buf
MOV CX, 19 ; max 19 characters
read_loop:
MOV AH, 01h ; DOS: read char with echo
INT 21h
CMP AL, 0Dh ; Enter key?
JE done_read
MOV [SI], AL
INC SI
LOOP read_loop
done_read:
; Step 5: New line
MOV AH, 02h
MOV DL, 0Dh
INT 21h
MOV DL, 0Ah
INT 21h
; Step 6: Print back the name
MOV AH, 09h
LEA DX, name_buf
INT 21h
; Step 7: Exit to DOS
MOV AH, 4Ch
INT 21h
MAIN ENDP
END MAINLine-by-line walkthrough
- 1. .MODEL SMALL / .STACK 100h — tell the assembler we want a small-model EXE with a 256-byte stack
- 2. .DATA section — define three strings: a greeting, an input buffer (20 bytes filled with '$' as terminator), and a prompt
- 3. MOV AX, @DATA / MOV DS, AX — essential setup: the CPU does not automatically set DS, so we manually point it to our data segment
- 4. MOV AH, 09h / LEA DX, greeting / INT 21h — DOS function 09h prints a '$'-terminated string at DS:DX
- 5. The read_loop uses DOS function 01h to read one character at a time, storing each in the buffer and advancing SI
- 6. CMP AL, 0Dh / JE done_read — when the user presses Enter (carriage return = 0Dh), we stop reading
- 7. After reading, we print CR+LF (0Dh, 0Ah) to move to a new line on the screen
- 8. MOV AH, 09h / LEA DX, name_buf — print back the name. It works because we prefilled the buffer with '$' terminators
- 9. MOV AH, 4Ch / INT 21h — standard DOS exit. Function 4Ch terminates the program and returns control to the DOS prompt
Spot the bug
.MODEL SMALL
.STACK 100h
.DATA
msg DB 'Test$'
.CODE
MAIN PROC
MOV AH, 09h
LEA DX, msg
INT 21h
MOV AH, 4Ch
INT 21h
MAIN ENDP
END MAINNeed a hint?
Show answer
Explain like I'm 5
Fun fact
Hands-on challenge
More resources
- emu8086 Download (Softonic)
- MASM Getting Started Guide (Microsoft Docs)
- emu8086 Installation and First Program (YouTube)
- COM vs EXE File Format (Wikipedia)