Segmented Memory and Physical Address
How the 8086 combines segment and offset to address 1 MB of memory with 16-bit registers
Open interactive version (quiz + challenge)Real-world analogy
What is it?
The 8086 uses segmented memory to bridge the gap between its 16-bit registers and its 20-bit address bus. Every memory access combines a 16-bit segment register value and a 16-bit offset using the formula: Physical Address = Segment x 16 + Offset. Four segment registers (CS, DS, SS, ES) define 64 KB windows into the 1 MB address space. The CPU automatically selects the appropriate segment register based on the type of access (code, data, stack, or string). Multiple segment:offset pairs can map to the same physical address, and segments overlap on 16-byte (paragraph) boundaries.
Real-world relevance
Segmented memory defined the IBM PC era. Every DOS program lived within the 640 KB conventional memory limit imposed by the IBM PC's memory map. DOS memory managers (HIMEM.SYS, EMM386) existed specifically to work around segmentation limits. The A20 gate — a hardware hack tied to the address wraparound — persisted in PCs until UEFI replaced BIOS. Even modern x86 CPUs boot in Real Mode with segmentation active before switching to Protected Mode with paging.
Key points
- The Problem: 16-Bit Registers, 20-Bit Addresses — The 8086 has 16-bit registers, which can only represent values 0 to 65,535 — enough to address 64 KB. But the 8086 has a 20-bit address bus, addressing 1 MB (1,048,576 bytes). The gap between 16-bit registers and 20-bit addresses is bridged by segmentation: every memory access combines two 16-bit values to produce a 20-bit physical address.
- Physical Address Formula — Physical Address = Segment Register x 16 + Offset. Multiplying by 16 is the same as shifting left by 4 bits, which extends the 16-bit segment to 20 bits (the low 4 bits become zero). Adding the 16-bit offset fills in those low bits and can carry into the higher bits. The BIU's dedicated adder computes this for every memory access.
- Segment:Offset Notation — Addresses are written as Segment:Offset — e.g., 2000h:1234h means segment 2000h, offset 1234h. This notation clearly shows both components before they are combined. Many different segment:offset pairs can produce the same physical address, which is both a feature and a source of confusion.
- The Four Segments in Action — CS:IP fetches instructions (code segment). DS:offset accesses most data (data segment). SS:SP and SS:BP access the stack (stack segment). ES:DI is used by string destination operations (extra segment). Each segment register defines a 64 KB window, and the windows can overlap.
- Segment Overlap and Aliasing — Since segments start every 16 bytes (a paragraph boundary), adjacent segments overlap by 65,520 bytes. Segment 1000h covers 10000h-1FFFFh, segment 1001h covers 10010h-2000Fh — they share most of their range. This means the same physical byte can be accessed through many different segment:offset combinations.
- Default Segment Assignments — The 8086 automatically selects a segment register based on the type of memory access. Code fetch uses CS. Stack operations (PUSH, POP, and [BP]) use SS. String destinations use ES. Almost everything else uses DS. You can override the default with a segment prefix byte in the instruction.
- Paragraph Boundaries — A paragraph is 16 bytes. Segments always start on paragraph boundaries because the segment address is shifted left by 4 bits (multiplied by 16). This means the lowest possible segment start addresses are 00000h, 00010h, 00020h, etc. You cannot start a segment at an arbitrary byte address.
- The Wraparound at 1 MB — With segment FFFFh and offset FFFFh, the calculated address would be FFFF0h + FFFFh = 10FFEFh — a 21-bit value. The original 8086 only has 20 address pins (A0-A19), so bit 20 is lost and the address wraps around to 0FFEFh. This wraparound was later used (and abused) by DOS programs, leading to the famous A20 gate issue on later processors.
- Memory Map of a Typical 8086 System — In a standard PC (8088/8086-based), the 1 MB address space is divided: 00000h-9FFFFh (640 KB) is conventional RAM. A0000h-BFFFFh is video memory. C0000h-EFFFFh is ROM for expansion cards. F0000h-FFFFFh is the system BIOS ROM. The CPU starts executing at FFFF0h after reset.
- Why Segmentation Was Chosen — Intel chose segmentation as a practical compromise. True 20-bit registers would require a new instruction format. Virtual memory hardware was too complex and expensive for 1978. Segmentation reused the existing 16-bit register width and instruction encoding while extending addressability to 1 MB. It also naturally supported relocatable code — change the segment register, and the same code runs at a different physical address.
Code example
; Demonstrate segmented memory addressing
.MODEL SMALL
.STACK 100h
.DATA
msg1 DB 'Segment Demo', 0Dh, 0Ah, '$'
val1 DW 0ABCDh
val2 DW 0
.CODE
MAIN PROC
; Setup data segment
MOV AX, @DATA
MOV DS, AX ; DS points to data segment
MOV ES, AX ; ES also points to data segment
; --- Default segment usage ---
MOV BX, OFFSET val1 ; BX = offset of val1
MOV AX, [BX] ; DS:BX -> reads val1 = ABCDh
; --- Segment override ---
MOV AX, ES:[BX] ; ES:BX -> same physical address here
; because DS = ES in this example
; --- Stack segment (SS) usage ---
PUSH 1234h ; SS:SP -> push to stack
MOV BP, SP
MOV CX, [BP] ; SS:BP -> reads 1234h from stack
POP AX ; restore stack
; --- Demonstrate address calculation ---
; If DS = 1234h and BX = 0010h
; Physical = 1234h x 10h + 0010h
; = 12340h + 0010h = 12350h
; --- Different segment:offset, same address ---
; DS=1235h, BX=0000h -> 12350h + 0000h = 12350h
; DS=1200h, BX=0350h -> 12000h + 0350h = 12350h
; All point to physical address 12350h
; Store result to verify
MOV [val2], AX
MOV AH, 4Ch
INT 21h
MAIN ENDP
END MAINLine-by-line walkthrough
- 1. MOV AX, @DATA / MOV DS, AX — loads the data segment address into DS. The assembler replaces @DATA with the paragraph address where .DATA begins
- 2. MOV ES, AX — sets ES equal to DS. Now both segment registers point to the same segment. In real programs, ES often points to a different segment (e.g., video memory at B800h)
- 3. MOV BX, OFFSET val1 — loads the offset (not the value) of val1 into BX. OFFSET is an assembler operator that returns the address within the data segment
- 4. MOV AX, [BX] — reads a word from DS:BX. The BIU computes physical address = DS x 16 + BX, fetches the 16-bit value (ABCDh), and loads it into AX
- 5. MOV AX, ES:[BX] — same offset BX, but now using ES instead of DS. The ES: prefix overrides the default segment. Since DS = ES, this reads the same physical address
- 6. PUSH 1234h — pushes 1234h onto the stack. The BIU writes to SS:SP (stack segment). SP is decremented by 2 first, then the value is stored
- 7. MOV BP, SP / MOV CX, [BP] — BP accesses the stack. [BP] defaults to SS segment (unlike [BX] which defaults to DS). Reads back 1234h from the top of the stack
- 8. POP AX — pops the value from SS:SP into AX and increments SP by 2, restoring the stack to its previous state
Spot the bug
MOV AX, 1234h
MOV DS, AX ; DS = 1234h
MOV SI, 5678h ; offset = 5678h
; Programmer expects physical address: 12345678h
MOV AL, [SI] ; read from 'address 12345678h'Need a hint?
Show answer
Explain like I'm 5
Fun fact
Hands-on challenge
More resources
- 8086 Memory Segmentation (GeeksforGeeks)
- Physical Address Calculation in 8086 (TutorialsPoint)
- Segmented Memory in 8086 Explained (YouTube)
- The A20 Gate — History and Technical Details (Wikipedia)