String Instructions and REP Prefix
Process arrays and strings at hardware speed with the 8086's built-in bulk operations
Open interactive version (quiz + challenge)Real-world analogy
What is it?
String instructions are a set of 8086 operations (MOVS, CMPS, SCAS, LODS, STOS) designed for efficient bulk processing of byte or word arrays. They automatically update SI and/or DI pointers based on the Direction Flag. Combined with REP/REPE/REPNE prefixes, they execute repeated operations entirely in hardware, providing high-speed memory copy, fill, compare, and search without explicit loop code.
Real-world relevance
String instructions power the inner loops of operating systems. DOS uses REP MOVSB to copy file buffers, REP STOSB to zero memory during allocation, and REPNE SCASB to find string terminators. BIOS uses them to scroll the video buffer. Any time you need memcpy, memset, strcmp, or strchr in assembly, string instructions are the tool.
Key points
- String Instruction Basics — The 8086 has five core string operations: MOVS (move), CMPS (compare), SCAS (scan/search), LODS (load), STOS (store). Each has byte (B suffix) and word (W suffix) variants. They all auto-increment or auto-decrement SI and/or DI after each operation.
- Direction Flag (DF) — DF controls whether SI and DI auto-increment (DF=0, forward) or auto-decrement (DF=1, backward). CLD clears DF for forward processing. STD sets DF for backward processing. Always set DF explicitly before string operations.
- MOVSB / MOVSW — Block Copy — MOVSB copies one byte from DS:SI to ES:DI, then adjusts SI and DI by 1. MOVSW copies a word (2 bytes) and adjusts by 2. Combined with REP, this copies entire memory blocks in a single instruction.
- STOSB / STOSW — Fill Memory — STOSB stores AL into ES:DI, then adjusts DI. STOSW stores AX. With REP, this fills a block of memory with a value — equivalent to memset() in C.
- LODSB / LODSW — Stream Load — LODSB loads the byte at DS:SI into AL and adjusts SI. LODSW loads a word into AX. Typically used WITHOUT REP inside a manual loop, so you can process each loaded element before moving to the next.
- CMPSB / CMPSW — Block Compare — CMPSB compares byte at DS:SI with byte at ES:DI (like CMP), updates flags, then adjusts both SI and DI. With REPE, it continues while bytes are equal — stops at the first mismatch. Implements strcmp-like behavior.
- SCASB / SCASW — Search — SCASB compares AL with the byte at ES:DI, updates flags, then adjusts DI. With REPNE, it scans until a match is found — implementing strchr-like behavior. CX limits the search range.
- REP Prefix — REP repeats the string instruction CX times, decrementing CX each iteration. Stops when CX reaches 0. Used with MOVS and STOS. The CPU executes REP-prefixed instructions entirely in hardware — much faster than a software loop.
- REPE/REPZ and REPNE/REPNZ — REPE (repeat while equal) continues while CX != 0 AND ZF = 1. REPNE (repeat while not equal) continues while CX != 0 AND ZF = 0. Used with CMPS and SCAS to add a condition beyond just the counter.
- Segment Overrides with Strings — By default, the source uses DS:SI and the destination uses ES:DI. You typically need to set up ES to match DS (or to another segment). The source can use a segment override prefix, but the destination is always ES:DI.
Code example
; Program: Copy a string then find a character in it
.MODEL SMALL
.STACK 100h
.DATA
src DB 'Hello, 8086 World!', 0
dest DB 20 DUP(0)
pos DW ?
.CODE
MAIN PROC
MOV AX, @DATA
MOV DS, AX
MOV ES, AX ; ES = DS (same segment)
; --- Part 1: Copy string ---
LEA SI, src
LEA DI, dest
MOV CX, 19 ; 18 chars + null
CLD ; forward direction
REP MOVSB ; copy src -> dest
; --- Part 2: Find '8' in dest ---
LEA DI, dest
MOV CX, 19
MOV AL, '8' ; search for '8'
CLD
REPNE SCASB ; scan until match
JNE not_found
; DI points one past '8', so subtract to get position
LEA AX, dest
SUB DI, AX
DEC DI ; DI = zero-based position
MOV [pos], DI ; pos = 7
not_found:
MOV AH, 4Ch
INT 21h
MAIN ENDP
END MAINLine-by-line walkthrough
- 1. MOV ES, AX — set ES equal to DS because string destinations always use ES:DI
- 2. LEA SI, src / LEA DI, dest — load effective addresses of source and destination strings into the index registers
- 3. MOV CX, 19 — set the repeat counter to 19 (18 characters plus the null terminator)
- 4. CLD — clear Direction Flag so string operations move forward (incrementing SI and DI)
- 5. REP MOVSB — copies 19 bytes from DS:SI to ES:DI. Each iteration: copy byte, increment SI, increment DI, decrement CX
- 6. LEA DI, dest — reset DI to the start of dest for the search operation
- 7. MOV AL, '8' — load the character we want to find into AL (SCASB always compares with AL)
- 8. REPNE SCASB — scan through dest comparing each byte with AL. Stops when a match is found (ZF=1) or CX reaches 0
- 9. JNE not_found — if ZF=0 after REPNE SCASB, the character was not found in the buffer
- 10. SUB DI, AX / DEC DI — calculate the zero-based position. DI points one past the match, so we subtract the base address and then subtract 1
Spot the bug
LEA SI, src
LEA DI, dst
MOV CX, 50
REP MOVSBNeed a hint?
Show answer
Explain like I'm 5
Fun fact
Hands-on challenge
More resources
- 8086 String Instructions (GeeksforGeeks)
- REP/REPE/REPNE Prefixes (Wikipedia)
- String Operations in 8086 (YouTube)
- x86 String Instructions Deep Dive (Art of Assembly)