StringInstructions.pdf

(173 KB) Pobierz
AoA.book
The String Instructions
The String Instructions
Chapter Six
6.1
Chapter Overview
A string is a collection of objects stored in contiguous memory locations. Strings are usually arrays of
bytes, w
ords, or (on 80386 and later processors) double w
ords.
The 80x86 microprocessor f
amily supports
se
eral instructions specifi
cally designed to cope with strings.
This chapter e
xplores some of the uses of
these string instructions.
The 80x86 CPUs can process three types of strings: byte strings
, w
ord strings
, and double w
ord strings
.
The
y can mo
v
e strings
, compare strings
, search for a specifi
c v
alue within a string
, initialize a string to a
x
ed v
alue
, and do other primiti
v
e operations on strings.
The 80x86’
s string instructions are also useful for
manipulating arrays, tables, and records.
Y
ou can easily assign or compare such data structures using the
string instructions. Using string instructions may speed up your array manipulation code considerably
.
6.2
The 80x86 String Instructions
All members of the 80x86 f
amily support fi
v
e dif
ferent string instructions: MO
VS
x
, CMPS
x
, SCAS
x
,
1
LODS
x
, and ST
OS
x
. (
x
= B,
W
, or D for byte, w
ord, or double w
ord, respecti
v
ely
.
This te
xt will generally
drop the x suf
x when talking about these string instructions in a general sense.)
The
y are the string primi
-
ti
v
es
since you can b
uild most other string operations from these fi
v
e instructions. Ho
w you use these fi
v
e
instructions is the topic of the ne
xt se
v
eral sections.
For MOVS:
movsb();
movsw();
movsd();
For CMPS:
cmpsb(); // Note: repz is a synonym for repe
cmpsw();
cmpsd();
cmpsb(); // Note: repnz is a synonym for repne.
cmpsw();
cmpsd();
For SCAS:
scasb(); // Note: repz is a synonym for repe
scasw();
scasd();
scasb(); // Note: repnz is a synonym for repne.
scasw();
scasd();
For STOS:
stosb();
stosw();
stosd();
1. The 80x86 processor support two additional string instructions, INS and OUTS which input strings of data from an input
port or output strings of data to an output port. We will not consider these instructions since they are privileged instructions
and you cannot execute them in a standard 32-bit OS application.
Beta Draft - Do not distribute
© 2001, By Randall Hyde
Page
935
v
286680900.001.png
Chapter Six
Volume Six
For LODS:
lodsb();
lodsw();
lodsd();
6.2.1
How the String Instructions Operate
The string instructions operate on blocks (contiguous linear arrays) of memory
. F
or e
xample, the MO
VS
instruction mo
v
es a sequence of bytes from one memory location to another
.
The CMPS instruction com
-
pares tw
o blocks of memory
.
The SCAS instruction scans a block of memory for a particular v
alue.
These
string instructions often require three operands, a destination block address, a source block address, and
(optionally) an element count. F
or e
xample, when using the MO
VS instruction to cop
y a string, you need a
source address, a destination address, and a count (the number of string elements to mo
v
e).
Unlik
e other instructions which operate on memory
, the string instructions don’
t ha
v
e an
y e
xplicit oper
-
ands.
The operands for the string instructions include
the ESI (source index) register,
• the EDI (destination index) register,
• the ECX (count) register,
• the AL/AX/EAX register, and
• the direction flag in the FLAGS register.
For example, one variant of the MOVS (move string) instruction copies a string from the source address
specified by ESI to the destination address specified by EDI, of length ECX. Likewise, the CMPS instruction
compares the string pointed at by ESI, of length ECX, to the string pointed at by EDI.
Not all instructions have source and destination operands (only MOVS and CMPS support them). For
example, the SCAS instruction (scan a string) compares the value in the accumulator (AL, AX, or EAX) to
values in memory.
6.2.2
The REP/REPE/REPZ and REPNZ/REPNE Prefixes
The string instructions, by themselv
es, do not operate on strings of data.
The MO
VS instruction, for
e
xample, will mo
v
e a single byte, w
ord, or double w
ord.
When e
x
ecuted by itself, the MO
VS instruction
ignores the v
alue in the ECX re
gister
.
The repeat prefi
xes tell the 80x86 to do a multi-byte string operation.
The syntax for the repeat prefix is:
For MOVS:
rep.movsb();
rep.movsw();
rep.movsd();
For CMPS:
repe.cmpsb(); // Note: repz is a synonym for repe.
repe.cmpsw();
repe.cmpsd();
repne.cmpsb(); // Note: repnz is a synonym for repne.
repne.cmpsw();
repne.cmpsd();
For SCAS:
repe.scasb(); // Note: repz is a synonym for repe.
repe.scasw();
repe.scasd();
Page
936
© 2001, By Randall Hyde
Beta Draft - Do not distribute
286680900.002.png
The String Instructions
repne.scasb(); // Note: repnz is a synonym for repne.
repne.scasw();
repne.scasd();
For STOS:
rep.stosb();
rep.stosw();
rep.stosd();
You don’t normally use the repeat prefixes with the LODS instruction.
When specifying the repeat prefix before a string instruction, the string instruction repeats ECX times 2 .
Without the repeat prefix, the instruction operates only on a single byte, word, or double word.
You can use repeat prefixes to process entire strings with a single instruction. You can use the string
instructions, without the repeat prefix, as string primitive operations to synthesize more powerful string
operations.
6.2.3 The Direction Flag
Besides the ESI, EDI, ECX, and AL/AX/EAX registers, one other register controls the 80x86’s string
instructions – the flags register. Specifically, the direction flag in the flags register controls how the CPU pro-
cesses strings.
If the direction flag is clear, the CPU increments ESI and EDI after operating upon each string element.
For example, if the direction flag is clear, then executing MOVS will move the byte, word, or double word at
ESI to EDI and will increment ESI and EDI by one, two, or four. When specifying the REP prefix before this
instruction, the CPU increments ESI and EDI for each element in the string. At completion, the ESI and EDI
registers will be pointing at the first item beyond the strings.
If the direction flag is set, then the 80x86 decrements ESI and EDI after processing each string element.
After a repeated string operation, the ESI and EDI registers will be pointing at the first byte or word before
the strings if the direction flag was set.
The direction flag may be set or cleared using the CLD (clear direction flag) and STD (set direction flag)
instructions. When using these instructions inside a procedure, keep in mind that they modify the machine
state. Therefore, you may need to save the direction flag during the execution of that procedure. The follow-
ing example exhibits the kinds of problems you might encounter:
procedure Str2; nodisplay;
begin Str2;
std();
<Do some string operations>
.
.
.
end Str2;
.
.
.
cld();
<do some operations>
Str2();
<do some string operations requiring D=0>
2. Except for the cmps instruction which repeats at most the number of times specified in the cx register.
Beta Draft - Do not distribute
© 2001, By Randall Hyde
Page 937
286680900.003.png
Chapter Six
Volume Six
This code will not work properly. The calling code assumes that the direction flag is clear after Str2
returns. However, this isn’t true. Therefore, the string operations executed after the call to Str2 will not func-
tion properly.
There are a couple of ways to handle this problem. The first, and probably the most obvious, is always to
insert the CLD or STD instructions immediately before executing a sequence of one or more string instruc-
tions. The other alternative is to save and restore the direction flag using the PUSHFD and POPFD instruc-
tions. Using these two techniques, the code above would look like this:
Always issuing CLD or STD before a string instruction:
procedure Str2; nodisplay;
begin Str2;
std();
<Do some string operations>
.
.
.
end Str2;
.
.
.
cld();
<do some operations>
Str2();
cld();
<do some string operations requiring D=0>
Saving and restoring the flags register:
procedure Str2; nodisplay;
begin Str2;
pushfd();
std();
<Do some string operations>
.
.
.
popfd();
end Str2;
.
.
.
cld();
<do some operations>
Str2();
<do some string operations requiring D=0>
If you use the PUSHFD and POPFD instructions to save and restore the flags register, keep in mind that
you’re saving and restoring all the flags. Therefore, such subroutines cannot return any information in the
flags. For example, you will not be able to return an error condition in the carry flag if you use PUSHFD and
POPFD.
6.2.4 The MOVS Instruction
The MOVS instruction uses the following syntax:
movsb()
Page 938
© 2001, By Randall Hyde
Beta Draft - Do not distribute
The String Instructions
movsw()
movsd()
rep.movsb()
rep.movsw()
rep.movsd()
The MOVSB (move string, bytes) instruction fetches the byte at address ESI, stores it at address EDI
and then increments or decrements the ESI and EDI registers by one. If the REP prefix is present, the CPU
checks ECX to see if it contains zero. If not, then it moves the byte from ESI to EDI and decrements the
ECX register. This process repeats until ECX becomes zero.
The MOVSW (move string, words) instruction fetches the word at address ESI, stores it at address EDI
and then increments or decrements ESI and EDI by two. If there is a REP prefix, then the CPU repeats this
procedure as many times as specified in ECX.
The MOVSD instruction operates in a similar fashion on double words. Incrementing or decrementing
ESI and EDI by four for each data movement.
When you use the rep prefix, the MOVSB instruction moves the number of bytes you specify in the
ECX register. The following code segment copies 384 bytes from CharArray1 to CharArray2 :
CharArray1: byte[ 384 ];
CharArray2: byte[ 384 ];
.
.
.
cld();
lea( esi, CharArray1 );
lea( edi, CharArray2 );
mov( 384, ecx );
rep.movsb();
If you substitute MOVSW for MOVSB, then the code above will move 384 words (768 bytes) rather
than 384 bytes:
WordArray1: word[ 384 ];
WordArray2: word[ 384 ];
.
.
.
cld();
lea( esi, WordArray1 );
lea( edi, WordArray2 );
mov( 384, ecx );
rep.movsw();
Remember, the ECX register contains the element count, not the byte count. When using the MOVSW
instruction, the CPU moves the number of words specified in the ECX register. Similarly, MOVSD moves
the number of double words you specify in the ECX register, not the number of bytes.
If you’ve set the direction flag before executing a MOVSB/MOVSW/MOVSD instruction, the CPU dec-
rements the ESI and EDI registers after moving each string element. This means that the ESI and EDI regis-
ters must point at the end of their respective strings before issuing a MOVSB, MOVSW, or MOVSD
instruction. For example,
CharArray1: byte[ 384 ];
CharArray2: byte[ 384 ];
.
.
.
cld();
lea( esi, CharArray1[383] );
lea( edi, CharArray2[383] );
Beta Draft - Do not distribute
© 2001, By Randall Hyde
Page 939
Zgłoś jeśli naruszono regulamin