您的位置:首页 > 编程语言 > C语言/C++

c/c++ 下使用内嵌汇编(inline assembler) (转自MSDN)

2006-08-12 14:06 543 查看

Inline Assembler

The compiler includes a powerful inline assembler. With it, assembly language instructions can be used directly in C and C++ source programs without requiring a separate assembler program. Assembly language enables optimizing critical functions, interfacing to the BIOS, operating system and special hardware, and access capabilities of the processor that are not available from C++.
It supports both 16-bit and 32-bit code generation in all memory models.

What's in This Chapter

Basic features of the inline assembler.

The ASM statement and the ASM block.

Using ASM registers.

Calling C and C++ from assembly language.

Registers and opcodes.

Advantages of writing inline assembly language functions

Use assembly language functions to:

Create a subroutine that executes as quickly as possible

Provide an interface to functions compiled with another compiler

Acess capabilities of the CPU and the native instruction set that are not available from C++

Interface to specialized hardware

Write code where specific instruction selection and ordering is critical

The asm Statement

The asm statement invokes the assembler. Use this statement wherever a C or C++ statement is legal. You can use asm in any of three ways.
The first example shows asm followed simply by an assembly instruction:

asm mov AH,2
asm mov DL,7
asm int 21H

The second example shows asm followed by a set of assembly instructions enclosed by braces. An empty set of braces may follow the directive.
asm {
mov AH,2
mov DL,7
int 21H
}

Because the asm statement is a statement separator, assembly instructions can appear on the same line:
asm mov AH,2 asm mov DL,7 asm int 21H

The three previous examples generate identical code. But enclosing assembly language in braces, as in the second example, has the advantage of setting the assembly language off from the surrounding C++ code and avoids repeating the asm statement.
No assembler instruction can continue onto a second line. Use the form of the last example primarily for writing macros, which must be one line long after expansion.

Note

The Digital Mars C++ asm statement emulates the Borland asm statement. The _asm and __asm statements emulate the Microsoft _asm and __asm statements.

The ASM Block

A series of assembler instructions enclosed by braces following the asm keyword are called an "ASM block." Unlike C++ blocks, ASM blocks do not affect the scope of variables.

Restrictions on using C and C++ in an ASM block

An ASM block can use the following C and C++ language elements:

Symbols, including labels, variables, and function names.

Constants, including symbolic constants and enum members.

Macros and preprocessor directives.

Comments delimited by /**/ symbols or defined by // symbols. In Microsoft-compatible mode, semicolons (;) can also be used to delimit comments.

Type names (where a MASM type would be legal).

Type names, including declarations of structs and pointers.

C-style type casts.

Note

Microsoft and Borland inline assemblers do not support type casts.
Inline assembly instructions within C or C++ statements can refer to C or C++ variables by name.

MASM-Style hexadecimal constants

Support for MASM-style hexadecimal constants provides easy conversion of MASM-style source code. The constants take the form:
digit {hex_digit} ('H'| 'h')

You cannot use hexadecimal constants if the -A (for ANSI compatibility) option is used.

C and C++ operators in an ASM block

An ASM block cannot use operators specific to C and C++, such as the left-shift (<<) operator.
You can use operators common to C, C++, and MASM within an ASM block but interpret them as assembly language operators. C and C++ interpret brackets ([]) as enclosing array subscripts and scale them to the size of an array element. But within an asm statement, C and C++ interpret brackets as the MASM index operator, which adds an unscaled byte offset to any adjacent operand.

In Microsoft-compatible mode, the semicolon delimits comments, as in MASM.

Assembly language in an asm statement

In common with other assemblers, the inline assembler accepts any instruction that is legal in MASM. The following are some of the assembly language features of the asm statement:

Expressions. Inline assembly code can use any MASM expression. A MASM expression is any combination of operands and operators that evaluates to a single value or address.

Data directives and operators. An asm statement can define data objects in the code segment with the MASM directives DB, DW, and DQ. The MASM directives DT, DF, STRUC, and RECORD, and the operators DUP, THIS, WIDTH, and MASK are not accepted.

Macros. An asm statement can use C preprocessor macros even though the inline assembler is not a macro assembler and does not support MASM macro directives.

Other restrictions on C and C++ symbols

An asm statement can reference any C or C++ variable name, function, or label in scope, provided those names are not symbolic constants. However, you cannot call a C++ member function from within an asm statement.
Prototype the functions referenced in an asm statement before using them in programs. This lets the compiler distinguish them from names and labels.

Each assembly language instruction can contain a single C or C++ symbol.

C or C++ symbols within an ASM block must not have the same spelling as an asm reserved word.

The inline assembler allows structure or union tags in asm statements, but only as qualifiers of references to members of the structure or union.

Accessing C or C++ data in an asm statement

In general, instructions in an asm statement can reference any symbol in scope where the statement appears. The following statement loads the AX register with the value of var, a C variable in scope:
asm mov AX,var

An asm statement can reference a uniquely named member of a class, structure, or union without specifying the variable name or type before the period operator. But if the member name is not unique, you must specify a variable or type name before the period operator. If two structure types have a member name in common, as in this example:
struct first_type
{
char *mold;
int common_name;
};

struct second__type
{
char *mildew;
long common_name;
int unique_name;
};

then qualify the reference to common_name with the tag name:
asm mov [bx]first_type.common_name,10

You need not qualify a reference to a unique member name. In the following example, unique_name is an anonymous structure member because it is a member of first_type.
asm mov [bx].unique_name,10

This statement generates the same instruction whether or not a qualifying name or type is present. For more information, see the section "Making anonymous references to structure members" later in this chapter.

Functions in inline assembly language

Because ASM blocks do not require separate source file assembly steps, writing a function using ASM blocks is easier than using a separate assembler. In addition, the compiler generates function prolog and epilog code.
The expon2 function is an example of a function written in inline assembly language:

int expon2(int num, int power)
{
asm
{	mov AX,num	// get first argument
mov CX,power	// get second argument
shl AX,CL	// AX = AX * (2 to the power of CL)
}
}

An inline function refers to its arguments by name and may appear in the same source file as the callers of the function.
Refer to Using Assembly Language Functions for a description of the register stacks used by inline assembly instructions.

Making anonymous references to structure members

You can make anonymous references to members of a given structure, as in the following:
struct x {
int i;
int j;
int k;
} foo;

You can refer to these members i, j, and k anonymously, for example, with the assembly instruction:
asm
{   mov BX,4
mov AX,foo[BX]; Refers to member j of foo
}

Using register variables

Digital Mars C++ supports register variables. Register variables are useful with inline assembly. If asm statements place results in registers, you can use register variables to access those values.
For more information see "Using Register Variables" in Chapter 5, "Using Assembly Language Functions."

Using the __LOCAL_SIZE symbol

When using the inline assembler, the special symbol __LOCAL_SIZE expands to the number of bytes used by all local symbols. __LOCAL_SIZE is useful in combination with __declspec(naked), as __LOCAL_SIZE is the amount of space to reserve on the stack.
For example:

__declspec(naked) int test()
{
int x, y, z;

_asm
{	push BP
mov BP,SP
sub SP,__LOCAL_SIZE
mov BX,__LOCAL_SIZE[BP]
mov BX,__LOCAL_SIZE+2[BP]
mov AX,__LOCAL_SIZE
mov AX,__LOCAL_SIZE+2
}
_AX = x + y + z;
_asm
{
mov SP,BP
pop BP
ret
}
}

Using ASM Registers

The asm statement alters the registers outside the programmer's explicit assembly language instructions. Registers contain whatever values the normal control flow leaves in them at the point of the asm statement.

For 16-bit memory models

You do not need to preserve the following registers when writing inline assembly language: AX, BX, CX, DX, SI, DI, ES, and flags (other than DF).
C and C++ do not expect these registers to be maintained between statements, but they do preserve the following registers: CS, DS, SS, SP and BP.

Note

The compiler does not use registers to hold register variables for functions containing inline assembly code.

For 32-bit memory models

Functions can change the values in the EAX, ECX, EDX, ESI, EDI registers.
Functions must preserve the values in the EBX, ESI, EDI, EBP, ESP, SS, CS, DS registers (plus ES and GS for the NT memory model).

Always set the direction flag to forward.

To maximize speed on 32-bit buses, make sure data aligns along 32-Function return values

For 16-bit models. If the return value for a function is short (a char, int, or near pointer) store it in the AX register, as in the previous example, expon2. If the return value is long, store the high word in the DX register and the low word in AX. To return a longer value, store the value in memory and return a pointer to the value.

For 32-bit models. Return near pointers, ints, unsigned ints, chars, shorts, longs and unsigned longs in EAX. 32-bit models return far pointers in EDX, EAX, where EDX contains the segment and EAX contains the offset.

When C linkage is in effect. Floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX the least significant.

When C++ linkage is in effect. The compiler creates a temporary copy on the stack and returns a pointer to it.

Function return values

For 16-bit memory models. If the return value for a function is short (a char, int, or near pointer) store it in the AX register, as in the previous example, expon2. If the return value is long, store the high word in the DX register and the low word in AX. To return a longer value, store the value in memory and return a pointer to the value.

For 32-bit models. Return near pointers, ints, unsigned ints, chars, shorts, longs, and unsigned longs in EAX. 32-bit models return far pointers in EDX, EAX, where EDX contains the segment and EAX contains the offset.

When C linkage is in effect. Floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX the least significant.

When C++ linkage is in effect. The compiler creates a temporary copy on the stack and returns a pointer to it.

Interfacing to a member function

The easiest way to interface an assembly language routine to a class member function is to provide a C wrapper function that can be called from the assembly language routine.
An alternative is to write the member function in C++ and compile it with normal out-of-line member functions.

Calling C Functions from an ASM Block

C functions, including C library functions, can be called from within the asm block, as in the following example:
#include <stdio.h>
char format [] = "% s %s %s /n";
char alas[] = "Alas,";
char poor[] = "poor";
char Yorick[] = "Yorick!";

void main(void)
{
asm
{	mov	AX, offset Yorick
push	AX
mov	AX, offset poor
push	AX
mov	AX, offset Alas
push	AX
mov	AX, offset format
push	AX
call	printf
}
}

Simply push the needed arguments from right to left before calling the function, since function arguments are passed on the stack. To print the message, the example pushes pointers to the three strings, formats them, and then calls printf.

Calling C++ functions

An ASM block can call only global C++ functions that are not overloaded because the types of the arguments are unknown. The compiler issues an error if an ASM block calls an overloaded global C++ function or a C member function.
You can also call a function declared with extern "C" linkage from an asm statement within a C++ program, because all the standard header files declare the library functions to have extern "C" linkage.

Defining ASM blocks as C macros

A C++ macro is a convenient way to insert assembly language into source code. But, because a macro expands into a single, logical line, take care when writing them.
If the macro expands into multiple instructions, enclose the instructions in an ASM block. The asm statement must precede each instruction. Also separate comments from code with /**/ characters rather than //. Unless you take these precautions, the compiler can be confused by C or C++ statements to the left or right of the assembly code or interpret instructions as comments when the macro becomes a single line. Without the closing brace, the compiler cannot tell where the assembly language ends.

Warning: Do not use double-slash (//) characters within a macro. The compiler terminates the macro when it sees a double-slash.

An ASM block written as a macro can accept arguments but, unlike a C macro, it cannot return values. But some MASM macros can be written as macros for C. The following MASM macro sets a video page to the value specified in the argument page:

findpage MACRO page
mov AH, 5
MOV AL,page
int 10h
ENDM

The following C macro does the same thing:
#define findpage(page) asm /
{ /
asm mov AH,5 /
asm mov AL,page /
asm int 10h /
}

Registers

The following registers are supported. Register names are in upper or lower case. AL, AH, AX, EAX BL, BH, BX, EBX CL, CH, CX, ECX DL, DH, DX, EDX BP, EBP SP, ESP DI, EDI SI, ESI ES, CS, SS, DS, GS, FS CR0, CR2, CR3, CR4 DR0, DR1, DR2, DR3, DR6, DR7 TR3, TR4, TR5, TR6, TR7 ST ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), ST(7) MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7 XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7

Opcodes

The following instructions are supported. Opcode names are in upper or lower case.
aaaaadaamaasadc
addaddpdaddpsaddsdaddss
andandnpdandnpsandpdandps
arplboundbsfbsrbswap
btbtcbtrbtscall
cbwcdqclccldclflush
clicltscmccmovacmovae
cmovbcmovbecmovccmovecmovg
cmovgecmovlcmovlecmovnacmovnae
cmovnbcmovnbecmovnccmovnecmovng
cmovngecmovnlcmovnlecmovnocmovnp
cmovnscmovnzcmovocmovpcmovpe
cmovpocmovscmovzcmpcmppd
cmppscmpscmpsbcmpsdcmpss
cmpswcmpxch8bcmpxchgcomisdcomiss
cpuidcvtdq2pdcvtdq2pscvtpd2dqcvtpd2pi
cvtpd2pscvtpi2pdcvtpi2pscvtps2dqcvtps2pd
cvtps2picvtsd2sicvtsd2sscvtsi2sdcvtsi2ss
cvtss2sdcvtss2sicvttpd2dqcvttpd2picvttps2dq
cvttps2picvttsd2sicvttss2sicwdcwde
dadaadasdbdd
dedecdfdidiv
divpddivpsdivsddivssdl
dqdsdtdwemms
enterf2xm1fabsfaddfaddp
fbldfbstpfchsfclexfcmovb
fcmovbefcmovefcmovnbfcmovnbefcmovne
fcmovnufcmovufcomfcomifcomip
fcompfcomppfcosfdecstpfdisi
fdivfdivpfdivrfdivrpfeni
ffreefiaddficomficompfidiv
fidivrfildfimulfincstpfinit
fistfistpfisubfisubrfld
fld1fldcwfldenvfldl2efldl2t
fldlg2fldln2fldpifldzfmul
fmulpfnclexfndisifnenifninit
fnopfnsavefnstcwfnstenvfnstsw
fpatanfpremfprem1fptanfrndint
frstorfsavefscalefsetpmfsin
fsincosfsqrtfstfstcwfstenv
fstpfstswfsubfsubpfsubr
fsubrpftstfucomfucomifucomip
fucompfucomppfwaitfxamfxch
fxrstorfxsavefxtractfyl2xfyl2xp1
hltidivimulininc
insinsbinsdinswint
intoinvdinvlpgiretiretd
jajaejbjbejc
jcxzjejecxzjgjge
jljlejmpjnajnae
jnbjnbejncjnejng
jngejnljnlejnojnp
jnsjnzjojpjpe
jpojsjzlahflar
ldmxcsrldslealeaveles
lfencelfslgdtlgslidt
lldtlmswlocklodslodsb
lodsdlodswlooploopeloopne
loopnzloopzlsllssltr
maskmovdqumaskmovqmaxpdmaxpsmaxsd
maxssmfenceminpdminpsminsd
minssmovmovapdmovapsmovd
movdq2qmovdqamovdqumovhlpsmovhpd
movhpsmovlhpsmovlpdmovlpsmovmskpd
movmskpsmovntdqmovntimovntpdmovntps
movntqmovqmovq2dqmovsmovsb
movsdmovssmovswmovsxmovupd
movupsmovzxmulmulpdmulps
mulsdmulssnegnopnot
ororpdorpsoutouts
outsboutsdoutswpackssdwpacksswb
packuswbpaddbpadddpaddqpaddsb
paddswpaddusbpadduswpaddwpand
pandnpavgbpavgwpcmpeqbpcmpeqd
pcmpeqwpcmpgtbpcmpgtdpcmpgtwpextrw
pinsrwpmaddwdpmaxswpmaxubpminsw
pminubpmovmskbpmulhuwpmulhwpmullw
pmuludqpoppopapopadpopf
popfdporprefetchntaprefetcht0prefetcht1
prefetcht2psadbwpshufdpshufhwpshuflw
pshufwpslldpslldqpsllqpsllw
psradpsrawpsrldpsrldqpsrlq
psrlwpsubbpsubdpsubqpsubsb
psubswpsubusbpsubuswpsubwpunpckhbw
punpckhdqpunpckhqdqpunpckhwdpunpcklbwpunpckldq
punpcklqdqpunpcklwdpushpushapushad
pushfpushfdpxorrclrcpps
rcpssrcrrdmsrrdpmcrdtsc
repreperepnerepnzrepz
retretfrolrorrsm
rsqrtpsrsqrtsssahfsalsar
sbbscasscasbscasdscasw
setasetaesetbsetbesetc
setesetgsetgesetlsetle
setnasetnaesetnbsetnbesetnc
setnesetngsetngesetnlsetnle
setnosetnpsetnssetnzseto
setpsetpesetposetssetz
sfencesgdtshlshldshr
shrdshufpdshufpssidtsldt
smswsqrtpdsqrtpssqrtsdsqrtss
stcstdstistmxcsrstos
stosbstosdstoswstrsub
subpdsubpssubsdsubsssysenter
sysexittestucomisducomissud2
unpckhpdunpckhpsunpcklpdunpcklpsverr
verwwaitwbinvdwrmsrxadd
xchgxlatxlatbxorxorpd
xorps

Pentium 4 (Prescott) Opcodes Supported

addsubpdaddsubpsfisttphaddpdhaddps
hsubpdhsubpslddqumonitormovddup
movshdupmovsldupmwait

AMD Opcodes

pavgusbpf2idpfaccpfaddpfcmpeq
pfcmpgepfcmpgtpfmaxpfminpfmul
pfnaccpfpnaccpfrcppfrcpit1pfrcpit2
pfrsqit1pfrsqrtpfsubpfsubrpi2fd
pmulhrwpswapd
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息