Generating
x86 code
The x86 instruction set has evolved organically
over the decades, so it is quite complex. One goal of this document
is to explain a small of instructions that we need in this course.
Another goal is to help you make the transition from generating SaM
code to generating x86 code.
Resources:
- This is an excellent high-level introduction
to the x86 ISA: http://www.cs.virginia.edu/~evans/cs216/guides/x86.html
- SASM is a cross-platform IDE for developing x86 assembly
programs. Like SaM, it has an intuitive GUI and to the
extent that writing x86 assembler code can be fun, SASM makes it
fun. The Windows download is completely self-contained and I was
able to get it running in a couple of minutes. On Linux or Mac,
installation can be more complex. See the SASM website for more
information: https://dman95.github.io/SASM/english.html
- You can of course read Intel's ISA manuals (insert smiley face
here): http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
What are the main differences between the x86 and SaM ISAs?
- x86 has data registers. In this document, we use
registers eax
and ebx for
storing data values. Register esp is the stack pointer and register ebp is the frame
base register. All these are 32-bit registers. The
picture below (from (1)) is useful;
ignore the sub-fields of some of the registers such as eax.
- The x86 stack grows towards smaller addresses.
In particular, the stack pointer esp will decrease when a value is
pushed on the stack.
-
x86 is a 2-address ISA. Arithmetic
instructions like add
or sub
require two operands, and the result overwrites the first
operand. For example, the instruction add ebx, eax
adds the contents of registers ebx and eax, and stores the result in register ebx.
-
For some instructions, one of operands
can be in memory. We will use the register-indirect
and register-indirect-with-offset addressing
modes. For example add eax, [ebp] reads the contents of
memory at the address pointed to by ebp, and adds the value to
register eax.
This is different from add eax, ebp, which adds the contents
of ebp to
eax. The
instruction add
eax, [ebp+8] performs a similar operation but with
the contents of memory at address (ebp+8).
-
The second operand can be a constant; for
example, add esp,
4 adds 4 to the stack pointer esp.
- The x86 ISA supports operations on data values of
different lengths, including bytes (8 bits), words (16
bits), and double words (32 bits). To avoid complexity, we
restrict ourselves to 32-bit integers and addresses.
- x86 has 32-bit addresses but each address refers to a
byte in the address space. Even if we restrict
ourselves to 32-bit integers and addresses, we must be aware
that addresses refer to bytes and not to words or double
words.
- For example, to get rid of a 32-bit integer from the top
of stack, you must write add esp, 4 and not add esp, 1
- Some instructions need to be told explicitly the length of
the operands. For example, mov [esp], 2 is ambiguous: do you want
to update a byte, word or double word on top of stack? If
you use SASM, you have to write mov dword [esp], 2 to
specify that the intended length of the operation is a
double word. On the other hand, mov eax, 2 works fine because
the assembler knows that eax is a 32-bit register so it deduces
that the intended length is a double word. SASM does
not care if you specify dword where it is not needed, so you
can write mov
dword eax, 2 if you want.
- x86 has condition code registers. When an arithmetic
operation is performed, bits in a special register called the
condition code register (CCR) are set depending on the result
of the operation. Conditional jumps refer to these bits to
determine which way to jump. For example, one bit in the CCR
is called the zero (Z) bit, and it is set if the result of an
instruction such as addition or subtraction is zero;
similarly, the negative (N) bit is set if the result is
negative. Conditional jump instructions refer to these bits:
for example, jz foo
is jump-if-zero, and takes the branch if the Z is set. Some
instructions do not change the CCR. See (1)
for a more detailed discussion.
x86 assemblers
One advantage of SaM
is that there is no SaM but SaM and Pingali is its prophet. For
x86 assembly code on the other hand, there are a bewildering
number of different assemblers, and you will see acronyms like
NASM (network assembler), MASM (Microsoft assembler), GAS (GNU
assembler), AT&T syntax, Intel syntax etc. Each one is
different and assembly programs produced for one assembler will
usually not work with other assemblers. This means that x86 code
you get from the Internet may not work with your assembler.
Another issue is system calls: assembly programs that make Windows
systems calls for, say, printing values will not work on Linux
because system calls are different on the two operating systems. A
final issue is linking with routines in libraries like libc: if
you want to link to these routines, you must use the standard
protocol for calling these routines.
In this course, we will use
the NASM
syntax, which is supported by the SASM IDE. It is simpler than the
others. The documentation for SASM says that it supports MASM and
other formats but you should not use these since it complicates
grading. One advantage of SASM is that it has its own routines
(macros) for I/O, and these are translated by the SASM assembler
into the appropriate system calls for whatever platform you are
generating code for. This means you do not need to worry about
system calls at least for I/O, and you get a level of portability
that is convenient.
Here is a simple SASM
assembly file. The program is in the .text section. The entry point into
your code must be labeled CMAIN, and it must be declared to be a global. The code
calls a SASM print routine to print the 32-bit (4 byte) integer
666, and then returns.
%include "io.inc"
section .text
global CMAIN
CMAIN:
push ebp ; set up
frame base register
mov ebp, esp
PRINT_DEC 4, 666
pop ebp ; restore frame base register and return
ret
More complicated programs will have a .global section
where global variables like strings are allocated. See the file
for factorial in SASM.
Generating x86 code from Bali
One way to generate x86 code
from Bali programs is to generate SaM code and then translate each
SaM instruction into small sequences of x86 instructions. For
example, the SaM instruction ADD can be implemented by the
following x86 sequence:
pop ebx
pop eax
add eax, ebx
push eax
Similarly, the LINK instruction in SaM can be
implemented by the sequence
push ebp
mov ebp, esp
This is essentially what
binary translators and just-in-time (JIT) compilers do. I haven't
worked out the details so I don't know if there are any hidden
gotchas with this approach. Since the stacks in SaM and x86 grow
in opposite directions, there may be some subtle issues with stack
manipulation.
A second way to generate x86 code is to
retarget your Bali compiler so that it produces x86 code
directly. Here are the key points to keep in mind.
- Stack frames: we will use a stack frame that mimics the
standard stack frame used on x86 for C programs. The picture
below shows the stack frame organization.
-
The only difference from the standard C
stack frame is that the parameters are pushed in
left-to-right order, as in SaM, rather than the
standard right-to-left order (see the picture in
(1)).
The right-to-left order is a weird legacy thing that we can
ignore.
- There is no return value slot in the frame because register
eax is used
to return a value from a method invocation.
-
Notice that the order of the saved FBR (ebp) and return
address is the reverse of what it is in SaM. In particular,
this means that saving the FBR and updating the FBR to point
to the new frame, which was achieved by the LINK
instruction in SaM, must be done in the callee, not the
caller, using the instruction sequence discussed above.
- x86 has a CALL
instruction, which is like JSR in SaM. It pushes the return address
and jumps to the specified address. There is a matching RET instruction
which pops an address from the stack and jumps to it.
Use them.
- Since method invocations return values in eax, it may be
convenient to require all expressions to return their value in
eax rather
than push the value on the stack as we did in SaM. This means
that the Bali expression "2" would be compiled to the
instruction mov
eax, 2. The expression "x" would be compiled into
something like mov
eax, [ebp+8], depending on the offset of x from the
frame base register. The expression "e1-e2" would be
compiled to the following code:
code for e1 ; value left in
eax
push eax ; save value on stack since e2 will overwrite this
register
code for e2
pop ebx; ebx now has the value of e1
sub ebx, eax
mov eax, ebx
- For commutative operations like +, you can avoid the last mov by generating
the code add eax, ebx
for example. This approach has the virtue of
simplicity, but you may be able to get more compact code by
exploiting some of the memory addressing modes of
x86. Whether that code runs faster is another story. Feel free
to experiment.
- As mentioned above, conditional jumps use the flags in the
condition code register. The CMP instruction is useful for setting
these flags; for example, CMP eax, ebx subtracts the value of
ebx from the
value in eax,
and sets the flags based on the result. It does not change the
value in eax.
Of course arithmetic operations like ADD and SUB also set
these flags.