Suppose we write a C program as two files p1.c and p2.c. We can then compile this code on an IA32 machine using a Unix command line:
unix> gcc -O1 -o p p1.c p2.c
The command gcc indicates the gcc C compiler. Since this is the default compiler on Linux, we could also invoke it as simply cc. The command-line option -O1 instructs the compiler to apply level-one optimizations. In general, increasing the level of optimization makes the final program run faster, but at a risk of increased compilation time and difficulties running debugging tools on the code. As we will also see, invoking higher levels of optimization can generate code that is so heavily transformed that the relationship between the generated machine code and the original source code is difficult to understand. We will therefore use level-one optimization as a learning tool and then see what happens as we increase the level of optimization. In practice, level-two optimization (specified with the option -O2) is considered a better choice in terms of the resulting program performance.
The gcc command actually invokes a sequence of programs to turn the source code into executable code. First, the C preprocessor expands the source code to include any files specified with #include commands and to expand any macros, specified with #define declarations. Second, the compiler generates assemblycode versions of the two source files having names p1.s and p2.s. Next, the assembler converts the assembly code into binary object-code files p1.o and p2.o. Object code is one form of machine code—it contains binary representations of all of the instructions, but the addresses of global values are not yet filled in. Finally, the linker merges these two object-code files along with code implementing library functions (e.g., printf) and generates the final executable code file p. Executable code is the second form of machine code we will consider—it is the exact form of code that is executed by the processor.
Machine-Level Code
As described in Section 1.9.2, computer systems employ several different forms of abstraction, hiding details of an implementation through the use of a simpler, abstract model. Two of these are especially important for machine-level programming. First, the format and behavior of a machine-level program is defined by the instruction set architecture, or “ISA,” defining the processor state, the format of the instructions, and the effect each of these instructions will have on the state. Most ISAs, including IA32 and x86-64, describe the behavior of a program as if each instruction is executed in sequence, with one instruction completing before the next one begins. The processor hardware is far more elaborate, executing many instructions concurrently, but they employ safeguards to ensure that the overall behavior matches the sequential operation dictated by the ISA. Second, the memory addresses used by a machine-level program are virtual addresses, providing a memory model that appears to be a very large byte array. The actual implementation of the memory system involves a combination of multiple hardware memories and operating system software, as described in Chapter 9.
The compiler does most of the work in the overall compilation sequence, transforming programs expressed in the relatively abstract execution model provided by C into the very elementary instructions that the processor executes. The assembly-code representation is very close to machine code. Its main feature is that it is in a more readable textual format, as compared to the binary format of machine code. Being able to understand assembly code and how it relates to the original C code is a key step in understanding how computers execute programs.
IA32 machine code differs greatly from the original C code. Parts of the processor state are visible that normally are hidden from the C programmer:
- The program counter (commonly referred to as the “PC,” and called %eip in IA32) indicates the address in memory of the next instruction to be executed.
- The integer register file contains eight named locations storing 32-bit values. These registers can hold addresses (corresponding to C pointers) or integer data. Some registers are used to keep track of critical parts of the program state, while others are used to hold temporary data, such as the local variables of a procedure, and the value to be returned by a function.
- . The condition code registers hold status information about the most recently executed arithmetic or logical instruction. These are used to implement conditional changes in the control or data flow, such as is required to implement if and while statements.
- A set of floating-point registers store floating-point data.
Whereas C provides a model in which objects of different data types can be declared and allocated in memory, machine code views the memory as simply a large, byte-addressable array. Aggregate data types in C such as arrays and structures are represented in machine code as contiguous collections of bytes. Even for scalar data types, assembly code makes no distinctions between signed or unsigned integers, between different types of pointers, or even between pointers and integers.
The program memory contains the executable machine code for the program, some information required by the operating system, a run-time stack for managing procedure calls and returns, and blocks of memory allocated by the user (for example, by using the malloc library function). As mentioned earlier, the program memory is addressed using virtual addresses. At any given time, only limited subranges of virtual addresses are considered valid. For example, although the 32-bit addresses of IA32 potentially span a 4-gigabyte range of address values, a typical program will only have access to a few megabytes. The operating system manages this virtual address space, translating virtual addresses into the physical addresses of values in the actual processor memory.
A single machine instruction performs only a very elementary operation. For example, it might add two numbers stored in registers, transfer data between memory and a register, or conditionally branch to a new instruction address. The compiler must generate sequences of such instructions to implement program constructs such as arithmetic expression evaluation, loops, or procedure calls and returns.
Code Examples
Suppose we write a C code file code.c containing the following procedure definition:
int accnum=0;
int sum(int x,int y){
int t=x+y;
accum+=t;
return t;
}
To see the assembly code generated by the C compiler, we can use the “-S” option on the command line:
unix> gcc -O1 -S code.c
This will cause gcc to run the compiler, generating an assembly file code.s, and go no further. (Normally it would then invoke the assembler to generate an objectcode file.)
The assembly-code file contains various declarations including the set of lines:
sum:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
addl 8(%ebp), %eax
addl %eax, accum
popl %ebp
ret
Each indented line in the above code corresponds to a single machine instruction. For example, the pushl instruction indicates that the contents of register %ebp should be pushed onto the program stack. All information about local variable names or data types has been stripped away. We still see a reference to the global variable accum, since the compiler has not yet determined where in memory this variable will be stored.
If we use the ‘-c’ command-line option, gcc will both compile and assemble the code:
unix> gcc -O1 -c code.c
This will generate an object-code file code.o that is in binary format and hence cannot be viewed directly. Embedded within the 800 bytes of the file code.o is a 17-byte sequence having hexadecimal representation.
55 89 e5 8b 45 0c 03 45 08 01 05 00 00 00 00 5d c3
This is the object-code corresponding to the assembly instructions listed above. A key lesson to learn from this is that the program actually executed by the machine is simply a sequence of bytes encoding a series of instructions. The machine has very little information about the source code from which these instructions were generated.
To inspect the contents of machine-code files, a class of programs known as disassemblers can be invaluable. These programs generate a format similar to assembly code from the machine code. With Linux systems, the program objdump (for “object dump”) can serve this role given the ‘-d’ command-line flag:
unix> objdump -d code.o
The result is (where we have added line numbers on the left and annotations in italicized text) as follows:
Disassembly of function sum in binary file code.o
00000000 <sum>:
Offset Bytes Equivalent assembly language
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 8b 45 0c mov 0xc(%ebp),%eax
6: 03 45 08 add 0x8(%ebp),%eax
9: 01 05 00 00 00 00 add %eax,0x0
f: 5d pop %ebp
10: c3 ret
On the left, we see the 17 hexadecimal byte values listed in the byte sequence earlier, partitioned into groups of 1 to 6 bytes each. Each of these groups is a single instruction, with the assembly-language equivalent shown on the right.
Several features about machine code and its disassembled representation are worth noting:
/199