The C compiler (gcc) of the
GNU project
is in widespread use and
can be ported to different processors by writing a machine description [Sta93]. This compiler has been used as basis for a C compiler for
Fast.
However, it did not suffice to write a machine description,
some source files of the compiler had to be modified as well.
gcc uses RTL
as intermediate language,
in which are available three pointers to manipulate a stack:
the stack-pointer is used to push arguments for the next function call
on the stack,
the argument-pointer is used to access the parameters referred to in
the current function invocation,
and
the frame-pointer is used to address local variables.
To accommodate the runtime environment,
we also need means to
store return addresses
and to save register variables which must be kept alive across function calls.
This is usually done in a function call/return
and a function prologue/epilogue instruction sequence, respectively.
Before we explain
how this runtime environment is implemented on Fast,
we have to deal with
some problems with local variables (esp. arrays)
held on a stack.
The stacks of Fast are not those found in conventional runtime systems,
but resemble a conventional register file.
A stack element can not be accessed with a variable index;
the index must be specified at compile time
.
This is analogous to register indices which must be fixed at compile time.
Therefore, we can not perform a computed access to an array element (e.g., a[i]),
if the array is mapped onto a stack.
A related problem is caused by the address operator of C (e.g., &),
since the address of a stack element cannot be computed either.
In Fast these problems are even more complicated
by accesses to an element in the unsafe region.
These problems also prevent an efficient implementation of languages
on Fast which allow for nested function definitions,
e.g.; Pascal [JW78].
Implementations of the runtime environment for these languages
have their activation records chained up by static links
to keep track of nesting levels.
Due to the problems listed above,
such a runtime environment can not be efficiently implemented
on the stack system of Fast.
To avoid these problems for the implementation of C,
an conventional stack is placed in main memory for local variables.
The optimizer of gcc takes care of mapping scalar variables
into general purpose registers.
To make efficient use of all stacks,
a similar parameter passing mechanism as that of
-Red
has
been implemented for C as well.
The four stacks of Fast will be used as follows:
A function call is implemented as follows: The parameters for the function are pushed onto stack A. Before the function is called, a stack switch is performed, and then the function can access its parameters on stack I (which is the previous stack A). If a tail call is performed, the stack switch will be omitted and the parameters are placed directly on stack I.
Another problem we have to deal with is that gcc assumes a register-oriented processor, i.e., it will make use of (general purpose) registers wherever possible. The optimizer tries to avoid stack since they are considered memory accesses. If a stack element is used more than once, it will be copied to a register prior to the first use. If the value must be preserved across function calls, this register is saved in a function prologue on stack H, and restored in a function epilogue. However, this is not intended for the stacks of Fast. The generated code should work directly with the values on the stack instead of copying them to registers. This is accomplished by a trick. In the machine description, another 32 registers are introduced, which are reserved for parameter passing. With these registers, gcc can do all standard optimizations which are not available to stack elements. In the code generation stage, these registers are translated into stack accesses. Doing this optimization for the stack switching version is simple, since there is a one-to-one mapping between register numbers and stack indices. If only one stack is used for parameter passing, the indices change during computation, thus making it difficult to keep track of them. Moreover, this trick might even fail if there are too many parameters. Since the new parameters are pushed on the same stack, the actual arguments can be moved into the swap-area due to a stack overflow. In this case, the compilation will be aborted with an error message. Specifying the type qualifier register for some of the parameters allows for a compilation of the program. We will take a small example to explain the parameter passing mechanism based on this implementation.
void f(int i, int j)
{ g(j, i); ... }
void g(int n, int m)
{ ... }
Function f() is called with two arguments, which are expected on stack I (see figure 1 (a)). These arguments i and j are read from stack I and pushed onto stack A in reverse order (see figure 1 (b)).
Figure 1: Implementation of a Function Call
Thereafter, stacks A and I are exchanged and g() is called, which establishes the situation given in figure 1 (c). The function g() can access its arguments on stack I. Right before returning from g(), the situation for f() will be re-established by switching the stacks again, thus reaching the situation as in figure 1 (b). At the end, function f() removes its arguments from stack I, switches stacks, and returns to its caller. for Fast can be controlled by many switches, which select the parameter passing model, e.g.; -mswitch: use version which switches stacks. For the performance evaluations discussed in the next section, this parameter passing model has been selected since it offers on average best performance for all programs we have tested so far [Hut94].
Since Fast does not directly support bytes or shorts but only words, the C scalar types (char, short, int, long) map all to the same internal size: 32 Bits. This causes some problems with programs which make assumptions about the internal size of the data types. However, this implementation is in compliance with ANSI C [PB92].