Introduction



next up previous
Next: The Architecture of Up: Comparing the Performance of Previous: Comparing the Performance of

Introduction

The reduction system -Red [GKK92] developed at the University of Kiel truly performs high-level program transformations governed by the rewrite rules of a full-fledged applied -calculus. Programs of the untyped high-level reduction language KiR may be reduced either partially or to normal forms. Conceptually, program execution takes place entirely within the space of programs. Intermediate programs may be made visible to the user in KiR notation. -Red comprises an abstract stack machine ASM for compiled graph reduction and a compilation scheme for the transformation of KiR programs into abstract machine code. At runtime, ASM performs supercombinator reductions under an applicative order regime, switching to interpretation iff potential situations of name clashes have to be resolved by full -reductions.

The implementation of conventional languages and compiled graph reduction poses similar needs on processor architectures [HE+86]. However, there are some aspects that must be taken into consideration for the efficient implementation of functional languages [Kie85], some of which are also important for the implementation of other languages:

The most important issue is the architectural support for highly efficient subroutine calls and parameter passing mechanisms. In contrast to conventional implementations which use a single runtime stack for recursive function invocations, -Red features four stacks to accommodate the runtime environment. One stack each is to

Multiple stacks allow for a simpler compilation scheme than just a single stack and also facilitate re-compilation of intermediate states of program execution into high-level KiR programs, as required by the reduction concept of -Red. In contrast to this, compiling to conventional register machines alienates the internal states and at least complicates re-compilation. Moreover, multiple stacks increase the opportunities for some optimizations which are not possible with a single stack. An important feature of -Red is that the argument stack and the workspace stack are interchanged between function calls. The workspace frame of a calling function (or at least its upper part) can thus directly be taken as the argument frame of the called function. This enables the compiler to generate code which releases as early as possible and right from the top of the argument frame entries that are no longer needed, which might save considerable stack space. Upon termination of a function call, the stack constellation of the calling function can be re-installed by again switching argument and workspace stacks, and by moving the result of the called function to the new working stack.

ASM can not be compiled to conventional computers with reasonable efficiency, since modern RISC processors do not provide the facilities to directly support its stack concept. We have therefore developed a tailor-made RISC processor architecture with a unique stack system. It comprises four processor-internal stacks which can be readily extended into main memory. Two of the stacks can be flipped by inverting an internal flag. The stacks are the main part of the architecture. They combine the advantages of pure stacks with the flexibility of variable sized register windows. Other support for -Red is available for runtime type-checking by means of tags which is to validate operand types simultaneously with the execution of primitive functions.

Fast is a general purpose 40-bit RISC processor with a Harvard architecture. A versatile but small instruction set with register-to-register, load/store, subroutine call and conditional branch instructions supports efficient implementations of functional languages. However, implementations of conventional languages may benefit from this stack architecture as well. To demonstrate this, we compare on this architecture the performance of programs written in our functional language KiR and in C. C has been chosen for two reasons: it is in widespread use and hence allows for a comparison of the performance of Fast with other RISC processorsgif. Moreover, it does not have nested function definitions, and in this respect resembles the concept of supercombinators insofar, as the parameter passing mechanisms are very similar: both require only a single argument frame to accommodate the environment of a function call.

In the following, we will give an overview over the main features of the processor architecture, then we will explain the C compiler, and last we will compare the performance of a representative sample of programs written in KiR and C. The compilation of KiR to ASM code is specified in [GKK92] and will not be explained here.



next up previous
Next: The Architecture of Up: Comparing the Performance of Previous: Comparing the Performance of



ca@informatik.uni-kiel.de