Byte Code Compiler

Next: Packages Up: Additions Since First Release Previous: Additions Since First Release

Byte Code Compiler

A byte code compiler translates a complex high-level language like Lisp into a very simple language that can be interpreted by a very fast byte code interpreter, or virtual machine. The internal representation of this simple language is a string of bytes, hence the name byte code. The compilation process eliminates a number of costly operations the interpreter has to perform, including variable lookup and setting up exits for nonlocal transfers of control. It also performs some inlining and makes the language tail-recursive. This means that tail calls are compiled as jumps and therefore iterations can be implemented using recursion.

As a simple example, consider a function to add up the elements of a list.

(defun f (x)
  (let ((s 0))
    (dolist (y x s)
      (setf s (+ s y)))))

Applying the interpreted function to a list of 1000 numbers produces

> (def x (iseq 1 1000))
X
> (time (dotimes (i 10) (f x)))
The evaluation took 2.10 seconds

With compilation we get

> (compile 'f)
F
> (time (dotimes (i 10) (f x)))
The evaluation took 0.20 seconds

This is an improvement of a factor of 10 in speed. Optimized native C code using a native C array of integers can be another 100 times faster, so there is room for further improvement, but the byte code compilation has made a substantial difference. There are examples where byte code compilation improves performance even more, but typically improvements are a bit more modest, on the order of a factor of two to five.

In any high-level language for scientific and statistical computing there will always be a need to implement some code in a lower level language like C to obtain acceptable performance. But developing code in the higher language itself is usually much faster and the resulting code is typically easier to understand and maintain. It is therefore advantageous to move the point where performance demands use of a low level language as far down as possible. Byte code compilation helps significantly in this respect.

It is possible to translate the byte codes produced by the compiler into C code, which can then be compiled with a native C compiler and linked into the system dynamically or statically. The naive approach does indeed produce additional speedup of about a factor of two by eliminating the byte code interpretation component, but there is a significant cost: the resulting object code is huge. As a result this option has not been pursued. However a more sophisticated approach based on a higher level intermediate representation appears to be promising and will be considered as part of the development of a new virtual machine discussed in Section 4.1.

Next: Packages Up: Additions Since First Release Previous: Additions Since First Release

Luke Tierney
5/27/1998