[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CLisp design and implementation details?



Dwight Hughes <dwighth@intellinet.com> writes:
> Is there any documentation available on the internal design of CLisp?
> I am not referring to info like the implementation notes that are part
> of the binary release - what I want is a general overview of the
> design and why it was done that way (type of gc, memory organization,
> stacks, tagging, 32bit versus 64bit mode operation, ...).

There is no such closed documentation. You will, however, find some
information about the techical details in the source comments (partly
in German, partly in English). There is a comment about the memory
management at spvw.d:550. For other kinds of details, please ask me
by email.

Another, more political, answer to your question "why was it done that way"
is contained in the public announcement I made on comp.lang.lisp more
than 4 years ago. I've attached this announcement, for your amusement.

                Bruno


--------------------------------------------------------------------------
From: haible@ma2s2.uucp (Bruno Haible)
Newsgroups: comp.lang.lisp
Subject: ANNOUNCE: CLISP Common Lisp
Date: 10 Mar 1993 09:06:13 GMT
Organization: University of Karlsruhe, Germany
Lines: 104
Sender: <haible@ma2s2.mathematik.uni-karlsruhe.de>
Message-ID: <1nkb25$ies@nz12.rz.uni-karlsruhe.de>
NNTP-Posting-Host: ma2s2.mathematik.uni-karlsruhe.de
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Summary: Free, portable Common Lisp available
Keywords: Common Lisp, CLISP, implementation, computer algebra

We announce the public availablity of the Common Lisp implementation CLISP.

CLISP is mostly CLtL1 compliant. CLOS programming is supported through the
use of PCL.


Availability:

CLISP is freeware and can be distributed under the terms of the GNU GPL.
The newest versions will always be available via anonymous ftp from
ma2s2.mathematik.uni-karlsruhe.de [129.13.115.2], directory /pub/lisp/clisp/.


Design:

CLISP was designed along the following main goals:

  1* full implementation of CLtL1,

  2* requires very few memory,

  3* a portable implementation,

  4* suitable for computations occurring in mathematics,

  5* reasonable speed (under the constraints 1* to 4*).

Ad 3*: To be portable, it was written in C, more precisely in a language that
is preprocessed to either K&R C or ANSI C. Portability among different Unix
platforms is achieved through the use of GNU Autoconf.

Ad 2* and 1*: CLISP requires only 1.5 MB RAM on most machines, only 1 MB
on Atari ST and DOS. This was possible because we compile Lisp source to a
very space efficient byte code. 

Ad 5*: Compiled programs run about 5 times faster than interpreted programs.

Ad 1*: We felt that supporting parts of Common Lisp with high efficiency
(e.g. CAR and CDR) and others with low performance (e.g. multiple values,
multi-dimensional arrays, catch & throw or subtypep) is bad practice.
We therefore tried to optimize _everything_ to a reasonable amount. 542 of
the 594 library functions have been written in C.

Ad 4*: CLISP has not only the mandatory bignums, but also long floats that
deserve that name: the floating point precision is virtually unlimited. For
efficiency, the kernel of the bignum/longfloat arithmetic subroutines has
been written in assembly language for several processors.

Ad 3*: CLISP does not depend on a floating point coprocessor. To avoid
problems with possibly buggy C floating point libraries it uses its own soft
float library.


Distinguished features:

* The debugger allows the user
  - to return prescribed values from and
  - to restart the execution of
  any activation record corresponding to the evaluation of an interpreted
  form or the application of an interpreted function.

* The CASE macro is compiled to a hash table lookup.

* Compiled code can be mixed freely with interpreted forms. Use
  (LOCALLY (DECLARE (COMPILE)) ... code to be compiled ...)

* The floating point printing routine is free of rounding errors, yet still
  fast because the bulk of the conversion to decimal notation is done using
  a low-level bignum routine.


The bytecode compiler:

The bytecode defines a stack-based virtual machine. The "top of stack" is
not on the stack, however, as it may consist of multiple values: it is split
into a value count and a first value (which reside in some global processor
registers when possible). The remaining values are stored in a global array.

To reduce the relative cost of bytecode interpretation and to meet goal 2*,
the bytecode instruction set has been optimized to reduce the number of
instructions needed for a particular program. There are complex instructions
for accessing closed over variables, for dynamic binding, creating closures,
multiple value handling, catch & throw, unwind-protect.

No frame pointer is needed. Local variables are accessed directly off the
stack.

The compiler is split into two passes. The first pass does macro expansion,
argument count checking, maintains scoping information, and produces a tree
with one node for every form. The second pass flattens the tree and does
peephole optimizations.
This produces good code without compilation speed being an issue.
The drawback of this approach is that some optimizations come too late for
other optimizations to occur. Actually, some optimizations (dead code
elimination for example) need to be done twice: once in the first pass
(when compiling IF and COND) and once in the second pass (when analyzing JMP
instructions).


Thanks for your attention.

Bruno Haible <haible@ma2s2.mathematik.uni-karlsruhe.de>
Michael Stoll <michael@rhein.iam.uni-bonn.de>

--------------------------------------------------------------------------