[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Compiler section (4.2)
This section was written by Sandra and reviewed by RPG. Any other comments
before I begin working on it?
kathy
[rpg: Comments look like this.]
Introduction
============
The compiler is a utility that translates programs into an
implementation-dependent form that can be represented and/or executed
more efficiently. The nature of the processing performed during
compilation is discussed in the "Compilation Semantics" section below.
This is followed by a discussion of the behavior of COMPILE-FILE and
the interface between COMPILE-FILE and LOAD.
[rpg:
The compiler is a utility that may translate programs into an
implementation-dependent form that might be represented or executed
more efficiently. The nature of the processing performed during
compilation is discussed in the "Compilation Semantics" section below.
This is followed by a discussion of the behavior of COMPILE-FILE and
the interface between COMPILE-FILE and LOAD.
]
% References:
% CLtL page 143 (next to last paragraph)
% CLtL page 321 (second paragraph)
[rpg: CLtL page 438]
The functions COMPILE and COMPILE-FILE are used to explicitly force
[rpg: yuck] compilation to take place. It is permissible for
conforming implementations to also perform implicit compilation during
ordinary evaluation. While the evaluator is typically implemented as
an interpreter that traverses the given form recursively, performing
each step of the computation as it goes, a permissible alternate
approach is for the evaluator first to completely compile the form
into machine-executable code and then invoke the resulting code.
Various mixed strategies are also possible. All of these approaches
should produce the same results when executing a correct program, but
may produce different results for incorrect programs.
[rpg: This should say that execution of programs can be accomplished
by a variety of means ranging from direct interpretation of the list
structure representing a program through compilation to machine code,
and that the designer of an interpreter (evaluator?) can select any
of these strategies, and the designer of the compiler should select
any strategy that generally results in code that is no slower or no
bigger and which satisfies the constraints just below.]
% This paragraph should really conclude with a stronger statement that
% conforming programs must be structured so they will work if implicit
% compilation does take place, but CLtL doesn't come right out and say
% that, and we have never voted on any issue to say that either.
Compilation Semantics
=====================
% References:
% Issue COMPILE-ENVIRONMENT-CONSISTENCY [pending]
% Issue COMPILED-FUNCTION-REQUIREMENTS [pending]
% The material in this section will have to be updated to reflect further
% changes to these issues.
Conceptually, compilation can be viewed as a process which traverses a
program, performs certain kinds of syntactic and semantic analysis
using information (such as proclamations and macro definitions)
present in the compile time environment, and produces a modified
program. As a minimum, the compiler must perform the following
actions:
- All macro calls appearing lexically within the code being compiled
must be expanded at compile time and will not be expanded again at
run time. The process of compilation effectively turns MACROLET
and SYMBOL-MACROLET constructs int PROGNs, with all calls to the local
macros in the body replaced by their expansions.
The compiler must treat any form that is a list beginning with a
symbol that does not name a macro or special form as a function call.
(This implies that SETF methods must also be available at compile time.)
- The compiler must capture declarations to determine whether
variable bindings and references appearing lexically within the
code being compiled are to be treated as lexical or special. The
compiler must treat any binding of a variable that has not been
declared or proclaimed to be SPECIAL as a lexical binding.
- The compiler must process EVAL-WHEN forms that appear lexically within
the program being compiled. Effectively, the compiler must replace
the EVAL-WHEN form with either a PROGN containing the body forms, or
a constant NIL.
- The compiler must process LOAD-TIME-VALUE forms that appear lexically
within the program being compiled. In the case of COMPILE, evaluation
of the LOAD-TIME-VALUE form happens at compile time and the resulting
value is treated as a literal constant at run time. In the case of
COMPILE-FILE, the compiler must arrange for evaluation of the form
to take place at load time.
In addition, the compiler is permitted to incorporate the following
kinds of information into the code it produces, if the information is
present in the compile time environment and is referenced within the
code being compiled. Except where some other behavior is explicitly
stated, when the compile time and run time definitions are different,
it is unspecified which will prevail within the compiled code. It is
also permissible for implementations to signal an error at run time to
complain about the discrepancy. [rpg: Diction.] In all cases, the
absence of the information at compile time is not an error, [rpg:
terminology] but its presence may enable the compiler to generate more
efficient code.
[rpg: There is a complicated issue: Can the compiler assume that the
resulting code in a compile-file situation will be run in the same
Lisp? The same implementation? The same computer? The same type of
computer?
I suggest that we say that the semantics we discuss presumes that the
compiler can assume that when doing compile-file that the resulting
code will be loaded into a fresh copy of the same Lisp.]
- The compiler may assume that functions that are defined and
declared or proclaimed INLINE in the compile time environment will
retain the same definitions at run time.
- The compiler may assume that, within a named function, a
recursive call to a function of the same name refers to the
same function, unless that function has been declared NOTINLINE.
[rpg: the interpreter can assume the same thing, right? That is, a
valid Common Lisp has be one in all code is compile-filed by a
separate program and loaded and executed in the apparent Common Lisp
image.]
- COMPILE-FILE may assume that, in the absence of NOTINLINE
declarations, a call within the file being compiled to a named
function which is defined in that file refers to that function.
(This permits "block compilation" of files.) The behavior of
the program is unspecified if functions are redefined individually
at run time.
[rpg: the interpreter can assume the same thing, right?]
- The compiler may assume that the argument syntax and number of return
values for all built-in Common Lisp functions will not change. In
addition, the compiler may treat all built-in Common Lisp functions
as if they had been proclaimed INLINE.
[rpg: the interpreter can assume the same thing, right? This follows from
LISP-SYMBOL-REDEFINITION.]
- The compiler may assume that the argument syntax and number of return
values for all functions with FTYPE information available at
compile time will remain the same at run time.
% Reference: CLtL page 69
- The compiler may assume that symbolic constants that have been
defined with DEFCONSTANT in the compile time environment will retain
the same value at run time as at compile time. The compiler may replace
references to the name of the constant with the value of the constant,
provided that such "copies" are EQL to the object that is the
actual value of the constant.
% The following paragraph from issue COMPILE-ENVIRONMENT-CONSISTENCY
% seems likely to change:
- The compiler can assume that type definitions made with DEFTYPE
or DEFSTRUCT in the compile time environment will retain the same
definition in the run time environment. It may also assume that
a class defined by DEFCLASS in the compile time environment will
be defined in the run time environment in such a way as to have
the same superclasses and metaclass. [rpg: compatible metaclass?]
[rpg: This is a little curious. Is this talking only about this sort of case:
(defclass c ...)
(compile-file <something using c>)
or is it trying to cover the case of
(compile-file <...(declass c ...) ... something using c>)
]
This implies that
subtype/supertype relationships of type specifiers will not
change between compile time and run time. (Note that it is not
an error [rpg: terminology?] for an unknown type to appear in a
declaration at
compile time, although it is reasonable for the compiler to
emit a warning in such a case.)
% Ref: CLtL page 153
- The compiler may assume that if type declarations are present
in the compile time environment, the corresponding variables and
functions present in the run time environment will actually be of
those types; otherwise, the run time behavior of the program is
undefined.
The compiler *must not* make any additional assumptions about
consistency between the compile time and run time environments. In
particular:
- The compiler may not assume that functions that are defined
in the compile time environment will retain the either the
same definition or the same signature at run time, except in the
situations explicitly listed above.
- The compiler may not signal an error if it sees a call to a
function that is not defined at compile time, since that function
may be provided at run time.
File Compilation
================
The function COMPILE-FILE performs compilation processing (described
in the previous section) on forms appearing in a file, producing an
output file which may then be loaded with LOAD.
Normally, the top-level forms appearing in a file compiled with
COMPILE-FILE are executed only when the resulting compiled file is
loaded, and not when the file is compiled. However, it often happens
that some forms in the file must be evaluated at compile time in order
for the remainder of the file to be read and compiled correctly; for
example, forms that change the values of *PACKAGE* or *READTABLE* and
macro definitions. In such cases, the distinction between processing
that is performed at compile time and processing that is performed at
load time becomes important.
The special form EVAL-WHEN can be used to give explicit control over
the time at which evaluation of a top-level form takes place, allowing
forms to be executed at compile time, load time, or both. The
behavior of this construct may be more precisely understood in terms
of a model of how COMPILE-FILE processes forms in a file to be
compiled.
Successive forms are read from the file by the file compiler [rpg:
COMPILE-FILE] using READ. These top-level forms are normally processed
in what we call `not-compile-time' mode; in this mode, the file
compiler arranges for forms to be evaluated only at load time and not
at compile time. There is one other mode, called `compile-time-too'
mode, in which forms are evaluated both at compile and load times.
[rpg: what is the file compiler? The thing that compile-file causes to
run?
Also, isn't this requirement that COMPILE-FILE to use READ new? I don't
see why it's required. I suggest removing it.]
Processing of top-level forms in the file compiler works as follows:
* If the form is a macro call, it is expanded and the result is
processed as a top-level form in the same processing mode
(compile-time-too or not-compile-time).
* If the form is a PROGN form, each of its body forms is
sequentially processed as top-level forms in the same processing
mode.
* If the form is a LOCALLY, MACROLET, or SYMBOL-MACROLET,
the file compiler makes the appropriate bindings and recursively
processes the body forms as an implicit top-level PROGN with those
bindings in effect, in the same processing mode. (Note that this
implies that the lexical environment in which top-level forms are
processed is not necessarily the null lexical environment.)
* If the form is an EVAL-WHEN form, it is handled according to
the following table:
:COMPILE- :LOAD- :EXECUTE compile-time-too Action
TOPLEVEL TOPLEVEL
Yes Yes -- -- Process body in compile-time-too mode
No Yes Yes Yes Process body in compile-time-too mode
No Yes Yes No Process body in not-compile-time mode
No Yes No -- Process body in not-compile-time mode
Yes No -- -- Evaluate body
No No Yes Yes Evaluate body
No No Yes No do nothing
No No No -- do nothing
"Process body" means to process the body (using the procedure
outlined in this subsection) as an implicit top-level PROGN.
"Evaluate body" means to evaluate the body forms as an implicit
PROGN in the dynamic execution context of the compiler and in the
lexical environment in which the EVAL-WHEN appears.
* Otherwise, the form is a top-level form that is not one of the
special cases. If in compile-time-too mode, the compiler first
evaluates the form and then performs normal compiler processing
on it. If in not-compile-time mode, only normal compiler
processing is performed. Any subforms are treated as non-top-level
forms.
Note that top-level forms are processed in the order in which they
textually appear in the file, and that each top-level form read by the
compiler is processed before the next is read. However, the order of
processing (including, in particular, macro expansion) of subforms
that are not top-level forms is unspecified.
EVAL-WHEN forms cause compile time evaluation only at top-level. In
non-top-level locations, both the :COMPILE-TOPLEVEL and :LOAD-TOPLEVEL
situations are ignored and only the :EXECUTE situation is considered.
The following macros make definitions that are typically used during
compilation and are defined to make those definitions available at
both compile time and run time when calls to those macros appear in a
file being compiled. As with EVAL-WHEN, these compile time
side-effects happen only when the defining macros appear at top-level.
% The specific details of the compile time side effects should go under
% the description of the macro in chapters 6 & 7.
DEFTYPE
DEFMACRO
DEFINE-MODIFY-MACRO
DEFVAR
DEFPARAMETER
DEFCONSTANT
DEFSETF
DEFINE-SETF-METHOD
DEFSTRUCT
DEFINE-CONDITION
DEFPACKAGE
IN-PACKAGE
% These depend on the outcome of issue CLOS-MACRO-COMPILATION
DEFCLASS
DEFGENERIC
DEFMETHOD
DEFINE-METHOD-COMBINATION
% This depends on the outcome of issue PROCLAIM-ETC-IN-COMPILE-FILE
DEFPROCLAIM
The compile time behavior of these macros can be understood as if
their expansions effectively include (EVAL-WHEN (:COMPILE-TOPLEVEL)
...) forms. It is not required that the compile time definition be
made in the same manner as if the defining macro had been evaluated
directly. In particular, the information stored by the defining
macros at compile time may or may not be available to the evaluator
(either during or after compilation), or during subsequent calls to
COMPILE or COMPILE-FILE. If the definition must be visible during
compile time evaluation, it should be placed within an explicit
(EVAL-WHEN (:COMPILE-TOPLEVEL) ...) to ensure that it will be fully
defined at compile time.
Wrong: (defmacro foo (x) `(car ,x))
(eval-when (:execute :compile-toplevel :load-toplevel)
(print (foo '(a b c))))
Right: (eval-when (:execute :compile-toplevel :load-toplevel)
(defmacro foo (x) `(car ,x))
(print (foo '(a b c))))
Compiler/Loader Interface
=========================
% Reference: Issue QUOTE-SEMANTICS
The functions EVAL and COMPILE always ensure that constants referenced
within the resulting interpreted or compiled code objects are EQL to
the corresponding objects in the source code. COMPILE-FILE, on the
other hand, must produce an output file which contains instructions
[rpg: to] tell the loader how to reconstruct the objects appearing in
the source code when the compiled file is loaded.
[rpg: I prefer this, because the objects may not be *re*constructed since
they might not have been constructed in the first place. Also, ``instructions''
might never appear, only some collaboration need be implied:
COMPILE-FILE, on the other hand, must produce an output file which
when loaded with LOAD constructs the objects defined by the source
code.]
The EQL relationship is not well-defined in this case, since the
compiled file may be loaded into a different Lisp image than the one
that it was compiled in. This section defines a notion of "similarity
as constants" which relates objects in the the compile time
environment to the corresponding objects in the load time environment.
The constraints on constants described in this subsection apply only
to COMPILE-FILE; implementations are not permitted to copy or coalesce
constants appearing in code processed by EVAL or COMPILE.
Terminology
-----------
% Reference: Issue CONSTANT-COMPILABLE-TYPES
The following terminology is used in this section.
The term "constant" refers to a quoted or self-evaluating constant
or an object that is a substructure of such a constant, not a named
(DEFCONSTANT) constant. [rpg: ``self-evaluating means....'']
The term "source code" is used to refer to the objects constructed
when COMPILE-FILE calls READ, and additional objects constructed by
macroexpansion during COMPILE-FILE.
[rpg: I think the source code is whatever the representation is in
whatever a file is. I think this use of READ as a semantic crutch is
unnecessary.]
The term "compiled code" is used to refer to objects constructed by
LOAD.
[rpg: so a floating-point number constructed by LOAD is ``compiled
code''?]
The term "coalesce" is defined as follows. Suppose A and B are two
objects used as quoted constants in the source code, and that A' and
B' are the corresponding objects in the compiled code. If A' and B'
are EQL but A and B were not EQL, then we say that A and B have been
coalesced by the compiler.
[rpg: here is a first pass at changing this wording to avoid READ:
The term "coalesce" is defined as follows. Suppose A and B are two
objects defined as quoted constants in the source code, and that A'
and B' are the corresponding objects in the compiled code. If A' and
B' are EQL but A and B were not defined to be EQL, then we say that A
and B have been coalesced by the compiler.]
What may appear as a constant
-----------------------------
An object may be used as a quoted constant processed by COMPILE-FILE
if the compiler can guarantee that the resulting constant established
by loading the compiled file is "similar as a constant" to the
original.
The notion of "similarity as a constant" is not well-defined on all
data types. Objects of these types may not portably appear as
constants in code processed with COMPILE-FILE. Conforming
implementations are required to handle such objects either by having
the compiler and/or loader reconstruct an equivalent copy of the
object in some implementation-specific manner; or by having the
compiler signal an error.
For some aggregate data types, being similar as constants is defined
recursively. We say that an object of these types has certain "basic
attributes", and to be similar as a constant to another object, the
values of the corresponding attributes of the two objects must also be
similar as constants.
This kind of definition has problems with any circular or "infinitely
recursive" object such as a list that is an element of itself. We use
the idea of depth-limited comparison, and say that two objects are
similar as constants if they are similar at all finite levels. This
idea is implicit in the definitions below, and applies in all the
places where attributes of two objects are required to be similar as
constants.
[rpg: Hm, this comment can be got around.]
% Reference: issue CONSTANT-CIRCULAR-COMPILATION
Such circular objects may legitimately appear as constants to be
compiled. More generally, if two constants appearing in the source code
for a single file processed with COMPILE-FILE are EQL, the corresponding
constants in the compiled code must also be EQL.
% Reference: issue CONSTANT-COLLAPSING
However, the converse of this relationship need not be true; if two
objects are EQL in the compiled code, that does not always imply that
the corresponding objects in the source code were EQL. This is
because COMPILE-FILE is permitted to coalesce constants appearing in
the source code if and only if they are similar as constants, except if
the objects involved are of type SYMBOL, PACKAGE, STRUCTURE, or
STANDARD-OBJECT. Objects of these types are never coalesced.
Similarity as constants
-----------------------
Two objects are defined to be "similar as a constant" if and only if
they are both of one of the [rpg: same type from the list of] types
listed below and satisfy the additional requirements listed for that
type.
Number
Two numbers are similar as constants if they are of the same type
and represent the same mathematical value.
Character
Two characters are similar as constants if they both represent
the same character.
% Note that this definition has to depend on the results of the
% Character Set proposals. The intent is that this be compatible with
% how EQL is defined on characters.
Symbol
% Issue COMPILE-FILE-SYMBOL-HANDLING defines how the file compiler
% and loader handle interned symbols.
An uninterned symbol in the source code is similar as a constant
to an uninterned symbol in the compiled code if their print names
are similar as constants.
Package
A package in the source code is similar as a constant to a package in
the compiled code if their names are similar as constants. Note that
the loader finds the corresponding package object as if by calling
FIND-PACKAGE with the package name as an argument. An error is
signalled if no package of that name exists at load time.
Random-state
Let us say that two random-states are functionally equivalent if
applying RANDOM to them repeatedly always produces the same
pseudo-random numbers in the same order.
Two random-states are similar as constants if and only if copies of
them made via MAKE-RANDOM-STATE are functionally equivalent.
Note that a constant random-state object cannot be used as the "state"
argument to the function RANDOM (because RANDOM side-effects this
data structure).
Cons
Two conses are similar as constants if the values of their respective
CAR and CDR attributes are similar as constants.
Array
Two arrays are similar as constants if the corresponding values each
of the following attributes are similar as constants:
For 1-dimensional arrays:
LENGTH, ARRAY-ELEMENT-TYPE, and ELT for all valid indices.
For arrays of other dimensions:
ARRAY-DIMENSIONS, ARRAY-ELEMENT-TYPE, AREF for all valid indices.
In addition, if the array in the source code is a SIMPLE-ARRAY, then
the corresponding array in the compiled code must also be a
SIMPLE-ARRAY. If the array in the source code is displaced, has a
fill pointer, or is adjustable, the corresponding array in the
compiled code is permitted to lack any or all of these qualities.
[rpg: hm]
Hash Table
Two hash tables are similar as constants if they meet the following
three requirements:
(1) They both have the same test (e.g., they are both EQL hash tables).
(2) There is a unique one-to-one correspondence between the keys of
the two tables, such that the corresponding keys are similar as
constants.
(3) For all keys, the values associated with two corresponding keys
are similar as constants.
If there is more than one possible one-to-one correspondence between
the keys of the two tables, the results are unspecified. A conforming
program cannot use such a table as a constant.
[rpg: So, compilers can only be heuristic in such cases, no?]
Pathname
Two pathnames are similar as constants if all corresponding pathname
components are similar as constants.
Stream, Readtable, Method
Objects of these types are not supported in compiled constants.
Function
% Issue CONSTANT-FUNCTION-COMPILATION specifies how the compiler and
% loader handle constant functions.
Structure, Standard-object
% Reference: issue LOAD-OBJECTS
Objects of type structure and standard-object may appear in compiled
constants if there is an appropriate MAKE-LOAD-FORM method defined
for that type.
COMPILE-FILE calls MAKE-LOAD-FORM on any object that is referenced as
a constant or as a self-evaluating form, if the object's metaclass is
STANDARD-CLASS, STRUCTURE-CLASS, any user-defined metaclass (not a
subclass of BUILT-IN-CLASS), or any of a possibly-empty
implementation-defined list of other metaclasses. COMPILE-FILE will
only call MAKE-LOAD-FORM once for any given object (compared with EQ)
within a single file.
Condition
% This somehow got overlooked. Are they handled under LOAD-OBJECTS?
[rpg: Yes, since they are instances of classes.]
Compile Time Error Handling
===========================
% Reference: Issue COMPILER-DIAGNOSTICS
% The STYLE-WARNING condition needs to be integrated into the section
% describing the hierarchy of condition types.
Errors and warnings may be issued within COMPILE or COMPILE-FILE.
This includes both arbitrary errors which may occur due to
compile-time processing of (EVAL-WHEN (:COMPILE-TOPLEVEL) ...) forms
or macro expansion, and conditions signalled by the compiler itself.
Conditions of type ERROR may be signalled by the compiler in
situations where the compilation cannot proceed without
intervention.
Examples:
file open errors
syntax errors
Conditions of type WARNING may be signalled by the compiler in
situations where the standard explicitly states that a warning must,
should, or may be signalled; and where the compiler can determine
that a situation with undefined consequences or that would cause
an error to be signalled would result at runtime.
[rpg: But this is not to be construed as an escape clause that allows
an implementation to not warn when it is required. This attempts to
only talk about how the warning is issued, right?]
Examples:
violation of type declarations
SETQ'ing or rebinding a constant defined with DEFCONSTANT
calls to built-in Lisp functions with wrong number of arguments
or malformed keyword argument lists
referencing a variable declared IGNORE
unrecognized declaration specifiers
The compiler is permitted to issue warnings about matters of
programming style as conditions of type STYLE-WARNING. Although
STYLE-WARNINGs -may- be signalled in these situations, no
implementation is -required- to do so. However, if an
implementation does choose to signal a condition, that condition
will be of type STYLE-WARNING and will be signalled by a call to
the function WARN.
Examples:
redefinition of function with different argument list
calls to function with wrong number of arguments
unreferenced local variables not declared IGNORE
declaration specifiers described in CLtL but ignored by
the compiler
Both COMPILE and COMPILE-FILE are allowed to establish a default
condition handler. If such a condition handler is established,
however, it must first resignal the condition to give any
user-established handlers a chance to handle it. If all user error
handlers decline, the default handler may handle the condition in an
implementation-specific way; for example, it might turn errors into
warnings.
% Reference: issue WITH-COMPILATION-UNIT
In some implementations, some kinds of warnings may be deferred until
"the end of compilation"; see WITH-COMPILATION-UNIT.
-------