[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Constructors



This is a proposal to add constructors back into CLOS, now that the basic
object creation protocol has been agreed upon.

Note that this proposal has been greatly simplified compared to what was being
discussed last spring.  In response to several cogent objections that were
raised, a number of problematic features have been removed and constructors
have been made easier to understand and to use.  While these proposed
constructors are somewhat different from the constructors whose usefulness has
been verified by their use in New Flavors, a survey of 242 constructors used
in a variety of Flavors-based programs showed that these CLOS constructors
should meet the needs of those programs, which I am assuming to be in some
sense typical.  The only real change to those programs would be the
requirement that any slot to be filled by a constructor must have an initarg,
and I don't think that presents any difficulty.

Survey of 242 constructors for 237 different flavors:
83 with MAKE-INSTANCE-style arguments.
Of the remaining 159 constructors:
140 with required arguments only, 8 also with optional or rest,
11 with keyword arguments.  0 with &aux parameters.
15 take no arguments, 102 only fill slots, 3 also specify area,
39 take initargs.
142 fill slots, of which 0 use the initarg rather than the slot name,
and 85 fill slots that don't have initargs.  103 fill inherited slots.


WHY HAVE CONSTRUCTORS?

(1) A cleaner interface for the caller

It's often appropriate to have a more abstract interface than the one provided
by MAKE-INSTANCE.  Providing a constructor as the documented inter-module
interface for making a particular kind of object encourages users of the
interface to think in more abstract, conceptual terms.  Using a constructor
also allows more aspects of the implementation to be changed without changing
the interface: the class name and initarg names could be changed, or the data
representation could be changed to a DEFSTRUCT representation or a
standard-type, without changing the interface.  The constructor could even be
replaced by an interface function that does some complex computations to
decide what type of object to create, or to decide whether to return an
existing object or create a new one.

These needs could be met simply by defining constructor functions with DEFUN
and advertising them.  Some reasons to have a :CONSTRUCTOR option in
DEFCLASS are 
 - to make it easier and more convenient for users to create constructors
 - to be culturally consistent with DEFSTRUCT
 - :CONSTRUCTOR is a convenient abbreviation for something you could do yourself
   in a more long-winded way, just like :ACCESSOR
Other reasons appear below.

(2) Coordination with class redefinition

As the bridge between an external interface and the internal structure of a
class, a constructor function contains certain information about the class,
such as what are its initargs and their default values.  When a class is
redefined, any constructors for the class and its subclasses should be updated
if necessary to keep them consistent with the class.  Making this updating
automatic is a convenience for programmers, so they don't have to remember to
do it by hand.  One way to make the updating automatic would be to add a new
feature to the programming environment by which a linkage could be established
between a function and a class so that redefining the class makes some edits
to the source of the function and then recompiles it.  A much simpler way is
to make the constructor syntactically part of the DEFCLASS, so it naturally
gets updated at the same time as the rest of the DEFCLASS.  This is another
way in which constructors are analogous to accessors.

(3) More efficient than MAKE-INSTANCE

MAKE-INSTANCE must operate by interpreting data structures that describe the
class, while constructors can be compiled, since they know the exact class
that they are constructing, and since they can be automatically recompiled if
the class or any of its superclasses changes.  Initialization methods can be
completely inlined into a constructor, where MAKE-INSTANCE has to go through a
generic function dispatch.  Constructors can take positional arguments, which
are more efficient in most implementations, while MAKE-INSTANCE requires
keyword arguments.

Gregor has argued that calls to MAKE-INSTANCE with a constant first argument
can be equally optimized, since the exact class being constructed is known.
While this is true in theory, it seems that either a complicated mechanism
would be needed to make sure that the function was recompiled when the class
was redefined in a way that invalidated the inline code, or else there would
have to be user-visible declarations to control the tradeoff between
performance and robustness in the face of class redefinition.  Alternatively,
calls to MAKE-INSTANCE with a constant first argument could be turned into
calls to a constructor function that was created behind the scenes.  Then if
the class was redefined, only the constructor function would have to be
recompiled.  In either case, if we are going to have this type of mechanism, I
would much rather make it explicitly visible as a :CONSTRUCTOR option than
have it operating behind the scenes in some vaguely defined way.


WHY GET RID OF CONSTRUCTORS?

(1) Simplicity

If we have both constructors and MAKE-INSTANCE, then we have two ways to
do the same thing.

  [But CLOS very often provides both a primitive mechanism and a convenient
  abbreviation for a common case of using that mechanism.]

The rules for mapping constructor parameter names into slots and initargs are
complicated and confusing.

  [Very true.  In this proposal they have been enormously simplified.]

(2) Avoid hiding mechanisms

Constructors contain a hidden performance optimization, in that there is
more inlining in their bodies than can be achieved through documented
mechanisms elsewhere.

  [I argued above that this is preferable to the inherent complexity of
  making that mechanism generally available.  Of course there is nothing
  to stop us from documenting it if that's what we really want.
  Also, exactly the same thing could be said about :ACCESSOR, at least
  in the Symbolics implementation.]

(3) Not more efficient than MAKE-INSTANCE

(see discussion above)


SYNTAX

The DEFCLASS option (:CONSTRUCTOR -symbol-) creates a function named -symbol-
that takes the initargs of this class as keyword arguments and returns an
instance of this class.  Thus a call to -symbol- looks just like a call to
MAKE-INSTANCE with the first argument omitted.

The DEFCLASS option (:CONSTRUCTOR -symbol- -constructor-lambda-list-) creates
a function named -symbol- whose lambda-list is -constructor-lambda-list-.  The
function returns an instance of this class, initialized according to the
parameters in -constructor-lambda-list-.  Each parameter supplies the value of
one initarg, determined by the following rules:
 - If a parameter variable name is EQ to an initarg name, the parameter 
   supplies the value of that initarg.
 - If a parameter variable name is not EQ to any initarg name, but the symbol
   in the keyword package with the same name as the parameter variable
   name is EQ to an initarg name, the parameter supplies the value of that
   initarg.
 - If neither rule succeeds, signal an error.
The second rule exists because initarg names are often keyword symbols, which
are not valid as variable names.

-constructor-lambda-list- allows all of the standard lambda-list features that
DEFUN allows.  The only difference is that if no initform is specified for an
&optional, &key, or &aux parameter, instead of just defaulting to NIL, the
parameter defaults in a special way, as in DEFSTRUCT constructors.  An
&optional or &key parameter with no initform defaults to the corresponding
initarg's default-initarg form, or if there is none defaults to NIL but is
not passed to initialization methods if unsupplied.

An &aux parameter with no initform defaults to NIL but always behaves as if
unsupplied: the corresponding initarg is not passed to initialization methods
and if there is a default-initarg form, it is never evaluated.  (This feature
comes from DEFSTRUCT.  Since &aux was never used in constructors in the
Flavors programs I surveyed, I wouldn't propose it if DEFSTRUCT hadn't already
introduced it into Common Lisp.)

The :CONSTRUCTOR option can appear more than once in a DEFCLASS form.


PROCEDURAL DEFINITION OF CONSTRUCTORS

After receiving and defaulting its arguments, a constructor forms an initarg
list from its parameters and calls MAKE-INSTANCE with the appropriate class
object as the first argument and the initarg list as the remaining arguments.

The actual code compiled for a constructor can be optimized in
implementation-dependent ways, as long as it has the same effect as above.
For example, instead of calling MAKE-INSTANCE, the constructor can inline the
body of MAKE-INSTANCE and the bodies of some or all of the methods that
MAKE-INSTANCE calls.  This implies that the initarg list might not be fully
materialized, parameter values might really be stored directly into slots, and
keyword argument processing might be completely eliminated.  Any optimization
of this type is valid as long as the same effect as calling MAKE-INSTANCE is
achieved and the compiled code is updated when the class or a superclass is
redefined or a relevant method is added or removed.


WHEN ARE CONSTRUCTORS DEFINED AND REDEFINED?

A constructor for a class C cannot be accurately defined until each of 
C and its superclasses is defined and all applicable INITIALIZE-INSTANCE
and ALLOCATE-INSTANCE methods have been defined.  Until then, the set
of initargs and default-initarg forms is not known.

The macro-expansion of DEFCLASS includes a DEFUN for each constructor
based on the information available at the time DEFCLASS is expanded.  If any
superclass is not yet defined, this constructor is a dummy that simply signals
an error.  At any later time when C is redefined, a superclass of C is defined
or redefined, or a relevant method is added or removed, CLOS considers each
constructor for C and if necessary recompiles it.  If a class is redefined and
a :CONSTRUCTOR is removed, FMAKUNBOUND is applied to the former constructor's
name.  Thus a constructor is always up to date with the latest information
about its class.

Note that a user might extract the constructor function from the function
definition of its name, redefine the class (thus redefining the constructor),
and then call the out of date constructor function.  Implementations should be
robust in the face of this, either signalling an error when an out of date
constructor function is called, automatically calling the latest version of
the constructor function, or making an instance of the class as it used to be
defined and then updating it as if the class had been redefined after the
instance was made.  This follows from the principle that optimization of a
constructor function should not affect its semantics.

This is better than the way Flavors does it, which involves only creating
constructors in COMPILE-FLAVOR-METHODS.


PRIMITIVES FOR MAKING CONSTRUCTORS

At the meta-object level, there will be primitive functions for turning a
constructor lambda-list into a function by filling in parameter defaults and
computing the function body, verifying that a constructor is still valid, and
establishing and removing the linkage between a class and a constructor.  A
default method for some function that gets called when things are redefined
will call these functions, to keep the constructors up to date.

Very roughly, these will be:

MAKE-CONSTRUCTOR class name &optional constructor-option-second-argument
 -> lambda-expression form

To install the constructor function, evaluate
(PROGN (SETF (SYMBOL-FUNCTION 'name) #'lambda-expression)
       form
       (LINK-CONSTRUCTOR 'class 'name T)
       'name)

VERIFY-CONSTRUCTOR class name &optional constructor-option-second-argument
 -> Boolean

LINK-CONSTRUCTOR class name on-or-off

We could also expose the next level down, which defines how class and
constructor-option-second-argument are turned into information that is
compared by VERIFY-CONSTRUCTOR to see whether the constructor needs to be
regenerated, and defines how the form returned by MAKE-CONSTRUCTOR as its
second value records this information.  (This form is a bit of a crock.  In
the Symbolics system, it isn't necessary, because all information is in the
lambda-expression and in the compiled-function object compiled from it.
However, I'm assuming that some implementations cannot do this, and therefore
provision is needed to associate this information with the name of the
constructor rather than with the actual function object.  If we can get rid of
this, great.  Of course LINK-CONSTRUCTOR still needs the name so class
redefinition knows where to store the updated constructor.)