[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Names to Objects and Compiler-environment


The fundamental problem I think we are trying to address with this
proposal is the transfer of information about class definitions
and potentially method definitions from form to form during a
file compilation. Such a transfer will necessary be implementation dependent, 
but we would like to design an interface to the metaclass protocol which 
is portable. The information on these definitions should
either not be a "real" definition or it should be a seperate and shadowing 
definition from the definition in compiler's run time environment (if any).
The alternatives are to either replace the definition in the compiler's
run time environment, or to not propagate any information on class
and method definitions being compiled between forms. If we chose the
former solution (replacing the definition in the compiler's run
time environment) we face a bootstrapping problem, since a method
or class necessary for the compilation of a file may get redefined 
while the file is being compiled, breaking the compiler (admittedly, this
may not occur very often, but, if it does, it could be very disabling).
The latter solution (not propagating any information) would place
an unreasonable burden on programmers using CLOS, since they would be
required to place class definitions and definitions of methods working
on those classes into seperate files. Usual practice in object-oriented
programming is to group class definitions and method definitions into
the same file, for easy reference.


A solution has been proposed in which certain metaclass protocol
functions (generic and otherwise) take parameters which are "environments".
The exact nature of these environment parameters is unspecified, but
in order for them to be accessable to the top level macros DEFCLASS
and DEFMETHOD without adding alot of additional machinery to Common Lisp, 
the most logical choice, as Moon has pointed out, is the macro &ENVIRONMENT 

The metaclass functions which Patrick has identified as being involved
in name to object mapping are:

	CLASS-NAMED (aka SYMBOL-CLASS) - maps a symbol to a class 
           object having that name.
	SYMBOL-FUNCTION - maps a symbol naming a (potentially generic)
	   function to the (potentially generic) function object.
	GET-METHOD - ? according to pg. 2-39 of 87-002, this
	   takes a generic function object, list of method qualifiers,
	   and a list of parameter specializers (which, presumably,
	   are also objects) and produces the method.

Sonya has added:

	GET-SETF-GENERIC-FUNCTION - maps a name for a generic
	  function into the generic function for doing the SETF.

I would argue that, of these, CLASS-NAMED and SYMBOL-FUNCTION are
at the right primitive level to discuss. My reasoning is as follows.
As the spec for GET-METHOD indicates, it is not doing a name to
object mapping but rather an object to object mapping. Thus 
the more primitive operations CLASS-NAMED and SYMBOL-FUNCTION can
be used to find the objects, CLASS-NAMED to find the specializer
list, and SYMBOL-FUNCTION to find the generic function. As far
as GET-SETF-GENERIC-FUNCTION goes, it is doing a name to object
mapping, but the mapping is slightly bogus, since the name for a
SETF generic function is created, and the operation could just 
as well be done by passing the generic function object for which the
SETF generic function was desired. The generic function object
might have to keep around information about it's SETF, however. 
Alternatively, the algorithm for generating the SETF name could 
be published (a user can find it simply enough anyway by 
macroexpanding a SETF form) and we are back in the case where 
SYMBOL-FUNCTION is the correct primitive for finding the 
generic function.

Naturally, along with the functions for doing the name to object
mapping, the functions for doing a SETF will require an environment
argument as well. These would be the SETF functions for CLASS-NAMED

Additional CLOS functions which Patrick has identified as possibly
requiring an environment argument are: ADD-METHOD, REMOVE-METHOD,
and ENSURE-GENERIC-FUNCTION. While these do not explicitly
do name to object mapping, I believe the logic here is the following:

	ADD-METHOD, REMOVE-METHOD - addition and removal of a method from
          a generic function is dependent on the environment, since a
          different method definition may be desired on a generic function
          in the compile time environment from what is available in the
          compiler's run time environment (the "outside" or "top" environment).
        FIND-APPLICABLE-METHODS - the compilation of a CALL-NEXT-METHOD
          form will require access to methods as they are "defined" or,
          at the very least, to the definitions compiled during a file
          compilation, so the "current" definition is used for arranging
          the method call, rather than the definition in the compiler's
          run time environment.
	ENSURE-GENERIC-FUNCTION-This does an implicit name to generic
          function mapping, setting up any existing function (generic
          also?) as a default method. Since it will probably use
          SYMBOL-FUNCTION to retreive the function object bound to
          the symbol's function cell, an environment parameter might
	  be needed to indicate which particular generic function
    	  is required. 
Of these, only ENSURE-GENERIC-FUNCTION takes a function name as an
an argument, the others all take generic function and other objects.
Hence, sensitivity to the processing environment need only be included
in ENSURE-GENERIC-FUNCTION, since only it will have to internally
resolve a name to object mapping.

One group of functions Patrick missed in his list is the metaclass
functions on pg. 3-25 of the metaobjec protocol specification. 
They are all defined to take a name for the appropriate metaclass. 
With the exception of DEFINE-METACLASS (which can be a macro anyway, 
and thus use its &ENVIRONMENT parameter), the others could as well 
be defined to operate on a class object which was a metaclass, 
rather than directly on a metaclass name. 


I believe that an excellent case can be made for an environment argument
to CLASS-NAMED, and, correspondingly, that class definitions need to
be made both in the compile time environment (but *not* in the 
compiler's run time environment) and at load time, as usual. 
The arguments presented in the first section indicated why some way
of maintaining information on classes being defined needs to be propagated
between forms during a file compilation, independently of any definitions
in the compiler's run time environment. 

An alternative for doing the definition "for real" is to maintain 
information about definitions being compiled, then have the relevent 
metaclass protocol functions distinguish whether the information about
a particular definition comes from the "for real" definition or from the
partial definition. I do not like this solution because it introduces
an additional element of complexity into the metaclass protocol which
somehow seems unnecessary, and sets up a more sharp distinction between
compiling a definition and evaluating it than simply switching
environments. CommonObjects did things this way, and it slowed down
compilation and made for some nasty case analysis. For example, handling 
the distinction between the following two cases would be nontrivial 
(in each case, the class FOO is also defined in the compiler's run time 

Case 1:

	(defclass foo () () )
	(setf *global-var* (make-instance 'foo))

Case 2: 

	(defclass foo () () )
	(eval-when (compile)
	  (setf *global-var* (make-instance 'foo)))

Though it could be disputed, I think the intent of Case 1 is to have
MAKE-INSTANCE use the FOO defined immediately above it, and that
the compiler, running in *not-compile-time-mode* (CLtL 69), should
defer instance creation and execution of the SETF until load time,
while, in the second case, instance creation and SETF should get done
at compile time using the definition in the compiler's run time
environment (*compile-time-too* mode) rather than the immediately 
preceeding defintion, (except for KCL, which runs in *compile-time-too* 
mode at the top level, but it is definitely in the minority). 
If a compile time environment is used, then the EVAL-WHEN (COMPILE) 
can simply be viewed as "popping" back to the compiler's run time 
environment within the dynamic scope of the form, and returning to
the compile-time environment when the form ends.

The required behavior from DEFCLASS would be that the establishment
of a name to class object mapping is made via the &ENVIRONMENT
parameter, at compile time, and in the top level environment, at
load time. This suggests some way of obtaining the top level 
environment for inserting the class name to object mapping.
Following Patrick's suggestion, a function GET-CURRENT-ENVIRONMENT
could be used. Another possibility is a special variable, *ENVIRONMENT*, 
which would be bound to the current environment, similarly to how *PACKAGE* 
is bound to the current package. I'd be interested in hearing if
this would have problems, as Moon's comments about GET-CURRENT-ENVIRONMENT
seem to indicate:

>If GET-CURRENT-ENVIRONMENT takes no arguments, then what you have is
>some form of dynamic scoping, rather than lexical scoping, and you can
>get scoping problems.  Symbolics' implementation, and I believe TI's as
>well, currently works this way, using the special variable
>SYS:UNDO-DECLARATIONS-FLAG to inform macro expanders on behalf of which
>environment they are working.  The genesis of this is historical and
>predates lexical scoping.  This causes a number of subtle problems.
>CLOS should not make this mistake.

though I'm not quite sure what sorts of arguments GET-CURRENT-ENVIRONMENT
should have or how this relates to dynamic scoping. The idea with 
*ENVIRONMENT* is that it would be bound to the current macroexpansion 
environment, which may or may not be EQL to the &ENVIRONMENT parameter 
of a macro (*MACROEXPAND-HOOK* could be used to modify whether this is 
true or not) but they would, in any event, be the same "kind" of environment. 
Exactly how the name to object binding is inserted into the environment 
would, of course, be implementation dependent (but this could be 
hidden within CLASS-NAMED).

In addition, DEFCLASS would naturally have to use definitions within
the &ENVIRONMENT parameter for things like determining inheritance 
information necessary at compile time. What kinds of information
would be necessary? For the moment, let's ignore optimization information,
since things get a bit more complicated when it is taken into account.
Given this, we can rule out slot layout and number information, since
WITH-SLOTS :USE-ACCESSORS NIL (the only place it would potentially be needed)
should go through SLOT-VALUE. Possibly the slot :INITFORM (and any 
additional initialization information) would need to be compiled, but
they would not have to be accessed by anyone else. The only really
important piece of information needed would be the SETF generic
function names for inherited slots, since these would be required for
expanding SETF forms at compile time. Most other aspects of inheritance
(modulo optimizations) could be handled at load time or run time.

In order to make things more convenient for the user, we may want
to define an interface function called CLASS-NAMED, which takes the
class out of the current environment, and a metaclass function,
called SYMBOL-CLASS, which requires an environment argument. 
Corresponding SETFs would also be required. But this seems as
if it should be the only modification needed for dealing with
the name to class mapping.

As a side note, I did an experimental implementation of something
similar using the CommonObjects on CommonLoops implementation this spring.
The part modifying CLASS-NAMED to be sensitive to the compilation
environment worked very well, which leads me to believe that implementation
should be possible.


The other part of the initial proposal involved shadowing generic
functions and methods in the compile time environment by making
the name to function mapping dependent on the environment.
The effect would be to require SYMBOL-FUNCTION to have an environment 
parameter, since SYMBOL-FUNCTION and its SETF are the means whereby a
name to (possibly generic) function mapping is established.

Note that any attempt to make the name to function mapping dependent
on the environment will inevitably have some serious reprecussions
for Common Lisp. In particular, the design of Common Lisp assumes
functions are named by symbols in a global name space, partitioned
through packages. These symbols have a globally accessable function
cell, which SYMBOL-FUNCTION, FBOUNDP, MAKFUNBOUND, and other accessor
function access. Thus function names are kind of like special variables
except they can't be dynamically bound, or, more precisely, like global 
variables in other languages (Pascal, for example), where dynamic
binding is not available. The name to function mappings established
by FLET and LABELS are not available via. SYMBOL-FUNCTION.

Referring back to the initial motivation for including environment
sensitivity, namely information propagated from form to form, there
is only one case where one method might need to know something about
another 's definition during compilation: CALL-NEXT-METHOD. However,
ignoring optimizations for the moment, the characterization of
CALL-NEXT-METHOD as lexical in scope and dynamic in extend suggests
lookup of the next method could be done, at the latest, at run time
exactly as method dispatch is done. Knowledge about the classes
of the caller's parameters at compile time could be used to limit 
the run time method search. Various further optimizations are possible,
but the most obvious require only the ability to do method lookup
and linking at load time.


Unfortunately, some object oriented languages resolve method inheritance
fully at compile time. CommonObjects is an example. CommonObjects
has a form similar to CALL-NEXT-METHOD (called CALL-METHOD) which
allows the programmer to specify a particular method on the direct
super (or on itself), and a function call to the method (via. a
special method symbol) is compiled in at compile time. Thus the
CALL-METHOD macro must have access to the symbol at compile time,
and the fasl loader must maintain a compile time to load time 
mapping of the symbol. This is all implementation dependent, of
course, and this particular feature has given us much trouble
in developing the Portable CommonObjects implementation. The
point is, however, that compiling a file of CommonObjects methods
requires information on the methods previously compiled to be
propagated between forms.

As mentioned in the previous section, addition of an environment
parameter to SYMBOL-FUNCTION (and its SETF) would involve a
major change to the semantics of function symbols in Common Lisp.
Since the function cell is a globally accessable place, would
redefining a function in a particular environment still cause
the global definition to change? If so, then bootstrapping problems
could easily occur, since a definition which has just been compiled
should not be used in the compilation process. If not, then the
nature of the function cell as a globally accessable place is
compromised, since function invocations in the compiler's run
time environment will get one definition, while another definition
will be operative in the compilation environment.