[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

issue COMPILE-FILE-SYMBOL-HANDLING, version 1



This is a new issue, split off from CONSTANT-COMPILABLE-TYPES and the
cl-cleanup issue IN-PACKAGE-FUNCTIONALITY.  The writeup is rather
long, with three proposals and an analysis section at the end that
compares them.

Forum:		Compiler
Issue:		COMPILE-FILE-SYMBOL-HANDLING
References:	CLtL p. 182
		Issue IN-PACKAGE-FUNCTIONALITY
		Issue CONSTANT-COMPILABLE-TYPES
		Issue DEFPACKAGE (passed)
Category:	CHANGE/CLARIFICATION 
Edit History:   V1, 01 Feb 1989, Sandra Loosemore
Status:		**DRAFT**


Problem Description:

It is not clear how COMPILE-FILE is supposed to specify to LOAD how
symbols in the compiled file should be interned.  In particular, what
happens if the value of *PACKAGE* is different at load-time than it
was at compile-time, or if any of the packages referenced in the file
are defined differently?

There are three proposals:  CURRENT-PACKAGE, HOME-PACKAGE, and
REQUIRE-CONSISTENCY.


Proposal COMPILE-FILE-SYMBOL-HANDLING:CURRENT-PACKAGE:

  When a compiled file is loaded, the interned symbols it references
  are found by the following procedure.  The rules are applied in the
  order listed and only the first applicable rule has any effect.

  (1) Any symbol accessible at compile time in the package that is the
      value of *PACKAGE* is found by calling INTERN at load time with one
      argument, the name of the symbol.

  (2) A keyword symbol is found by finding or creating a keyword symbol
      with the same name.
    
  (3) A symbol that at compile time is an external symbol of its home
      package is found at load time by finding the package with the same
      name as the compile-time home package, and then finding an exported
      symbol of that package with the same name as the compile-time symbol.
      If no such package exists, no such symbol exists, or the symbol is not
      exported, an error is signalled.

  (4) Any other symbol is found by calling INTERN at load time with two
      arguments, the name of the symbol and the package with the same name
      as the compile-time symbol's home package.  If no such package exists,
      an error is signalled.

  The goal of this procedure is for each symbol reference to be
  resolved to the same symbol when a compiled file is loaded as when
  the source file is loaded directly with LOAD.  It is possible to
  create package structures that make that impossible; for example, it
  is possible for a symbol to be inaccessible from its own home
  package.  A conforming program cannot depend on any symbol
  resolution behavior that is not provided by the above four rules.
  
  If any top level form in a compiled file changes the value of
  *PACKAGE*, other than a SELECT-PACKAGE appearing as the first
  top level form in the file, the results are unspecified.


  Rationale:

    Proposal CURRENT-PACKAGE makes COMPILE-FILE/LOAD follow the same
    rules as PRINT/READ.  For any symbol not written with a package
    prefix in the source file (which should be the great majority of
    them), CURRENT-PACKAGE will make loading the compiled file get the
    same symbols as loading the source file.

    The reason for the rule about changing the value of *PACKAGE* is that
    many loaders cache the interning of symbols; if the same symbol 
    appears multiple times in the source file, its name may only be 
    looked up once at load time.  Since not all loaders are required to
    work this way, changing *PAKCAGE* in mid-file is not allowed,
    because the effect on later occurrrences of a symbol would be
    implementation-dependent.


Proposal COMPILE-FILE-SYMBOL-HANDLING:HOME-PACKAGE:

  When a compiled file is loaded, the interned symbols it references are
  found by calling INTERN at load time with two arguments, the name of
  the symbol and the package with the same name as the compile-time
  symbol's home package.  If no such package exists, an error is
  signalled. 
  
  The goal of this procedure is for each symbol reference to be resolved
  to the same symbol when a compiled file is loaded as when the source
  file was processed by COMPILE-FILE.  A conforming program cannot
  depend on any symbol resolution behavior that is not provided by the
  above rule.

  If any top level form in a compiled file changes the value of
  *PACKAGE* when the file is loaded interpretively but not during
  compile-time processing by COMPILE-FILE, the results are unspecified.
 
  Rationale:

    The behavior specified in this proposal is simple and easy to 
    understand (there is only one rule to remember instead of four).  
    It does not require any restrictions on where top-level
    SELECT-PACKAGE forms may appear in the file.  It allows a compiled
    file that does not include an explicit SELECT-PACKAGE to be loaded 
    successfully no matter what the load-time value of *PACKAGE* is,
    as long as the compile-time value of *PACKAGE* was the "right" 
    package.


Proposal COMPILE-FILE-SYMBOL-HANDLING:REQUIRE-CONSISTENCY:

  In order to guarantee that compiled files can be loaded correctly,
  users must ensure that the packages referenced in the file are defined
  consistently at compile and load time.  Conforming Common Lisp programs
  must satisfy the following requirements:
  
  (1) The value of *PACKAGE* when the contents of the file are compiled 
      by COMPILE-FILE must be the same as the value of *PACKAGE* when
      the file is loaded.  In particular:

      (a) If any top level form in a compiled file changes the value
          of *PACKAGE*, other than a SELECT-PACKAGE appearing as the first 
          top-level form in the file, the results are unspecified.

      (b) If the first top-level form in the file is not a call to
          SELECT-PACKAGE, then the value of *PACKAGE* at the time LOAD is
          called must be a package with the same name as the package that
          was the value of *PACKAGE* at the time COMPILE-FILE was called.

  (2) For all symbols that were accessible in *PACKAGE* at compile
      time but whose home package was another package, at load time there
      must be a symbol with the same name that is accessible in both the
      load-time *PACKAGE* and in the package with the same name as the
      compile-time home package.
  
  (3) For all symbols in the compiled file that were external symbols in
      their home package at compile time, there must be a symbol with the
      same name that is an external symbol in the package with the same name
      at load time.
        
  If any of these conditions do not hold, the package in which the
  affected symbols are interned by LOAD is unspecified.  Implementations
  are permitted to signal an error or otherwise define this behavior.
  
  Otherwise, when a compiled file is loaded, the interned symbols it
  references are found by calling INTERN at load time with two
  arguments, the name of the symbol and the package with the same name
  as the compile-time symbol's home package.  If no such package exists,
  an error is signalled.

  Rationale:

    Any program that behaves differently under the other two proposals
    is already nonportable.  This proposal is merely an explicit 
    statement of the status quo, namely that users cannot depend on
    any particular behavior if the package environment at load time is
    inconsistent with what existed at compile time.


Current Practice:

  PSL/PCLS implements something very similar to proposal HOME-PACKAGE,
  as does A-Lisp.  Utah Common Lisp implements something like proposal
  CURRENT-PACKAGE, but the chief compiler hacker says he thinks that
  proposal HOME-PACKAGE actually makes more sense, and agrees that any
  program that behaves differently under the two proposals is broken.
  
  KCL implements something like HOME-PACKAGE (symbols in the compiled
  file are explicitly qualified with the name of their home package),
  except that it differentiates between internal and external symbols.
  
  Lucid Lisp appears to implement something like proposal CURRENT-PACKAGE.
  
  Symbolics Genera implements CURRENT-PACKAGE.  Symbolics Cloe probably
  does also.
  
  
Cost to implementors:

  Proposals HOME-PACKAGE and CURRENT-PACKAGE would be incompatible
  changes for implementations that currently do things the other way.
  It would probably be easier to convert to HOME-PACKAGE than
  CURRENT-PACKAGE, since it is less complicated.
  
  Proposal REQUIRE-CONSISTENCY is intended to be compatible with either
  of the other two proposals, but it may not be entirely compatible with
  the details of current implementations.


Cost to users:

  Proposal HOME-PACKAGE places the fewest restrictions on user programs.
  
  Proposal CURRENT-PACKAGE places a restriction on where and how the value
  of *PACKAGE* may be changed within the file.  
  
  Proposal REQUIRE-CONSISTENCY places even more restrictions on user
  programs.
  
  Most of these restrictions are probably already necessary in portable
  programs.  However, some nonportable programs that depend on the "other"
  model may be broken by proposals HOME-PACKAGE or CURRENT-PACKAGE.
  
  For a discussion of how these proposals treat nonportable or erroneous
  programs, see the "Analysis" section below.
  
  
Benefits:

  COMPILE-FILE's treatment of symbols is made explicit in the standard.
  
  
Analysis:

  Proposals CURRENT-PACKAGE and HOME-PACKAGE present two different
  models of how this problem might be solved.  Essentially, proposal
  CURRENT-PACKAGE uses the same rules as PRINT/READ in deciding when to
  qualify symbols with a package name and where to find unqualified
  symbols.  Proposal HOME-PACKAGE requires -all- symbols written to the
  compiled file to be qualified with an explicit package, and the loader
  simply interns the symbols in that package.
  
  These two proposals differ in the following situations.  Proposal
  REQUIRE-CONSISTENCY, in effect, says that valid programs do not cause
  any of these situations to occur, and the behavior in such cases is
  unspecified (allowing both models to be used as valid implementation
  techniques).
  
  (1) The situation where the file does not contain a SELECT-PACKAGE
      and where the compile-time value of *PACKAGE* is a package with a
      different name than the load-time value of *PACKAGE*.
      
      Proposal CURRENT-PACKAGE would intern symbols that were accessible
      in *PACKAGE* at compile time in *PACKAGE* at load time.
      
      Proposal HOME-PACKAGE would intern symbols that were accessible in
      *PACKAGE* at compile time in the package with the same name as
      their compile-time home package.
      
      In general, programs must be compiled in the "right" package, so
      that the compiler can find and apply the correct macro expansions,
      type definitions, and so on; see issue COMPILE-ENVIRONMENT-CONSISTENCY.
      As a result of macroexpansion or other transformations applied by
      the compiler, the compiled file may contain symbol references that
      were not present in the source file.  Proposal CURRENT-PACKAGE may
      cause problems because these references may be resolved to be
      symbols other than the ones that were intended.  Since proposal
      HOME-PACKAGE remembers the home package of all symbols, it is much
      more likely to find the correct symbols at load time.
          
  (2) The situation where *PACKAGE* is altered by a top-level form
      that is not a SELECT-PACKAGE which is the first top-level form in
      the file.
      
      Proposal CURRENT-PACKAGE says this is illegal.
      
      Proposal HOME-PACKAGE says this is OK, as long as *PACKAGE* is
      altered in the same way at compile time as when the file is loaded
      interpretively.  This is possible because the behavior this
      proposal specifies does not depend on what the value of *PACKAGE*
      is once symbols in the source file have been read by COMPILE-FILE.
      
      Some people argue that allowing *PACKAGE* to be switched in
      mid-file is a bad idea anyway; it is not really necessary and it
      implies a restriction on COMPILE-FILE to read forms from the file 
      one at a time, processing each form before the next call to READ.
      
      Others argue that restricting SELECT-PACKAGE to be the first
      top-level form is an artificial contrivance.  The compile-time
      behavior of SELECT-PACKAGE is well-defined no matter where it
      appears in the file.  There is also a problem defining what "the
      first top-level form" really means.  Finally, this model requires 
      all package definitions to be made externally to the file, which 
      may be inconvenient for smaller programs that now contain the 
      package definition and package contents all in one file.

  (3) The situation where there is a symbol accessible in the
      compile-time value of *PACKAGE* but with another home package, and
      where at load time there is not a symbol with the same name that
      is accessible in both packages.  This situation might occur, for
      example, if at compile time there is a symbol that is external in
      its home package and that package is used by *PACKAGE*, but where
      there is no such external symbol in that package at load time, or
      the load-time *PACKAGE* does not use the other package.
      
      Proposal CURRENT-PACKAGE would find or create a symbol accessible
      in *PACKAGE*.
      
      Proposal HOME-PACKAGE would find or create a symbol accessible in
      a package with the same name as the symbol's compile-time home
      package.
      
      Some people feel that the behavior of proposal CURRENT-PACKAGE is
      more intuitive in this situation, and that it is more forgiving of
      differences between the compile-time and load-time package
      structures.  Others feel that the behavior of HOME-PACKAGE is more
      intuitive, and that if there have been significant changes to the
      package structures, it is probably an indication that the file
      needs to be recompiled anyway, since the compiler might have
      picked up macro definitions and the like from the wrong package.
  
  (4) The situation where a symbol is external in its home package
      and where there is no such external symbol in that package at load
      time.
      
      Proposal CURRENT-PACKAGE would quietly intern the symbol in
      *PACKAGE* if the symbol were accessible in *PACKAGE* at compile
      time.  Otherwise, it will signal an error.
      
      Proposal HOME-PACKAGE would always just quietly intern the symbol
      as internal in its home package.
      
      Not complaining when a symbol that is supposed to be external
      isn't can be seen as a violation of modularity.  However, it seems
      like this argument should apply equally to symbols whose home
      package is *PACKAGE* as symbols whose home package is somewhere
      else.
          

Discussion:

  Loosemore is opposed to proposal CURRENT-PACKAGE, but would be
  less opposed to it if it contained an explicit statement that
  *PACKAGE* must be a package with the same name at load time as at
  compile time.
  
  Moon is opposed to proposal HOME-PACKAGE, but would be less
  opposed to it if it required an error to be signalled when a
  symbol that was external at compile time is not external at load
  time.

-------