[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue EQUALP-GENERIC



I think you did some good work here, but I don't think you solved
all the problems that Gregor and I discovered in our walk on the beach.
If I were you, I would give up and not try to solve the problem of
making EQUALP generic within the Common Lisp standard (I might still
try to solve it as a research effort).  Details below.  Sorry about
the length.

On the related question of "what did CLtL mean by the undefined
phrase `objects that have components'," I would prefer to abide by
the Kauai decision that this doesn't include structures and instances
of defclass-defined classes, rather than discuss the issue endlessly.
On the other hand, I wouldn't shout down anyone who took the trouble
to write up a cleanup issue in standard form.

    Date: Tue 28 Feb 89 13:48:20-PST
    From: Kim A. Barrett <IIM@ECLA.USC.EDU>

    In response to JonL's complaint that a proposal to make EQUALP a generic
    function did not seem to be forthcoming ...

    -----
    Forum:		Cleanup
    Issue:		EQUALP-GENERIC
    References:	EQUALP, p81
    Related issues: EQUAL-STRUCTURE
		    HASH-TABLE-STABILITY
		    HASH-TABLE-TESTS
    Category:	CHANGE ADDITION
    Edit history:	Version 1, 02-28-89, Kim A. Barrett

    Problem description:

     The recent acceptance of Issue EQUAL-STRUCTURE (as ammended) has generated
     some complaints about its specified behavior on objects with metaclass
     STRUCTURE-CLASS.  Some programs which depend on EQUALP comparing the
     components of structures are now broken.

     There are some who feel that EQUALP could be made much more useful than it
     currently is by making it user extensible, but there are potential
     difficulties due to the recent acceptance of Issue HASH-TABLE-TESTS, which
     added EQUALP as a valid hash-table test.

    Proposal (EQUALP-GENERIC:YES):

     Specify that EQUALP is a generic function, and define the new generic function
     EQUALP-HASH.  Specify that a necessary requirement on all methods for these
     functions is that (EQUALP x y) => (= (EQUALP-HASH x t) (EQUALP-HASH y t)).
     Add a discussion of the constraints on equivelence predicates and their
     corresponding hash functions, similar to that presented in Issue
     HASH-TABLE-STABILITY.

OK so far.

CLtL says that objects that have components are only EQUALP if the two objects
are of the same type.  Perhaps this vague phrase means that CLASS-OF returns
the same (EQ) class for both of them.  Who enforces this?  And who enforces
that numbers are an exception to this rule?  Does every method enforce this,
or is there some kind of shared code?

     EQUALP object1 object2					[Generic Function]

      The generic function EQUALP is a predicate which can be called to compare two
      objects for equivelence.  The precise definition of equivelence is specified
      by the methods defined on this function.  The result is either true or false,
      depending on whether the objects are equivelent.

     Method Signatures:

      EQUALP (object1 NUMBER) (object2 NUMBER)		[Primary Method]
       Defines equality as being =.

      EQUALP (object1 CHARACTER) (object2 CHARACTER)	[Primary Method]
       Defines equality as being CHAR-EQUAL.

      EQUALP (object1 CONS) (object2 CONS)			[Primary Method]
       Defines equality as having EQUALP CAR's and EQUALP CDR's.

      EQUALP (object1 PATHNAME) (object2 PATHNAME)		[Primary Method]
       Defines equality as being EQUAL.

      EQUALP (object1 VECTOR) (object2 VECTOR)		[Primary Method]
       Defines equality as having the same length (observing fill-pointers when
       present), and every pair of corresponding components are EQUALP. 

      EQUALP (object1 ARRAY) (object2 ARRAY)		[Primary Method]
       Defines equality as having the same rank and dimensions, and every pair of
       corresponding components are EQUALP.

      EQUALP (object1 HASH-TABLE) (object2 HASH-TABLE)	[Primary Method]
       Defines equality as satisfying the following conditions:
       1. Each hash-table uses the same test function.
       2. Each hash-table contains entries for the same set of keys, where the test
	  function is used to determine the equivelence of keys.
       3. For each key, the associated values from the two tables are EQUALP.

      EQUALP (object1 T) (object2 T)			[Primary Method]
       The default method defines equality as being EQ.

     Remarks:
      An error is signaled on an attempt to add a method on any built-in class.
      Adding a method on a class after ALLOCATE-INSTANCE has been called on the
      class has undefined consequences.  Implementations may be extended in this
      case.

      An implementation may specify additional methods on this function.

     EQUALP-HASH object &optional use-pointer depth		[Generic Function]

      The generic function EQUALP-HASH is a called to compute a hash code for the
      Object.  Use-Pointer is a flag.  If true, the memory location of the object
      may be used to generate the hash code.  The default is false.

I couldn't figure out how use-pointer is to be used.  Are we saying anything
about what value a hash table supplies for this argument?  Up at the top you
seemed to imply that hash tables supply T.  Then what use is the NIL value?

      Depth is used to control the depth of recursion, allowing cuttoff.  This
      prevents infinite recursion in the case of circular references.  Callers of
      this function should not explicitely specify a value for this argument.
      Methods on this function which need to make recursive calls should simply
      pass the Depth argument along unchanged.

The depth thing is a new feature since CLtL p.81 says EQUALP is permitted
to be non-terminating if given circular arguments.

Perhaps we should leave our the optional arguments and keep things simple.
But that's not my main point, which is still to come.

      This function returns two values.  The first value is the hash code, which is
      a non-negative fixnum.  The second value is a flag indicating whether the
      memory location of the object (or any component) was used in generating the
      hash code.  A caller who specifies NIL for the Use-Pointer argument may
      reliably depend on the second value being false.

The second value is not well enough specified.  The problem is that if
an equalp-hash method for an object with components works by calling
equalp-hash recursively on each component and combining the results,
it has to combine both values.  Since you said the first value is
a non-negative fixnum, we know it can be combined with LOGXOR or
something more intelligent.  You haven't said anything about how to
combine the second value.  You may be assuming that the second value
is T or NIL and the appropriate combining function is OR, but you
haven't said that.  In fact a simple binary flag isn't really good
enough, because many implementations have multiple levels of memory
(volatility, ephemeralness, generations) and efficiency dictates that
the second value indicate which level was depended upon.  Possibly
it would work to require the second value to be an integer and use
MAX as the combining function.  Or perhaps there should be a specific
combining function for this purpose with an implementation-dependent
behavior.  Even that isn't good enough, because a user who writes
an equalp-hash method that actually computes a hash code, without
recursing into components, has to know what to return as the second
value.  We could say that NIL or 0 or some named constant means
the first value does not depend on any memory locations.

      This function is used by the implementation to compute hash codes for EQUALP
      hash-tables.  By specifying a NIL value for the second argument, it may also
      be used to compute a code which is independent of the particular
      "incarnation" or "core image", just as for SXHASH.

Now wait a minute, I think being independent of the core image and being
independent of objects changing their address are two different issues.
There can be other things besides memory addresses that are dynamically
allocated on a per-core-image basis.  Character codes in systems with
user-defined character registries are a classic example.  You need to
clarify whether use-pointer NIL means the value cannot change within
the core image (e.g. when a relocating GC changes memory addresses)
or means the value cannot change within the implementation (e.g. when
you apply equalp-hash in a different core image to an object that is
considered to be equivalent).

     Method Signatures:

      EQUALP-HASH (object NUMBER) &optional use-pointer depth     [Primary Method]
      EQUALP-HASH (object CHARACTER) &optional use-pointer depth  [Primary Method]
      EQUALP-HASH (object CONS) &optional use-pointer depth	      [Primary Method]
      EQUALP-HASH (object PATHNAME) &optional use-pointer depth   [Primary Method]
      EQUALP-HASH (object VECTOR) &optional use-pointer depth     [Primary Method]
      EQUALP-HASH (object ARRAY) &optional use-pointer depth      [Primary Method]
      EQUALP-HASH (object HASH-TABLE) &optional use-pointer	depth [Primary Method]
      EQUALP-HASH (object T) &optional use-pointer depth	      [Primary Method]

       The manner in which the hash code is computed by the standard methods on
       this function is implementation dependent.

It's pretty unclear what the EQUALP-HASH method on T would do with a
second argument of NIL.  What else can it use but the pointer (and
perhaps the class)?  I don't see what it can do but return a constant
value for all objects.  Maybe it would be better off signalling an
error.  Or maybe this is another argument against the use-pointer
feature.