[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: REMF-DESTRUCTION-UNSPECIFIED (Version 2)
- To: Masinter.pa@Xerox.COM
- Subject: Re: REMF-DESTRUCTION-UNSPECIFIED (Version 2)
- From: David L. Andre <DLA@DIAMOND.S4CC.Symbolics.COM>
- Date: Mon, 2 Nov 87 19:43 EST
- Cc: CL-Cleanup@SAIL.Stanford.EDU, DLA@DIAMOND.S4CC.Symbolics.COM, KMP@DIAMOND.S4CC.Symbolics.COM
- In-reply-to: <871030-114930-5868@Xerox>, <871030-115402-5874@Xerox>, <871031-165117-6815@Xerox>
Date: 30 Oct 87 11:49 PST
I am in favor of making the behavior of CL as well specified as
possible. I also think that users should be discouraged from taking
advantage of subtle details. I don't think the two are incompatible; one
is a matter of portability and more complete programming language
semantics, and the other is a matter of good programming practice.
Here we are in agreement. I don't think anything you said is
incompatible with MAKE-EXPLICITLY-VAGUE; the key here is explicit. We
should make the language explicit so users know what to expect.
Certainly both alternatives are better than the present situation.
[You already said what I was going to say about compiler optimizers, so
I deleted it.]
The performance argument given is pretty weak. Perhaps the right
solution for DLA would be to have a SCL:NREVERSE which had the desired
(undefined) side-effect behavior? To be taken seriously, I'd like to
hear some actual statistics, e.g., is there any evidence that this would
make more than a 5% difference in total runtime in any implementation
for any benchmark?
You're making the mistake of assuming that all performance effects can
be measured by runtimes, and that you can use a threshold on whatever
that is to make a decision. Runtimes are important for benchmarks, but
working-set overhead, garbage-collector overhead, and secondary effects
due to noncollection of garbage usually dominate for large applications.
I hope I am not taking a leap of faith in assuming that Common Lisp is
intended to be used for real applications.
There are some particular elements of the MAKE-EXPLICITLY-VAGUE proposal
which I feel strongly about: REMF, NREVERSE, and DELETE. Of course,
arguments on one function extend naturally to the family of functions
with the same behavior.
REMF: We have measured substantial *space* improvements in large
systems by making REMF (and REMPROP) replace the pointer to the removed
property with NIL. Thus we make (REMF a b) roughly equivalent to
(PROGN (SETF (GETF a b) NIL) (REMF a b)). This yields substantially
better garbage-collector performance because the garbage collector can
immediately reclaim the property without reclaiming the list first.
Especially if the property list is connected with a symbol, it can be
quite a bit older. More reclaimed garbage means less total space used
by user structures, less frequent garbage collections, and more locality
REMF (weaker): In the Symbolics implementation, cdr-coded list blocks
are reclaimed or kept only in their entirety. For example,
(LAST (MAKE-LIST 1000)) takes up 1000 cells, even if you only have a
pointer to the last cell. For cdr-coded property lists, this means that
a the REMF'd portion of the property list may never get reclaimed. To
solve this, we further change REMF so it is roughly equivalent to
(PROGN (MULTIPLE-VALUE-BIND (IGNORE IGNORE TAIL)
(GET-PROPERTIES a (LIST b))
(SETF (CADR TAIL) NIL) ;This is what we did in the last paragraph.
(SETF (CDDR TAIL) NIL))) ;This allows the removed conses to be reclaimed.
(REMF a b))
NREVERSE: Fast, linear algorithms exist for reversing lists by
shuffling elements, by shuffling conses, or by a combination of those
two strategies. I believe individual implementations should be allowed
to choose the implementation which is faster for them.
NREVERSE (weaker): In implementations with cdr-coded lists, a
significant performance degradation would be introduced if NREVERSE were
forced to be implemented with RPLACD. The speed difference alone on a
cdr-coded list could make this argument, and additional GC overhead (to
reclaim the original conses) and working-set overhead (to accommodate the
resulting non-coded list) makes it worse. I could get exact numbers
for you if you want, but on the Symbolics implementation, the speedup is
at least 3:1.
DELETE: KMP mentioned this in his proposal because it is an example
where cdr-coded implementations can use %CHANGE-LIST-TO-CONS rather than
RPLACD. Once again, %CHANGE-LIST-TO-CONS is simply two memory
references, while a RPLACD of a cdr-coded list can take many times that
amount of time, escaping from microcode to run the macrocode escape
routine, updating the cons caches, allocating pages, and GC scavenging.
Whether this makes a 1% difference or a 1000% difference in your program
depends on how cdr-coded your list is, how paging-bound your application
is, how long your list is, and how many elements in the list will be
On the subject of compilers, it is practically impossible to implement
any of the features above based on compile-time analysis. When
compiling a function, you have to have complete lexical knowledge of the
creation, usage, and creation of pointers to a list before such an
optimization is useful. For the above functions, this would cause
the compiler optimizers to be fairly worthless.
Date: 31 Oct 87 16:50 PST
Frankly, I'm not particularly willing to adopt MAKE-EXPLICITLY-VAGUE on
the basis of mere speculation that this might allow some performance
improvements. The example of using %CHANGE-LIST-TO-CONS rather than
RPLACD is given, but no guidelines as to how important that could
possibly be. I have a hard time imagining that there is any program,
real or artificial, that the MAKE-EXPLICITLY-VAGUE will make more than
trivially faster. Here's my best shot at imagining one: suppose you
constructed a cdr-coded list, did DELETE on every other element using
RPLACD instead of %CHANGE-LIST-TO-CONS, and then CDR'd down it lots.
Would it run more than 5% more slowly? Wouldn't the next GC remove the
This is NOT mere speculation, and I resent your calling it that. I would
not be proposing this if it were frivolous.
In your example, DELETE would take approximately 4.5 times as long as it
otherwise would. This is without taking into account any GC or paging
effects. The amount of time it would take to subsequently walk the list
would be little affected. The next GC would indeed remove forwarding
pointers. However there is additional overhead because GC will be
forced to run sooner. See the appendix to this message.
At the risk of repeating myself, be careful to avoid confusing good
programming practice with good programming language design. Whether
users *should* write programs which depend upon exact specifications is
a style issue.
I'm confused about Pitman's assertion that a purist would prefer
MAKE-EXPLICITLY-VAGUE. Which kind of purity? I can't imagine any part of
the programming task that MAKE-EXPLICITLY-VAGUE would make more
convenient, easier, more facile, less error-prone.
I prefer MAKE-EXPLICITLY-VAGUE, although some of the vagueness KMP has
come up with I believe is unnecessary. I believe what he meant was that
a purist would argue that this forces the programmer to respect a
certain level of abstraction. (I only speculate because KMP has not
I don't think the Benefits given for MAKE-EXPLICITLY-DEFINED are what I
would write; I think the major benefit is in portability of debugged
I agree. However, as I pointed out to KMP privately, my experience has
been that programs have much more difficulty being ported due to their
doing things something which "is an error", and only one of the two
implementations actually signal an error. The only times I have seen
problems converting code based on these assumptions is when converting
old MACLISP programs which use the then-documented side-effects.
I don't think it is much of a benefit that users CAN write
programs which exploit them; rather, for those who must deal with
programs where the original programmers DID exploit them, the programs
will be more portable.
If I had to, I would rank order these in the order of importance for
specificity. For example, it seems more important to give NCONCs
behavior as specified than it does to give NREVERSE.
I agree, sort of. I can imagine an implementation of NCONC which used
%CHANGE-LIST-TO-CONS on the tail of the first list if it were cdr-coded,
then explicitly created a CONS for the missing element. This wouldn't
buy anything in our implementation, but it's conceivable it might in
There are many instances why substantial performance improvements have
been realized by techniques allowed under MAKE-EXPLICITLY-VAGUE. Many
of these improvements have been part of Symbolics Common Lisp for many
years, and I am unaware of any bug reports relating to them, which tends
to support the claim that users don't use undocumented side-effects.
On the other hand, oftentimes people with large applications are stumped
as to why their application, which benchmarked well during testing, has
gotten so slow on real problems. Or they wonder why they can't collect
any garbage on their machines even though they have removed all pointers
to their objects. Such things are quite difficult for the uninitiated
to track down. When I tell them, say, to do
(PROGN (SETF (GETF a b) NIL) (REMF a b)) instead of (REMF a b), it only
incites them to complain that Common Lisp is a language only
implementors can understand.
(DEFUN MASINTER-DELETE (ELEMENT LIST)
(LET ((TOP (LOCF LIST))) ;The (LOCF LIST) could be (CONS NIL LIST) for portability.
(DO ((L LIST (CDR L))
(PREVIOUS TOP L))
((NOT (EQL (CAR L) ELEMENT)))
(SETF (CDR PREVIOUS) (CDR L))
(SETQ L (CDR L))
(WHEN (ENDP L)
(RETURN-FROM MASINTER-DELETE (CDR TOP)))))))
Running this on a list, 1000 long, of alternating A and B. This timing
excludes most paging and GC effects, which would tend to dominate in
larger programs. Symbolics 3640, In-House development software.
MASINTER-DELETE A took 0.07725 seconds
MASINTER-DELETE B took 0.070475 seconds
DELETE A took 0.017822 seconds
DELETE B took 0.017825 seconds
MASINTER-DELETE A took 0.077003 seconds
MASINTER-DELETE B took 0.076675 seconds
DELETE A took 0.017718 seconds
DELETE B took 0.017838 seconds
MASINTER-DELETE A took 0.078764 seconds
MASINTER-DELETE B took 0.07683 seconds
DELETE A took 0.017821 seconds
DELETE B took 0.017826 seconds