[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comments on X3J13 document 88-007, "The Loop Facility"



Here are my comments on the proposed standard for LOOP.  Please respond.

I have divided my comments into "substantive" comments and "editorial"
comments.  However, the "editorial" section should not be dismissed
as "merely editorial."  I have already filtered out all the "mere
editorial" comments, and what remains in this section are comments on
places where the document is inconsistent, manifestly incorrect, or
sufficiently inscrutable that it cannot reasonably be interpreted.  Thus
in all these places the document does not really specify how the software
behaves, and my "editorial" comments may really be "substantive", depending
on how the document was supposed to be interpreted.

My "substantive" comments, in contrast, refer to areas where the document
makes an unambiguous statement that I feel would be a serious mistake to
include in the standard.

Several of the examples are very poor, encouraging bad style and
obfuscating loop issues.  I started criticizing them individually, but
decided that that wasn't constructive without supplying suggested
replacement examples, and so far I haven't had time to concoct examples.
Hence, I have omitted all criticism of the examples from these remarks.
However, this is an issue that will use up a lot of time during the
writing of any real documentation for this iteration standard.

I did not read Chapter 3 at all, as I assume it would not be in the
standard.  I ought to have reviewed the Glossary, but I didn't.  I
didn't pay much attention to the data type stuff, as it is to be redone.
All page numbers refer to 88-007 as distributed with a date of May 15, 1988.


SUBSTANTIVE COMMENTS

2-13 This page is very difficult to understand, as it tries to discuss
simultaneously the two quite different constructs "for var = val" and
"for var = initial-val then val".  The latter construct has historically
been a major problem for users, because it is the only case where an
expression is not evaluated at the time one would expect.  I strongly
suggest eliminating it from the standard.  "for var first initial-val
then val" provides similar functionality and can easily be made equally
efficient in most cases (mistaken efficiency considerations, involving
preferring to set variables with LET over binding them with a LET and
setting them later with a SETQ, were the reason for adding the "= then"
construct in the original LOOP).  Implementations needing compatibility
with the "= then" construct can continue to provide it, of course.  
"for var = val" is then simply describable as "var is a locally bound
variable that gets set to val by the body on each iteration".  The times
when variables get set are then simply describable as "WITH variables
are set in the prologue, in the order they appear.  FOR/AS variables are
set in the body, in the order they appear."  This confusing business
with FOR/AS variables sometimes being set in the body and sometimes in
the prologue can be junked.

2-45 (loop for e being each element of s ...) is a really ugly way to say to
iterate through a sequence.  This is left over from Maclisp, where there
was not such a well-developed concept of a sequence data type as in Common
Lisp.  Six years ago I proposed the syntax (loop for e across s ...) for
this; why not adopt it?

2-46 I don't think the name ARRAY-ELEMENTS should be included in the
standard.  It perpetuates confusion between arrays and vectors.

2-50 The syntax USING (INDEX var) (SEQUENCE var) is wrong, because it
results in problems of syntactic ambiguity, especially if we ever make
DO optional.  The syntax should be USING (INDEX var SEQUENCE var).  The
operative rule here is that each LOOP preposition is followed by one and
only one Lisp expression, and USING should have the same syntax as a
preposition.

2-55 This business with DEFINE-LOOP-METHOD exporting things from the
LOOP package seems wrong to me.  The argument passed to the user's code
for the loop method should be the exact symbol that was specified in the
definition of the loop method.  Passing neither that symbol, nor the
symbol that was specified when the loop method was invoked, but instead
a third symbol, one in the LOOP package, seems pointless to me.
Furthermore I don't approve of behind-the-scenes package exporting, it
can lead to problems.

2-56 There are fundamental problems with DEFINE-LOOP-METHOD which I
think definitely must be fixed before standardizing.  Some of these
problems are inherited from the 10-year-old syntax of DEFINE-LOOP-PATH;
rather than simply renaming this macro, its syntax should have been
redesigned to be consistent with Common Lisp.  Other problems may be
simply editorial: the division of responsibility between LOOP and the
method-function is very poorly specified.  I'll enumerate the problems
and my proposed solutions; for brevity this is not a complete writeup.
If you want me to, I'll spend some more time and produce a complete
writeup.

  Instead of a separate function and method-specific-data to pass to it,
  define-loop-method should have a body, like define-setf-method.  Thus
  the syntax should be
   (define-loop-method name-or-names (iter-var options...)
                                     preposition-list
      &body body),
  and the preposition-list should be like a lambda-list: within
  body each preposition name is bound to the expression that followed it;
  &optional separates required prepositions from optional prepositions,
  and the latter can have default-value-forms and supplied-p variables.
  iter-var is a variable bound to the variable of iteration.
  options... are alternating keywords and arguments, the following
  keyword options are currently defined:
    :inclusive variable -- enables "inclusive" iteration and binds
                           the variable around the body to t if inclusive
                           iteration is used, nil otherwise.
    (one more, see below)

  The method body should return its values in the standard Common Lisp
  way, using multiple values, not as a list.

  Destructuring should be handled completely by LOOP; the iter-var argument
  may be a cons, and "variables" in the values returned by the method body
  may be conses, which LOOP will destructure.

  Data type processing should be handled completely by LOOP, the method
  body shouldn't even need to see it.  LOOP will attach any type declarations
  to the variable bindings where they are needed.

  Temporary variables created by the body should be created by calling
  LOOP:NAMED-VARIABLE, not by calling gensym.  This just needs to be stated
  explicitly.

  LOOP is responsible for reporting an error if USING specifies a name for
  a variable but the body does not call LOOP:NAMED-VARIABLE with that name.

  LOOP is responsible for reporting an error if there are missing prepositions,
  extra prepositions, or the same preposition used twice.

  There should be a way to define loop methods that can be invoked by
  (loop for var method-name ...) rather than or in addition to
  (loop for var being each method-name ...).  This can be done with
  an option :short-syntax preposition, where preposition is the name
  of a preposition that is assumed, thus (loop for var method-name ...)
  expands into (loop for var being each method-name preposition...).

2-65 I don't understand why GET-LOOP-METHOD is in the documentation.  I
don't see why anything other than the internals of LOOP would ever call this.

2-68 Are these backward compatibility comstructs proposed for the standard?
I'd like to see them removed.

The order of evaluation of prepositional expressions in LOOP iteration
driving clauses needs to be defined, or explicitly undefined.  88-007
doesn't even define this for the built-in forms of iteration, let alone
the user-defined.  There are several possible approaches here.  The one
I prefer is to say that syntaxes 4 and 5 of FOR/AS are special cases
(these evaluate expressions in the body) not regarded as prepositions.
Those aside, all prepositional expressions are evaluated exactly once,
in the prologue, in the order that they are written, and the LOOP system
takes care of this (the individual LOOP methods don't have to worry about
it, the values passed to them are gensym variables where necessary to
assure order of evaluation, similar to DEFSETF).  Some might want
to have an escape in DEFINE-LOOP-METHOD that allows access to the
unevaluated prepositional phrase, however note that the only thing this
would be used for is the WITH-KEY preposition in the obsolete
HASH-ELEMENTS method; WITH-KEY is inconsistent with everything else,
I have always thought it was bogus (it should have been done with USING),
and I don't think we want to give users the tool to create more
inconsistencies like that one.

I'd like to see a standardized way to define new value accumulation
clauses, in addition to this proposal's standardized ways (DEFLOOP and
DEFINE-LOOP-METHOD) to define new iteration driving clauses.  I suppose
that can always be added later.
!
EDITORIAL COMMENTS

1-3 "The expanded loop form is a lambda expression ... and a tagbody"
is not really true.  This is misleading and the document should be less
specific about what the macro expands into.  The binding can't be done
with a lambda-expression because of the mixture of parallel and sequential
binding.

1-6 It is useful to describe a category of "value returning clauses"
which includes all the value accumulation clauses plus always, never,
thereis, and return.

1-7 Assignment is a poor name for WITH, since it is a bind, not a SETQ.
Local variables would be a better description.

2-6 The last paragraph in the first section says FOR must precede NAMED,
but 2-42 says NAMED must precede everything else.  Which is correct?

2-7 The syntax of the AND conjunction of FOR clauses is defined only in
this example, which shows it as "AND FOR".  MIT LOOP allows FOR to be
either included or omitted here, and perhaps this proposal was intended
to be the same, but that is a mistake.  The FOR should be required to be
omitted, to avoid a syntactic ambiguity with a variable named FOR, and to
be consistent with WITH.  Thus the example should be
  (loop for x from 1 to 10
        and y first nil then x
        collect (list x y))

2-8 Syntax 1 for FOR is not described carefully enough.  Although each
of the clauses is optional, it is not valid to omit all three.  Also it
is permissible for the clauses to appear in any order, e.g. BY before
TO.  The description of how the direction (up or down) of stepping is
determined and when the loop terminates is hard to understand; it would
be better if it discussed one topic at a time.

2-14 This description is hard to understand.  It should simply say that
var is a local variable that is set within the body of the loop; on the
first iteration it is set to expr1, on all iterations after the first
it is set to expr2.

2-15 "the expression in the loop body is not evaluated" doesn't make
any sense.  I imagine this means "the loop body is not executed".

2-18 The first remark doesn't make sense unless "clause" is a typo
for "loop expression", i.e. an entire loop.  The second remark is
probably inaccurate; the fourth example on the next page makes it
clear that the FINALLY clause may or may not be evaluated depending
on which clause terminates the loop.

2-23,4 Wouldn't it be more consistent for COLLECT, etc. to allow a
type-spec just like COUNT, etc?  Of course it would only be useful if
Common Lisp defined type specifiers for subtypes of LIST, but the
LOOP syntax is easier to understand if there are fewer exceptions.

2-25,6,7 The description of the meaning of a type-spec without an INTO
preposition seems unreasonable; the type-spec should apply to the
anonymous variable that accumulates the count, not to <expr>.  <expr>'s
type is not useful since only its truth or falsity matters.  Having
the type-spec apply always to the accumulation variable, regardless of
whether it is named or anonymous, seems more consistent.

2-25 The default type for COUNT should not be FIXNUM, since in some
implementations FIXNUM might not include 0, and some implementations
(e.g. XCL) do not have a large enough range of fixnums to include all
reasonable counts.  The default should be an implementation-dependent
subtype of INTEGER, or the standard should not specify a default.

2-32 The comment about BLOCK and PROGN leaves the reader more confused
than enlightened.  I know what it's trying to say, namely that the
reason for existence of loop conditional clauses is solely to allow
conditionalization of loop value accumulation clauses, which have no
equivalent Lisp expressions, but that isn't what it actually says.
I'd sure like to see loop conditional clauses flushed, but in all these
years no one has been able to come up with a fully satisfactory
proposal, so it's probably difficult.

2-33 The second remark paragraph is quite unclear.  I wouldn't much
mind flushing the IT feature, but if it's going to be included it
needs to be unambiguously explained.  IT is neither a clause nor a
keyword.  The explanation in the original MIT documentation is much
clearer.

2-33 I don't see any discussion of the dangling ELSE issue.  To be
unambiguous, the specification cannot avoid this issue.

2-35 "Within and around these parts, you can bind...".  I can't figure
out what this means; certainly one cannot wrap LET forms around the
prologue, epilogue, or body.  I suggest removing this sentence since I
can't think of anything it could be trying to say.

2-35,7 Page 2-35 says FINALLY clauses go at the end of the epilogue
but page 2-37 says they go at the beginning of the epilogue.  Similarly,
2-35 says INITIALLY clauses go at the beginning of the prologue and
2-37 is silent on the subject.  I believe 2-35 is wrong in both cases.
I think prologue forms should be executed in the order that their
defining clauses appear, and FINALLY clauses should be executed
before all epilogue forms created by value returning clauses.

2-40 The relationship between data type declarations and destructuring
is not clear.  I believe the rule can be summed up as "destructuring has
precedence, but atomic type specifiers distribute".  In more words, a
data type specifier for a destructuring pattern is a tree of type
specifiers with the same shape as the tree of variables, with these two
exceptions:  When aligning these trees, an atom in the type specifier
tree matching a cons in the variable tree declares the same type for
each variable.  A cons in the type specifier tree matching an atom
in the variable tree is a nonatomic type specifier.  The case that is
not made clear by your document is the meaning of
  (loop for (x y) the (vector fixnum) in l do ...); by my rules
this means that x is a vector and y is a fixnum, not that both x and y
are vectors of fixnums.

2-44 The description of what prepositions are allowed where is quite
unclear.  The last paragraph on this page appears to say that from, downfrom,
upfrom, to, downto, upto, below, above, by are valid prepositions for
all forms of series iteration, but in fact they are only valid for
indexed series iteration [see next comment for terminology definition].
Also it isn't made clear that
  (loop for x being the elements of a downto 0 ...)
is valid, even though
  (loop for i downto 0 ...)
is not valid.
A much more precise description is needed here, and again on 2-63.

2-44,5 The use of the word sequence both to mean "the Common Lisp
SEQUENCE data type" and to mean "any ordered set of elements through
which you can iterate" is extremely confusing.  For example, it leads to
an out of place reference to DEFLOOP on page 2-45, and to confusion in
the third bullet on 2-56.  It would be better to follow the lead of OSS
and use "series" as the more general term.  It's a little more complex
in LOOP, though, as we have three concepts.  In order of increasing
generality:
  SEQUENCE -- the Common Lisp SEQUENCE data type, see ELEMENTS
  Indexed Series -- an ordered series of data elements that can be
                be accessed by an integer index, see DEFLOOP
  Series -- an ordered series of data elements that can be accessed
                with some iteration method, not necessarily involving
                an integer index, see DEFINE-LOOP-METHOD
Calling all three of these "sequence" is really confusing.

2-44, 2-51 The syntax of series iteration shown is incorrect.  You show
IN/OF with no expression after it.

2-49 (make-package ("temp")) should be (make-package "temp").

2-51 The syntax for invoking loop methods with define-loop-method is
wrong.  It requires a minimum of two prepositions.  In fact giving no
prepositions at all is valid, if the loop method accepts it.
PRESENT-SYMBOLS is an example where giving no prepositions is valid.

2-53 Using define-loop-macro with a loop method name doesn't make sense,
since loop method names don't introduce clauses, but only appear inside
FOR/AS clauses.

I think the overall framework of iteration should be described early,
before launching into the individual clauses.  As is, the description
of FOR cannot really be understood without referring to material near
the very end of the document.  The LOOP specification needs to be very
clear about the order of evaluation of various parts of the loop, when
variables are bound, when variables are setq'ed, and when endtests are
executed.

The document does not specify any rule about duplicate variable bindings.
It should say "Binding the same variable twice in variable-binding clauses 
of a single LOOP expression is an error" or something stronger.  For
example, (loop with x = '(1 2 3) for y in l collect y into x finally
(return x)) should not be a valid loop expression, in my opinion.