[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: HASH-TABLE-PACKAGE-GENERATORS (version 2)



This is revised at long last.   Moon and I agree on the new syntax.

Note also that the hash-table return value sequence was changed to
accord with more peoples' expectations (i.e., <key> before <value>)

-- JonL --

!

Issue:         HASH-TABLE-PACKAGE-GENERATORS

References:    

Category:      ADDITION

Edit history:  Version 1, 23-May-88 JonL
	       Version 2,  6-Oct-88 JonL (convert to "with" scoping).


Problem description:

The Iteration subcommittee would like the several iteration proposals to be
writeable in portable Common Lisp code.  Unfortunately, the only complete
access to hash-tables and packages is through MAPHASH and DO-SYMBOLS (and
DO-EXTERNAL-SYMBOLS and DO-ALL-SYMBOLS); none of these existing primitives 
is satisfactory for building complex iteration clauses.


Proposal (HASH-TABLE-PACKAGE-GENERATORS:ADD-WITH-WRAPPER)

Add two new macros WITH-HASH-TABLE-ITERATOR and WITH-PACKAGE-ITERATOR 
to the language as follows:

    WITH-HASH-TABLE-ITERATOR ((<next-fn> <hash-table>) &body body)      [Macro]
    
    Within the lexical scope of 'body', the name <next-fn> is defined
    via MACROLET such that each invocation (<next-fn>) will return
    the successive items from the hash-table which is the value of
    the expression <hash-table>.  Iterating over such a form will thus 
    make the contents of the hash-table available one at a time.  An 
    invocation (<next-fn>) returns three values as follows:
      ;;    1. a boolean to indicate no more entries (T says "there are more")
      ;;    2. the key item (of a <key, value> pair)
      ;;    3. the value item (of a <key, value> pair)
      ;; If there are no more entries in the hash-table, then only one 
      ;;  value is returned, namely NIL.


    WITH-PACKAGE-ITERATOR ((<next-fn> <package> <type>) &body body)     [Macro]

    Within the lexical scope of 'body', the name <next-fn> is defined
    via MACROLET such that each invocation (<next-fn>) will return
    the successive items from the package which is the value of the
    expression <package>.  Iterating over such a form will thus make the
    contents of the package available one at a time.  The <type> argument
    may assume any of the following values, indicating  the sort of 
    symbols wanted:
            DO-SYMBOLS			;any available symbol
            DO-EXTERNAL-SYMBOLS		;just the :external symbols
            DO-INTERNAL-SYMBOLS		;just the :internal symbols
            DO-PRESENT-SYMBOLS		;both :internal and :external symbols
            DO-ALL-SYMBOLS		;all symbols; 
   When <type> evaluates to DO-ALL-SYMBOLS, the <package> argument is 
   ignored.   An invocation (<next-fn>) returns three values as follows:
      ;;    1. a boolean to indicate no more entries (T says "there are more")
      ;;    2. a symbol (available in the indicated package)
      ;;    3. the availability type for that symbol, (i.e. one of
      ;;       :INTERNAL, :EXTERNAL, or :INHERITED).
      ;; If there are no more symbols available from the package, then only 
      ;;  one value is returned, namely NIL.


There is no guarantee that any state implicit in the invocation of the 
form (<next-fn>) will survive outside the scope of the WITH-... form.


Test-case:

The following function should return T on any hash-table, and signal
an error if the usage of 'with-hash-table-iterator' doesn't agree
with the corresponding usage of 'maphash'.

(defun test-ht-generator-test (hash-table)
  (let ((all-entries '())
	(generated-entries '())
	(unique (list nil)))
    (maphash #'(lambda (key value) (push (list key value) all-entries))
	     hash-table)
    (with-hash-table-iterator (generator-fn hash-table)
      (loop 	
	;;Note -- this is the "trivial" LOOP of CLtL p121
	(multiple-value-bind (more? key value) (generator-fn)
	  (unless more? (return))
	  (unless (eql value (gethash key hash-table unique))
	    (error "Key ~S not found for value ~S" key value))
	  (push (list key value) generated-entries))))
    (unless (= (length all-entries)
	       (length generated-entries)
	       (length (union all-entries generated-entries :test #'equal)))
      (error "Generated entries and Maphash entries don't correspond"))
    t))


The following function should return T on any package, and signal
an error if the usage of 'with-package-iterator' doesn't agree
with the corresponding usage of 'do-symbols'.

(defun test-pk-generator-test (package)
  (unless (packagep package)
    (setq package (find-package package)))
  (let ((all-entries '())
	(generated-entries '()))
    (do-symbols (x package) 
      (multiple-value-bind (symbol accessibility) 
		(find-symbol (symbol-name x) package)
	(push (list symbol accessibility) all-entries)))
    (with-package-iterator (generator-fn package 'do-symbols)
      (loop 	
	;;Note -- this is the "trivial" LOOP of CLtL p121
	(multiple-value-bind (more? symbol accessibility) (generator-fn)
	  (unless more? (return))
	  (let ((l (multiple-value-list (find-symbol (symbol-name symbol) 
						     package))))
	    (unless (equal l (list symbol accessibility))
	      (error "Symbol ~S not found as ~S in package ~A [~S]"
		     symbol accessibility (package-name package) l))
	    (push l generated-entries)))))
    (unless (and (subsetp all-entries generated-entries :test #'equal)
		 (subsetp generated-entries all-entries :test #'equal))
     (error "Generated entries and Do-Symbols entries don't correspond"))
    t))

       

Rationale:

The particular way in which hash-tables and packages are represented
need not be standardized, or even exposed to the user.  Yet a simpler 
handle on them is needed for the various iteration paradigms to be written 
in portable code.  In fact, after these iterator macros are put into an 
implementation, then MAPHASH and DO-<mumble>-SYMBOLS are trivial usages 
of them; but no _efficient_ use of the current primitives will provide 
the effect of the new macros, namely a form that _returns_ the elements
of a table "one by one".


Current Practice:

Nobody does it this way, but both Symbolics and Lucid are not far off.

Cost to Implementors:

Moderate.  Possibly a couple day's to a week's work for an implementation 
that has to start completely afresh.

Cost to Users:

None.

Benefits:

Will provide a more basic primitive for iterating over hash-tables and 
packages; will permit new iteration paradigms to be written in portable code.

Aesthetics:

All other things being equal, it is better to have more general primitives
than less general ones.  


Discussion:

One must be careful not to assume that the invocation (<next-fn>) is a 
"generator" function call -- since <next-fn> is MACROLET'd in an 
implementation dependent way, it could even turn into a special form like
    (if something
        (values nil)
        (yet-another-function-call))

The Iteration Subcommittee supports this proposal (or, "used to" -- 
JonL 6-Oct-88).

The scoping called for herein may not be quite so useful to the "generators"
style proposals; in particular they offer an interface wherein one may 
create a "generator" function of indefinite extent that returns, one-by-one,
the elements of the table.  The constrained scoping implicit in these
WITH-... macros is not so much for any kind of optimization, but rather
for coordination of such hash-table "locking" as may occur in multi-
processing implementations like Symbolics.  Nevertheless, Dick Waters 
thinks these macros should be put in anyway, since it clearly is a 
requirement for a portable LOOP, and can be use in a limited context 
(i.e., not "indefinite scope") for portable versions of ITERATE and OSS.