[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Issue: PATHNAME-SUBDIRECTORY-LIST (Version 3)



Ok, I've been through and I think successfully merged all the pending
discussion, most of which seemed to center around issues of :UP.

-----
Issue:          PATHNAME-SUBDIRECTORY-LIST
References:     Pathnames (pp410-413), MAKE-PATHNAME (p416),
		PATHNAME-DIRECTORY (p417)
Category:       CHANGE
Edit history:   18-Jun-87, Version 1 by Ghenis.pasa@Xerox.COM
	        05-Jul-88, Version 2 by Pitman (major revision)
		28-Dec-88, Version 3 by Pitman (merge discussion)
Status:	        For Internal Discussion
Related-Issues: PATHNAME-COMPONENT-CASE

Problem Description:

  It is impossible to write portable code that can produce a pathname
  in a subdirectory of a hierarchical file system. This defeats much of
  the purpose of having an abstraction like pathname.

  According to CLtL, only a string is a portable filler of the directory
  slot, but in order to denote a subdirectory, the use of separators (such
  as dots, slashes, or backslashes) would be necessary. The very fact that
  such syntax varies from host to host means that although the
  representation might be "portable", the code using that representation 
  is not portable.

  This problem is even worse for programs running on machines on a network
  that can retrieve files from multiple hosts, each using a different OS
  and thus a different subdirectory delimiter.

  Related problems:

  - In some implementations "FOO.BAR" might denote the "BAR" subdirectory
    of "FOO" while in other implementations because "." is not the
    separator. To be safe, portable programs must avoid all potential
    separators.

  - Even in implementations where "." is the separator, "FOO.BAR" may be
    recognized by some to mean the "BAR" subdirectory of "FOO" and by others
    to mean `a seven letter directory with "." being a superquoted part of
    its name'.

  - In fact, CLtL does not even say for toplevel directories whether the
    directory delimiters are a part. eg, is "foo" or "/foo" the directory
    filler for a unix pathname "/foo/bar.lisp". Similarly, is "[FOO]" or
    "FOO" the directory filler for a VMS pathname "[FOO]ME.LSP"?

Proposal (PATHNAME-SUBDIRECTORY-LIST:NEW-REPRESENTATION)

  Allow a list to be a filler of a pathname. The car of the list may be either
  of the symbols :ABSOLUTE or :RELATIVE.

  If the car of the list is :RELATIVE, the rest of the list is the
  implementation-dependent result of PARSE-NAMESTRING for file systems which
  have relative pathnames. Unless some other proposal is submitted to clarify
  the behavior of relative pathnames in merging, etc. that behavior is left
  undefined.

  If the car of the list is :ABSOLUTE, the rest of the list is a list of 
  strings each naming a single level of directory structure. The strings
  should contain only the directory names themselves -- no separator
  characters.

  The spec (:ABSOLUTE) represents the root directory.

  Clarify that if a string is used as a filler of a directory field in a
  pathname, it should be the unadorned name of a toplevel directory.
  Specifying a string, str, is equivalent to specifying the list
  (:ABSOLUTE str).

  In place of a string, at any point in the list, keyword symbols may occur
  to deal with special file notations. The following symbols have standard
  meanings; they may not be meaningful for all operating systems, and are
  intended for use only on those operating systems where they have meaning:

   :WILD           - Wildcard match of one level of directory structure.
   :WILD-INFERIORS - Wildcard match of any number of directory levels.
   :UP             - Go upward in directory structure (semantic).
   :BACK 	   - Go upward in directory structure (syntactic).

  The difference between up and back is that if there is a directory
    (:ABSOLUTE "X" "Y" "Z")
  linked to 
    (:ABSOLUTE "A" "B" "C")
  and there also exist directories
    (:ABSOLUTE "A" "B" "Q")
    (:ABSOLUTE "X" "Y" "Q")
  then
    (:ABSOLUTE "X" "Y" "Z" :OUT "Q")
  designates
    (:ABSOLUTE "A" "B" "Q")
  while
    (:ABSOLUTE "X" "Y" "Z" :UP "Q")
  designates
    (:ABSOLUTE "X" "Y" "Q")

Test Case:

  (PATHNAME-DIRECTORY (PARSE-NAMESTRING "[FOO.BAR]BAZ.LSP")) ;on VMS
  => (:ABSOLUTE "FOO" "BAR")

  (PATHNAME-DIRECTORY (PARSE-NAMESTRING "/foo/bar/baz.lisp")) ;on Unix
  => (:ABSOLUTE "foo" "bar")
  or (:ABSOLUTE "FOO" "BAR")
  If PATHNAME-COMPONENT-CASE:CANONICALIZE passes, only the 2nd return value.

  (PATHNAME-DIRECTORY (PARSE-NAMESTRING ">foo>**>bar>baz.lisp")) ;on LispM
  => (:ABSOLUTE "FOO" :WILD-INFERIORS "BAR")

  (PATHNAME-DIRECTORY (PARSE-NAMESTRING ">foo>*>bar>baz.lisp")) ;on LispM
  => (:ABSOLUTE "FOO" :WILD "BAR")

Rationale:

  This would allow programs to usefully deal with hierarchical file systems,
  which are by far the most common file system type.

Current Practice:

  Symbolics Genera implements something very similar to this. The main
  differences are:
   - In Genera, there is no :ABSOLUTE keyword at the head of the list.
     This has been shown to cause some problems in dealing with root
     directories. Genera represents the root directory by a keyword
     symbol (rather than a list) because the list representation 
     was not adequately general.
   - Genera represents Unix ".." as :UP, but deals with :UP 
     syntactically, not semantically.

Cost to Implementors:

  In principle, nothing about the implementation needs to change except
  the treatment of the directory field by MAKE-PATHNAME and
  PATHNAME-DIRECTORY. The internal representation can otherwise be left
  as-is if necessary.

  For implementations that choose to rationalize this representation
  throughout their internals and any other implementation-specific
  accessors, the cost will be necessarily higher.

Cost to Users:

  None. This change is upward compatible.

Cost of Non-Adoption:

  Serious portability problems would continue to occur. Programmers would be
  driven to the use of implementation-specific facilities because the need
  for this is frequently impossible to ignore.

Benefits:

  The serious costs of non-adoption would be avoided.

Aesthetics:

  This representation of hierarchical pathnames is easy to use and quite
  general. Users will probably see this as an improvement in the aesthetics.

Discussion:

  This issue was raised a while back but no one was fond of the particular
  proposal that was submitted. This is an attempt to revive the issue.

  The original proposal, to add a :SUBDIRECTORIES field to a pathname, was
  discarded because it imposed an unnatural distinction between a toplevel
  directory and its subdirectories. Pitman's guess is the the idea was to
  try to make it a compatible change, but since most programmers will 
  probably want to change from implementation-specific primitives to portable
  ones anyway, that's probably not such a big deal. Also, there might have
  been some programs which thought the change was compatible and ended up
  ignoring important information (the :SUBDIRECTORIES field). Pitman thought
  it would be better if people just accepted the cost of an incompatible
  change in order to get something really pretty as a result.

  This issue used to address the issue of relative pathnames (pathnames
  relative to some default which is separately maintained). Pitman removed
  this issue for now in order to simplify things. He feels the issue should
  be resubmitted under separate cover so that it can be discussed separately.