[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
CL-Cleanup-mailer
- To: alarson@src.honeywell.com (Aaron Larson)
- Subject: CL-Cleanup-mailer
- From: masinter.pa@Xerox.COM
- Date: 15 Mar 89 07:14 PST
- Cc: cl-cleanup@sail.stanford.edu
Unfortunately, there is a "nest" of cleanup items on pathnames
that have been postponed. Here's PATHNAME-SUBDIRECTORY-LIST.
----- Begin Forwarded Messages -----
Date: Wed, 28 Dec 88 14:32 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: PATHNAME-SUBDIRECTORY-LIST (Version 3)
To: CL-Cleanup@SAIL.STANFORD.EDU
cc: KMP@STONY-BROOK.SCRC.Symbolics.COM
Ok, I've been through and I think successfully merged all the pending
discussion, most of which seemed to center around issues of :UP.
-----
Issue: PATHNAME-SUBDIRECTORY-LIST
References: Pathnames (pp410-413), MAKE-PATHNAME (p416),
PATHNAME-DIRECTORY (p417)
Category: CHANGE
Edit history: 18-Jun-87, Version 1 by Ghenis.pasa@Xerox.COM
05-Jul-88, Version 2 by Pitman (major revision)
28-Dec-88, Version 3 by Pitman (merge discussion)
Status: For Internal Discussion
Related-Issues: PATHNAME-COMPONENT-CASE
Problem Description:
It is impossible to write portable code that can produce a pathname
in a subdirectory of a hierarchical file system. This defeats much of
the purpose of having an abstraction like pathname.
According to CLtL, only a string is a portable filler of the directory
slot, but in order to denote a subdirectory, the use of separators (such
as dots, slashes, or backslashes) would be necessary. The very fact that
such syntax varies from host to host means that although the
representation might be "portable", the code using that representation
is not portable.
This problem is even worse for programs running on machines on a network
that can retrieve files from multiple hosts, each using a different OS
and thus a different subdirectory delimiter.
Related problems:
- In some implementations "FOO.BAR" might denote the "BAR" subdirectory
of "FOO" while in other implementations because "." is not the
separator. To be safe, portable programs must avoid all potential
separators.
- Even in implementations where "." is the separator, "FOO.BAR" may be
recognized by some to mean the "BAR" subdirectory of "FOO" and by others
to mean `a seven letter directory with "." being a superquoted part of
its name'.
- In fact, CLtL does not even say for toplevel directories whether the
directory delimiters are a part. eg, is "foo" or "/foo" the directory
filler for a unix pathname "/foo/bar.lisp". Similarly, is "[FOO]" or
"FOO" the directory filler for a VMS pathname "[FOO]ME.LSP"?
Proposal (PATHNAME-SUBDIRECTORY-LIST:NEW-REPRESENTATION)
Allow a list to be a filler of a pathname. The car of the list may be either
of the symbols :ABSOLUTE or :RELATIVE.
If the car of the list is :RELATIVE, the rest of the list is the
implementation-dependent result of PARSE-NAMESTRING for file systems which
have relative pathnames. Unless some other proposal is submitted to clarify
the behavior of relative pathnames in merging, etc. that behavior is left
undefined.
If the car of the list is :ABSOLUTE, the rest of the list is a list of
strings each naming a single level of directory structure. The strings
should contain only the directory names themselves -- no separator
characters.
The spec (:ABSOLUTE) represents the root directory.
Clarify that if a string is used as a filler of a directory field in a
pathname, it should be the unadorned name of a toplevel directory.
Specifying a string, str, is equivalent to specifying the list
(:ABSOLUTE str).
In place of a string, at any point in the list, keyword symbols may occur
to deal with special file notations. The following symbols have standard
meanings; they may not be meaningful for all operating systems, and are
intended for use only on those operating systems where they have meaning:
:WILD - Wildcard match of one level of directory structure.
:WILD-INFERIORS - Wildcard match of any number of directory levels.
:UP - Go upward in directory structure (semantic).
:BACK - Go upward in directory structure (syntactic).
The difference between up and back is that if there is a directory
(:ABSOLUTE "X" "Y" "Z")
linked to
(:ABSOLUTE "A" "B" "C")
and there also exist directories
(:ABSOLUTE "A" "B" "Q")
(:ABSOLUTE "X" "Y" "Q")
then
(:ABSOLUTE "X" "Y" "Z" :OUT "Q")
designates
(:ABSOLUTE "A" "B" "Q")
while
(:ABSOLUTE "X" "Y" "Z" :UP "Q")
designates
(:ABSOLUTE "X" "Y" "Q")
Test Case:
(PATHNAME-DIRECTORY (PARSE-NAMESTRING "[FOO.BAR]BAZ.LSP")) ;on VMS
=> (:ABSOLUTE "FOO" "BAR")
(PATHNAME-DIRECTORY (PARSE-NAMESTRING "/foo/bar/baz.lisp")) ;on Unix
=> (:ABSOLUTE "foo" "bar")
or (:ABSOLUTE "FOO" "BAR")
If PATHNAME-COMPONENT-CASE:CANONICALIZE passes, only the 2nd return value.
(PATHNAME-DIRECTORY (PARSE-NAMESTRING ">foo>**>bar>baz.lisp")) ;on LispM
=> (:ABSOLUTE "FOO" :WILD-INFERIORS "BAR")
(PATHNAME-DIRECTORY (PARSE-NAMESTRING ">foo>*>bar>baz.lisp")) ;on LispM
=> (:ABSOLUTE "FOO" :WILD "BAR")
Rationale:
This would allow programs to usefully deal with hierarchical file systems,
which are by far the most common file system type.
Current Practice:
Symbolics Genera implements something very similar to this. The main
differences are:
- In Genera, there is no :ABSOLUTE keyword at the head of the list.
This has been shown to cause some problems in dealing with root
directories. Genera represents the root directory by a keyword
symbol (rather than a list) because the list representation
was not adequately general.
- Genera represents Unix ".." as :UP, but deals with :UP
syntactically, not semantically.
Cost to Implementors:
In principle, nothing about the implementation needs to change except
the treatment of the directory field by MAKE-PATHNAME and
PATHNAME-DIRECTORY. The internal representation can otherwise be left
as-is if necessary.
For implementations that choose to rationalize this representation
throughout their internals and any other implementation-specific
accessors, the cost will be necessarily higher.
Cost to Users:
None. This change is upward compatible.
Cost of Non-Adoption:
Serious portability problems would continue to occur. Programmers would be
driven to the use of implementation-specific facilities because the need
for this is frequently impossible to ignore.
Benefits:
The serious costs of non-adoption would be avoided.
Aesthetics:
This representation of hierarchical pathnames is easy to use and quite
general. Users will probably see this as an improvement in the aesthetics.
Discussion:
This issue was raised a while back but no one was fond of the particular
proposal that was submitted. This is an attempt to revive the issue.
The original proposal, to add a :SUBDIRECTORIES field to a pathname, was
discarded because it imposed an unnatural distinction between a toplevel
directory and its subdirectories. Pitman's guess is the the idea was to
try to make it a compatible change, but since most programmers will
probably want to change from implementation-specific primitives to portable
ones anyway, that's probably not such a big deal. Also, there might have
been some programs which thought the change was compatible and ended up
ignoring important information (the :SUBDIRECTORIES field). Pitman thought
it would be better if people just accepted the cost of an incompatible
change in order to get something really pretty as a result.
This issue used to address the issue of relative pathnames (pathnames
relative to some default which is separately maintained). Pitman removed
this issue for now in order to simplify things. He feels the issue should
be resubmitted under separate cover so that it can be discussed separately.
----- Next Message -----
Date: Thu, 29 Dec 88 13:29 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: PATHNAME-SUBDIRECTORY-LIST (Version 3)
To: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
cc: CL-Cleanup@SAIL.STANFORD.EDU
In-Reply-To: <881228143206.5.KMP@BOBOLINK.SCRC.Symbolics.COM>
I approve PATHNAME-SUBDIRECTORY-LIST:NEW-REPRESENTATION but note the
following typos and proposed simplifications. Also, don't you need
functions like Symbolics' DIRECTORY-PATHNAME-AS-FILE and
PATHNAME-AS-DIRECTORY? The conversion between the name of a directory
and the directory component of a file inferior to that directory is
system-dependent, for example TOPS-20 appends a type field and Unix does
not. Also in some systems the root directory has a name and in others
it doesn't. Of course these functions signal an error in
non-hierarchical file systems. Should there be a separate proposal for
these?
Typos (and some discussion):
Problem Description:
- In some implementations "FOO.BAR" might denote the "BAR" subdirectory
of "FOO" while in other implementations because "." is not the
separator.
Some words must be missing after "while".
Proposal (PATHNAME-SUBDIRECTORY-LIST:NEW-REPRESENTATION)
:UP - Go upward in directory structure (semantic).
:BACK - Go upward in directory structure (syntactic).
The difference between up and back is that if there is a directory
(:ABSOLUTE "X" "Y" "Z")
linked to
(:ABSOLUTE "A" "B" "C")
and there also exist directories
(:ABSOLUTE "A" "B" "Q")
(:ABSOLUTE "X" "Y" "Q")
then
(:ABSOLUTE "X" "Y" "Z" :OUT "Q")
designates
(:ABSOLUTE "A" "B" "Q")
while
(:ABSOLUTE "X" "Y" "Z" :UP "Q")
designates
(:ABSOLUTE "X" "Y" "Q")
Is :OUT a typo for :BACK? Also I don't understand what your proposed
semantic/syntactic distinction is. I almost thought I did until I read
the above text carefully and saw that the syntactic one chases links
to the truename and the semantic one does not, which seems backwards.
I also don't think :UP and :BACK are meaningful anywhere except immediately
after :RELATIVE, although I have to concede that Unix disagrees with me
and therefore Symbolics Genera's Unix pathname support also disagrees
with me. But if they were only allowed immediately after :RELATIVE I
don't think you would need two of them. I also don't think that
MERGE-PATHNAMES should ever look at what files/directories actually
exist in the file system, which makes me opposed to the existence
of the one that you have called syntactic. Is this really something
we need, or will TRUENAME do the job?
I think we should only have :UP and not :BACK.
Current Practice:
- Genera represents Unix ".." as :UP, but deals with :UP
syntactically, not semantically.
After you straighten out the definition of syntactic and semantic, check
whether this statement is true. (I'm always irked by proposals where
the current practice section is wrong, because someone wrote an original
proposal where the current practice section was right, then the group
changed the proposal all around but didn't update the current practice
section).
Discussion:
This issue used to address the issue of relative pathnames (pathnames
relative to some default which is separately maintained). Pitman removed
this issue for now in order to simplify things. He feels the issue should
be resubmitted under separate cover so that it can be discussed separately.
It seems to me that this fully addresses relative pathnames already. The
discussion of :UP and :BACK (if freed of typos) seems specific enough that
even though this proposal doesn't explicitly propose what MERGE-PATHNAMES
does with relative pathnames, I think there is only one thing that it could
do that would be consistent. A pathname with a :RELATIVE directory is
not very different from a pathname with a NIL directory; in either case
you have to merge with a default to find the real directory to use; thus
I don't think there are any other new issues with relative pathnames.
----- Next Message -----
Date: Thu, 29 Dec 88 14:54 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: PATHNAME-SUBDIRECTORY-LIST (Version 3)
To: Moon@STONY-BROOK.SCRC.Symbolics.COM
cc: KMP@STONY-BROOK.SCRC.Symbolics.COM, CL-Cleanup@SAIL.STANFORD.EDU
In-Reply-To: <19881229182903.0.MOON@EUPHRATES.SCRC.Symbolics.COM>
Date: Thu, 29 Dec 88 13:29 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
... don't you need functions like Symbolics' DIRECTORY-PATHNAME-AS-FILE and
PATHNAME-AS-DIRECTORY? The conversion between the name of a directory
and the directory component of a file inferior to that directory is
system-dependent, for example TOPS-20 appends a type field and Unix does
not. Also in some systems the root directory has a name and in others
it doesn't. Of course these functions signal an error in
non-hierarchical file systems. Should there be a separate proposal for
these? ...
I guess it's enough related to this topic that it could piggy back.
Certainly these functions made no sense prior to this proposal and
now they are suddenly important, so I'll see about putting them in on
next pass.
... Is :OUT a typo for :BACK? ...
Yeah. Sloppy editing. Sorry.
Also I don't understand what your proposed semantic/syntactic
distinction ...
Semantic means you have to probe the file system. Syntactic means
looking at the file names themselves is enough. Maybe I screwed up
the presentation. I'll double-check.
Genera is syntactic in that it doesn't probe when processing :UP.
That means that MERGE-PATHNAMES on
(:ABSOLUTE "X" "Y" "Z")
and
(:RELATIVE :UP "Q")
returns
(:ABSOLUTE "X" "Y" "Q")
rather than
(:ABSOLUTE "X" "Y" "Z" :UP "Q")
If you were going to contract out the :UP, you'd need to probe the
file system to make sure
(:ABSOLUTE "A" "B" "Q")
wasn't more correct than
(:ABSOLUTE "X" "Y" "Q")
so I think Genera has a bug.
I think this because if I do:
cd /m/n/o
cd ../q
on Unix I get one place, but in Genera if I visit a file in the
editor named
/m/n/o/x.lisp
and then I type c-X c-F and when prompted for a filename I type
../q/x.lisp
I end up with
/m/n/o/q/x.lisp
rather than the x.lisp in the directory that Unix itself would have
plopped me in if I'd used the cd commands above.
If you disagree, please reply to me privately so we can avoid
further confusion, quickly iron it out offline, and then report a
joint answer (and rationale) to everyone else once we've achieved
consensus.
I also don't think :UP and :BACK are meaningful anywhere except immediately
after :RELATIVE, although I have to concede that Unix disagrees with me
and therefore Symbolics Genera's Unix pathname support also disagrees
with me.
When I previously raised this topic, there were a pile of messages on
exactly this subject. My putting these into the proposal was an attempt
to address those topics. Even if I took these out, the current proposal
offers the flexibility that implementations could offer them without being
in violation of the spec. On the other hand, if people -are- going to offer
them, we might as well agree on common names if it's possible to do so.
But if they were only allowed immediately after :RELATIVE I
don't think you would need two of them.
I claim my example above refutes this.
I also don't think that
MERGE-PATHNAMES should ever look at what files/directories actually
exist in the file system,
If you permit :UP in absolute pathnames, you don't have to look in the
file system. You just get some funny names. Most file systems that
have this feature permit such funny names, though. eg, /foo/../bar/x
is valid in Unix, I understand. If memory serves me, Multics permits
>foo>bar>baz<x>y in Multics, no? Our (Symbolics) pathname system doesn't
permit embedded "<", but the error message suggests that this is only
because there's no obvious interpretation. We could define it to mean
:BACK rather than :UP, so that syntactic merges could be done and
unique pathnames would always be generated.
which makes me opposed to the existence
of the one that you have called syntactic.
In any case, I'm sympathetic to your desire to not look in the file
system. I just think that if a file system designer has made a
decision that forces you to look in the file system to get the right
answer, I don't know what we the CL designers can do to alter that.
The same issue comes up for logical devices and as Sandra has reminded
us, we've not done a particularly good job of papering over that real
externally-induced issue either.
Is this really something we need, or will TRUENAME do the job?
TRUENAME and OPEN would, presumably, not return pathnames with :UP
references (except maybe in pathological situations which they tell
me you can make in Unix if you try hard enough where the only way to
get to a directory is to go down first and then dig upward.)
I think we should only have :UP and not :BACK.
Nothing forces a file system to have both. The question is whether
there are some file systems that have one set of semantics and others
which have the other. If so, then two tokens are needed. In the discussion
leading up to this, people asserted (and I took them at their word) that
there were such competing semantics.
Current Practice:
- Genera represents Unix ".." as :UP, but deals with :UP
syntactically, not semantically.
After you straighten out the definition of syntactic and semantic, check
whether this statement is true. (I'm always irked by proposals where
the current practice section is wrong, because someone wrote an original
proposal where the current practice section was right, then the group
changed the proposal all around but didn't update the current practice
section).
I'll double-check when I've made the next version.
Discussion:
This issue used to address the issue of relative pathnames (pathnames
relative to some default which is separately maintained). Pitman removed
this issue for now in order to simplify things. He feels the issue should
be resubmitted under separate cover so that it can be discussed separately.
It seems to me that this fully addresses relative pathnames already. The
discussion of :UP and :BACK (if freed of typos) seems specific enough that
even though this proposal doesn't explicitly propose what MERGE-PATHNAMES
does with relative pathnames, I think there is only one thing that it could
do that would be consistent. A pathname with a :RELATIVE directory is
not very different from a pathname with a NIL directory; in either case
you have to merge with a default to find the real directory to use; thus
I don't think there are any other new issues with relative pathnames.
Well, there was the whole treatment of :UP. I hadn't really meant for this
proposal to specify that treatment so much as to identify a common representation
so we'd be a little less divergent in the areas we hadn't really specified.
Does anyone think I should add a disclaimer in the proposal body similar to the
one for :OLDEST, :INSTALLED, etc in pathname versions in CLtL that says that
although these are semi-standard names, there is no attached semantics?
----- Next Message -----
Date: Thu, 29 Dec 88 16:05 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: Issue: PATHNAME-SUBDIRECTORY-LIST (Version 3)
To: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
cc: CL-Cleanup@SAIL.STANFORD.EDU
In-Reply-To: <881229145413.6.KMP@BOBOLINK.SCRC.Symbolics.COM>
Date: Thu, 29 Dec 88 14:54 EST
From: Kent M Pitman <KMP@STONY-BROOK.SCRC.Symbolics.COM>
Date: Thu, 29 Dec 88 13:29 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Also I don't understand what your proposed semantic/syntactic
distinction ...
Semantic means you have to probe the file system. Syntactic means
looking at the file names themselves is enough. Maybe I screwed up
the presentation. I'll double-check.
OK, I understand, and you did screw up the presentation.
But if they were only allowed immediately after :RELATIVE I
don't think you would need two of them.
I claim my example above refutes this.
Agreed.
I also don't think that
MERGE-PATHNAMES should ever look at what files/directories actually
exist in the file system,
If you permit :UP in absolute pathnames, you don't have to look in the
file system. You just get some funny names. Most file systems that
have this feature permit such funny names, though. eg, /foo/../bar/x
is valid in Unix, I understand. If memory serves me, Multics permits
>foo>bar>baz<x>y in Multics, no?
No. Although that's from memory: there are few Multices left, I don't
have an account of any of them, and my manuals are in the attic at home.
I think you got syntactic and semantic mixed up again. I think you're
saying that MERGE-PATHNAMES of a relative directory against an absolute
directory will remove :UP, since that's purely syntactic, but will leave
:BACK in the middle of the merged directory, since resolving :BACK
"semantically" would require accessing the file system, which
MERGE-PATHNAMES doesn't do. Okay.
These names are extremely confusing, obviously. Can anyone think
of better ones than :UP and :BACK?
Our (Symbolics) pathname system doesn't
permit embedded "<", but the error message suggests that this is only
because there's no obvious interpretation.
The error message I get doesn't suggest anything, it's just "Embedded <?".
We could define it to mean
:BACK rather than :UP, so that syntactic merges could be done and
unique pathnames would always be generated.
I don't understand this sentence. Say it again after we all agree on
which one is :UP and which one is :BACK.
which makes me opposed to the existence
of the one that you have called syntactic.
In any case, I'm sympathetic to your desire to not look in the file
system. I just think that if a file system designer has made a
decision that forces you to look in the file system to get the right
answer, I don't know what we the CL designers can do to alter that.
The same issue comes up for logical devices and as Sandra has reminded
us, we've not done a particularly good job of papering over that real
externally-induced issue either.
Is this really something we need, or will TRUENAME do the job?
TRUENAME and OPEN would, presumably, not return pathnames with :UP
references (except maybe in pathological situations which they tell
me you can make in Unix if you try hard enough where the only way to
get to a directory is to go down first and then dig upward.)
I think we should only have :UP and not :BACK.
Nothing forces a file system to have both. The question is whether
there are some file systems that have one set of semantics and others
which have the other. If so, then two tokens are needed. In the discussion
leading up to this, people asserted (and I took them at their word) that
there were such competing semantics.
I don't know enough about the weird file systems out there to dispute this.
Well, there was the whole treatment of :UP. I hadn't really meant for this
proposal to specify that treatment so much as to identify a common representation
so we'd be a little less divergent in the areas we hadn't really specified.
Does anyone think I should add a disclaimer in the proposal body similar to the
one for :OLDEST, :INSTALLED, etc in pathname versions in CLtL that says that
although these are semi-standard names, there is no attached semantics?
Yes, I think :UP and :OLDEST have the same status, but no I don't agree
that that status is that they have no semantics. I agree with what CLtL
page 412 actually says, which is that either an implementation doesn't
support these keywords or if it does support them they have prescribed
meanings.
----- Next Message -----
Date: Thu, 29 Dec 88 14:48:23 PST
From: Jim McDonald <jlm@lucid.com>
To: Moon@STONY-BROOK.SCRC.Symbolics.COM
Cc: KMP@STONY-BROOK.SCRC.Symbolics.COM, CL-Cleanup@SAIL.STANFORD.EDU
In-Reply-To: David A. Moon's message of Thu, 29 Dec 88 13:29 EST
Subject: Issue: PATHNAME-SUBDIRECTORY-LIST (Version 3)
>> Is this really something we need, or will TRUENAME do the job?
If you're trying to create a filename to be used for output, it might
not exist yet (hence TRUENAME would signal an error), but there might
be various funny links in its directory path you would like to
traverse. Presumably you could use PROBE-FILE on some part of the
name (perhaps recursively down through the super-directories), then
merge in the remaining part, but that seems enough error-prone to be
worth hiding.
BTW, I think the labelling of semantic/syntactix examples was reversed
in the proposal, independantly of :UP vs. :BACK.
jlm
----- End Forwarded Messages -----