[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Issue: PATHNAME-LOGICAL (version 2)

To: Richard Mlynarik <Mly@AI.AI.MIT.EDU>, Sandra J Loosemore <sandra%defun@cs.utah.edu>, Gail Zacharias <gz@spt.entity.com>, masinter.pa@Xerox.COM, David N Gray <Gray@DSG.csc.ti.com>
Subject: Issue: PATHNAME-LOGICAL (version 2)
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Date: Mon, 12 Jun 89 19:02 EDT
Cc: CL-Cleanup@sail.stanford.edu
In-reply-to: <19890526160304.1.MLY@ANNA-MAGDALENA-BACH.AI.MIT.EDU>, <8905251939.AA09185@defun.utah.edu>, <CMM.0.88.612204228.gz@spt.entity.com>, <890530-090736-7241@Xerox>, <2822163312-14997875@Kelvin>, <2822166522-15190737@Kelvin>
Line-fold: No
I'm having trouble finding the time to write up revised versions of
these issues, based on your comments, but here is an acknowledgement of
your comments (including the ones not copied in this reply!) and some
responses to some of the comments.

    Date: Thu, 25 May 89 12:38 EDT
    From: Richard Mlynarik <Mly@AI.AI.MIT.EDU>

    1 I believe that mixing in the issue of parsing out host components from
      pathname namestrings is a real mistake....
      To my thinking, a much better idea would be to define a new function, say,
      (LOGICAL-PATHNAME <host-name> <namestring>) which would return a logical
      pathname.

See comments below after GZ's message.  I don't think I can write up a
revised version of the proposal until this is resolved, so far I'm still
mulling it over in my own mind and don't have a firm opinion yet.

    2 My second problem is with the syntax of the pathnames.

I'll mull over these comments too, although I'd like to point out that
I don't think logical pathnames should be asked to represent all features
of all (or even most) real file systems.

    Date: Thu, 25 May 89 13:39:52 MDT
    From: sandra%defun@cs.utah.edu (Sandra J Loosemore)

    I don't understand why there's any need for magic numbers (12 and 6)
    and restrictions on what characters may appear in a logical pathname
    component.  The rationale says this is an arbitrary decision, but
    doesn't address the question of why restrictions are necessary at all. 

I think these length limits were a mistake, and I propose to remove them
in version 3 of the proposal.  It is always up to the person who writes
the translation rules for a particular logical pathname host to a
particular physical file system to make sure that the pathnames that are
going to be used translate to valid pathnames for the particular file
system.  The length limits were supposed to make that easier to do, but
they didn't.  I'll put in an example showing how to do this for your
favorite (!) file system, the Cray with 6 character names and no
directories, types, or versions.

    The proposal mentions that not all filesystems support versions and
    that versions in logical pathnames can't be used portably.  More
    generally, not all filesystems have notions of hosts, devices,
    directories, and file types either.  (In other words, about the only
    thing you can depend on a filename always having is a name.) Why treat
    versions as a special case, but ignore all the other problems?

Because the other fields can be addressed by translation, but it doesn't
really make sense to do that for versions.  Actually, this is not an
absolute, versions could be done by translation.  However, the typical
use of versions is such that on a file system without versions, people
would rather just store one version, and not specify translations that
will preserve the version information by encoding it in the name.  This
is different from the typical use of types or directories, where the
files with different values in those components are truly distinct and
everything would break if you only kept one of them.

At least, that's what I thought.  Perhaps I was wrong and we would
rather mandate that versions always work in logical pathnames,
translating into whatever is necessary to preserve the information.
Another possibility that might bear thinking about is to say that
logical pathnames never have versions, but I know that some development
systems make very effective use of file version numbers, so I'm reluctant
to just rule them out entirely.

    How are functions like OPEN supposed to map a logical pathname onto a real
    pathname?  (Does it do it in the same way as TRANSLATE-LOGICAL-PATHNAME or 
    can it use some other mapping?)

They don't have to actually call the function TRANSLATE-LOGICAL-PATHNAME, but
they have to produce the same result.  This should have been explicit in the
proposal.

    The dependence on issue PATHNAME-WILD (the functions PATHNAME-MATCH-P
    and TRANSLATE-PATHNAME that are referenced in the description of
    TRANSLATE-LOGICAL-PATHNAME) ought to be made more explicit.  What
    happens if PATHNAME-WILD fails?

I thought it said somewhere that PATHNAME-LOGICAL cannot pass if
PATHNAME-WILD fails.

    LOAD-LOGICAL-PATHNAME-TRANSLATIONS sounds suspiciously like REQUIRE --
    I'm sure that's a bad sign... 

It does, but I hope that being much more task-specific, it doesn't really
have the same problems.

    COMPILE-FILE-PATHNAME doesn't seem to have anything to do with the
    rest of this proposal. 

It's another part of what you need to actually use logical pathnames for
storing programs.  Suppose you want to call COMPILE-FILE only if the source
file is newer than the compiled file.  To do that, you have to have a way
to know the name of the compiled file without actually calling COMPILE-FILE.
I could have proposed that logical pathnames always use a specific naming
convention for compiled files, as I did for source files, but for compiled
files it seemed better to let the implementation control the name.  Do
you think a different approach should have been taken, or did you just
want to see this proposal split into parts?

    In general, I don't really see what this proposal buys the user that
    can't already be achieved using other mechanisms that are already part
    of the language.  For example, when I have some files that live in
    different places on different hosts, I usually put a pathname
    containing the appropriate pathname for that place in a variable and
    call MERGE-PATHNAMES to get the full pathnames of the individual
    files.  Logical pathnames don't eliminate the necessity of having
    literal, host-specific pathnames in a program; you still have to
    supply them to DEFINE-LOGICAL-PATHNAME-TRANSLATIONS.

Then you've missed the point, which should have been described more
clearly in the issue writeup.

The call to DEFINE-LOGICAL-PATHNAME-TRANSLATIONS isn't part of the
program, it's separate.  That separation, and a uniform convention
for how to do the separation, are the key aspects of logical pathnames.
I agree that Common Lisp is Turing-machine-equivalent with or without
logical pathnames, or, more seriously, that each user could implement
something like logical pathnames for himself.  The reason to standardize
it is so everyone will do it the same and so individual users can spend
their time writing applications instead of doing this kind of system
programming.  Note that your way sounds simple, but it doesn't stay
simple when you get into more complicated situations such as program
generated file names or porting a program developed on a system with
long file names onto a system with a very restrictive limit on the
length of file names.

    Date: Fri, 26 May 1989 12:43:48 EDT
    From: Gail Zacharias <gz@spt.entity.com>

    .... [discussion of various logical namestring syntax ideas]

I don't know what I think yet about whether logical pathnames should
have a namestring syntax that is specified to be distinguishable from
all physical pathname namestrings, or whether there should be a
LOGICAL-PATHNAME function distinct from the PATHNAME function.  I'm
suspicious of the latter idea because I know that in current practice it
is very common to have strings that are sometimes one kind of pathname
and sometimes the other kind; thus it seems desirable to have all the
operations, including parsing, be generic for both kinds of pathnames.
What if you have a string that is just "FOO", i.e. just a name
component?  However, I haven't had time to think out the full
implications of this issue yet.  The suggestion to have two distinct
spaces of namestrings was not something I had anticipated and the
implications require substantial thought.  Current practice in Symbolics,
Explorer, and Coral has one namespace for both logical and physical
namestrings.

    Date: 30 May 89 09:07 PDT
    From: masinter.pa@Xerox.COM

    I think this discussion is leading in a productive direction. Standardizing
    on a funny syntax for namestrings on the grounds that it is  "different
    enough" from the file systems we know about seems like we're going in the
    wrong direction; it presumes that we know about all possible file systems
    to which the Standard might need to be connected.

    If we want to do anything about logical pathnames at all, building Lisp
    constructors for them (either as a new function, MAKE-LOGICAL-PATHNAME, or
    possibly a just new keyword for MAKE-PATHNAME which can be used instead of
    host+device+directory) sounds less likely to lead us into trouble.
 
Maybe, or maybe more likely to lead us into trouble; see above.  I'm still
thinking this one over.

    Date: Tue, 6 Jun 89  17:15:12 CDT
    From: David N Gray <Gray@DSG.csc.ti.com>

    > Well, it seems like this particular choice conflicts with just about everybody
    > (except unix), so maybe it's worth considering alternatives...

    On the Explorer, we use the colon as the host delimiter for all
    pathnames, which includes support for files on Symbolics, VAX-VMS,
    MS-DOS, Multics, and Macintosh as well as Unix, the local Explorer
    files, and logical pathnames.  This has not been seen to be a problem.

Same in Genera.

    True, for MS-DOS and Macintosh pathnames, a host must always be supplied
    since the first colon is taken to be a host delimiter.  Even this hasn't
    been considered to be a problem, but that is probably just because we
    are in the habit of always specifying the host anyway because the
    pathname defaulting facilities in our environment are too unpredictable
    to be of much use.

    I can see, though, that if you are on a non-networked Macintosh, it
    could be annoying to have to specify the host even though there is only
    one host that you are using.

    I wonder, since neither MS-DOS or Macintosh pathnames use the semicolon
    [the proposed logical pathname directory delimiter], if it would be
    reasonable in such an environment to consider the first colon to be a
    host delimiter only if the namestring contains a semicolon?  Since it
    would be unusual to want to use a namestring consisting of only a host
    name, that could be a useful way of avoiding the ambiguity.

On first blush I like this idea.  Maybe we can work out something along
these lines.  I'd be worried about making the rules too complicated to
understand, though.

    Date: Tue, 6 Jun 89  18:08:42 CDT
    From: David N Gray <Gray@DSG.csc.ti.com>

    I would prefer to use "#" as the version prefix.  More than once I've
    been annoyed when using Symbolics pathnames to not be able to specify a
    version without also specifying the type.

My preference is to minimize the use of special characters.  I'm not
sure the syntax matters very much so long as we all agree on it so our
programs can interchange logical pathnames, but I don't want the syntax
to appear too complicated and intimidating.  I don't think the ability
to specify a version by itself in a namestring is needed for the
intended uses of logical pathnames.

    The 12-character limit doesn't seem to have any clear significance.
    Since it is common for older file systems to have an 8-character limit
    for file names, maybe it would be more meaningful to say that names can
    be any length, but only the first 8 characters are guaranteed to be
    significant on all implementations.

The particular numbers came from consideration of the 14-character-max
version of Unix, I believe.

I think these length limits were a mistake, and I propose to remove them in
version 3 of the proposal.  It is always up to the person who writes the
translation rules for a particular logical pathname host to a particular
physical file system to make sure that the pathnames that are going to be
used translate to valid pathnames for the particular file system.

    >   There is no device, so the device component of a logical pathname is
    >   always :UNSPECIFIC.  No other component can be :UNSPECIFIC.

    I presume you mean that the standard doesn't specify any portable
    meaning for an :UNSPECIFIC component, and don't intend to rule out its
    use in generic pathnames as an extension?

What's a generic pathname?

I think I meant what I said, which is that no component of a logical pathname
other than the device can ever be :UNSPECIFIC.  If that's the wrong thing,
okay, but let's hear an argument why it's wrong.  Remember that logical 
pathnames don't have to be able to represent all features of all file systems,
they just have to do what's needed for portable naming of program and data
files within a program.

    >   DEFINE-LOGICAL-PATHNAME-TRANSLATIONS host translations &key   [Function]
    > 
    >     Define a logical pathname host named <host> (a string or a symbol which
    >     is coerced to a string).  <translations> is a list of translations.
    >     Each translation is a list of from-wildcard and to-wildcard.
    >     From-wildcard must be a logical pathname or a string coercible to a
    >     logical pathname. 

    Could we say
      ...  coercible to a logical pathname using "<host>:" as the
      default pathname.
    so that the host doesn't have to be explicitly supplied every time?

That seems like probably a good idea, let me think it over.

    > 	...	To-wildcard must be a physical pathname or a 
    >     coercible to a physical pathname. 

    Is there any reason to not permit this to be another logical pathname?

Yes, it's too complicated to define what it means and not obviously useful
for anything (as far as I can see).

    >     There are no keyword arguments specified by this standard, but any
    >     implementation extensions are provided as keyword arguments or as
    >     translations with more than two elements.

    An extension I would like to have is the ability to specify what syntax
    will be used for parsing the name strings.  If I'm using logical
    pathnames for my own convenience, rather than portability, then I would
    like to be able to use whatever pathname syntax I like the most.

I don't understand, could you be more specific?

    > Current practice:

    The Explorer also has a comparable logical pathname facility, although
    the translation mechanism is unfortunately less general than proposed
    here.  The namestring syntax used is slightly different:

      host ":" [{directory "."}* directory ";"] [name] ["." type] ["#" version]

    The newest version is indicated by ">" instead of "newest".

I'd like to minimize the use of special characters.
References:
- Re: Issue: PATHNAME-LOGICAL (version 2)
  - From: David N Gray <Gray@DSG.csc.ti.com>
Prev by Date: Re: Issue: PATHNAME-WILD (version 5)
Next by Date: Re: Issue: PATHNAME-WILD (version 5)
Previous by thread: Re: Issue: PATHNAME-LOGICAL (version 2)
Next by thread: Re: Issue: PATHNAME-LOGICAL (version 2)
Index(es):
- Date
- Thread