[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

newline et al



    From: James R. Meehan <Meehan at YALE-RES>
    ... what's #\NEWLINE on a "carriage return + line feed" machine?

The general idea is that T's I/O system should try, within reason, to
provide an operating-system-independent file model, one in which a
file consists of a sequence of "lines" or "records", each of which is a
sequence of characters.  Doing a READC at the end of a record would
cause READC to return #\NEWLINE (a character not necessarily in the
actual machine's character set; conceptually not even a character at all
(although making it be one makes it possible to put it in a string; a
somewhat dubious feature)).  This means folding return+linefeed into a
single newline token (character).

Consider the following scenario:

(DEFINE (SUCK STREAM L)		; READC from a file until EOF is found, and
  (LET ((C (READC STREAM)))	; return a list of the characters read.
    (COND ((EOF? C) (REVERSE L))	; Thanks to C. R. for inspiring
	  (T (SUCK STREAM (CONS C L))))))	; this routine's name.

On ASCII machine A, someone opens a file FOO for input, opens a file BAR
for output, and does

(PRINT (SUCK FOO-STREAM '()) BAR-STREAM).

They use FTP to send the file BAR over to EBCDIC machine B, run T,
open BAR, and do

(READ BAR-STREAM)

to get the list of characters previously PRINTed.  Well, this should
yield the same list as if FOO were FTP'ed instead of BAR, and the call
to SUCK done on machine B.  Line separation may be implemented very
differently on the two machines, but programs should see the same
thing when they have that list of characters.

This of course implies certain well-formedness restrictions on files,
and assumes that FTP has a file model similar to T's (which I believe to
be the case), does character set and newline translation, etc.  There is
of course nothing special about FTP, I only mention it to try to get a
handle on the ineffable.  This presentation is pretty rough but it should
suggest the kind of invariant I'm after.

I don't mean to annoy you by saying things you already know; just felt
like writing all this down.

Maybe your question was simpler, and you were just wondering which of

(EQ? #\NEWLINE #\RETURN)    and
(EQ? #\NEWLINE #\LINEFEED)

was the case; this is left undefined, and in fact the very existence of
#\RETURN and #\LINEFEED is shadowy (you probably won't find them on the
370).

Well, actually I don't whether that can be left undefined, because
CHAR->ASCII needs to be well-defined for all characters (I think).