# Re: character objects & representation issues

• To: ALAN at MIT-MC
• Subject: Re: character objects & representation issues
• From: Robert W. Kerns <RWK at MIT-MC>
• Date: Sat ,16 Aug 80 18:42:00 EDT
• Cc: LISP-FORUM at MIT-MC

    Date: 15 August 1980 16:13-EDT
From: Alan Bawden <ALAN at MIT-MC>
To:   RWK
cc:   LISP-FORUM
Re:   character objects & representation issues

Date: 14 August 1980 23:18-EDT
From: Robert W. Kerns <RWK at MIT-MC>

Transportability aside, there are good CODING STYLE arguments
for character objects.  It is damned useful to know by inspection
of the code that the 'fixnum' being handled is in reality A
REPRESENTATION OF A CHARACTER!

Indeed, one of the reasons for "#/" is to flag those fixnums that are
being used as representations of characters.  This is a "good CODING
STYLE argument" for "#/", not for character objects.
OK, so I was being a little fuzzy.  You KNOW this argues for CHAR-= and CHAR-<,
rather than for the desireability of characters being represented as a
character object.  It was just a lead-in to the actual argument for characters,
which you've kindly reproduced below.
Even better, if your program is writing a file to be read on
another machine, it is nice if PRINT knows they are characters!  A
character is not JUST a fixnum.  A fixnum can REPRESENT a
character, but it does not have the IDENTITY of a character.

I frequently use lists to REPRESENT things.  The lists usually don't
have the IDENTITY of the things they represent (I don't know exactly
what this means anyway), and they print as lists too, but this isn't
an inconvenience.  And there is this bonus that I can use these
functions named CAR and CDR and RPLACA to examine and modify them!
Amazing isn't it, without adding a single new data-type or function to
the language I can talk about new things!

Of course it is true that PRINTing something from MacLisp and then
READing it into some other machine (even a LispMachine) can produce
problems with your character set.  But this is an amazingly rare
occurence, and involves other problems as well...  (are #\lambda,
#\backspace and #/H all going to be diffent (non-EQ) character
objects?  On the LispMachine they are all different, and the last one
isn't really a full-fledged character.)

The bit about identity was brought up by others who flamed that a character and
a fixnum were identical.  I assert that the CONCEPTS *ARE*  >>DIFFERENT<<  ...
(otherwise, we'd just have one word, right?).  If you don't understand what I
mean by "identity", it's because it's too obvious.

You may think that transporting code is an "amazingly rare occurence".  I
somehow doubt I would be amazed.  You don't have Fateman hassling you for
magtapes.  You may thrill to the idea of using indistinquishable lists to
represent n different things, but I'd much prefer to be able to tell my objects
apart.  I like to talk about pieces of my objects by name, it's much more
friendly.  I use one function to modify them:  SETF.  I use lists to build
lists of things.  I think it's a nice, simple, consistant world view, no?

But anyway, for DLW's sake, I think it is clear that being character-set
independent and readable in your code, via #/x and things like CHAR-= is
far more important than having character objects.  And even if you don't like
can't hurt you for those functions to exist for those who DO care about such
things.

I think DLW got the wrong impression about why I want character objects:  I
don't care much about runtime type-checking.  I DO care about being able to
tell what's going on while debugging.  I find having objects print out in a way
that relates to their meaning very helpful.  I also brought up the point that
they provide an additional type of transportability:  Files written by PRINT.
These are definitely two independent points, and I may have confused some by
bringing them up together.

I would like to extend DLWs point about character functions not requiring you
to re-write your code to character OBJECTS.  If you have code that uses TYI, =,
<, etc., you'll just deal with characters as fixnums.  Now if I write
(DEFSTRUCT CHARACTER (CODE)) (or whatever the syntax is) and define functions
INCH to return them, and OUCH to output them, and make things which take
fixnums for character purposes (like string-manipulators, TYO, etc.) also work
on characters.  In short, if I make things which take fixnums to represent
characters ALSO take my 'CHARACTER's, what breaks?  Nothing.  Now there are
those who would replace fixnums as characters completely, making TYI return a
character object.  (Actually, if we wanted to, we could make most reasonable
uses of character objects replace fixnums, such as vector/array references,
arithmetic (as long as you add fixnums to a single character, not two
characters, which is nonsense anyway)).  But I take a more middle ground (yes,
I know I'm supposed to be a radical but I'm getting old) and only suggest
AUGMENTING the language, not REDOING it.

I would like to repeat a point repeated by GLS (I won't recurse further but GLS
originated it ultimately):  It is more important that there be a character
STANDARD than there be character OBJECTS.  They are independent issues which
happen to overlap in the problems they attack.  I argue for both of them.