[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character objects & representation issues



    Date: 15 August 1980 16:13-EDT
    From: Alan Bawden <ALAN at MIT-MC>
    To:   RWK
    cc:   LISP-FORUM
    Re:   character objects & representation issues

        Date: 14 August 1980 23:18-EDT
        From: Robert W. Kerns <RWK at MIT-MC>

            Transportability aside, there are good CODING STYLE arguments
        for character objects.  It is damned useful to know by inspection
        of the code that the 'fixnum' being handled is in reality A
        REPRESENTATION OF A CHARACTER!

    Indeed, one of the reasons for "#/" is to flag those fixnums that are
    being used as representations of characters.  This is a "good CODING
    STYLE argument" for "#/", not for character objects.
OK, so I was being a little fuzzy.  You KNOW this argues for CHAR-= and CHAR-<,
rather than for the desireability of characters being represented as a
character object.  It was just a lead-in to the actual argument for characters,
which you've kindly reproduced below.
            Even better, if your program is writing a file to be read on
        another machine, it is nice if PRINT knows they are characters!  A
        character is not JUST a fixnum.  A fixnum can REPRESENT a
        character, but it does not have the IDENTITY of a character.

    I frequently use lists to REPRESENT things.  The lists usually don't
    have the IDENTITY of the things they represent (I don't know exactly
    what this means anyway), and they print as lists too, but this isn't
    an inconvenience.  And there is this bonus that I can use these
    functions named CAR and CDR and RPLACA to examine and modify them!
    Amazing isn't it, without adding a single new data-type or function to
    the language I can talk about new things!

    Of course it is true that PRINTing something from MacLisp and then
    READing it into some other machine (even a LispMachine) can produce
    problems with your character set.  But this is an amazingly rare
    occurence, and involves other problems as well...  (are #\lambda,
    #\backspace and #/H all going to be diffent (non-EQ) character
    objects?  On the LispMachine they are all different, and the last one
    isn't really a full-fledged character.)

The bit about identity was brought up by others who flamed that a character and
a fixnum were identical.  I assert that the CONCEPTS *ARE*  >>DIFFERENT<<  ...
(otherwise, we'd just have one word, right?).  If you don't understand what I
mean by "identity", it's because it's too obvious.

You may think that transporting code is an "amazingly rare occurence".  I
somehow doubt I would be amazed.  You don't have Fateman hassling you for
magtapes.  You may thrill to the idea of using indistinquishable lists to
represent n different things, but I'd much prefer to be able to tell my objects
apart.  I like to talk about pieces of my objects by name, it's much more
friendly.  I use one function to modify them:  SETF.  I use lists to build
lists of things.  I think it's a nice, simple, consistant world view, no?

But anyway, for DLW's sake, I think it is clear that being character-set
independent and readable in your code, via #/x and things like CHAR-= is
far more important than having character objects.  And even if you don't like
your code to be readable (sorry), and your code isn't transportable anyway, it
can't hurt you for those functions to exist for those who DO care about such
things.

I think DLW got the wrong impression about why I want character objects:  I
don't care much about runtime type-checking.  I DO care about being able to
tell what's going on while debugging.  I find having objects print out in a way
that relates to their meaning very helpful.  I also brought up the point that
they provide an additional type of transportability:  Files written by PRINT.
These are definitely two independent points, and I may have confused some by
bringing them up together.

I would like to extend DLWs point about character functions not requiring you
to re-write your code to character OBJECTS.  If you have code that uses TYI, =,
<, etc., you'll just deal with characters as fixnums.  Now if I write
(DEFSTRUCT CHARACTER (CODE)) (or whatever the syntax is) and define functions
INCH to return them, and OUCH to output them, and make things which take
fixnums for character purposes (like string-manipulators, TYO, etc.) also work
on characters.  In short, if I make things which take fixnums to represent
characters ALSO take my 'CHARACTER's, what breaks?  Nothing.  Now there are
those who would replace fixnums as characters completely, making TYI return a
character object.  (Actually, if we wanted to, we could make most reasonable
uses of character objects replace fixnums, such as vector/array references,
arithmetic (as long as you add fixnums to a single character, not two
characters, which is nonsense anyway)).  But I take a more middle ground (yes,
I know I'm supposed to be a radical but I'm getting old) and only suggest
AUGMENTING the language, not REDOING it.

I would like to repeat a point repeated by GLS (I won't recurse further but GLS
originated it ultimately):  It is more important that there be a character
STANDARD than there be character OBJECTS.  They are independent issues which
happen to overlap in the problems they attack.  I argue for both of them.