[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Tagged Architectures, a question

To: Bil Lewis <lewis@rocky.stanford.edu>
Subject: Tagged Architectures, a question
From: "Robert W. Kerns" <RWK@scrc-yukon.arpa>
Date: Fri, 16 Oct 87 12:21 EDT
Cc: Slug@r20.utexas.edu
In-reply-to: <8710151816.AA17408@sally.utexas.edu>
Posted-date: Fri, 16 Oct 87 12:21 EDT
Resent-date: Fri 16 Oct 87 13:10:41-CDT
Resent-from: CMP.SLUG@r20.utexas.edu
Resent-message-id: <12342986650.39.CMP.SLUG@R20.UTEXAS.EDU>
Resent-to: SLUG:;

    Date: Thu, 15 Oct 87 11:10:28 PDT
    From: Bil Lewis <lewis@rocky.stanford.edu>
      Some of the special purpose personal machines also have hardware tags:
    the D-machines, the Lispms, etc.(?)  On the Lispms, however, it is the
    POINTERS that are tagged with the type of the object pointed to, not
    the actual data word.
Right, but see my remarks below about CDR codes.

      Taking the Symbolics I machine as an example (32 bit pointers with
    8 more bits for tag + GC)...
No, not for GC.  For CDR coding, to make lists more compact.  See below.

      Now I understand that when I call (CAR FOO), the hardware does a
    data type check (in parallel) of the pointer for FOO.  What happens,
    however, when I call (+ X Y), when X and/or Y are either bignums or
    large FP numbers?  Do I do the checking on the tag bits for the pointers
    in X & Y?  If so, then when I do the indirection to the actual data
    words, am I adding two 32 bits quantities whose tag bits are ignored, or
    am I adding two 40 bit quantities (mod GC)? 
If X or Y are a bignum or a large-float, then their type code is
DTP-EXTENDED-NUMBER, and they point to a block of memory consisting
of a header word (which indicates just what kind of number; don't forget
COMPLEX and RATIO!), followed by the necessary data words.  (DTP is
short for "Data TyPe")

The header word is tagged with a DTP-HEADER-I type, and the data words
are DTP-FIX (i.e. fixnum's).  These type codes are not ignored; they are
checked.  After all, the check is free, and it helps detect memory
clobbered due to bugs or hardware, and helps us to make the low-level
system more reliable by finding such problems early.  They are also necessary
for the GC to know that these are numbers, not pointers to other objects
that need to be copied.

Earlier MIT (CADR-based) architectures, such as the LM-2, the TI
Explorer, the LMI Lambda, relax this restriction slightly by teaching
the GC more about what fields of individual structures are pointers and
what are not.  On our 3600 and later machines, you can always tell
whether a memory location is a pointer or not by looking at the type
code.  Even disk buffers and IO registers will have a type code of 00
(i.e. fixnum).  Clearly, we are helped in this by our larger word sizes;
the essential part is that the GC must be able to *parse* the memory
into pointers and non-pointers.

So to summarize your addition example: you are manipulating 32-bit
quantities, type checked, under the control of either microcode or a
macrocode routine invoked by the + instruction, depending on just what
hardware you're talking about and what data-types are involved.  The
microcode always starts the operation, and if it's too complex for
microcode, it may call out ("trap out") to a macrocode routine.  Our
macrocode is fast enough relative to our microcode that this approach is
quite efficient.

Other MIT-descended Lisp machines are similar, though not identical in
detail, although I don't believe any of the CADR-descendents ever trap
out to macrocode.

      In other words, is the data itself ever tagged on these machines?

The problem with this question is that it is entirely a matter of viewpoint
what you call "the data".  I would term the quantity with the type code and
either a pointer or immediate data, to consider that to be the object.  The
object may or may not have any storage associated with it, but you pass
*objects* around.  This contrasts with the view encouraged by other languages,
of making a distinction between objects and pointers to objects.

    If not, are the tag bits even used?  

The term "tag bits" is usually used to refer to all the non-pointer fields,
both the data-type and the CDR-code.
					 If so, what are they used for?

The CDR code, unlike the data-type, is effectively a tag on a memory cell.
It talks about how that memory-cell is related to its neighbors, structurally.
It says nothing about the object stored in the cell.  For example, CDR-NIL
means that this cell is the last of a list, while CDR-NORMAL says the next
cell holds the object which is the CDR of the list.  The final choice, CDR-NEXT,
indicates that the next cell *IS* the object which is the CDR of the list;
that is, the CDR of the list is the pointer consisting of DTP-LIST and the
address of that next cell.

CDR-codes are read at the same time as the object, but are ignored unless
you're doing something where they are relevant, such as taking the CDR.
When writing (i.e. RPLACA), the old CDR code is stored back with the new
data.  Thus my assertion that the CDR code is a tag for the memory cell,
not for the data contained in it.

      And while we're at it, am I correct in calling the GC bits ``tag bits''?
    CDR-coding bits are also tag bits?

You're wrong in calling them GC bits.  They're not for the GC, they're for
CAR and CDR and LIST and CONS and friends.  Of course, the GC has to understand
them, just like it has to understand everything else.

(In fact, there is something in our 36xx series machines called "GC
tags", but they're not part of the memory word, but rather a small table
elsewhere.  It's really not relevant to what you're talking about; I
only mention it in case it may be a source of your confusion of terms).

References:
- Tagged Architectures, a question
  - From: Bil Lewis <lewis@rocky.stanford.edu>

Prev by Date: Re: Sym. Terminal Cables
Next by Date: SLUG library status report.
Previous by thread: Tagged Architectures, a question
Next by thread: Tagged Architectures, a question
Index(es):
- Date
- Thread