[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: concurrent GC



> One of the ways to implement a concurrent garbage collector is to protect
> pages that have not been scavenged, and have some other thread that fields
> page faults on those protected pages.  If this scavenger thread isn't in
> the process of fielding a page fault, it can scavenge and unprotect pages
> while waiting for the next fault.

That's the wrong way to do it on Mach.  You already have code that
does it with the external pager interface, why would you want to be
side-tracked by the people at Princeton that do not understand anything
but BSD ? If you do not trust me, ask Rick Rashid which way it is 
faster on Mach, he timed it.  Or look at /../testarossa/usr/af/tmp/appel
for the program Appel and Rick used and time it yourself.

And yes, that scheme only works for ONE user thread: what they
really mean by "scavenger thread" is the Unix signal handler.  Which
says a lot about their ignorance on threads...

Using two tasks is probably possible, but sounds like a lot of syscalls
per-page: vm_read() to get it and vm_write() to put it, plus
vm_protect() once on the address range and once on the page --> 3+
syscalls per page plus scaffolding code to stop/resume faulting threads
[not just ONE thread, ANY thread that would fault on ANY page]
which implies at least another syscall.
Sharing memory takes vm_read and vm_write away, but makes fork()
a total nightmare.  And besides, it slows down page faulting because
of the extra sharing map needed.  Either way, vm_protect() is slow
and there is a fat chance anyone would speed it up.

The external pager code has none of these problems, what you have
will work just fine for any number of threads.  And much, MUCH faster
still when we'll be done with the optimizations to the 3.0 kernel
[RPD already sped up certain things a lot, but there is more to come]
The number of syscalls there is strictly one per address range to
prevent access, and one per page to restore access.  There is a possibilty
then to also buffer the "restore" path e.g. sending more than one
clean page at once, but currently this needs fixing a kernel bug
[Dave Black is currently working on it at OSF].

Moreover, we are planning to extend the external pager interface
so that the kernel and pager can better interact in handling main
memory: for instance, the pager would be asked "please release all
the garbage you have pls" before the kernel starts paging out.
This way lisp would adapt itself automagically to the available
amount of main memory.  And then more fancy stuff.. but not if you
confine yourself in the Appel ghetto of memory protection schemes.

These extensions are being considered specifically for garbage
collecting languages, and you can influence the way we define
and implement them.  Why pass up on such an nice opportunity ?

> Is there anyway to control which thread gets interrupted when a signal is
> delivered?

Yes, override the default exception handler and get exception messages
yourself.  You can then forward all those you do not care about and
get o'unix signals out of them if you want to.  I believe Morrisett
has worked out things for SML, but you can look in
/afs/cs/project/mach/src/bin/gdb/mach_os.c to see how GDB does it
for Mach.  Look where it munges with the task's exception port
and catch_exception_raise().

>   I assume that things like protection violations and bus errors
> are sent to the thread that caused them, but what about things like
> keyboard interrupts?  I wrote a simple test program, and it looked like it
> was random.

Per-task signals are sent to the current thread if possible, to the
first thread in the task otherwise.  This is NOT a specification,
just the way it is implemented.  There is a standard committee at work
to define how signals and threads are supposed to get a long, ask
Mike Jones for the latest news.

But there are a kazillion ways in which Unix and multi-threading do not
get along, signals are but one of them.  That is why I would strongly
suggest you just forget about Unix if you want a parallel lisp and get
down to Mach basics.
OR, be happy with just faked parallelism.

I would be delighted to explain to all the group how the logic behind
the external pager facility works, and/or discuss the tradeoffs and
why the Princeton people are wrong.  I strongly fear you might take
the wrong design decision only because you have been explained the
one way of doing it and not the other.

sandro-
[BTW, the paging code you currently have is horribly wrong in using a 
 vm_copy() where a simple bcopy() is all it is needed].