[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CMUCL's kernel requirements (long&educational)



> We only create code objects in virgin memory (i.e. has not been
> touched in any way after the last vm_allocate).  Therefore, if
> vm_allocate flushes the allocated range from the various caches, we
> don't need to do any additional flushing.  Does it do this?  (Seems to
> me like it would have to.)

This is a subtle but important point, which needs many secondary pieces
of info to be fully qualified.  The quick answer is "yes in Mach 2.5,
no on Mach 3.0 and all U*xes".

a) When you say "virgin memory" you are talking about VIRTUAL memory.
The PHYSICAL memory behind it had obviously beed used for other purposes
before, possibly including program's instruction and possibly implying
that some of its content might be still in the I-cache (cuz the cache
is physically addressed on mips, and tx God it is).

b) the only reason the OS zero-fills freshly allocated memory is
for security: if it didn't a user will eventually get a page that
contains security-sensitive information in it.

c) U*x defines clearly what is text, what is data, and what is stack
in its VM system.  Unfortunately data-that-contains-text does not
belong in this simplistic scheme.  Ultrix does not I-flushes
pages obtained via sbrk(2), nor does RISCos (MIPS' and SGI's U*x,
based on System V).

d) In Mach I decided that if a page had the VM_PROT_EXECUTE permission
it contained text, regardless of its location in the address space.
I also decided that people would like the OS to make certain guarantees
about such a page, such as the one you ask for that it be I-cache
coherent with respect to paging operations.

e) In Mach 2.5 vm_allocate() creates pages that have the EXECUTE bit on,
and this is (in retrospect) an oversight on my side.  The cost on a pmax
for flushing the Icache on a zero-fill page fault (e.g. when you first
touch a vm_allocated page) is very great, over 30% of the total cost.
And zero-filling is a major fraction of, for instance, compilation of a
C program.

f) There are very few programs that execute code they have created on
the fly, for all of them the costs associated with code generation
go far bejond the cost of a single system call.

h) In Mach 3.0 the default VM protection does not include execute
permission, sparing me that 30% hit.  If a program does what lisp
does, it must vm_protect(2) appropriately the memory where it puts
instructions.

g) I will spare you a tirade on why vm_allocate(2) is not a primitive
and is implemented in terms of vm_map(2).

Moral:
	If you want your lisp to run on Mach 3.0 add a vm_protect()
	call after vm_allocate() calls.  I checked, and it currently
	dies with all the typical symptoms of cache incoherency on
	my 3max (which runs 3.0, obviously)

sandro-