[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

A suggestion about working sets



    Date: Fri, 6 Nov 87 07:13 EST
    From: Jeffrey Mark Siskind <Qobi@ZERMATT.LCS.MIT.EDU>
		 While this is true, it is possible to make
    one assumption which is usually a good approximation
    of reality: Processes usually do not share data so
    an object belongs to the process that created it,
    at least for the purpose of determining the total
    storage in use by a process. 
There are *many*, *many* ways that this is assumption
is false.

Things which use data shared between processes.

1read0 (symbols, pathnames)
Waving the mouse around.
Type definitions.
Flavors (the flavor structures, not the instances).
Documentation.
The generic filesystem.
The scheduler.


				 Thus tagging objects
    with the process of their creator would allow two
    tallys to be kept per process, one: the total
    ammount of virtual memory in use by that process and,
    two: the total ammount of real memory in use by that
    process. Actually, these tallys would have to be
    updated both by storage allocation, GC, and paging.
Use 1make-area0.  Realize that a single process doesn't
sit there doing just one thing all the time.  One minute
it may be running the compiler, another, it may be using
your favorite AI system, and another, it may be handling
mouse sensitivity.  Each activity involes a *different*
working-set.  Areas allow you to keep these distinct,
so when you start working on a new phase, you can get
more related objects loaded at a time.  In other words,
areas are *better* than a separate section of memory
for each process.

    The tallys would allow two additional types of firewalling
    that most conventional systems *DO* provide which
    the Symbolics *DOES NOT* (despite the claims about greater
    firewalling provided by the 3600 architecture).
    First, it would allow me to set a limit on the amount
    of virtual storage that a process can use.  
You can set a limit on the size of an area.
						A process
    that exceeds this would signal an error (perhaps with
    a proceed option to increase that limit and continue)
    rather than allowing a run-away process to trash my
    machine with all of its valuable state in other
    processes. (It takes me over two hours to reboot my
    machine these days and load all of my environment.)

Use incremental disk save.

    Second, it would allow me to set working set guidlines
    for the pager and scheduler to increase performance of
    interactive tasks while still allowing background
    tasks to run.  
This is about the only thing in this entire message that
I can see that we don't already provide.  It might be
valuable to be able to tell the pager that pages in
certain areas are to be swapped out more slowly, because
the currently-interactive process uses them heavily.

		   Often, I run many tasks at once, such
    as reading my Babyl file in Zmail (my Babyl file
    is over 4 megabytes long and takes an hour to read),
Use KBIN files.  It will cut your working-set for Zmail
dramatically, and it loads much much faster.  (It saves
about 50% slower, but you're not as often waiting for it
to save, and this is less than the time you save by not
having to reparse).

    running LaTeX in one Lisp Listener, running some
    Lisp program in another, and trying to edit a file
    in Zmacs. Although the preemptive schedular can (and
    presumably does) 
Yes, it does.
		     give a higher priority to the
    interactive task of editing, the thrashing caused by
    other proceeses paging and reducing the working set
    of the editor causes the editor to be painfully
    slow. If the working set of the higher priority tasks
    is made larger than lower priority ones, then they
    can page all they want, at a lower priority, without
    causing poor paging performance for my interactive
    task.



    Now I know that the overhead of storing a pointer to
    the creating procees in each object would be large.
    But if we make another assumption that there are
    usually at most 15 (or 31) processes ever started
    in a lisp world from boot to reboot then only 4 or 5
    bits of tag need to be added to each object (or word).
On my machine at the moment there are 34 processes.
(I cannot recall when I last saw as few as 15; it may
be before Symbolics was started).  I looked at one server
over lunch hour (i.e. off peak, but not light load) that
had 80 processes.  Another had 78.  You're talking 6 or
7 or 8 bits.  I.e. each process not much better than
a PDP-10.

Offhand, I would guess you've been using the VAX, where
they allocate entire bits like this.  It's a very inefficient
way to use your address space, and unnecessary.  Lisp implementations
have been using better techniques for dividing up memory
since BIBOP Maclisp, which as I recall was done about 12
years ago.  Use areas.

    When that runs out the remaining code can be used as
    an overflow code indicating that the storage ower is
    anonymous. This method would require hardware not
    available in the 36xx architecture, 
Why?  It sounds like you think memory allocation or paging
are done in hardware or something.  They aren't.

					but perhaps
    something like the area mechanism can be adapted
    to automatically have each process cons in a different
    area. I don't know if the I-machine architecture
    has the capability to support the ideas discussed
    here but perhaps the next machine (the J-machine?)
    could.
Hint:  The 3600 was known internally as the "L-machine".
	    Jeff