[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Request for help with paging.
Date: Tue, 14 Mar 89 20:08:41 EST
From: whuts!davel@att.att.com
We have written a system which appears to have a grave difficulty with
thrashing in the paging system. When freshly loaded, the system
appears to spend perhaps 10% of its process time in the paging system,
but after about four hours of continuous running, paging system time
increases dramatically to 80% or so. These results have been confirmed
with both the TIME macro and the Metering system. According to the
metering system, there is no small subset of functions responsible for
most of the paging.
It's usually the data structures that matter in paging, not the functions.
So far, we have found that our efforts to increase
locality of reference by using many areas have yielded minimal improve-
ment. We had expected the problem to be solved by the use of areas;
we have a physical memory of 6MW, each area is smaller than
4MW or so for the first five hours of the run, and references between
areas should be uncommon.
The point of using areas is not to minimize interarea references; if
that were the case you'd do best with one area. The point is to
optimize main memory usage by concentrating useful objects. That is,
objects that are likely to be used at the same time should be located
together. This minimizes the amount of main memory wasted storing
objects that aren't useful at the moment but happen to be on the same
page as useful objects.
As an example, consider the way Genera stores compiled code and
related objects. Some of the objects are used at compile time (macro
definitions, etc), some at run time (most compiled functions), and
some only when debugging (debug-info). Genera separates these using
area (compiled-function-area and debug-info-area). The compiled
function area is actually partitioned internally by Optimize World,
which is how the distinction between compiled functions is made.
Using areas is a delicate art, and it's very easy to end up making
things worse. You should get DLA to send you a copy of his thesis,
which I believe was excerpted in an early issue of Lisp Pointers.
For reference, we are using a 3675 with a swap
space of 150MW, but using a smaller swap space (75MW) seems to have no
effect on performance.
1. What is the paging scheme used by the 3600-series (e.g.,
least-recently-used), anyway? Software support doesn't know and has had
some difficulties getting a hold of a developer who does.
Genera uses a fairly standard LRU approximation for page replacement;
a clock algorithm supported by hardware maintained reference tags.
3600 and Ivory machines make approximately the same page replacement
decisions, but the Ivory implementation uses fewer CPU cycles to do so.
2. Has anyone had similar difficulties with paging? How did you solve
them?
Anyone who's used a virtual memory system has had problems with
paging at some point. They're usually solvable with patient analysis
followed up by careful design and implementation, but it's often
cheaper just to buy more memory boards. I believe there are several
people in the Symbolics consulting group with experience at
optimizing paging performance; you might consider their services.
3. We are particularly alarmed by the fact that increasing the
number of areas (and, we believe, increasing locality of reference)
has had so little impact. Does anyone know of a way to find out
how much physical memory is devoted to each area? Is there any way
to measure locality of reference directly, such as counting the
number of pointer references across area boundaries?
Something you might try is page tracing, read
sys:l-sys;page-trace.lisp for instructions. Trace a portion of your
system when it's thrashing, and then spend a day poring through the
trace. I'm sure you'll learn some interesting things.
Here's a function to describe main memory usage by area:
;;; -*- Mode:Lisp; Syntax: Zetalisp; Package: SYSTEM-INTERNALS; Lowercase:T; Base:8; -*-
(defun describe-main-memory (&key (depth 10.))
(labels ((percentage (fraction total)
(if (zerop total) 0.0 (* 100.0 (// fraction (float total))))))
(let ((array (make-array (n-areas) :initial-value 0))
(list nil)
(invalid 0)
(valid 0))
(without-interrupts
(loop for mmpt-index below *mmpt-size* do
(if (= (mmpt-invalid-vpn mmpt-index) 0)
(incf (aref array (%area-number (dpb (mmpt-vpn mmpt-index) %%vma-page-num 0))))
(incf invalid))))
(loop for area from 0 below (si:n-areas) do
(push (cons area (aref array area)) list))
(format t "~&Main memory breakdown:")
(format t "~& ~:D total pages, ~:D valid pages (~1$%)~%"
*mmpt-size* (setq valid (- *mmpt-size* invalid))
(percentage valid *mmpt-size*))
(formatting-table ()
(with-character-face (:italic)
(formatting-column-headings () "Area" "Pages" "Fraction"))
(loop repeat depth
for (area . frames) in (sort list (lambda (a b) (> (cdr a) (cdr b))))
do
(formatting-row ()
(formatting-cell () (format t "~A" (area-name area)))
(formatting-cell () (format t "~D" frames))
(formatting-cell () (format t "~1$%" (percentage frames valid)))))))))
David Loewenstern
<davel@whuts.att.com; backbone!{moss || ihnp4}!whuts!davel>
AT&T Bell Laboratories
14B-253
Whippany, NJ 07981
201-386-6516