[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unable to deliver letter



Unable to deliver letter to the following recipient:
  SLUG@R20.UTexas.EDU:
    After trying without success for 3 days to contact the relevant
    domain servers needed to resolve the domain name "R20.UTexas.EDU",
    the Mailer must presume that there probably is no such domain.

----- Text of letter follows -----
Received: from EUPHRATES.SCRC.Symbolics.COM by STONY-BROOK.SCRC.Symbolics.COM via CHAOS with CHAOS-MAIL id 361457; Fri 11-Mar-88 18:22:24 EST
Date: Fri, 11 Mar 88 18:22 EST
From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>
Subject: A question about tweaking the 3600 Lisp Compiler
To: Jeffrey Mark Siskind <Qobi@ZERMATT.LCS.MIT.EDU>
cc: Customer-Reports@STONY-BROOK.SCRC.Symbolics.COM, SLUG@R20.UTexas.EDU,
    Discuss-Compiler@STONY-BROOK.SCRC.Symbolics.COM
In-Reply-To: <880308012605.1.QOBI@WHIRLWIND.LCS.MIT.EDU>
Message-ID: <19880311232215.1.MOON@EUPHRATES.SCRC.Symbolics.COM>

    Date: Tue, 8 Mar 88 01:26 EST
    From: Jeffrey Mark Siskind <Qobi@ZERMATT.LCS.MIT.EDU>

	Date: Mon, 7 Mar 88 11:22 EST
	From: David A. Moon <Moon@STONY-BROOK.SCRC.Symbolics.COM>

	    Date: Fri, 4 Mar 88 11:00 PST
	    From: Siskind.pa@Xerox.COM

	    I am doing some experiments where I compile SAT problems
	    (Boolean satisfiability) and CSP problems (Constraint Satisfaction
	    problems) directly into very tight Lisp code. The code is very simple.
	    It is not very Lisp-like and ressembles Assembler code. It is all one
	    large PROG with lots of IF-EQ-GOTOs and some SETQs to local variables.
	    I can generate such code from the SAT/CSP problems quite quickly. The good
	    thing about such code is that once it is compiled it runs exceedingly
	    quickly (about 10,000 times faster than other more sophisticated SAT/CSP
	    solvers which I have tried). The only problem is that the 3600 Lisp compiler
	    takes forever to compile the large proceedures which my SAT/CSP-Lisp translater
	    generates because of all of the optimization which is being done. 

	Are you sure it's the optimization, or is that just an assumption?

    I'm pretty sure. In the example shown below, 58% is spent in COMPILER:OPTIMIZE-FORM.

The fact that 58% of the time was spent in a function with that name, and in functions
that it calls, does not prove that optimization is where the time is being spent.
For example, macro expansion takes place inside of that function.  So do a bunch
of other things that aren't optimization.

									      All of
	    that optimization is useless for this application as the code which I
	    generate has a trivial one-to-one correspondence to the most optimal
	    3600 machine instruction sequences which the compiler ends up prducing.
	    I tried setting both COMPILER:*PHASE-1-OPTIMIZE-P* and COMPILER:*RUN-PHASE-3*
	    to NIL. That helped a bit but not enough.

    Just for your info, it was Eric Weaver who suggested that I turn off those two switches.

	    Question 1: How do I turn off ALL optimizations.

	I would first use metering tools, even crude ones like pressing c-m-Suspend
	a statistically significant number of times while compiling, to find out
	where the time is being spent.  Only then should you worry about turning
	things off.

    Here are the results of using my profiling tools. (Do you remember the help you gave
    me several months ago. Well this tool which I wrote is the result of that help. I find
    it really easy to use and provides simple, uncluttered, and easy to interpret information.
    I'll send you a copy if you want. I would submit it to SLUG but I don't know how.)

    Anyway, first I ran the compiler with COMPILER:*PHASE-1-OPTIMIZE-P* and COMPILER:*RUN-PHASE-3*
    both set to T. It took 1715 samples (at 10 samples per second) to compile my one function.
    Here are the percentages of time spent inside each function. (The times are cummulative,
    if A calls B then B's time is also charged against A.)

The use of cumulative percentages on a program as highly recursive as the compiler
means that there is essentially no information here on where the time is actually
being spent.

    A total of 1715 samples were taken.

    The following system functions were profiled:
     92%  COMPILER:COMPILE-TO-CORE
     92%  (:INTERNAL COMPILER:COMPILE-TO-CORE 1 (:DUMP-LAMBDA-EXPRESSION))
     92%  COMPILE
     92%  COMPILER:COMPILE-DEFINITION-1
     92%  COMPILER:COMPILE-LAMBDA-EXP
     69%  COMPILER:PHASE-1-BLOCK
     69%  COMPILER:RUN-PHASE-1-FUNCTION
     69%  (:PROPERTY PROG COMPILER:PHASE-1)
     69%  COMPILER:PHASE-1-EXTERNAL-FUNCTION
     69%  COMPILER:PHASE-1-FUNCTION-DEFINITION
     69%  COMPILER:PHASE-1-PROG
     69%  COMPILER:PHASE-1
     61%  COMPILER:TRANSFORM
     58%  COMPILER:OPTIMIZE-FORM
     55%  (:PROPERTY IF COMPILER:PHASE-1)
     37%  COMPILER:TRANSFORM-PREDICATE
     21%  COMPILER:ASSEMBLE-FUNCTION
     21%  COMPILER:RUN-PHASE-4
     17%  COMPILER:FIXUP-BRANCHES
     15%  COMPILER:PASS-OVER-BRANCHES
     11%  SYS:%ALLOCATE-BLOCK
     10%  SI:%GC-SCAVENGE
      6%  SI:%ALLOCATE-LIST-BLOCK-ESCAPE
      6%  COMPILER:PHASE-1-PROGN

    I then ran the compiler again with both switches set to NIL. Here are the results
    (For your info, the function being compiled is a CSP (Constraint Satisfaction problem)
    encoding of the 8-Queens problem.)

The only information I got from the second profile is that failing to exclude
the GC is making your results unreproducible.  Thus I can't compare the two
profiles to see what the effect of setting those switches was.

If you have access at MIT to Genera 7.2, you might try the metering tools
there.  I'm told they're really good, and not as prone to the artifacts that
are showing up in your metering here.  That's all I know; I have not actually
used the 7.2 metering tools myself, as I have not done any metering since
they became available to me.

    A total of 974 samples were taken.

    The following system functions were profiled:
     74%  COMPILER:COMPILE-TO-CORE
     74%  (:INTERNAL COMPILER:COMPILE-TO-CORE 1 (:DUMP-LAMBDA-EXPRESSION))
     74%  COMPILE
     72%  COMPILER:COMPILE-DEFINITION-1
     72%  COMPILER:COMPILE-LAMBDA-EXP
     43%  SYS:%ALLOCATE-BLOCK
     41%  SI:%GC-SCAVENGE
     38%  COMPILER:ASSEMBLE-FUNCTION
     38%  COMPILER:RUN-PHASE-4
     34%  COMPILER:PHASE-1-BLOCK
     34%  COMPILER:RUN-PHASE-1-FUNCTION
     34%  (:PROPERTY PROG COMPILER:PHASE-1)
     34%  COMPILER:PHASE-1-EXTERNAL-FUNCTION
     34%  COMPILER:PHASE-1-FUNCTION-DEFINITION
     34%  COMPILER:PHASE-1-PROG
     34%  COMPILER:PHASE-1
     26%  (:PROPERTY IF COMPILER:PHASE-1)
     23%  SI:%ALLOCATE-STRUCTURE-BLOCK-ESCAPE
     23%  SI:SIMPLE-MAKE-ARRAY-NSS-AREA
     23%  COMPILER:FIXUP-BRANCHES
     20%  SI:%ALLOCATE-LIST-BLOCK-ESCAPE
     19%  COMPILER:PASS-OVER-BRANCHES
     13%  COMPILER:ASSEMBLE-INTERVAL
     11%  APPEND
     11%  REDUCE
     11%  SI:%MAKE-LIST
      9%  COMPILER:PHASE-1-FUNCTION-CALL
      9%  (:PROPERTY COMPILER:JUMP COMPILER:ASSEMBLE)
      8%  COMPILER:TRANSFORM
      8%  COMPILER:NOTE-BRANCH
      6%  COMPILED-CSP-SOLVER-INTERNAL

	You can start by eschewing macros like IF and PUSH, as well as optimizable
	functions like LIST, if you're really aware of a one-to-one correspondence
	between your code and machine instructions.

    Are you trying to tell me that COND is more primitive than IF? 

Yes, but as it turns out, some time during the last three years while I wasn't
looking this was changed.  IF is now the primitive, and what I said about it
being a macro was wrong.  My comments still apply to PUSH, though.

								   Anyway, macro
    expansion is trivial for my examples and doesn't take much time. I will however
    follow your suggestion and not use macros. BTW, what are the non-optimizable
    or pre-optimized versions of LIST?

Look at the COMPILER:OPTIMIZED-INTO property of LIST.

	    Question 2: I know the machine code sequences which I want to generate.
	    There are only two general templates of length 6 and 2 instructions
	    respectively. Can I use an assembler directly and I how do I do that?

	It's a little known fact, but there is a crude assembler buried somewhere
	in the system.  I'd rather not tell you how to use it, for three reasons.
	One is that it won't actually work for you; without some modifications, it
	won't assemble code that you can call as an ordinary function.  Second is
	that if you adopt that approach, your code won't port to future machine
	offerings.  Third, because of the way that assembler is implemented, it
	may well be slower than the compiler.  We can keep this idea in reserve
	in case better ideas don't pan out, but currently I doubt that the assembler
	will be the answer to your problems.

    I don't need to call the assembler generated function as an ordinary function.

How else do you propose to call it?

    The whole purpose of the experiment is to figure out how large the constant factor
    is between CSP solvers which compile into machine code and CSP solvers which
    function more like interpreters. The claim which I am trying to verify is that
    variable reordering schemes, which lend themselves better to interpreted methods,
    do not buy back the inefficiency lost from interpretation instead of compilation.

This is beyond me, sorry.  I'm not familiar with these issues.

	    Question 3: Is there an indirect GOTO instruction? If there is, how do I
	    access it directly from lisp? 

	In 7.2 the CASE construct compiles into an indexed branch when the keys
	are integers in a small range.  I don't believe this instruction exists
	in 7.1.  In all releases, the undocumented SYS:%JUMP subprimitive
	transfers control to a PC.  I do not recommend using it as it's very
	easy to get into trouble with it.  I'm not aware of any special form for
	converting a GO-tag into a PC value.

    The SYS:%JUMP subprimitive is exactly what I need. I realize that I will get into
    trouble with this. I am not producing production code however, I am doing an
    experiment which is not very LISPlike in nature but since I do all of my work
    on a 3600 (for obvious reasons) I need to be able to do this one assembler level
    experiment to verify a conjucture.

    I will need to be able to convert a GO-tag into a PC value for this to work at
    at all.

I don't know how to do this.

					  I also need to be able to compare two
	    PCs as integers and determine which is larger before I branch indirectly
	    to one of of them. How can I do that?

	I don't understand why you would want to do this nor how you would get
	your hands on PC values.  However, given that you have two PC values,
	the subprimitive SYS:%POINTER-LESSP will compare them, as documented.

    If you want to know the real nitty-gritty reasons, here they are. One of the
    CSP techniques which I am trying to code up is called Selective Backtracking
    (no it is not full Dependency-Directed Backtracking, only one aspect of it.)
    This technique computes nogoods which are sets of mutually inconsistent
    assumptions. It turns out that assumptions can be most efficiently represented
    as PC values for the code which made the assumption. One thing that Selective
    Backtracking must do is take a nogood and find the most recent assumption
    in it. If the order of assumptions is fixed at compile time (no variable reordering)
    then I can find the latest assumption by finding the largest PC-value.
    All of this can be compiled into very tight machine code on almost any architecture.
    My conjecture is that the efficiency of such tight code far outweighs the benefits
    of variable reordering which makes such tight code optimization tricks impossible.
    In the general case, I will need to create arbitrary data-structures representing
    hierarchal nogoods over assumptions, which in this case will be PC-values, and
    then do a form of resolution on these hierarchal nogoods. The resolution operation
    will need to be able to compare pointers of =, and <.

	I hope you find these suggestions helpful.  If you have more questions,
	feel free to ask me.  I may take a few days to respond.  Be sure to copy
	Customer-Reports and Discuss-Compiler on all mail.

    I really do appreciate these suggestions.

	    Below is a sample piece of code which my SAT translater generates:

	    (LAMBDA (ALL?)
	      (PROG (RESULTS P0001 P0002 P0003)
		    (SETQ RESULTS 'NIL)
		    (SETQ P0001 NIL)
		    (GO N0004)
	      F0005 (IF P0001 (RETURN RESULTS))
		    (SETQ P0001 T)
	      N0004 (IF (NOT P0001) (GO S0006))
		    (GO F0005)
	      S0006 (SETQ P0002 NIL)
		    (GO N0007)
	      F0008 (IF P0002 (GO F0005))
		    (SETQ P0002 T)
	      N0007 (IF P0001 (GO S0009))
		    (IF (NOT P0002) (GO S0009))
		    (GO F0008)
	      S0009 (SETQ P0003 NIL)
		    (GO N0010)
	      F0011 (IF P0003 (GO F0008))
		    (SETQ P0003 T)
	      N0010 (IF P0002 (GO S0012))
		    (IF (NOT P0003) (GO S0012))
		    (GO F0011)
	      S0012 (PUSH (LIST (CONS 'A (IF P0001 :TRUE :FALSE))
				(CONS 'B (IF P0002 :TRUE :FALSE))
				(CONS 'C (IF P0003 :TRUE :FALSE)))
			  RESULTS)
		    (IF ALL? (GO F0011) (RETURN RESULTS))))