[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: why is tree-shaking hard?



I don't know what others have done, but our tentative effort at tree-shaking 
(no longer supported) completely ignored the possibility of functions becoming
accessible via INTERN, READ, etc.  Even so, tree-shaking was only marginally
effective (producing applications that were still several megabytes.)  

The problem was that a lot of CL was implicitly used by the "run-time system".
In particular, unless you totally gave up the possibility of debugging, any
error could land you in the debugger, which used read, print, format, sequence
functions, and any other hairy CL feature you care to think of.  Also, tree
shaking can never eliminate support for things liks bignums.

What our "tree-shaker" did was simply destroy the package system, and then do a
GC.  Any symbols remaining were put back in the package they were originally
in, allowing debugging.  It was kind of amusing to note that you could still
evaluate just about any CL expression that randomly came to mind without
calling any of the functions in the 2/3's of the image that was missing.

b.t.w., when you look at the distributed CMU CL image, a whopping fraction is
the compiler and PCL.  By simply not loading some packages and byte-compiling
others, we get our "runtime" image, which is 40% the size of the standard
image.  This has all of CLtL1 except general EVAL and COMPILE, and isn't that
much bigger than what you get from non-aggressive tree-shaking.

You can definitely get better control over size by not loading the stuff you
don't need.  One way that this has been done in Lisp is by "autoloading", but
this isn't a complete solution, since it still requires as much or more disk
space, and may result in worse sharability in multi-user environments.

It's hard to beat the idea of libraries which are explicitly loaded when
needed (though tree shaking may still be useful if only a fraction of a library
is used.)

  Rob