[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Extending the address space of MIT Cscheme (long reply)
I suppose I was hoping that the CSchemers felt sufficiently guilty that
they would agree with my remark by their silence - I see this is not the
case!
> * When you refer to CScheme's "coding style" are you referring to
> the portion written in C, the portion written in Scheme, or both?
Primarily the portion written in C. I suppose I could admire the Scheme
portion if I knew why it existed and what the various pieces are supposed
to do (not being a brilliant MIT student, I can't just look at a half-page
long S-expression and comprehend it instantly).
> * How about some specific examples of design/engineering decisions
> that you consider flaws, and why?
Gazillions of builtin types. There are twice as many primitive types of
objects as any other high-level language that I know of (3- and 4-element
hunks? give me a break!). Many could have been built as nonprimitives.
See T for good efforts in that direction.
Attempts to optimize the C code implementing a virtual machine. If you use
a virtual machine, you've already lost speedwise; doing complicated C hacks
isn't going to recover much for you. (Presumably that's the reason for
hundreds of C macros that could have been function calls.)
Writing nonprimitives in C. My eye happened to fall on list_to_string,
which is about three times longer and more complicated in C than in Scheme.
What earthly reason could there be for this? I suppose it could be worse;
KCL is an example. On the other hand, even KCL doesn't put a Fast Fourier
Transform and a regular expression matcher in its C "microcode"...
References to apparent GC in all kinds of strange places. When I followed
them, the trail disappeared in a maze of macros.
References to the compiler and Edwin, thus violating every principle of
abstraction known to exist.
Strange code concerning "MIT ASCII" (that's what it said) vs regular
ASCII characters. Given that CScheme is "portable", why does this get
included in everybody's copy?
There are probably others, I haven't looked at every single one of the
120+ files and 50K+ lines of the C code...
> * Have you seen a lisp of comparable (or greater) functionality
> that is significantly better in either respect? Please name it,
> and explain why.
In the absence of a manual describing the "functionality" of CScheme,
it's impossible to compare it on that basis. I hope there's lots of
functionality, CScheme is larger than commercial Common Lisps (which is
amusing considering how Schemers abuse Common Lisp for its "bloat").
Spice Lisp/CMU Common Lisp has its flaws, but it's better overall (for
one thing, it's smaller!). T/Orbit has better structure and style, but
its documentation is too scanty to recommend. I would say that PSL/PCLS
is cleaner, but I am biased!
> * When you refer to "experiences in trying to understand CScheme",
> how much of that can be attributed to lack of documentation? Have
> you ever tried to understand another program of comparable
> complexity?
Lack of documentation is an unforgivable omission, and is by far my
biggest gripe about CScheme. It shouldn't take two hours to figure
out what kind of garbage collection is being done, or to figure out
what the "danger bit" does. I've worked with many programs on that
scale during my 12 years in computing, and to be fair, most large
programs are difficult to understand, even with documentation (TeX
for instance). This just means that *more* documentation is required,
not less! To put it another way, lesser minds can't be amazed by your
cleverness if they don't even know what's going on...
> For example, while I know a great deal about compilers, I would
> expect to spend a great deal of time and effort trying to
> understand a good one, such as GNU CC. I'd probably get pretty
> pissed off at the crummy way that certain things were designed.
> But I wouldn't blame this on RMS being a poor programmer. The
> real problem is that such a program is so complex that even the
> best programmer can't make it perfect.
A bogus argument - I'm not asking for perfection, but a minimal standard
of quality. There should *at least* be a one-line justification for the
existence of functions and macros.
BTW, I don't want to claim that anyone is a "poor programmer"! A phrase
comes to mind (don't remember from where), that there are only a few
Gandhi-like programmers who are never tempted to write bad code...
>A general statement like yours is easy to make and can convey false
>impressions. I'd rather not have my reputation (and that of my
>colleagues) tarnished by vague accusations about the quality of our
>work. On the other hand, I welcome specific, well-reasoned criticism.
I'm sorry to say it, but your reputation has already been tarnished by
the code you've allowed to travel throughout the world. Hopefully you
prefer to have someone say it to your face (so to speak :-) ) than
behind your back.
>* By and large, I'm very proud of the part of CScheme that is
>implemented in Scheme. I believe that it captures a great deal of
>modularity and abstraction, contains many fine symmetries, and in
>general is very robust. It also pushes some of our language
>technology to the limit: in particular it suffers from the lack of a
>good module description facility.
So why didn't you explain all these "abstractions" and "fine symmetries"?
>* The part written in C is a very different story. I think alot of
>this can be attributed to C itself. Some of it can be attributed to
>the fact that most of us in the Scheme group at MIT hate C, and
>therefore don't want to do the work that it would take to make it a
>beautiful C program. In fact, I'm not sure it would be possible to
>make it beautiful, although clearly it could be much much better.
I find it odd that haters of C would use macros to the extent that CScheme
does. I find it odd that haters of C would write so much of it.
I find it odd that people who are aware of C's problems wouldn't try
to alleviate them by commenting on what the code is up to.
>One thing that everyone here has agreed upon: we would all like to
>rewrite that C program in Scheme or a nice Scheme-like language if
>only the compiler technology were good enough to let us keep the
>portability and performance. Well, I think that will be true in
>another year or two, and then maybe it will happen.
This sounds like a veiled criticism of T and Orbit, since they are portable
and fast. KCL compiles to C code, so it maintains portability at that
level. (Someone could do a great public service by reimplementing KCL
in a less idiosyncratic way - the basic idea is quite sound.)
>* We have virtually no documentation. This is obviously a terrible
>thing, and we are in fact generating some. But the bottom line for
>this is simply lack of time, plus the fact that none of us has much
>text writing experience.
I hear that excuse from froshes, and don't accept it from them either.
Documentation after the fact is inherently inferior, and leaves out
important details that the programmers have forgotten about.
>It is not widely known that CScheme, along with the Liar compiler, the
>Edwin text editor, and a previous 68000-based implementation, have
>largely been created by three people over about 5 years. Two of us
>had significant other commitments during that time which reduced the
>amount of effort that could be devoted to the project. Yet we
>generated over a quarter million lines of hairy code in about 10 to 12
>person-years. It is only now that our project has expanded to the
>point where we feel that time is available for documentation.
Perhaps if you had thought more carefully about what you were doing,
it wouldn't have been necessary to write so much code, and surely five
years is enough time to think about writing more documentation!
Still, I do appreciate where you're coming from - it's something that
professional SEs have to contend with all the time (my own views have
no doubt been colored by one of my first jobs, which was to document
40,000 lines of Fortran written by several other people).
>* Various representation decisions that we have made have been
>criticised at various times. In particular: high tags vs. low tags;
>SCode vs. byte code; and more recently, register-based vs. stack-based
>calling convention. I can produce reasonable arguments for all of
>these decisions. One thing that I have noticed is that the
>"community" seems to have preconceptions about some of these
>alternatives being better than others. In some cases we've tested
>them and found that the "common knowledge" is misleading: often the
>performance difference between two of the alternatives is very slight.
I would like to see the data - there are a huge number of undocumented
claims like this. It is true that if you use a virtual machine, then
a lot of representation decisions are unimportant. In fact, my almost
finished thesis is all about the analysis of representation decisions.
>* In your message you essentially are criticising us for not being
>able to change a fundamental representation decision. I don't believe
>this is a valid criticism. I claim that NO implementation in
>existence could easily make a large change at that level. Small
>changes, on the other hand, are a different matter.
You are almost right. Our new Common Lisp implementation can change
from tags to BBOP to separate spaces, just by changing opencodings.
We can also vary the function protocol in interesting ways. This all
happens in a native code compiler. Admittedly, this wasn't easy, and
it is not yet complete. No publications yet (i.e. Lisp conf paper
rejected), but my thesis will be available shortly, and we do have
some initial reports.
To summarize: CScheme is not incredibly bad, but it is disappointing,
especially given the interest of the Scheme community in simplicity,
modularity, and abstraction. The most serious consequence is that people
interested in learning about Scheme implementation will look at CScheme,
and most likely conclude that people at MIT don't really believe all those
ideas being promulgated in SICP...
stan