[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Genera 7.0, Spelling of

I also wish there were a better dictionary.  I spent a lot of time
researching the question of where to get serious dictionaries.  As you
might guess, this data is owned and copyrighted, and so you have your
choice of using a dictionary known to be in the public domain, or
licensing dictionary data from Merriam-Webster, or Hougton-Mifflin
(American Heritage Dictionary), or someone like that.  I queried the
latter two and found that they don't sell word lists; they only sell
entire software packages, written in C, along with data in highly
cleverly encoded formats that this C software understands.  Trying to
deal with this was simply too much effort at the time, compared to our
other development priorities.  So we used the only public domain
dictionary we could get, took out as many mistakes as we could find, and
used that.

Even given a better dictionary of English, some of the things you'd like
to see really belong in separate special-interest dictionaries.
Initials like IEEE and SRI are not well-known to everyone.  Names like
Bentley and Steele are not exactly household words.  I'm not sure who
you mean by Ackerman.  (There's a famous function called the Ackermann
function.  Who's to say whether "Ackerman Function" is a misspelling or
not?  If Ackerman were in the dictionary, it would not be flagged.)  The
point is that some of these things are only "correct" for some people.
I believe that the person who cleaned up our dictionary (a technical
editor) removed various compounds like cleanup and coffeepot, which I
believe are not really words as far as most dictionaries are concerned.
The words Dutch and Roman are in the proper name category along with
Geneva and London.  I think coercible might be missing because it's one
of those words that's rarely used in the world outside computers. 
Certainly the word painless ought to be in there.

The basic explanation for the deficiencies in the spelling checker is
that we had very little resources allocated to do it.  It's not
considered as important as many other software projects.  So, while we
do understand all the things that could be done to make it better, we
just haven't had time to do it all.

Instead, I've been waiting for the C compiler to become operational, and
then I am hoping that we'll be able to go back to Hougton-Mifflin and
buy a spelling checker.  This software has all kinds of other good
things as well, such as hyphenation, and (I think) a thesaurus.  It's
sublicensed by something like 20 computer companies, including
Interleaf.  When it comes to providing the best system to our users,
it's often better for us to buy instead of build.  Instead of spending
our time replicating things that other people have already done, we
want to work on things that provide unique value.  That's the long-term
plan for the spelling checker, although I couldn't say when it'll happen.