[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fast reading



In article <ab2779b102021004658a@[18.30.0.164]> psz@mit.edu (Peter Szolovits) writes:

   I need to read in some fairly large tables of data, in tab-delimited
   format ...   Fast!

A three ideas come to mind, they are - from slowest to fastest:

1) Use (set-macro-character #\tab #'(lambda(&rest ignore) :tab)) , and
use the usual read, or for something faster: stream-read. 

2) Use the function (stream-reader stream), which returns a fast
function which can be used like read-byte, and do character oriented
parsing of the file yourself. Stream-reader is documented in the mcl
manual.

3) Use the mac-file-io example to read the whole file into the mac
heap and use your own parsing function using (%get-byte macptr offset)
to access each character as a number (or use %code-char to change it
into a character).
   You would get the best code by using (declare (optimize (speed 3)
(safety 0))) , declaring your loop variable as fixnum and using
something like (the fixnum (%get-byte (the macptr pointer) (the fixnum
offset))) as the byte accessor.
   Using a c-like %inc-ptr instead of the offset to move the pointer
would be less efficient (because of way pointers are implemented in
mcl), as would using the %code-char, since you can write the parsing
routines just as well using bytes and you avoid that much overhead.

If you are not constantly getting new data then you could use a really
slow routine to read the data (overnight) into an array and then
create a fasl file out of that, which will read in pretty fast. The
trick there is to compile a file which has one line in it 

(setq the-answer #.the-array)

Compiling this file will create a fasl file with your data. After
loading the fasl file, the-answer will hold your data. I'm
assuming that your data is all immediates - strings and numbers.
The compiler doesn't write out some kinds of objects.

Hope this helps, 

alan