[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: kcl speed problems



I would suggest looking at the file fft-mod.cl below, to see how fft.cl
was improved to run 10 times faster.    Also make sure
compiler::*cc* does not contain -msoft-float, so that the
floating point operations will be inlined by the c compiler.

If properly declared floating point performance can be excellent.
In akcl on an ibm rs6000 I raised a 50x50 matrix to the 100'th power
using stupid multiplication, in 15 seconds.   That is about 6.5Mflops.

The arrays should be of element-type long-float, and must get declared
as such.     Variables holding entries must also be declared.
When you look at disassembled code, you should not see function calls
like number_plus, but rather inline c code.

Bill

ps:  the files are available in gabriel.tar.Z on rascal, and
are part of the standard gabriel bench marks.