[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: kcl speed problems
I would suggest looking at the file fft-mod.cl below, to see how fft.cl
was improved to run 10 times faster. Also make sure
compiler::*cc* does not contain -msoft-float, so that the
floating point operations will be inlined by the c compiler.
If properly declared floating point performance can be excellent.
In akcl on an ibm rs6000 I raised a 50x50 matrix to the 100'th power
using stupid multiplication, in 15 seconds. That is about 6.5Mflops.
The arrays should be of element-type long-float, and must get declared
as such. Variables holding entries must also be declared.
When you look at disassembled code, you should not see function calls
like number_plus, but rather inline c code.
ps: the files are available in gabriel.tar.Z on rascal, and
are part of the standard gabriel bench marks.