This could have some interesting hardware implications as well - it suggests that a large dedicated ...

tgtweak • today at 2:27 PM • 3 replies • view on HN

This could have some interesting hardware implications as well - it suggests that a large dedicated silicon instruction set could accelerate any mathematical algorithm provided it can be mapped to this primitive. It also suggests a compiler/translation layer should be possible as well as some novel visualization methods for functions and methods.

Replies

benleejamin • today at 2:32 PM

I'm not too familiar with the hardware world, but does EML look like the kind of computation that's hardware-friendly? Would love for someone with more expertise to chime in here.

➕ show 2 replies

tgtweak • today at 2:45 PM

This paper seems to suggest that a chip with 10 pipeline stages of EML units could evaluate any elementary function (table 4) in a single pass.

I'm curious how this would compare to the dedicated sse or xmx instructions currently inside most processor's instruction sets.

Lastly, you could also create 5-depth or 6-depth EML tree in hardware (fpga most likely) and use it in lieu of the rust implementation to discover weight-optimal eml formulas for input functions much quicker, those could then feed into a "compiler" that would allow it to run on a similar-scale interpreter on the same silicon.

In simple terms: you can imagine an EML co-processor sitting alongside a CPUs standard math coprocessor(s): XMX, SSE, AMX would do the multiplication/tile math they're optimized for, and would then call the EML coprocessor to do exp,sin,log calls which are processed by reconfiguring the EML trees internally to process those at single-cycle speed instead of relaying them back to the main CPU to do that math in generalized instructions - likely something that takes many cycles to achieve.

➕ show 1 reply

alt Hacker News

Replies