Does anyone knows the resources for the algos used in the HW implementations of math functions? I mean the algos inside the CPUs and GPUs. How they make a tradeoff between transistor number, power consumption, cycles, which algos allow this.
Here's one way to do it.
https://en.wikipedia.org/wiki/CORDIC
Here's one way to do it.
https://en.wikipedia.org/wiki/CORDIC