I agree, this is the correct way to see it IMO. Instead of designing better optimizers, we designed ...

hellohello2 • today at 12:08 AM • 1 reply • view on HN

I agree, this is the correct way to see it IMO. Instead of designing better optimizers, we designed easier parameterizations to optimize. The surprising part is that these parameterizations exist in the first place.

Replies

sigmoid10 • today at 6:34 AM

Gradient descent is mathematically the most efficient optimization strategy (safe for some special functions) in high dimensions. This goes so far that people nowadays even believe it has to be used in the human brain [1], if only because every other method of updating the brain would be way too energy inefficient. From that perspective, finding the right parameterization was all we ever needed to achieve AI.

[1] https://physoc.onlinelibrary.wiley.com/doi/full/10.1113/JP28...

➕ show 1 reply

alt Hacker News

Replies