logoalt Hacker News

mathisfun123today at 3:26 AM3 repliesview on HN

> why do neural networks work better than other models

The only people for whom this is an open question are the academics - everyone else understands it's entirely because of the bagillions of parameters.


Replies

hodgehog11today at 3:49 AM

No it isn't, and it's frustrating when the "common wisdom" tries to boil it down to this. If this was true, then the models with "infinitely many" parameters would be amazing. What about just training a gigantic two-layer network? There is a huge amount of work trying to engineer training procedures that work well.

The actual reason is due to complex biases that arise from the interaction of network architectures and the optimizers and persist in the regime where data scales proportionally to model size. The multiscale nature of the data induces neural scaling laws that enable better performance than any other class of models can hope to achieve.

show 1 reply
tacettoday at 3:38 AM

Also massive human work done on them, that wasn't done before.

Data labeling is pretty big industry in some countries and I guess dropping 200 kilodollars on labeling is beyond the reach of most academics, even if they would not care about ethics of that.

geokontoday at 8:15 AM

normally more parameters leads to overfitting (like fitting a polynomial to points), but neural nets are for some reason not as susceptible to that and can scale well with more parameters.

Thats been my understanding of the crux of mystery.

Would love to be corrected by someone more knowledgable though

show 1 reply