> can’t really move beyond their training data I don’t even think humans can “move beyond” thei...

matricks • yesterday at 6:09 PM • 1 reply • view on HN

> can’t really move beyond their training data

I don’t even think humans can “move beyond” their sensory data. They generalize using it, which is amazing, but they are still limited by it.* So why is this a reasonable standard for non-biological intelligence?

We have compelling evidence that both can learn in unsupervised settings. (I grant one has to wrap a transformer model with a training harness, but how can anyone sincerely consider this as a disqualifier while admitting that an infant cannot raise itself from birth!)

I’m happy to discuss nuance like different architectures (carbon versus silicon, neurons versus ANNs, etc), but the human tendency to move the goalposts is not something to be proud of. We really need to stop doing this.

* Jeff Hawkins describes the brain as relentlessly searching for invariants from its sensory data. It finds patterns in them and generalizes.

Replies

A_D_E_P_T • yesterday at 7:49 PM

Human sensory data doesn't correspond -- not neatly, and probably not at all -- to LLM training data.

Human sensory data combines to give you a spatiotemporal sense, which is the overarching sense of being a bounded entity in time and space. From one's perceptions, one can then generalize and make predictions, etc. The stronger one's capacity for cognition, the more accurate and broader these generalizations and predictions become. Every invention, including or perhaps especially the invention of mathematics, is rooted in this.

LLMs have no apparent spatiotemporal sense, are not physically bounded, and don't know how to model the physical world. They're trained on static communications -- though, of course, they can model those, they can predict things like word sequences, and they can produce output that mirrors previously communicated ideas. There's something huge about the fact, staring us right in the face, that they're clearly not capable of producing anything genuinely new of any significance.

This is why AGI is probably in world models.

alt Hacker News

Replies