> how would you empirically disprove that it doesn't have understanding?
The complete failure of Claude to play Pokemon, something a small child can do with zero prior instruction. The "how many r's are in strawberry" question. The "should I drive or walk to the car wash" question. The fact that right now, today all models are very frequently turning out code that uses APIs that don't exist, syntax that doesn't exist, or basic logic failures.
The cold hard reality is that LLMs have been constantly showing us they don't understand a thing since... forever. Anyone who thinks they do have understanding hasn't been paying attention.
> i can prove that it does have understanding because it behaves exactly like a human with understanding does.
First, no it doesn't. See my previous examples that wouldn't have posed a challenge for any human with a pulse (or a pulse and basic programming knowledge, in the case of the programming examples). But even if it were true, it would prove nothing. There's a reason that in math class, teachers make kids show their work. It's actually fairly common to generate a correct result by incorrect means.
> The complete failure of Claude to play Pokemon, something a small child can do with zero prior instruction
cherry picking because gemini and gpt have beat it. claude doesn't have a good vision set up
> The "how many r's are in strawberry" question
it could do this since 2024
> The "should I drive or walk to the car wash" question
the SOTA models get it right with reasoning
> fact that right now, today all models are very frequently turning out code that uses APIs that don't exist, syntax that doesn't exist, or basic logic failures.
not when you use a harness. even humans can't write code that works in first attempt.