> due to fundamental limitations People keep throwing this phrase around in relation to LLMs, w...

p-e-w • today at 5:51 AM • 11 replies • view on HN

> due to fundamental limitations

People keep throwing this phrase around in relation to LLMs, when not a single “fundamental limitation” has been rigorously demonstrated to exist, and many tasks that were claimed to be impossible for LLMs two years ago supposedly due to “fundamental limitations” (e.g. character counting or phonetics) are non-issues for them today even without tools.

Replies

aDyslecticCrow • today at 1:30 PM

> character counting

The models now whaste a vast amount of useless neurons memorising the character count the entire English language so that people can ask how many r's are in strawberry and check a tickbox in a benchmark.

The architecture cannot efficiently or consistently represent counting letters in words. We should never have forced trained them to do it.

This goes for other more important "skills" that are unsuited to tranformer models.

Most models can now do decent arithmetics. But if you knew how it has encoded that ability in its neurons then you would never ever ever ever trust any arithmetic it ever outputs, even in seems to "know" it (unless it called a calculator MCP to achieve it).

There are fundamental limitations, but we're currently brute forcing ourselves through problems we could trivially solve with a different tool.

➕ show 1 reply

coldtea • today at 7:52 AM

>People keep throwing this phrase around in relation to LLMs, when not a single “fundamental limitation” has been rigorously demonstrated to exist

Some limitations are not rigorously demonstrated to be fundamental, but continuously present from the first early LLMs yes. Shouldn't the burden of proof be on those who say it can be done?

And some limitations are fundamental, and have been rigorously demonstrated, e.g.:

https://arxiv.org/abs/2401.11817?utm_source=chatgpt.com

➕ show 1 reply

dijit • today at 6:06 AM

Character counting remains a huge issue without tools.

Are you using only frontier models that are gated behind openai/anthropic/google APIs? Those use tools to help them out behind the scenes. It remains no less impressive, but I think we should be clear.

girvo • today at 9:32 AM

The literal best public models still fail to count characters consistently in practice so I’m not sure what you mean. It’s literally a problem we’re still trying to solve at work

➕ show 1 reply

3form • today at 9:20 AM

Is character counting actually not an issue anymore? Do you know somewhere where I can read more about this?

mrob • today at 9:48 AM

Character counting errors are a side effect of tokenization, which is a performance optimization. If we scaled the hardware big enough we could train on raw bytes and avoid it.

➕ show 1 reply

3form • today at 10:15 AM

Your comment, after removing the particulars, has a shape of:

People have an <opinion> which hasn't been rigorously proven, while <not rigorously proven counteropinion>.

As such, I am not sure what you're trying to achieve here.

raincole • today at 9:46 AM

Drawing five fingered humans was a fundamental limitation... until it's not.

danpalmer • today at 6:58 AM

This is kind of my point, we need to get better at describing the limitations and study them. It seems extremely clear that there are limitations, and not just temporary ones, but structural limitations that existed at the beginning and continue to persist.

➕ show 1 reply

Marazan • today at 8:24 AM

If you remove the auxiliary tools and just leave the core LLM then strawberry still has an undefined number of `r`s in it.

➕ show 1 reply

rimliu • today at 6:31 AM

of course, if you choose to ignore all the limitations they indeed have no limitations.

➕ show 1 reply

alt Hacker News

Replies