The amount of things LLMs can do is insane.
It's interesting to me how much effort the AI companies (and bloggers) put into claiming they can do things they can't, when there's almost an unlimited list of things they actually can do.
Only because they have compressed and encoded the entire sum of human knowledge at their disposal. There are models for everything in there, but they can only do what has been done before.
What's more amazing to me is the average human, only able to hold a relatively small body on knowledge in their mind, can generate things that are completely novel.
This reminds me of "Devin". You know, the first "AI software engineer", which had the hype of the day but turned into a huge flop.
They had ridiculous demos of Devin e.g. working as a freelancer and supposedly earning money from it.
The hype has gotta keep going or the money will dry up. And hype can be quantified by velocity and acceleration, rather than distance. They need to keep the innovation accelerating, or the money stops. This is of course completely unreasonable, but also why the odd claims keep happening.
And many of them so unexpected, given the unusual nature of their intellegence emerging from language prediction. They excel wherever you need to digest or produce massive amounts of text. They can synthesize some pretty impressive solutions from pre-existing stuff. Hell, I use it like a thesaurus to sus out words or phrases that are new or on the tip of my tounge. They have a great hold on the general corpus of information, much better than any search engine (even before the internet was cluttered with their output). It's much easier to find concrete words for what you're looking for through an indirect search via an LLM. The fact that, say, a 32GB model seemingly holds approximate knowlege of everything implies some unexplored relationship between inteligence and compression.
What they can't they do? Pretty much anything reliably or unsupervised. But then again, who can?
They also tend to fail creatively, given their synthesize existing ideas. And with things involving physical intuition. And tasks involving meta-knowlege of their tokens (like asking them how long a given word is). And they tend to yap too much for my liking (perhaps this could be fixed with an additional thinking stage to increase terseness before reporting to the user)
I've been pushing Opus pretty hard on my personal projects. While repeatability is very hard to do, I'm seeing glimpses of Opus being well beyond human capabilities.
I'm increasingly convinced that the core mechanism of AGI is already here. We just need to figure out how to tie it together.
Because most of these things are not multi-trillion-dollar ideas. "We found a way to make illustrators, copyeditors, and paralegals, and several dozen other professions, somewhat obsolete" in no way justifies the valuations of OpenAI or Nvidia.