logoalt Hacker News

mirekrusinyesterday at 10:18 PM1 replyview on HN

I think you're somehow right and wrong at the same.

All those "it's like ..." are faulty – "post-it notes" are not 3k pages of text that can be recalled instantly in one go, copied in fraction of a second to branch off, quickly rewritten, put into hierarchy describing virtually infinite amount of information (outside of 3k pages of text limit), generated on the fly in minutes on any topic pulling all information available from computer etc.

Poor man's RL on test time context (skills and friends) is something that shouldn't be discarded, we're at 1M tokens and growing and pogressive disclosure (without anything fancy, just bunch of markdowns in directories) means you can already stuff-in more information than human can remember during whole lifetime into always-on agents/swarms.

Currently latest models use more compute on RL than pre-training and this upward trend continues (from orders of magnitude smaller than pre-training to larger that pre-training). In that sense some form of continous RL is already happening, it's just quantified on new model releases, not realtime.

With LoRA and friends it's also already possible to do continuous training that directly affects weights, it's just that economy of it is not that great – you get much better value/cost ratio with above instead.

For some definitions of AGI it already happened ie. "someboy's computer use based work" even though "it can't actually flip burgers, can it?" is true, just not relevant.

ps. I should also mention that I don't believe in "programmers loosing jobs", on the contrary, we will have to ramp up on computational thinking large numbers of people and those who are already verse with it will keep reaping benefits – regardless if somebody agrees or not that AGI is already here, it arrives through computational doors speaking computational language first and imho this property will be here to stay as it's an expression of rationality etc


Replies

0xbadcafebeeyesterday at 11:06 PM

> you can already stuff-in more information than human can remember during whole lifetime

The human eye processes between 100GB and 800GB of data per day. We then continuously learn and adapt from this firehose of information, using short-term and long-term memory, which is continuously retrained and weighted. This isn't "book knowledge", but the same capability is needed to continuously learn and reason on a human-equivalent level. You'd need a supercomputer to attempt it, for a single human's learning and reasoning.

RL is used for SOTA models, but it's a constant game of catch-up with limited data and processing. It's like self-driving cars. How many millions of miles have they already captured? Yet they still fail at some basic driving tasks. It's because the cars can't learn or form long-term memories, much less process and act on the vast amount of data a human can in real time. Same for LLM. Training and tweaking gets you pretty far, but not matching humans.

> With LoRA and friends it's also already possible to do continuous training that directly affects weights, it's just that economy of it is not that great

And that means we're stuck with non-AGI. Which is fine! We could've had flying cars decades ago, but that was hard, expensive and unnecessary, so we didn't do that. There's not enough money in the global economy to "spend" our way to AGI in a short timeframe, even if we wanted to spend it all, even if we could build all the datacenters quickly enough, which we can't (despite being a huge nation, there are many limitations).

> For some definitions of AGI

Changing the goalposts is dangerous. A lot of scary real-world stuff is hung on the idea of AGI being here or not. People will keep getting more and more freaked out and acting out if we're not clear on what is really happening. We don't have AGI. We have useful LLMs and VLMs.

show 1 reply