logoalt Hacker News

WD-42yesterday at 11:06 PM4 repliesview on HN

What I really don't understand is where the next generation of training material will come from. If websites stop being published and/or crawled, how will the machine continue to be fed.


Replies

azlevyesterday at 11:16 PM

Current executives think it's a problem for the future executives.

show 1 reply
dyauspitrtoday at 1:11 AM

Probably real life. At some point, these LLMs are going to be good enough to just train themselves off of cameras and audio recordings of people out in the real world. They’re going to have robots everywhere constantly listening to what people are saying.

Alternatively, they’re probably betting on being able to get the AGI with everything we already currently have and at that point further training doesn’t matter.

show 1 reply
bediger4000yesterday at 11:09 PM

Either Google is ignoring that, or crossing their fingers and hoping that one LLM can produce data to train another one.

wyreyesterday at 11:42 PM

They have enough internet slop. The training material they care about comes from experts, not randos online. This is why Mercor and Scale are billion dollar companies.