logoalt Hacker News

bradfayesterday at 6:59 PM4 repliesview on HN

The key seems to be that you take the transcript of a model working within a problem domain that it’s not yet good at or where the context doesn’t match it’s original training and then you continually retrain it based on its efforts and guidance from a human or other expert. You end up with a specialty model in a given domain that keeps getting better at that domain, just like a human.

The hard part is likely when someone proves some “fact” which the models knows and has had reinforced by this training is no longer true. The model will take time to “come around” to understand this new situation. But this isn’t unlike the general populous. At scale humans accept new things slowly.


Replies

bryanrasmussenyesterday at 7:27 PM

> But this isn’t unlike the general populous. At scale humans accept new things slowly.

right, the model works like humans at scale. Not like a human who reads the actual paper disproving the fact they thought was correct and is able to adapt. True not every human manages to do that, science advancing one death at a time, but some can.

But since the model is a statistical one, it works like humans at scale.

hyperpapeyesterday at 11:06 PM

> At scale humans accept new things slowly.

I think this is true, but there are big differences. Motivated humans with a reasonable background learn lots of things quickly, even though we also swim in an ocean of half-truths or outdated facts.

We also are resistant to certain controversial ideas.

But neither of those things are really that analogous to the limitations on what models can currently learn without a new training run.

emporasyesterday at 7:50 PM

Context learning means learning facts or rules without pre-training. They are two distinct phases.

An interesting question is, if pre-trained specialized models are available for a thousand or ten thousand most common tasks humans do every day, of what use a general model could be?

4b11b4yesterday at 9:27 PM

Yes, that's precisely the problem, you want continuous learning but you also want continuous pruning.