What's up with the buzzword bragging?
You don't know buzzword A, B, C? Heh, he must be incompetent and know nothing.
The buzzwords mean nothing, really. The math is the same for a stupid or a smart model, because the model is trying to mimic properties of the training dataset.
You can give me the ultimate model architecture that will beat every model in existence and I can still figure out a way to make it perform worse than what's available today, but you're not even doing that, you're just drumming up some old news.
If someone "threatened" me with tech advancements I would be more worried about things like an imminent massive drop in token costs for bigger context windows or other game changers like continual learning where the model internalizes your code base into its weights rather than just keeping it in its context.
It’s not buzzword bragging they are the prerequisites to having an coherent conversation. If someone doesn’t know what chinchilla scaling laws are the discussion about “I think things are saturated” is not grounded in anything. It’s like sitting around debating quantum mechanics and you don’t know the math, it’s just meaningless. If these sound like buzzwords the implication is not “you’re an idiot” it’s “you are not yet informed on the key basics of the discussion” and that is something you can fix with curiosity and a couple of prompts to ChatGPT to speed up the learning curve. It’s not like any of this stuff is gatekept.
> You can give me the ultimate model architecture that will beat every model in existence and I can still figure out a way to make it perform worse than what's available today, but you're not even doing that, you're just drumming up some old news.
Sorry I don’t understand what you’re saying here — what is the old news? You can break new models — yes. What’s the point you are trying to make here?
> If someone "threatened" me with tech advancements I would be more worried about things like an imminent massive drop in token costs for bigger context windows or other game changers like continual learning where the model internalizes your code base into its weights rather than just keeping it in its context.
I also don’t really know the point you’re trying to make here — like token cost drops seem like a good thing? Bigger context window too? Are we saying the same thing here?