logoalt Hacker News

otabdeveloper4yesterday at 11:12 AM1 replyview on HN

> BERT isn’t a SLM

Huh? BERT is literally a language model that's small and uses attention.

And we had good language models before BERT too.

They were a royal bitch to train properly, though. Nowadays you can get the same with just 30 minutes of prompt engineering.


Replies

mootothemaxyesterday at 11:28 AM

> > BERT isn’t a SLM Huh? BERT is literally a language model that's small and uses attention.

Astute readers will note what’s been missed here.

Fascinating, really. Your confidently-statement yet factually void comments I’d have previously put down to one of the classic programmer mindsets. Nowadays though - where do I see that kind of thing most often? Curious.

show 2 replies