logoalt Hacker News

mootothemaxyesterday at 11:08 AM1 replyview on HN

> We had good small language models for decades. (E.g. BERT)

BERT isn’t a SLM, and the original was released in 2018.

The whole new era kicked off with Attention Is All You Need; we haven’t reached even a single decade of work on it.


Replies

otabdeveloper4yesterday at 11:12 AM

> BERT isn’t a SLM

Huh? BERT is literally a language model that's small and uses attention.

And we had good language models before BERT too.

They were a royal bitch to train properly, though. Nowadays you can get the same with just 30 minutes of prompt engineering.

show 1 reply