If you are going to go to the bother of fine tuning for trivial problems like subject classification...

nl • today at 1:56 AM • 3 replies • view on HN

If you are going to go to the bother of fine tuning for trivial problems like subject classification then I think you'll find Scikit Learn with a SGDClassifier on 2-grams will do probably just as well and be under 1MB for the trained classifier.

You can train it in under a minute, and it will work perfectly well on embedded devices.

Small LLMs are good choices for text classification in two cases:

- If you next to provide in-context examples and classifier based on them.

- Your classification goes beyond simple subject-type classifiers. For example, multiple choice question answering is classification where small LLM will work but traditional ML methods won't/

Replies

djsjajah • today at 2:46 AM

Not with 800 examples. If you are going to consider an ngram model, I think you are better off getting a frontier llm to write you an absurd regex.

➕ show 1 reply

zubiaur • today at 1:53 PM

A small transformer like BERT or variants is a better fit. It only takes a few examples, which can be generated synthetically using an LLM.

Trains quickly and classifies speedily on modern hardware.

Had a lot of fun doing stuff like this years ago, before LLMs were a thing.

brokensegue • today at 4:30 AM

there are models between 2-grams and 600m param models that would be good options. i don't expect a 2-gram to do very well here. also i'm not sure why this model isn't a fine choice if it solves their problem

➕ show 1 reply

alt Hacker News

Replies