Does anyone know why they are using language models instead of a more purpose-built statistical mode...

100721 • today at 8:30 AM • 3 replies • view on HN

Does anyone know why they are using language models instead of a more purpose-built statistical model? My intuition is that a language model would either be overfit, or its training data would have a lot of noise unrelated to the application and significantly drive up costs.

Replies

LeoWattenberg • today at 8:39 AM

It's not an LLM, it is a purpose built model. https://arxiv.org/html/2411.19506v1

5 years ago we would've called it a Machine Learning algorithm. 5 years before that, a Big Data algorithm.

➕ show 2 replies

kevmo314 • today at 8:37 AM

This might be some journalistic confusion. If you go to the CERN documentation at https://twiki.cern.ch/twiki/bin/view/CMSPublic/AXOL1TL2025 it states

> The AXOL1TL V5 architecture comprises a VICReg-trained feature extractor stacked on top of a VAE.

dmd • today at 9:29 AM

… they’re not? Who said they are? The article even explicitly says they’re not?

➕ show 1 reply

alt Hacker News

Replies