What sense of rigour is going to be in a field (LLM usage as a user) where models, context sizes, to...

darkwater • today at 3:33 PM • 1 reply • view on HN

What sense of rigour is going to be in a field (LLM usage as a user) where models, context sizes, tooling and broadly "rules" (scary quotes) change every few weeks? There is no literal change to have a scientific approach to anything, churn is too high, there are papers about model XYZ v 12345 from a few months ago that are already old because there is model ABC on version 54321 that addresses half of the issue shown in the paper and add 3 new problems though.

Replies

skybrian • today at 4:58 PM

With benchmarks, you can re-run them after a change. A measurement in a paper will go out of date quickly unless turned into a benchmark.

alt Hacker News

Replies