> so llama.cpp just doesn't handle it correctly. It is a bug in the model weights and repr...

tarruda • yesterday at 7:26 PM • 1 reply • view on HN

> so llama.cpp just doesn't handle it correctly.

It is a bug in the model weights and reproducible in their official chat UI. More details here: https://github.com/ggml-org/llama.cpp/pull/19283#issuecommen...

sosodev • yesterday at 7:33 PM

I see. It seems the looping is a bug in the model weights but there are bugs in detecting various outputs as identified in the PR I linked.

alt Hacker News