logoalt Hacker News

rdslwyesterday at 7:11 PM1 replyview on HN

interesting, I just tried this very model, unsloth, Q8, so in theory more capable than Simon's Q4, and get those three "pelicans". definitely NOT opus quality. lmstudio, via Simon's llm, but not apple/mlx. Of course the same short prompt.

Simon, any ideas?

https://ibb.co/gFvwzf7M

https://ibb.co/dYHRC3y

https://ibb.co/FLc6kggm (tried here temperature 0.7 instead of pure defaults)


Replies

strobetoday at 1:04 AM

try Unsloth recommended settings

    Thinking mode for general tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

    Thinking mode for precise coding tasks (e.g. WebDev): temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0

    Instruct (or non-thinking) mode for general tasks: temperature=0.7, top_p=0.8, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

    Instruct (or non-thinking) mode for reasoning tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
(Please note that the support for sampling parameters varies according to inference frameworks.)