logoalt Hacker News

__mharrison__yesterday at 10:02 PM1 replyview on HN

Curious if this would help larger local models? Qwen 3.6 varieties of deepseek4?


Replies

zambelliyesterday at 10:05 PM

Yes it does! I haven't published those evals yet, but I'm actually running 24-35B class models on a custom coding harness built on forge (even 120B class recently).

I just need more GPU wall clock time to get more evals done. ETA is...a few weeks? Got distracted by the coding harness.

But the results are the same. Reforged models do better than bare, even at those sizes. As for published results, I ran forge on Anthropic models and reforged doe better than bare for them as well :)

show 3 replies