logoalt Hacker News

bwhittytoday at 6:37 PM1 replyview on HN

As another poster above linked, it’s been shown to be effective since 2022: https://arxiv.org/abs/2203.05482


Replies

nightpooltoday at 9:27 PM

it works because Nex N2 is also a derivative of the original base Qwen model. If it was two completely unrelated models it wouldn't work.