The most salient thing about these models is that they're non-reasoning models. This makes then...

woadwarrior01 • yesterday at 1:59 PM • 0 replies • view on HN

The most salient thing about these models is that they're non-reasoning models. This makes then very token efficient and particularly well suited for local inference where decoding is usually slower than with datacenter GPUs.

Link to HF collection: https://huggingface.co/collections/ibm-granite/granite-41-la...

alt Hacker News