>> The only benchmark it does well at compared to other models is non-hallucination and instruction following.
I think instruction following is going to be the most useful thing these models do. Add a voice interface and access to a bunch of simple, straight-forward devices or APIs and you have a mildly useful assistant. If that can be done in 8B parameters it will soon run on edge devices. That's solid usefulness.
Anything that beats alexa-level intelligence on an edge-device is what I'd call useful as well, which shouldn't be too hard.
It's mind-boggling how bad current voice assistants sometimes are when you prompt them some fairly easy questions.