logoalt Hacker News

cjbgkaghtoday at 3:12 PM1 replyview on HN

I've been playing with this for the last few days. The model is fast, pretty smart, and I am hitting the same tool use issues. This blog post is unusually pertinent. The model speed isn't an issue on my dual 4090s, the productivity is mainly limited by the intelligence (while high it's still not high enough for some tasks) and getting stuck in loops.

What I would like is for it to be able detect when these things happen and to "Phone a Friend" to a smarter model to ask for advice.

I'm definitely moving into agent orchestration territory where I'll have an number of agents constantly running and working on things as I am not the bottleneck. I'll have a mix of on-prem and AI providers.

My role now is less coder and more designer / manager / architect as agents readily go off in tangents and mess that they're not smart enough to get out of.


Replies

adrian_btoday at 5:36 PM

Google has replaced chat_template.jinja and tokenizer_config.json a few days ago in gemma-4-31B-it, which is supposed to have solved some problems related to tool invocation.

So if you have not updated your model, you should do it.