I tried it with OpenCode and it is borderline incapable of using tool calls, so that might be why it is doing so bad on your test.
I just did the same. Absolutely awful. I assume OpenCode's heavy context is a problem, and it's probably better to use Liquid's own OpenCode alternative for this.
I just did the same. Absolutely awful. I assume OpenCode's heavy context is a problem, and it's probably better to use Liquid's own OpenCode alternative for this.