logoalt Hacker News

bluegattyyesterday at 11:10 PM1 replyview on HN

Yup, you got it. It's a weird situation for sure.

You know what's also weird: Gem3 'Pro' is pretty dumb.

OAI has 'thinking levels' which work pretty well, it's nice to have the 'super duper' button - but also - they have the 'Pro' product which is another model altogether and thinks for 20 min. It's different than 'Research'.

OAI Pro (+ maybe Spark) is the only reason I have OAI sub. Neither Anthropic nor Google seem to want to try to compete.

I feel for the head of Google AI, they're probably pulled in major different directions all the time ...


Replies

visargayesterday at 11:23 PM

If you want that level of research I suggest you ask the model to draft a markdown plan with "[ ]" gates for todo items, and plan it in as many steps as needed. Then ask another LLM to review the plan, judge it. In the end use the plan as the execution state tracker, the model solves one by one the checkboxes.

Using this method I could recreate "deep research" mode on a private collection of documents in a few minutes. A markdown file can be like a script or playbook, just use checkboxes for progress. This works for models that have file storage and edit tools, which is most, starting with any coding agent.

show 1 reply