logoalt Hacker News

sincerelytoday at 6:39 AM0 repliesview on HN

>I think I'm at 1 success out of 15 attempts for Gemini to explain how to do something in Google Sheets/Docs, though, so I'm not hopeful that anyone can actually implement this.

People love to talk about this as one of the helpful features of AI (knowledge extraction from documents/summarizing), but I'm really not convinced. The last generation of models seem to have 70-90% accuracy on tasks like this, which is way below what i'd consider a reliable tool

e.g. https://www.frontiersin.org/journals/digital-health/articles...

https://pmc.ncbi.nlm.nih.gov/articles/PMC11197181/

I don't know if there are any benchmarks for this sort of task, maybe the new ones are improved but I also doubt that people are using GPT5.5 pro ultrathink for these tasks anyways