LLMs, even the best ones, are still hit or miss wrt quality. Constantly improving, though.
I see more confusion from Opus 4.x about how to weight the different parts of a paper in terms of importance than I see hallucinations of flat out incorrect stuff. But these things still happen.
surely, but it is a considerable concern? deflecting constructive feedback is probably not the best encouragement for others for a show HN?