My recent frustration with Claude has been it feels like I'm waiting on responses more. I don't have historical latency to compare this with, but I feel like it has been getting slower. I may be wrong, and maybe its just spending more time thinking than it used to. My guess is Anthropic is having capacity issues. I hope I'm wrong because I don't want to switch.
There was a really good point in this podcast episode about the speed of LLMs. They are so slow that all of the progress messages and token streaming are necessary. But the core problem is that the technology is so darn slow.
https://podcasts.apple.com/us/podcast/this-episode-is-a-cogn...
As someone who both uses and builds this technology I think this is a core UX issue we’re going to be improving for a while. At times it really feels like a choose 2+ of: slow, bad, and expensive.