I believe this is a CPU/GPU vs ASIC comparison, rather than CPU vs GPU. They have always(ish) coexisted, being optimized for different things: ASICs have cost/speed/power advantages, but the design is more difficult than writing a computer program, and you can't reprogram them.
Generally, you use an ASIC to perform a specific task. In this case, I think the takeaway is the LLM functionality here is performance-sensitive, and has enough utility as-is to choose ASIC.
The middle ground here would be an FPGA, but I belive you would need a very expensive one to implement an LLM on it.
It reminds me of the switch from GPUs to ASICs in bitcoin mining. I've been expecting this to happen.