About a year behind , TBQH. Newer Mixture-of-Experts models are comparable to a slightly older Claude Sonnet; if you don't mind the (lack of) speed. Some benchmarks say they're competitive with the frontier models right now for certain tasks.
I'm not sure how much I trust those benchmarks; I have a feeling everyone is playing up to them in some way. Still, if you're willing to accept the latency, they're definitely usable.
Of course everyone has realized this, so the hardware you need to run them is a little bit on the expensive side right this minute.
CPU manufacturers are working on improvements so that you can more practically run models on regular CPU+RAM (it's already possible with llama.cpp, just even slower).