I'm still not entirely clear on the problem <-> capability matching. E.g. it seems like Kimi K2.6 with good context would already be able to solve a huge chunk of problems. What share of prompts require frontier models?