I feel like the permanent fix is for the AI labs to figure out better attention methods that increase context length without extra inference cost, plus deeper discounts (like -99%) for people being able to add system prompts to their accounts that are cached permanently.
This way you build all your MCPs into the system prompt, save the prompt to the AI provider, then use it without overpaying API costs.
The current "tools-on-demand" workarounds should be great for infrequent tools but the future will probably bring agents with dozens of tools that need them in context to flexibly many of them in the same context window. So we just need to make the context windows longer and make this capability cheaper to use.