The hard part with these unified wrappers is always the leaky bits: prompt caching, tool-use schemas, structured output, and reasoning tokens differ enough across providers that a common interface tends to quietly drop whatever's provider-specific. How does RubyLLM handle that, escape hatches down to the raw provider, or normalize to a shared subset? Prompt caching is the one I'd care about most, since it's modeled pretty differently between OpenAI and Anthropic and it's where most of my cost savings come from.