+1 on this pipeline! You can use a super small model to perform an immediate response and a structured output that pipes into a tool call (which may be a call to a "more intelligent" model) or initiates skill execution. Having this async function with a fast response (TTS) to the user + tool call simultaneously is awesome.