Isn't this an attack on transcribers? Not on "Voice AI systems". ASR transcribers predate LLMs and all the AI hype.
If you are transcribing audio from unknown sources and feeding the output to agents that can perform authorized actions on your behalf you are kind of screwed anyway. I guess it would be dangerous if you tricked authorized users to play the sounds in the background while transcribing something.