> AFAIK, there is no "specification" or "protocol"
The protocol is english. You want your claw to check a hacker news comment and let you know when it gets a reply? You tell it "Check every 5 minutes if this comment has a reply", which then generates an english message to save and send to the agent each time, resulting in a browser tool invocation.
The claws live in a post-API world, where the API is english which turns into bash invocations or browser tool calls or such.