A big limitation for skills (or agents using browsers) is that the LLM is working against raw html&#...

realPubkey • today at 7:29 AM • 1 reply • view on HN

A big limitation for skills (or agents using browsers) is that the LLM is working against raw html/DOM/pixels. The new WebMCP API solves this: apps register schema-validated tools via navigator.modelContext, so the agent has structured JSON to work with and can be way more reliable.

WebMCP is currently being incubated in W3C [1], so if it lands as a proper browser standard, this becomes a endpoint every website can expose.

I think browser agents/skills+WebMCP might actually be the killer app for local-first apps [2]. Remote APIs need hand-crafted endpoints for every possible agent action. A local DB exposed via WebMCP gives the agent generic operations (query, insert, upsert, delete) it can freely compose multiple steps of read and writes, at zero latency, offline-capable. The agent operates directly on a data model rather than orchestrating UI interactions, which is what makes complex things actually reliable.

For example the user can ask "Archive all emails I haven't opened in 30 days except from these 3 senders" and the agent then locally runs the nosql query and updates.

- [1] https://webmachinelearning.github.io/webmcp/

- [2] https://rxdb.info/webmcp.html

Replies

utopiah • today at 9:18 AM

What's the difference with complying to OpenAPI specification and providing an endpoint?

alt Hacker News

Replies