We use agents to navigate the app, making real-time decisions based on its state. I prefer to compar...

okwasniewski • yesterday at 4:31 PM • 2 replies • view on HN

We use agents to navigate the app, making real-time decisions based on its state. I prefer to compare it more to a manual QA engineer than to static e2e tests. We spent a lot of time on the harness to make sure the results are reliable. This allows you to assert on dynamic content like AI-generated content. We also support validation of email flows since the agent can read its own email.

Replies

Laurel1234 • today at 12:51 PM

> We use agents to navigate the app, making real-time decisions based on its state.

This still leads me to my original question of how though. If you're not using locators are you just passing page contents to the LLM? Or using a multi modal model and say screenshotting? My experience with that has been pretty poor and worse than proper e2e scripts, and is fairly expensive to boot.

Sorry for the insistence haha, just interested because it could be pretty groundbreaking if done well.

jaggederest • yesterday at 4:39 PM

Fable (rip) is absurdly good at this, great time to build a product around it, you definitely need the harness, but it feels like it just turned the corner to be able to do really in depth and edge case work.

Do you handle heterogenous environments and network connectivity simulation as well? I am working on a mobile app and occasionally having users just lose a request or two can put the state machine into unusual modes.

➕ show 1 reply

alt Hacker News

Replies