logoalt Hacker News

pranshuchittorayesterday at 10:42 PM1 replyview on HN

Hey, I just gave it a try and ran a quick test on booking.com. It took ~3 mins for a basic test. Do you cache the test steps so that future runs are faster and they don't call LLMs for the subsequent runs?

Also your current pricing is $300 for 1K tests which means $0.3 for each test. We tried out playwright mcp and it easily consumes 1M+ tokens for a test with ~20 steps (including image input). So with this pricing are you guys default alive?

Also is there a benchmark which you ran to prove the efficacy of your testing agent? because in the current stage it is a trust me bro kinda thing.


Replies

okwasniewskiyesterday at 11:10 PM

We've been doing quite a lot of context engineering and optimizations to make sure it's not as expensive. The subsequent runs are faster because we cache the trajectory of the agent (not the whole test run yet, as we want to keep the agent in the loop, more like a manual QA engineer, not a test script).

We currently do not have any benchmarks; much of the experience depends on the test plan. We've been mostly focusing on the customer experience not benchmarking.