Launch HN: TesterArmy (YC P26) – Agents that test web and mobile apps

123 points • by okwasniewski • yesterday at 2:49 PM • 62 comments • view on HN

Hey HN - we’re Oskar, Szymon, and Piotr, and we’re building TesterArmy (https://tester.army). TesterArmy is an agentic testing platform that runs end-to-end checks before deployment and in production. Instead of wasting hours on manual testing or maintaining static scripts, we let you specify your tests in natural language and handle everything in between. We've built the platform fully around agents. Our agent will reliably execute the tests, but your coding agent can manage everything in our platform, from defining tests in natural language to running them on your behalf.

Check out our demo video: https://www.youtube.com/watch?v=291IkUbPrlk.

We started TesterArmy because testing is still far too painful. AI coding tools have made it dramatically faster to write and ship code, but testing is still a bottleneck. Traditional E2E tests are slow to set up and expensive to maintain. Managing auth and test users is painful. Setting up staging environments is painful. Running tests reliably is painful.

We think most teams do not actually want to spend their time writing selectors or maintaining test infrastructure. They just want confidence that their core flows work. With TesterArmy, an engineer can sign up, give an agent our CLI, and let it handle creating tests and running them on schedule or on GitHub.

When something breaks, TesterArmy alerts your team through Slack or Discord.

Over the past few months, we scaled from 0 to 30+ teams using our product every day. We caught bugs in critical flows, including onboarding, checkout, and AI chat. We've got many of our customers migrating from already established competitors to us because of the quality and reliability of our agents.

Here are a few of the recent bugs that our agent found (there were quite a lot of them!):

1) Timezone bug that affected the booking flow in one of our clients' apps, the dashboard was very complex and hard to catch by a human. 2) Regression in agent orchestration that caused a sandboxed environment to be stuck on loading, thanks to TesterArmy, the team was able to resolve it before it hit production. 3) Incorrectly counting the order amount in a complex dashboard flow with checkout, thanks to TesterArmy, the team was able to resolve it before it affected revenue 4) Catching a regression in an AI chat flow that would result in a user not being able to retrieve their data due to broken tool calling.

And many more, mostly related to some incorrect API calls, 404s, unhandled errors, etc.

If this sounds useful, we would love your feedback at https://tester.army. We have a bunch of free test runs for you to try. And don’t worry, we won’t make you do sales calls, and we don’t have long onboarding or annoying setup. Our goal is an it-just-works experience.

If you're looking for an end-to-end testing solution, we'd love to hear your feedback!

Comments

poisonborz • yesterday at 4:39 PM

E2E tests are now quick to write due to LLMs, and are then deterministic AND cheap to run. How would this compare to the token costs of running an agent the whole time for each test? How do you make sure results stay stable regardless of the nondeterministic nature? Do customers still need to create test cases - any way to import from test case management system - based on which they could have already generate e2e tests locally?

➕ show 1 reply

dbbk • yesterday at 3:19 PM

"Traditional E2E tests are slow to set up and expensive to maintain." I don't really understand this. If I'm already using Opus to write the code, surely it would know best what E2E tests to write to be able to verify its own output? This seems like an unnecessary external step.

➕ show 1 reply

fny • today at 3:52 PM

How do you differ from othe companies doing automated testing?

Rainforest QA, for example, has been in this space for a while and also happens to be a YC company.

Eridrus • today at 12:23 AM

Has anyone tried to build their own version of this?

It's cool, but I'm not super excited about using some 3rd party SaaS as a critical part of my testing.

➕ show 1 reply

negamax • today at 6:06 AM

Was writing E2E tests ever a problem that needs automation? Also E2E tests need to be updated everytime a new feature is added. TesterArmy sounds great. But config overhead and potential security leaks makes it a no go

➕ show 1 reply

radku • today at 1:19 PM

Congrats on the launch!

Will this solution work with services protected by cloudflare turnstile or captchas? Does this involve human in a loop?

msencenb • yesterday at 3:12 PM

Have you been able to nail down a loop where your tool can take an open pr, guess the code path and do some testing?

We use cypress heavily for our core flows which has a similar ai prompt thing but it’s not quite ad hoc enough for smaller fixes which is where the bottleneck still comes in for us.

➕ show 1 reply

pranshuchittora • yesterday at 11:05 PM

Some digging FAST_MODEL = "google/gemini-3-flash" (fast mode primary) DEEP_MODEL = "openai/gpt-5.4" (deep mode primary) VISION_CLICK_MODEL= "openai/gpt-5.4" (the visual grounder)

fast: gemini-3-flash, falls back to gpt-5.4, 15-min run timeout, max 2 visual calls/step. deep: gpt-5.4, 15-min timeout, max 3 visual calls/step.

Why such a hard timeout, and why not latest models?

➕ show 1 reply

RayFitzgerald • yesterday at 4:38 PM

Love your approach to product. It feels like TesterArmy will become the "Vercel for testing". Refreshing stuff!

➕ show 1 reply

_pdp_ • yesterday at 6:29 PM

Great presentation

On a slight tangent, since we are all here...

Does anyone still believe there is a long-term future in traditional UI/UX?

It feels like a lot of attention is still going into landing pages, dashboards, and CRUD apps, while overlooking a bigger shift where fewer people will actually need to interact with those interfaces directly when the same tools can perform the underlying tasks automatically, without much UI at all.

So the bigger question is does UI/UX evolve into something else, or does a large part of it simply disappear?

I might be a bit too early. Recently I started a project and decided to skip all of that and focus to make it more friendly to AI agents and frankly so far it has been great purely from user experience but also what it delivers.

➕ show 2 replies

j0sip • yesterday at 5:36 PM

I wonder how does it compare to mobileboost.io, which has been used by some companies like Duolingo?

➕ show 1 reply

yohguy • yesterday at 3:06 PM

Does it work of mobile native applications or expo apps that have native modules?

Pricing question, the usage on the plans seems low considering in the demo you said that you have 25 tests per pr which would mean you get only 10 PRs per month on the hobby plan?

➕ show 1 reply

tcoff91 • yesterday at 4:54 PM

I'm curious how your mobile testing compares to https://revyl.com

I've been experimenting with Revyl and it's really nice. I think this agent-driven testing is the future.

➕ show 2 replies

Laurel1234 • yesterday at 4:21 PM

Seems interesting, but I wonder about this

> Traditional E2E tests are slow to set up and expensive to maintain.

Isn't this just using agents to create e2e tests or is there some better new approach I'm missing?

➕ show 1 reply

pensono • yesterday at 8:30 PM

Love using tester army to validate PRs against my preview environment. Skips the manual check much of the time and helps me ship more confidently.

➕ show 1 reply

antifarben • yesterday at 6:25 PM

What are people using to test mobile apps on self hosted infrastructure nowadays? Is there a solution that's not super heavy and/or slow?

➕ show 1 reply

Lionga • yesterday at 6:34 PM

The most flaky tests possible as a service. Everyone knows that no tests are better then unreliable tests.

➕ show 1 reply

pranshuchittora • yesterday at 10:42 PM

Hey, I just gave it a try and ran a quick test on booking.com. It took ~3 mins for a basic test. Do you cache the test steps so that future runs are faster and they don't call LLMs for the subsequent runs?

Also your current pricing is $300 for 1K tests which means $0.3 for each test. We tried out playwright mcp and it easily consumes 1M+ tokens for a test with ~20 steps (including image input). So with this pricing are you guys default alive?

Also is there a benchmark which you ran to prove the efficacy of your testing agent? because in the current stage it is a trust me bro kinda thing.

➕ show 1 reply

peterspath • yesterday at 7:26 PM

Does it support testing on all Apple platforms (macOS, iOS, iPadOS, watchOS, tvOS, and visionOS)?

➕ show 1 reply

mogili • today at 2:53 AM

This is a solved problem, there are many that do this. Can't believe YC would fund this in 2026.

➕ show 1 reply

anaschouhan475 • today at 5:13 AM

System requirements please

rpunkfu • yesterday at 4:04 PM

Congratulations on launch, I’ve been tracking your progress since you’ve been accepted for spring batch.

Always happy to see cool products from Poland! :)

➕ show 1 reply

iknownthing • yesterday at 3:51 PM

.army?

➕ show 1 reply

zuzululu • yesterday at 5:34 PM

not sure the pain point you mentioned resonate. with LLMs its very easy to do E2E testing. also I feel uneasy about outsourcing this part with all the security issues these days.

➕ show 1 reply

dev-kdrainc • today at 2:09 PM

[flagged]

claud_ia • today at 10:02 AM

[flagged]

jybuilds • today at 12:51 PM

[flagged]

grrinkarthi • today at 12:27 PM

[dead]

TheMolunga • today at 2:10 AM

[flagged]

reliabilitygate • today at 11:21 AM

[flagged]

DipCoy • yesterday at 5:39 PM

[flagged]

amitpatole • yesterday at 6:32 PM

[flagged]

Shacharp • today at 6:30 AM

[dead]

ios-contractor • today at 4:35 AM

[dead]

maxothex • yesterday at 4:03 PM

[flagged]

KaiShips • yesterday at 7:03 PM

[flagged]

alt Hacker News

Launch HN: TesterArmy (YC P26) – Agents that test web and mobile apps

Comments