Yeah we've though about this approach a lot - but the problem is if your final program is a brittle script, you're gonna need a way to fix it again often - and then you're still depending on recurrently using LLMs/agents. So we think its better to have the program itself be resilient to change instead of you/your LLM assistant having to constantly ensure the program is working.
Are you sure? Couldnt you just just go back to the LLM if the script breaks? Pages changes but not that often in general.
It seems like a hybrid approach would scale better and be significantly cheaper.
I wonder if a nice middle ground would be: - recording the playwright behind the scenes and storing - trying that as a “happy path” first attempt to see if it passes - if it doesn’t pass, rebuilding it with the AI and vision models
Best of both worlds. The playwright is more of a cache than a test