>Testing workloads that take hours to run still take hours to run with either a human or LLM testing them out (aka that is still the bottleneck)
Absolutely. Tight feedback loops are essential to coding agents and you can’t run pipelines locally.
This is where I think we need better tooling around tiered validation - there's probably quite a bit you can run locally if we had the right separation; splitting the cheap validation from the expensive has compounding benefits for LLMs.
This is where I think we need better tooling around tiered validation - there's probably quite a bit you can run locally if we had the right separation; splitting the cheap validation from the expensive has compounding benefits for LLMs.