The magic is testing. Having locally available testing and high throughput testing with high amount of test cases now unlocks more speed.
The test cases themselves becomes the foci - the LLM usually can't get them right.
> The magic is testing.
No it is not.
There os no amount of testing that can fix a flawed design
The word "Testing" is a very loaded term. Few non-professionals, or even many professionals, fully understand what is meant by it.
Consider the the following: Unit, Integration, System, UAT, Smoke, Sanity, Regression, API Testing, Performance, Load, Stress, Soak, Scalability, Reliability, Recovery, Volume Testing, White Box Testing, Mutation Testing, SAST, Code Coverage, Control Flow, Penetration Testing, Vulnerability Scanning, DAST, Compliance (GDPR/HIPAA), Usability, Accessibility (a11y), Localization (L10n), Internationalization (i18n), A/B Testing, Chaos Engineering, Fault Injection, Disaster Recovery, Negative Testing, Fuzzing, Monkey Testing, Ad-hoc, Guerilla Testing, Error Guessing, Snapshot Testing, Pixel-Perfect Testing, Compatibility Testing, Canary Testing, Installation Testing, Alpha/Beta Testing...
...and I'm certain I've missed dozens of other test approaches.
How does that test suite get built and validated? A comprehensive and high quality test suite is usually much larger than the codebase it tests. For example, the sqlite test suite is 590x [1] the size of the library itself
1. https://sqlite.org/testing.html