Same but for multi-threaded Postgres[0]. 96% pg regression tests pass after 1 month and 823K LOC. 8 Codex accounts at $200/mo is what i could use up with no Mythos
I've also seen the benefits of Rust for this too. And making the bet that my pg experience will help me make good design choices around many of the things people have been having trouble with in pg for a long time[1]. Excited to see AI make it more possible to improve complex pieces of software than has historically been practical.
[0] https://github.com/malisper/pgrust [1] https://malisper.me/the-four-horsemen-behind-thousands-of-po...
1600/mo, there is now a token-rich class.
96% tests passing sounds impressive, but I remember that C compiler that had similar (or better) stats yet was still hilariously broken because the test suite didn't cover many "obvious" things that a human wouldn't get wrong even without the tests.
wow!
curious about your workflow for running all these accounts. different harnesses in parallel? manually switching in codex? 5.5pro only?
what works for you?
Very cool! If you have extra tokens laying around ask the agent try to break things and open GitHub issues. This is what I do for tsz and beyond conformance test I can see it finding very good bugs.