logoalt Hacker News

mohsen1yesterday at 7:46 PM4 repliesview on HN

Yes indeed. More than 1 million lines of code (including tests) is jumping lots of hoops but with LLMs it's not as painful so you can just ask it to do the hard things.

Example of a Claude Code session after 2 hours of "Crunching" that came out without results https://github.com/mohsen1/tsz/pull/4868 (Edit I force pushed to PR to solve the problem, you can see the initial refuse message in the initial version of PR description)

Funny thing is, the last percent of the test have been so hard to work on that Opus 4.7 routinely bails and says "it's too involved or complicated" so I had to add prompts specifically asking it not to bail.


Replies

baqyesterday at 7:52 PM

You should try GPT, I’d be really interested to hear if it works better. (Exclusively using GPT for systems work at $DAYJOB, but compare with opus every couple weeks and GPT consistently gives me better results)

show 2 replies
odie5533yesterday at 8:21 PM

Do you have any write ups on your workflow with Claude and github dev?

mebcittoyesterday at 7:57 PM

That might be opus 4.7 behaviour because I also get that all the time in the past few weeks. Also complex code base, but likely an order of magnitude simpler than yours.