This reminds me of when I tried to let Claude port an Android libgdx-based game to a WASM-based libgdx version, so I can play the game in the browser.
No matter how much I tried to force it to stick to a mostly line-by-line port, it kept trying to "improve" the code. At some point it had to undo everything as it introduced a number of bugs. I asked it: "What should I add to your prompt so you won't do this again?" and it gave me this:
### CRITICAL LESSON: Don't "Improve" During Porting
- **BIGGEST MISTAKE: Reorganizing working code**
- **What I did wrong:** Tried to "simplify" by splitting `createStartButton()` into separate creation and layout methods
- **Why it failed:** Introduced THREE bugs:
1. Layout overlap (getY() vs getY() - getHeight())
2. Children not sized (Group.setSize() doesn't affect children)
3. Origins not updated (scaling animations broken)
- **The fix:** Deleted my "improvements" and copied the original Android pattern faithfully
- **Root cause:** Arrogance - assuming I could improve production-tested code without understanding all the constraints
- **Solution:** **FOLLOW THE PORTING PRINCIPLES ABOVE** - copy first, don't reorganize
- **Time wasted:** ~1 hour debugging self-inflicted bugs that wouldn't exist if I'd just copied the original
- **Key insight:** The original Android code is correct and battle-tested. Your "improvements" are bugs waiting to happen.
I like the self-reflection of Claude, unfortunately even adding this to CLAUDE.md didn't fix it and it kept taking wrong turns so I had to abandon the effort.I recently did a c++ to rust port with Gemini and it was basically a straight line port like I wanted. Nearly 10k lines of code too. It needed to change a bit of structure to get it compiling, but that's only because rust found bugs at compile time. I attribute this success to the fact my team writes c++ stylistically close to what is idiomatic rust, and that generally the languages are quite similar. I will likely do another pass in the future to turn the callback driven async into async await syntax, but off the bat it largely avoided doing so when it would change code structure.
For anything large like this, I think it's critical that you port over the tests first, and then essentially force it to get the tests passing without mutating the tests. This works nicely for stuff that's very purely functional, a lot harder with a GUI app though.
Worth pointing out that your IDE/plugin usually adds a whole bunch of prompts before yours - let alone the prompts that the model hosting provider prepends as well.
This might be what is encouraging the agent to do best practices like improvements. Looking at mine:
>You are a highly sophisticated automated coding agent with expert-level knowledge across many different programming languages and frameworks and software engineering tasks - this encompasses debugging issues, implementing new features, restructuring code, and providing code explanations, among other engineering activities.
I could imagine that an LLM could well interpret that to mean improve things as it goes. Models (like humans) don't respond well to things in the negative (don't think about pink monkeys - Now we're both thinking about them).
Humans act the same way.
For all the (unfortunately necessary) conversations that have occurred over the years of the form, "JavaScript is not Java—they're two different languages," people sometimes go too far and tack on some remark like, "They're not even close to being alike." The reality, though, is that many times you can take some in-house package (though not the Enterprise-hardened™ ones with six different overloads for every constructor, and four for every method, and that buy hard into Java (or .NET) platform peculiarities—just the ones where someone wrote just enough code to make the thing work in that late-90's OOP style associated with Java), and more or less do a line-by-line port until you end up with a native JS version of the same program, which with a little more work will be able to run in browser/Node/GraalJS/GJS/QuickJS/etc. Generally, you can get halfway there by just erasing the types and changing the class/method declarations to conform to the different syntax.
Even so, there's something that happens in folks' brains that causes them to become deranged and stray far off-course. They never just take their program, where they've already decomposed the solution to a given problem into parts (that have already been written!), and then just write it out again—same components, same identifier names, same class structure. There's evidently some compulsion where, because they sense the absence of guardrails from the original language, they just go absolutely wild, turning out code that no one would or should want to read—especially not other programmers hailing from the same milieu who explicitly, avowedly, and loudly state their distaste for "JS" (whereby they mean "the kind of code that's pervasive on GitHub and NPM" and is so hated exactly because it's written in the style their coworker, who has otherwise outwardly appeared to be sane up to this point, just dropped on the team).
One thing that might be effective at limited-interaction recovery-from-ignoring-CLAUDE.md is the code-review plugin [1], which spawns agents who check that the changes conform to rules specified in CLAUDE.md.
[1] https://github.com/anthropics/claude-code/blob/main/plugins/...
It's not context-free (haha) but a trick you can try is to include negative examples into the prompt. It used to be an awful trick originally because of Waluigi Effect but then became a good trick, and lately with Opus 4.5 I haven't needed to do it that much. But it did work once. e.g. like take the original code and supply the correct answer and the wrong answers in the prompt as examples in Claude.MD and then redo.
If it works, do share.
Was this Claude Code? If you tried it with one file at a time in the chat UI I think you would get a straight-line port, no?
Edit: It could be because Rust works a little differently from other languages, a 1:1 port is not always possible or idiomatic. I haven't done much with Rust but whenever I try porting something to Rust with LLMs, it imports like 20 cargo crates first (even when there were no dependencies in the original language).
Also Rust for gamedev was a painful experience for me, because rust hates globals (and has nanny totalitarianism so there's no way to tell it "actually I am an adult, let me do the thing"), so you have to do weird workarounds for it. GPT started telling me some insane things like, oh it's simple you just need this rube goldberg of macro crates. I thought it was tripping balls until I joined a Rust discord and got the same advice. I just switched back to TS and redid the whole thing on the last day of the jam.
Sonnet 4.5 had this problem. Opus 4.5 is much better at focusing on the task instead of getting sidetracked.
libGDX, now that's a name I haven't heard in a while.
It doesn't seem very bound by CLAUDE.md
Tangential but doesn't libgdx have native web support?
I wish there was a feature to say "you must re-read X" after each compaction.
That’s a terrible prompt, more focused on flagellating itself for getting things wrong than actually documenting and instructing what’s needed in future sessions. Not surprising it doesn’t help.
Well its close to AGI, can you really expect AGI to follow simple instructions from dumbos like you when it can do the work of god?
Claude doesn't know why it acted the way it acted, it is only predicting why it acted. I see people falling for this trap all the time