maybe there should be an LLM trained on a corpus of a deletions and cleanup of code.

bryanrasmussen • yesterday at 3:00 AM • 2 replies • view on HN

Replies

I'm guessing there's a very strong prior to "just keep generating more tokens" as opposed to deleting code that needs to be overcome. Maybe this is done already but since every git project comes with its own history, you could take a notable open-source project (like LLVM) and then do RL training against against each individual patch committed.

➕ show 2 replies

ashdksnndck • yesterday at 9:10 AM

I think this is in the training data since they use commit data from repos, but I imagine code deletions are rarer than they should be in the real data as well.

➕ show 1 reply

alt Hacker News

Replies