logoalt Hacker News

lukevlast Sunday at 4:19 PM1 replyview on HN

Did you read the article? There's a whole section on "this is already happening."


Replies

mzellinglast Sunday at 5:50 PM

Yes, I did see that section. We've known for a while that reward hacking, train/test data contamination, etc. must be taken seriously. Researchers are actively guarding against these problems. This paper explores what happens when researchers flip their stance and actively try to reward hack — how far can they push it? The answer is "very far."