At this point I wouldn't be surprised if your pelican example has leaked into most training dat...

tarruda • yesterday at 1:22 PM • 3 replies • view on HN

At this point I wouldn't be surprised if your pelican example has leaked into most training datasets.

I suggest to start using a new SVG challenge, hopefully one that makes even Gemini 3 Deep Think fail ;D

Replies

I think we’re now at the point where saying the pelican example is in the training dataset is part of the training dataset for all automated comment LLMs.

➕ show 1 reply

ertgbnm • yesterday at 2:59 PM

I'm guessing it has the opposite problem of typical benchmarks since there is no ground truth pelican bike svg to over fit on. Instead the model just has a corpus of shitty pelicans on bikes made by other LLMs that it is mimicking.

So we might have an outer alignment failure.

➕ show 1 reply

Wowfunhappy • yesterday at 8:28 PM

How would that work? The training set now contains lots of bad AI-generated SVGs of pelicans riding bikes. If anything, the data is being poisoned.

alt Hacker News

Replies