logoalt Hacker News

kovektoday at 2:46 AM1 replyview on HN

I thought it was known since a few years now that if you train models to NOT do certain things, then they start behaving in weird ways…


Replies

srdjanrtoday at 12:51 PM

It seems like they run a classifier model before going to Fable (or falling back to Opus), so it should be fine