Independent of whether it has any meaning (because the entire paper might be a bit iffy), I find it ...

Paracompact • today at 5:54 AM • 2 replies • view on HN

Independent of whether it has any meaning (because the entire paper might be a bit iffy), I find it curious that Instructors 3 and 8 have the lowest harmfulness rates, quite a bit lower than even the LLMs, but not the highest preference rates. Harmfulness anticorrelates with preference, but not perfectly. Some amount of charisma appears to be a factor even in selections by professionals?

Replies

godelski • today at 3:49 PM

Yeah it's difficult to interpret.

One possible interpretation, the statements were very bland. These would be very low harm but also not very informative

RataNova • today at 9:55 AM

This is exactly why I'd be cautious about interpreting the preference metric too strongly

alt Hacker News

Replies