logoalt Hacker News

Paracompacttoday at 5:54 AM2 repliesview on HN

Independent of whether it has any meaning (because the entire paper might be a bit iffy), I find it curious that Instructors 3 and 8 have the lowest harmfulness rates, quite a bit lower than even the LLMs, but not the highest preference rates. Harmfulness anticorrelates with preference, but not perfectly. Some amount of charisma appears to be a factor even in selections by professionals?


Replies

godelskitoday at 3:49 PM

Yeah it's difficult to interpret.

One possible interpretation, the statements were very bland. These would be very low harm but also not very informative

RataNovatoday at 9:55 AM

This is exactly why I'd be cautious about interpreting the preference metric too strongly