The methods could be better described in the paper, but my understanding is that they did 10 runs fo...

jampekka • yesterday at 8:28 AM • 1 reply • view on HN

The methods could be better described in the paper, but my understanding is that they did 10 runs for each question for each prompt and took an average of those, so the compared values are not binary. You could do a sign test, but you'd lose power and answer a bit different question.

Replies

freehorse • yesterday at 8:52 AM

You can do a generalised mixed effects linear model with binomial outcome (ie a binomial test but with added random effects structure). But unless you want to introduce a richer random effects structure with more variables, it is overkill and overcomplicating things, and the result should be the same as t-tests.

alt Hacker News

Replies