GPT-4o is interesting to learn about - but it’d be great to test again with frontier models of May/June 2026 and see if these effects are gone, different, or the same.
Which model you use is a huge wildcard for results like this.