logoalt Hacker News

andaiyesterday at 10:29 PM0 repliesview on HN

That might actually boost performance since attention pays attention to stuff that stands out. If I make a typo, the models often hyperfixate on it.