logoalt Hacker News

MarkusQyesterday at 5:35 PM0 repliesview on HN

Computational power. Without self attention, you have a sloppy implementation of something called a PDA (push-down-automaton) -- like an old HP calculator. With it, you have an even sloppier implementation of a Turing machine.

So (modulo a _lot_ of details) it increases the power from that of a "calculator" to that of a "computer".