Haven't read the paper yet, but it is interesting how seemingly simple many breakthroughs in ML are. Even transformers are like that. Maybe it's hindsight bias.
I suppose we just don't have a deeper underlying theory to lean on and help us 'design' anything.
[flagged]
A lot of discoveries are like that. In fact, simplicity is often the hallmark of correctness, and complexity is often a sign that our understanding is incomplete and we’re still stumbling towards the right model. Not always, but often. It’s been a good rule of thumb in my programming career.