I think the attention mechanism is so simple but so revolutionary that people forget it.
Like the best leaps in thinking, once it is made, is is immediately obvious and intuitive.
Almost everything in ML is like that. It seems so obvious in hindsight. It's maybe what I love most.
Residual connections are so simple, so obvious and so vital. Yet nobody came up with them until 2015?
Almost everything in ML is like that. It seems so obvious in hindsight. It's maybe what I love most.
Residual connections are so simple, so obvious and so vital. Yet nobody came up with them until 2015?