So, if I understand correctly, this is about finding the optimal (or at least a better one) GPT arch...

GTP • yesterday at 10:43 PM • 0 replies • view on HN

So, if I understand correctly, this is about finding the optimal (or at least a better one) GPT architecture?

Anyway, "1980 experiments, 6 improvements" makes me wonder if this is better than a random search or some simple heuristic.

alt Hacker News