"using periodic features with dominant periods at T=2, 5, 10" seems inconsistent with "platonic representation" and more consistent with "specific patterns noticed in commonly-used human symbolic representations of numbers."
Edit: to be clear I think these patterns are real and meaningful, but only loosely connected to a platonic representation of the number concept.
Regardless of whether the convergence is superficial or not, I am interested especially in what this could mean for future compression of weights. Quantization of models is currently very dumb (per my limited understanding). Could exploitable patterns make it smarter?
Is it an actual counterargument?
The "platonic representation" argument is "different models converge on similar representations because they are exposed to the same reality", and "how humans represent things" is a significant part of reality they're exposed to.