That's one theory. Another one I can think of is that sharp edges are scary, and most distress calls are high pitched.
Also, the thing about high frequencies and sharp edges lead to a contradiction: babies are more round than adults and produce higher pitched sounds, this is almost universal across all species.
There are other tentative explanations, such as how the vocal tract acts when producing these sounds, with "bouba" sounds being the result of smoother movement more reminiscent of a round shape.
"kiki" is not just higher pitched, it is also "shaped" differently if you look at the sound envelope, with, as expected, sharper transitions.
So to me, the mystery is still there. Is is the kind of thing that sounds obvious, in the same way that kiki sounds obviously sharper than bouba, but is not.
> Also, the thing about high frequencies and sharp edges lead to a contradiction: babies are more round than adults and produce higher pitched sounds, this is almost universal across all species.
It's more in terms of harmonic content than the pitch fundamental. There are more harmonics from a thing with sharp transitions than there are in a thing with rounded transitions regardless of the fundamental pitch. Compare harmonic content of a pure sine wave (it's just the fundamental) with that of a square wave, which has an infinite series of higher harmonics.
Babies are also smaller, which means higher fundamental pitch.
> "kiki" is not just higher pitched, it is also "shaped" differently if you look at the sound envelope, with, as expected, sharper transitions.
Exactly!
EDIT I think this is interesting: it also applies to images as well, not just sound. You can "low pass filter" a photograph and it'll reduce some of the detail, smoothing out transitions (typically used for noise reduction). Detail is high frequency information (or high frequency noise depending on whether you want it or not.)