I don't think it's abstract at all. Rub something sharp (anything from a stick to a phonograph needle) on an object and you'll directly transcribe its spatial frequency spectrum into an audio frequency spectrum.
"Spatial frequency spectrum" typically refers to visual elements of an object, and has nothing particularly to do with its structure. Entirely smooth surfaces banded in different colors have a "spatial frequency". Extremely irregularly surfaces have no effective spatial frequency. Objects on the same scale as, say, a human head, would have to be "rubbed" at ridiculous high rates (and repeatedly) to even get into a "frequency" range that might include pressure variations that would be considered as a "wave".
I think you're imagining an entirely too limited set of objects.
Do you think it's obvious that a chick would understand that connection?