Could transparency help?
Ordinarily, a 3D scene rendered in 2D only allows you to see a cone from your eye up to the first surface the ray encounters, thus defining the 2D projection which you see.
But you can make the surfaces transparent so the ray continues, and each additional surface adds a bit to the final pixel. This can look like a mess if you stand still but if you wiggle your movement left and right (or any other direction), your brain suddenly manages to process it into the full 3D structure.
Can something like this be done in 4D?
Something like "wiggle stereoscopy"[1], but for 3d scenes instead of 2d images. Wiggle tesseroscopy?
[1] https://en.wikipedia.org/wiki/Wiggle_stereoscopy