I haven't tried that specific case but - are you sure? It does get a lot of stuff right from context. I think it would probably depend how much of the frame, the poster took up.
Maybe, but what is wrong with wanting real depth instead of "made up depth"? One extra photo mostly solves that.
More reference images from different angles is always going to give more accurate information in 3D. From a single 2D image there is a lot of ambiguity in the context. Several different shapes in 3D can be represented in identical ways in 2D. Additional context like lighting shadows etc helps. But more real signal from more images will always be better