What is the actual objective of this, is it solving an issue or creating a solution to a problem, that is still to be determined? It seems like a lot of energy to replicate a lidar mapping system. It's not like you can expect accurate dimensions from this approximate guess work, excluding the expected hallucinations adding to inaccuracy.
Video cameras are much cheaper and easier to use than LIDAR, like anyone can just pull out their phone, take a video and send it to this algorithm to get a reasonable point cloud of the environment. Sure, if you want an exact model of an environment and you have the time and money, LIDAR would give better results, but this is about doing more with less
N00b question from me, perhaps, but how easy is it to mount and run Lidar on aerial drones?
3D reconstruction of old spaces which no longer exist seems like a clear use case to me. There's loads of old videos of driving down a street in the 80s, or neighborhoods in cities which got replaced.
I can imagine future iterations of this which bring together other stills of the same space at that time to augment the dataset. Then perhaps another pass to fill in gaps with likely missing content based on probability or data from say the same street 10 years later.
It won't be 100% real, but I think it'd be very cool to be able to have a google-street view style experience of areas before google street view existed.