Extracting a surface mesh is possible, but the result is going to be really ugly (like the high-poly meshes from generative AI that are useless to artists)!
Mesh processing is a very difficult research domain in computer graphics that has been iterated for several decades, and we still don't have a good automated solution for retopology (Partly because the problem is hard to define in a mathematical way, but also since it's not a problem you can just solve with AI by throwing data and compute at it)
With things like the latest dlss (extremely high quality run time .. reinterpretation), I wonder how precise mesh etc has to be now.
1. extract even a super approximate (meaning, like square edges, with some visual details) mesh from gen ai or a scan as a starting point,
2. move things around and define volumes for gameplay needs,
3. name things ("this is a Victorian house in a surprisingly good condition compared to the neighborhood it's in"), have human guided gen ai polish the things a bit more from the labels within the bounds of the gameplay required volumes,
4. let run time dlss fix the lighting etc from the rough geometry
I used voxelization of the splats in the past so I appreciate the notes on the difficulty but this is sort of what PlayCanvas is doing here. Taking the splats, making voxels, meshing.
It's a novel approach and worked well in BIM a few years ago, though not anything real-time.
https://github.com/ziplab/VolSplat