logoalt Hacker News

satvikpendemyesterday at 3:06 PM1 replyview on HN

Curious, why wouldn't the future be a full world model like Google's Genie? It just renders every pixel so someone could still make their vision come to life via a prompt too.


Replies

Lercyesterday at 8:50 PM

It could be done that way but you are spending parameters managing the fact that the output changes completely with a change in view position or orientation. A observer independent model only has to manage changes of things that are actually changing in the world.

Since you can view Gaussian splats from any POV you end up generating an output that is closer to the representation of the world instead of a projection that a single observer sees.