I’m guessing googles model has extensive Minecraft sandbox mode YouTube vids in its training which would exactly this perspective