logoalt Hacker News

nightpoolyesterday at 8:00 PM3 repliesview on HN

Interesting, but it feels like it's going to cope very poorly with actually safety-critical situations. Having a world model that's trained on successful driving data feels like it's going to "launder" a lot of implicit assumptions that would cause a car to get into a crash in real life (e.g. there's probably no examples in the training data where the car is behind a stopped car, and the driver pulls over to another lane and another car comes from behind and crashes into the driver because it didn't check its blindspot). These types of subtle biases are going to make AI-simulated world models a poor fit for training safety systems where failure cannot be represented in the training data, since they basically give models "free reign" to do anything that couldn't be represented in world model training.


Replies

420officialyesterday at 8:41 PM

You're forgetting that they are also training with real data from the 100+ million miles they've driven on real roads with riders, and using that data to train the world model AI.

> there's probably no examples in the training data where the car is behind a stopped car, and the driver pulls over to another lane and another car comes from behind and crashes into the driver because it didn't check its blindspot

This specific scenario is in the examples: https://videos.ctfassets.net/7ijaobx36mtm/3wK6IWWc8UmhFNUSyy...

It doesn't show the failure mode, it demonstrates the successful crash avoidance.

MillionOClockyesterday at 9:05 PM

While there most likely is going to be some bias in the training of those kinds of models, we can also hope that transfer learning from other non-driving videos will at least help generate something close enough to the very real but unusual situations you are mentioning. We could imagine an LLM serving as some kind of fuzzer to create a large variety of prompts for the world model, which as we can see in the article seems pretty capable at generating fictive scenarios when asked to.

As always tho the devil lies in the details: is an LLM based generation pipeline good enough? What even is the definition of "good enough"? Even with good prompts will the world model output something sufficiently close to reality so that it can be used as a good virtual driving environment for further training / testing of autonomous cars? Or do the kind of limitations you mentioned still mean subtle but dangerous imprecisions will slip through and cause too poor data distribution to be a truly viable approach?

My personal feeling is that this we will land somewhere in between: I think approaches like this one will be very useful, but I also don't think the current state of AI models mean we can have something 100% reliable with this.

The question is: is 100% reliability a realistic goal? Human drivers are definitely not 100% reliable. If we come up with a solution 10x more reliable than the best human drivers, that maybe has some also some hard proof that it cannot have certain classes of catastrophic failure modes (probably with verified code based approaches that for instance guarantees that even if the NN output is invalid the car doesn't try to make moves out of a verifiably safe envelope) then I feel like the public and regulators would be much more inclined to authorize full autonomy.