logoalt Hacker News

AndrewKemendoyesterday at 11:14 PM0 repliesview on HN

This looks like a really promising approach

In particular the Forward rollout module is very important. It aligns your (effectively) world model with what it expects from the world, and keeping those in sync I think gives this the power it needs to be able to generate the state action pairs to continuously train semi supervised