logoalt Hacker News

cyanydeezyesterday at 10:42 AM2 repliesview on HN

the problem is the null answer will stop the "markov" chain.

so, thats all.


Replies

BDPWyesterday at 11:16 AM

You dont have to literally send a null token. Train it to generate text that summarizes the evidence that is there but the uncertainty of the final answer to a prompt.

make3yesterday at 1:08 PM

Transformers are not Markovian, their whole point is arguably to be the reverse of Markovian, to efficiently make it so the new tokens are a function of all previous tokens