logoalt Hacker News

Eextra953yesterday at 10:15 PM1 replyview on HN

Am I understanding correctly that an implication of this is reduced context? since they are streaming by splitting the input into streams the total context is now split amongst those streams and a particular streams context will be shorted to to context/ streams?


Replies

danlentontoday at 12:35 AM

I think the main benefit is improved speed and parallelism. Very similar to https://thinkingmachines.ai/blog/interaction-models/