logoalt Hacker News

sharmslast Thursday at 10:59 PM0 repliesview on HN

This is because the "thinking" you see is a summary by a highly quantized model - not the actual model, to mask these tokens