This is because the "thinking" you see is a summary by a highly quantized model - not the ...

sharms • last Thursday at 10:59 PM • 0 replies • view on HN

This is because the "thinking" you see is a summary by a highly quantized model - not the actual model, to mask these tokens

alt Hacker News