You have a benchmark for output token reduction, but without comparing before/after performance...

Tostino • today at 1:51 AM • 0 replies • view on HN

You have a benchmark for output token reduction, but without comparing before/after performance on some standard LLM benchmark to see if the instructions hurt intelligence.

Telling the model to only do post-hoc reasoning is an interesting choice, and may not play well with all models.

alt Hacker News