> Opus 4.7 uses an updated tokenizer that improves how the model processes text. The tradeoff is ...

cupofjoakim • yesterday at 2:43 PM • 19 replies • view on HN

> Opus 4.7 uses an updated tokenizer that improves how the model processes text. The tradeoff is that the same input can map to more tokens—roughly 1.0–1.35× depending on the content type.

caveman[0] is becoming more relevant by the day. I already enjoy reading its output more than vanilla so suits me well.

[0] https://github.com/JuliusBrussee/caveman/tree/main

Replies

Tiberium • yesterday at 2:47 PM

I hope people realize that tools like caveman are mostly joke/prank projects - almost the entirety of the context spent is in file reads (for input) and reasoning (in output), you will barely save even 1% with such a tool, and might actually confuse the model more or have it reason for more tokens because it'll have to formulate its respone in the way that satisfies the requirements.

➕ show 16 replies

gghootch • yesterday at 3:55 PM

Caveman is fun, but the real tool you want to reduce token usage is headroom

https://github.com/gglucass/headroom-desktop (mac app)

https://github.com/chopratejas/headroom (cli)

➕ show 3 replies

computomatic • yesterday at 2:56 PM

I was doing some experiments with removing top 100-1000 most common English words from my prompts. My hypothesis was that common words are effectively noise to agents. Based on the first few trials I attempted, there was no discernible difference in output. Would love to compare results with caveman.

Caveat: I didn’t do enough testing to find the edge cases (eg, negation).

➕ show 4 replies

alach11 • yesterday at 7:29 PM

On my private internal oil and gas benchmark, I found a counterintuitive result. Opus 4.7 scores 80%, outperforming Opus 4.6 (64%) and GPT-5.4 (76%). But it's the cheapest of the three models by 2x.

This is mainly driven by reduced reasoning token usage. It goes to show that "sticker price" per token is no longer adequate for comparing model cost.

TIPSIO • yesterday at 3:05 PM

Oh wow, I love this idea even if it's relatively insignificant in savings.

I am finding my writing prompt style is naturally getting lazier, shorter, and more caveman just like this too. If I was honest, it has made writing emails harder.

While messing around, I did a concept of this with HTML to preserve tokens, worked surprisingly well but was only an experiment. Something like:

> <h1 class="bg-red-500 text-green-300"><span>Hello</span></h1>

AI compressed to:

> h1 c bgrd5 tg3 sp hello sp h1

Or something like that.

➕ show 2 replies

fzaninotto • yesterday at 5:24 PM

To reduce token count on command outputs you can also use RTK [0]

[0]: https://github.com/rtk-ai/rtk

stacktraceyo • yesterday at 11:48 PM

What about some thing like

https://github.com/rtk-ai/rtk

willsmith72 • today at 12:50 AM

That's such a poor way to communicate a number. I take it they mean an increase of up to 35%?

motoboi • yesterday at 4:21 PM

Caveman hurt model performance. If you need a dumber model with less token output, just use sonnet-4-6 or other non-reasoning model.

➕ show 1 reply

chrisweekly • yesterday at 3:50 PM

I really enjoy the party game "Neanderthal Poetry", in which you can only speak using monosyllabic words. I bet you would too.

nickspag • yesterday at 4:36 PM

I find grep and common cli command spam to be the primary issue. I enjoy Rust Token Killer https://github.com/rtk-ai/rtk, and agents know how to get around it when it truncates too hard.

JustFinishedBSG • yesterday at 5:25 PM

Interesting, it doesn't seem intuitive at all to me.

My (wrong?) understanding was that there was a positive correlation between how "good" a tokenizer is in terms of compression and the downstream model performance. Guess not.

user34283 • yesterday at 3:32 PM

I used Opus 4.7 for about 15 minutes on the auto effort setting.

It nicely implemented two smallish features, and already consumed 100% of my session limit on the $20 plan.

See you again in five hours.

➕ show 1 reply

4b11b4 • today at 3:53 AM

but what about DDD

p_stuart82 • yesterday at 5:04 PM

caveman stops being a style tool and starts being self-defense. once prompt comes in up to 1.35x fatter, they've basically moved visibility and control entirely into their black box.

hayd • yesterday at 3:33 PM

me feel that it needs some tweaking - it's a little annoyingly cute (and could be even terser).

ctoth • yesterday at 5:01 PM

1.35 times! For Input! For what kinds of tokens precisely? Programming? Unicode? If they seriously increased token usage by 35% for typical tasks this is gonna be rough.

OtomotO • yesterday at 2:55 PM

Another supply chain attack waiting?

Have you tried just adding an instruction to be terse?

Don't get me wrong, I've tried out caveman as well, but these days I am wondering whether something as popular will be hijacked.

➕ show 1 reply

alt Hacker News

Replies