logoalt Hacker News

Universal Claude.md – cut Claude output tokens

441 pointsby killme2008today at 1:23 AM158 commentsview on HN

Comments

TacticalCodertoday at 2:58 AM

> Uses em dashes (--), smart quotes, Unicode characters that break parsers

Re- the Unicode chars that are a major PITA when they're used when they shouldn't, there's a problem with Claude Code CLI: there's a mismatch between what the model (say Sonnet) thinks he's outputting (which he's actually is) and what the user sees at the terminal.

I'm pretty sure it's due to the Rube-Goldberg heavy machinery that they decided to use, where they first render the response in a headless browser, then in real-time convert it back to text mode.

I don't know if there's a setting to not have that insane behavior kicking in: it's non-sensical that what the user gets to see is not what the model did output, while at the same time having the model "thinking" the user is getting the proper output.

If you ask to append all it's messages (to the user) to a file, you can see, say, perfectly fine ASCII tables neatly indented in all their ASCII glory and then... Fucked up Unicode monstrosity in the Claude Code CLI terminal. Due to whatever mad conversion that happened automatically: but worse, the model has zero idea these automated conversions are happening.

I don't know if there are options for that but it sure as heck ain't intuitive to find.

And it's really problematic when you need to dig into an issue and actually discuss with "the thing".

Anyway, time for a rant... I'm paying my subscription but overall working with these tools feels like driving at 200 mph on the highway and bumping into the guardrails left and right every second to then, eventually, crash the car into the building where you're supposed to go.

It "works", for some definition of "working".

The number of errors these things confidently make is through the roof. And people believe that having them figure the error themselves for trivial stuff is somehow a sane way to operate.

They're basically saying: "Oh no it's not a problem that it's telling me this error message is because of a dependency mismatch between two libraries while it's actually a logic error, because in the end after x pass where it's going to say it's actually because of that other thing --oh wait no because of that fourth thing-- it'll actually figure out the error and correct it".

"Because it's agentic", so it's oh-so-intelligent.

When it's actually trying the most completely dumbfucktarded things in the most crazy way possible to solve issues.

I won't get started on me pasting a test case showing that the code it wrote is failing for it to answer me: "Oh but that's a behavioral problem, not a logic problem". That thing is distorting words to try to not lose face. It's wild.

I may cancel my subscription and wait two or three more releases for these models and the tooling around them to get better before jumping back in.

Btw if they're so good, why are the tools so sucky: how comes they haven't written yet amazing tooling to deal with all their idiosynchrasies?

We're literally talking about TFA which wrote "Unicode characters that break parsers" (and I've noticed the exact same when trying to debug agentic thinking loops).

That's at the level of mediocrity of output from these tools (or proprietary wrappers around these tools we don't control) that we are atm.

I know, I know: "I'm doing it wrong because I'm not a prompt engineer" and "I'm not agentic enough" and "I don't have enough skills to write skills". But you're only fooling yourself.

skrun_devtoday at 7:05 PM

[dead]

adshotcotoday at 3:50 PM

[dead]

philbitttoday at 6:00 PM

[dead]

imta71770today at 12:10 PM

[dead]

theAurenValetoday at 3:51 PM

[dead]

adshotcotoday at 12:30 PM

[dead]

obelaitoday at 12:46 PM

[dead]

charlotte12345today at 7:13 AM

[dead]

TheProductAgenttoday at 1:28 PM

[dead]

minsung0830today at 4:04 AM

[dead]

chunpaiyangtoday at 8:54 AM

[dead]

damotianshengtoday at 3:32 AM

[dead]

vasanth7781today at 3:04 AM

[dead]

aiedwardyitoday at 11:45 AM

the token split is wild - 93% input vs 4% output. makes sense to optimize output but forcing short responses can hurt coherence in longer agentic sessions

marsven_422today at 4:37 AM

[dead]

keyletoday at 2:17 AM

Amusing how this industry went from tweaking code for the best results, to tweaking code generators for the best results.

There doesn't seem to be any adults left in the room.

show 1 reply