logoalt Hacker News

bashbjornyesterday at 9:45 PM1 replyview on HN

The model sees one token per marker - but the overlap with ingested actual text is still relevant, because the tokenizer will ingest regular text, where it will turn "<|turn>" into the same token.

For this reason, it can be tricky to work on the runtime for a model with the same model. This really feels like an accidental problem, but I'm not sure if it's really solvable without abandoning the text representations altogether (and the jinja abstraction along with it).


Replies

lifisyesterday at 10:13 PM

Surely one can just escape the input, no? Seems astonishing if someone isn't doing that

show 2 replies