logoalt Hacker News

thomascountztoday at 6:40 AM0 repliesview on HN

Yes, you can use constrained decoding like logit masking to force all invalid tokens in the vocabulary to -inf, and effectively be removed from selection. I believe llama.cpp exposes this by accepting a formatted grammar.