logoalt Hacker News

nnevatietoday at 4:53 AM0 repliesview on HN

The FP math is deterministic. However, the environments in which inference is run and specifically batching make current LLM services practically non-deterministic.