logoalt Hacker News

anthkyesterday at 11:24 PM2 repliesview on HN

Hey 'software engineer', how much of the output of an LLM it's actually reproducible vs the one from a calculator or any programming language with the same input in different sessions?


Replies

onion2ktoday at 7:30 AM

Not really related to this 'discussion' but this is an interesting problem in the AI space. It's essentially a well understood problem in unreliable distributed systems - if you have a series of steps that might not respond with the same answer every time (because one might fail usually) then how do you get to a useful and reliable outcome? I've been experimenting with running a prompt multiple times and having an agent diff the output to find parts that some runs missed, or having it vote on which run resulted in the best response, with a modicum of success. If you're concerned about having another layer of AI in there then getting the agents to return some structured output that you can just run through a deterministic function is an alternative.

Non-determinism is a problem that you can mitigate to some extent with a bit of effort, and is important if your AI is running without a human-in-the-loop step. If you're there prompting it though then it doesn't actually matter. If you don't get a good result just try again.

show 1 reply
weird-eye-issueyesterday at 11:41 PM

Why are you so concerned about the LLM producing the exact same code across different sessions? Seems like a really weird thing to focus on. Why aren't you focused on things like security, maintainability, UI/UX, performance?