Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most LLMs are deterministic, but the tooling around them samples randomly from the output to let users explore the nearby space of responses without having to come up with infinitely nuanced prompts. You can turn this off.

However, the structure of OpenAI's GPT-4 is not deterministic. The most likely explanation I've seen is that they only activate some parts of the model for each input, and the parts are load-balanced so sometimes a different part of the model will be responding. https://news.ycombinator.com/item?id=37006224




This non-deterministic sampling is not only for users to explore the space of responses. Without this, the LLM itself is prone to generate too-repetitive text.


> they only activate some parts of the model for each input

Perhaps you see seemingly random results because OpenAI is A/B testing multiple versions, or different combinations of hyperparameters, so that you can train GPT5.


Nah; the mentioned paper above (from a few days ago here on HN) show about how GPT4 is nondeterministic because the sparse mixture of experts technique used is nondeterministic based on batch positioning.


> You can turn this off

Not entirely. Even with temperature = 0, GPT4 is non-deterministic.


> GPT4 is non-deterministic.

For the curious reader: https://news.ycombinator.com/item?id=37006224

It appears that it could "easily" be made deterministic.


That article went past my level of expertise, which suggests that "easily" is, as you imply, a matter of perspective. It's possible the current behavior is a result of tradeoffs made for performance or cost. Modifications to make the model deterministic could depend on making unacceptable tradeoffs.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: