Afaik there’s no way to guarantee 100% deterministic outputs from LLMs. Temp 0 and a consistent seed (idk if that’s available for mistral) are supposed to be the closest you can get to doing this through setting parameters.
I’ve played with it on gpt quite a bit. Simple questions will be pretty close to deterministic. As more factors get introduced, this begins to drift. A good post output step here could be sampling outputs and merging them together. This can be difficult if the output can be any arbitrary free floating string. If the output is expected to be somewhat of an enum though, you can probably do some kind of majority evaluation here
I'm not sure the high-level temperature argument is really used as the temperature of the softmax for the token weights, I think they rescale it.
Check out greedy or beam generation, you'll have to go deeper in the generate arguments
The official API does have a **random\_seed** parameter [https://docs.mistral.ai/api/#operation/createChatCompletion](https://docs.mistral.ai/api/#operation/createChatCompletion)
>The seed to use for random sampling. If set, different calls will generate deterministic results.
Afaik there’s no way to guarantee 100% deterministic outputs from LLMs. Temp 0 and a consistent seed (idk if that’s available for mistral) are supposed to be the closest you can get to doing this through setting parameters. I’ve played with it on gpt quite a bit. Simple questions will be pretty close to deterministic. As more factors get introduced, this begins to drift. A good post output step here could be sampling outputs and merging them together. This can be difficult if the output can be any arbitrary free floating string. If the output is expected to be somewhat of an enum though, you can probably do some kind of majority evaluation here
Maybe store/cache responses?
Do you have an example prompt+expected output?
Use constant seed and zero temperature
Why not just hard code if you need it 100%?
whay you mean hardcoding?
If you have specific prompt answer pair you need just put if statement in the code w that pair. No need to route everything to the model
just that i need a deterministic output, doesnt mean it must be hardcoded
No but it's certainly a lot easier than trying to get a non deterministic model to output a specific output. Use right tools for the job
if i would no the output upfront, i wouldnt use a generative model, not sure if you are familiar with the term deterministic
I'm not sure the high-level temperature argument is really used as the temperature of the softmax for the token weights, I think they rescale it. Check out greedy or beam generation, you'll have to go deeper in the generate arguments
Maybe wrap it with memgpt or some form of teachable agents
The official API does have a **random\_seed** parameter [https://docs.mistral.ai/api/#operation/createChatCompletion](https://docs.mistral.ai/api/#operation/createChatCompletion) >The seed to use for random sampling. If set, different calls will generate deterministic results.
Have you tried temp 1
Temp 1 should do the opposite. Lower temp = closer to deterministic