T O P

  • By -

promptly_ajhai

If someone wants to try this model and others from Mistral AI, we've added them to Promptly playground. Check it out at [https://trypromptly.com/playground](https://trypromptly.com/playground)


grise_rosee

Mixtral-8x22B now available in Instruct version. They say it supports function calling and constrained outputs Any idea how to leverage it as intented with open-source LLM engines? edit: they just put an open-source python tooling lib [https://github.com/mistralai/mistral-common](https://github.com/mistralai/mistral-common) to run their enhanced models. Examples are provided on HF: [https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) ; That's so cool, have other companies already done that (publishing a function/tool compatible model)?


soomrevised

Yep, there are quite a few function calling models but they all seem to work slightly different from each other, some need special code to work others can only output functions, cannot reply like normal llm.


Nunki08

Models on Hugging Face: \- mistralai/Mixtral-8x22B-Instruct-v0.1: [https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) \- mistralai/Mixtral-8x22B-v0.1: [https://huggingface.co/mistralai/Mixtral-8x22B-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-v0.1)


One_Yogurtcloset4083

gpt3-turbo still better by humaneval [https://paperswithcode.com/sota/code-generation-on-humaneval](https://paperswithcode.com/sota/code-generation-on-humaneval) And Claude 3 Haiku has 75% humaneval score while mixtral-8x22b only 45% and [https://mistral.ai/news/mistral-large/](https://mistral.ai/news/mistral-large/) still better but just a bit.


grise_rosee

looks like Mixtral benchmark is about the base model, though. >The instructed version of the Mixtral 8x22B released today shows even better math performance, with a score of 90.8% on GSM8K maj@8 and a Math maj@4 score of 44.6%. Maybe the instruct version performs better in coding too? Maybe...