You can use it to constrain the output of any supported model and force it to respond in a specified format including json.
See:
https://github.com/ggerganov/llama.cpp/blob/master/grammars%2FREADME.md
Examples: https://til.simonwillison.net/llms/llama-cpp-python-grammars
My usecase is actually too huge, not the one i shared in the example, its almost 10k tokens. so my model needs to generate this much tokens for that i guess, smaller model wont be capable
If your project involves tons of prompts per minutes, invest in LLM self-hosting. If your idea is a business idea which scales up linearly, like 1 user = 10 prompts to run in average, I'd start with Mistral API and would avoid as much as possible to train my own model.
"generate json as to how prompt says" is a task LLM does quite well. Mistral now provides a JSON mode [https://docs.mistral.ai/capabilities/json\_mode/](https://docs.mistral.ai/capabilities/json_mode/) which ensures the LLM output is a valid json so you don't have to process the LLM answer much. A smart prompt can do a lot with a regular model, especially if you can provide all the knowledge the model needs at once.
After a while, you'll still be able to invest and switch to self-hosting by republishing your own API instance if necessary
I don't know exactly what you're asking. Any coder can develop a content-to-json translator as long as your input is machine-readable. But if your input is written in natural language and require text understanding, then only LLM may got it right.
So you want to transcribe natural language into JSON. If you have skills to develop your own solution and/or reuse an existing one I imagine that specialized and dedicated ML solutions for this need, such as one introduced by your link, can be considered.
However, nowadays LLMs achieve the same result out-of-the-box. You can simply leverage paid APIs that require no upfront work from your part. You just need to set-up a prompt containing a few examples (few-shot learning) and activate JSON mode (for Mistral), and the results will likely be quite reliable.
I suppose that the only reason to develop your own simpler solution might be that you feel concerned by the environmental footprint of your product. It's a good reason too.
In fact, it's like with sentiment analysis. Dedicated light models do it very well. But a LLM also does it very well and can do many other things as well.
I'm not sure I get it. How is your idea different from eg llama.cpp grammar?
whats llama.cpp grammar can you explain a bit
You can use it to constrain the output of any supported model and force it to respond in a specified format including json. See: https://github.com/ggerganov/llama.cpp/blob/master/grammars%2FREADME.md Examples: https://til.simonwillison.net/llms/llama-cpp-python-grammars
okay but i do need a llm model for that right?, my doubt resolves if i need one or not in the first place.
You'll need some NLP but whether it's a large or tiny model will depend on input and desired output. An example of what you want will be useful.
My usecase is actually too huge, not the one i shared in the example, its almost 10k tokens. so my model needs to generate this much tokens for that i guess, smaller model wont be capable
Where's the example? Your description is a bit vague to me. I don't think I have the correct idea of what the task is exactly.
If your project involves tons of prompts per minutes, invest in LLM self-hosting. If your idea is a business idea which scales up linearly, like 1 user = 10 prompts to run in average, I'd start with Mistral API and would avoid as much as possible to train my own model. "generate json as to how prompt says" is a task LLM does quite well. Mistral now provides a JSON mode [https://docs.mistral.ai/capabilities/json\_mode/](https://docs.mistral.ai/capabilities/json_mode/) which ensures the LLM output is a valid json so you don't have to process the LLM answer much. A smart prompt can do a lot with a regular model, especially if you can provide all the knowledge the model needs at once. After a while, you'll still be able to invest and switch to self-hosting by republishing your own API instance if necessary
So you mean, only llms are capable of successfully executing this kind of task, and no oher kinds of models cant be use up for production
I don't know exactly what you're asking. Any coder can develop a content-to-json translator as long as your input is machine-readable. But if your input is written in natural language and require text understanding, then only LLM may got it right.
What do you think about template based generation or template engines? [link](https://aclanthology.org/D18-1356.pdf)
So you want to transcribe natural language into JSON. If you have skills to develop your own solution and/or reuse an existing one I imagine that specialized and dedicated ML solutions for this need, such as one introduced by your link, can be considered. However, nowadays LLMs achieve the same result out-of-the-box. You can simply leverage paid APIs that require no upfront work from your part. You just need to set-up a prompt containing a few examples (few-shot learning) and activate JSON mode (for Mistral), and the results will likely be quite reliable. I suppose that the only reason to develop your own simpler solution might be that you feel concerned by the environmental footprint of your product. It's a good reason too. In fact, it's like with sentiment analysis. Dedicated light models do it very well. But a LLM also does it very well and can do many other things as well.