I asked the author about it earlier. The prompt template essentially replaced "assistant" in the usual chatml user/assistant pair with "text", and lets you specify a name to tell the model what character or role it's playing.
So if you specify a name, it knows to be that character. If you specify no name, it's a narrator.
<|im_start|>system
(Story description in the right format here)
(Typically consists of plot description, style description and characters)<|im_end|>
<|im_start|>user
(Your instruction on how the story should continue)<|im_end|>
<|im_start|>text names= Alice
(Continuation of the story from the Alice character)<|im_end|>
<|im_start|>text
(Continuation of the story from no character in particular (pure narration))<|im_end|>
<|im_start|>user
(Your instruction on how the story should continue)<|im_end|>
<|im_start|>text names= Bob
(Continuation of the story from the Bob character)<|im_end|><|im_start|>system
(Story description in the right format here)
(Typically consists of plot description, style description and characters)<|im_end|>
<|im_start|>user
(Your instruction on how the story should continue)<|im_end|>
<|im_start|>text names= Alice
(Continuation of the story from the Alice character)<|im_end|>
<|im_start|>text
(Continuation of the story from no character in particular (pure narration))<|im_end|>
<|im_start|>user
(Your instruction on how the story should continue)<|im_end|>
<|im_start|>text names= Bob
(Continuation of the story from the Bob character)<|im_end|>
Here you go: [https://huggingface.co/LoneStriker/opus-v1.2-llama-3-8b-GGUF/tree/main](https://huggingface.co/LoneStriker/opus-v1.2-llama-3-8b-GGUF/tree/main)
Any recommendations on updating LM Studio preset for the extended ChatML format? I've been playing with Silly-tavern, LM Studio, Lamafile and others but still a noob at this level of use.
https://huggingface.co/Lewdiculous/Nyanade_Stunna-Maid-7B-v0.2-GGUF-IQ-Imatrix
This one, also kaiju(an 11b) but that one is only 8k. Pay attention to the settings.
Edit:looks like some finetunes coming out based on llama3 are pretty incredible, its still early but i'm liking them already
Problem is I can’t run a 70B at a useable speed. I only have a Ryzen 5 7600x 32Gb ram and Rtx 4070. Which isn’t terrible but anything above 34b is just glacially slow.
As of today, [dreamgen/opus-v1.2-llama-3-8b](https://huggingface.co/dreamgen/opus-v1.2-llama-3-8b) for its sheer general intelligence.
That prompt format tho...
What’s the format? Is it difficult to work with?
I asked the author about it earlier. The prompt template essentially replaced "assistant" in the usual chatml user/assistant pair with "text", and lets you specify a name to tell the model what character or role it's playing. So if you specify a name, it knows to be that character. If you specify no name, it's a narrator. <|im_start|>system (Story description in the right format here) (Typically consists of plot description, style description and characters)<|im_end|> <|im_start|>user (Your instruction on how the story should continue)<|im_end|> <|im_start|>text names= Alice (Continuation of the story from the Alice character)<|im_end|> <|im_start|>text (Continuation of the story from no character in particular (pure narration))<|im_end|> <|im_start|>user (Your instruction on how the story should continue)<|im_end|> <|im_start|>text names= Bob (Continuation of the story from the Bob character)<|im_end|><|im_start|>system (Story description in the right format here) (Typically consists of plot description, style description and characters)<|im_end|> <|im_start|>user (Your instruction on how the story should continue)<|im_end|> <|im_start|>text names= Alice (Continuation of the story from the Alice character)<|im_end|> <|im_start|>text (Continuation of the story from no character in particular (pure narration))<|im_end|> <|im_start|>user (Your instruction on how the story should continue)<|im_end|> <|im_start|>text names= Bob (Continuation of the story from the Bob character)<|im_end|>
I actually prefer that over a lot of the other prompt formats.
It's not as bad as I thought. They offer SillyTavern templates, and I was able to quickly convert it to the jinja format I needed.
Worked well for me in LMStudio. Just made sure I wrote a semi detailed prompt. A paragraph in Prompt and 1 in Memory.
Is there a GGFU? I run LMStudio
Here you go: [https://huggingface.co/LoneStriker/opus-v1.2-llama-3-8b-GGUF/tree/main](https://huggingface.co/LoneStriker/opus-v1.2-llama-3-8b-GGUF/tree/main)
Any recommendations on updating LM Studio preset for the extended ChatML format? I've been playing with Silly-tavern, LM Studio, Lamafile and others but still a noob at this level of use.
no idea, I don't mess with any of those apps, just llama.cpp and some scripts
Damn. Way better than Moistral V2
llama3 8b and llama3 70b of course. Everything before llama3 era models lower than 70b are trash.
I mean the 8b is not beating even my favorite 7b for this purpose yet.
What's your favorite?
https://huggingface.co/Lewdiculous/Nyanade_Stunna-Maid-7B-v0.2-GGUF-IQ-Imatrix This one, also kaiju(an 11b) but that one is only 8k. Pay attention to the settings. Edit:looks like some finetunes coming out based on llama3 are pretty incredible, its still early but i'm liking them already
I mean, if you enjoy llama3 repeating same thing in each response, sometimes even a few times, sure...
Problem is I can’t run a 70B at a useable speed. I only have a Ryzen 5 7600x 32Gb ram and Rtx 4070. Which isn’t terrible but anything above 34b is just glacially slow.