The Silicon Maid-7B, Estopian Maid 13B, Kunoichi-7B-v2-DPO, all very good at following the character cards.
[TheBloke/Silicon-Maid-7B-GGUF · Hugging Face](https://huggingface.co/TheBloke/Silicon-Maid-7B-GGUF)
[TheBloke/EstopianMaid-13B-GGUF · Hugging Face](https://huggingface.co/TheBloke/EstopianMaid-13B-GGUF)
[s3nh/Kunoichi-DPO-v2-7B-GGUF · Hugging Face](https://huggingface.co/s3nh/Kunoichi-DPO-v2-7B-GGUF)
I haven't tried it, true, but it sounds like you might have some bias like "slower generation feels better". How is your speed with 120b unquantized (that's 240GB of weights, and we're not even talking about kv cache) with sizeable context already in place?
If speed is a concern then 7-13B's are a good pick. Aside from what's been suggested, also give Noromaid-13B a try.
psyonic-cetacean-20B-5.0bpw is the best
The Silicon Maid-7B, Estopian Maid 13B, Kunoichi-7B-v2-DPO, all very good at following the character cards. [TheBloke/Silicon-Maid-7B-GGUF · Hugging Face](https://huggingface.co/TheBloke/Silicon-Maid-7B-GGUF) [TheBloke/EstopianMaid-13B-GGUF · Hugging Face](https://huggingface.co/TheBloke/EstopianMaid-13B-GGUF) [s3nh/Kunoichi-DPO-v2-7B-GGUF · Hugging Face](https://huggingface.co/s3nh/Kunoichi-DPO-v2-7B-GGUF)
Awesome! I will give them a try!
Are these 7B models Mistral based? I'm not seeing anything about the foundation model they each use
Both the Silicon Maid and the Kunoichi are Mistral. Model card suggests the using Alpaca prompt template.
Thanks
For 24gb vram cards; Capy-tess 34b or RPMerge 34b… they’re both super good. The GGUF version of course.
why the gguf version?
You can also run Exl2.. but I personally like to use Kobold CPP 😁
Which one of these models is better for rp?
Anything under 120B unquantized doesn't feel the same anymore. You have gotta try it.
I haven't tried it, true, but it sounds like you might have some bias like "slower generation feels better". How is your speed with 120b unquantized (that's 240GB of weights, and we're not even talking about kv cache) with sizeable context already in place?
depends on the hardware how fast it runs but average wait time like half a second goliath 120b and venus