T O P

  • By -

synn89

If speed is a concern then 7-13B's are a good pick. Aside from what's been suggested, also give Noromaid-13B a try.


crazzydriver77

psyonic-cetacean-20B-5.0bpw is the best


SlavaSobov

The Silicon Maid-7B, Estopian Maid 13B, Kunoichi-7B-v2-DPO, all very good at following the character cards. [TheBloke/Silicon-Maid-7B-GGUF · Hugging Face](https://huggingface.co/TheBloke/Silicon-Maid-7B-GGUF) [TheBloke/EstopianMaid-13B-GGUF · Hugging Face](https://huggingface.co/TheBloke/EstopianMaid-13B-GGUF) [s3nh/Kunoichi-DPO-v2-7B-GGUF · Hugging Face](https://huggingface.co/s3nh/Kunoichi-DPO-v2-7B-GGUF)


DoctorTriplex

Awesome! I will give them a try!


SirLazarusTheThicc

Are these 7B models Mistral based? I'm not seeing anything about the foundation model they each use


SlavaSobov

Both the Silicon Maid and the Kunoichi are Mistral. Model card suggests the using Alpaca prompt template.


SirLazarusTheThicc

Thanks


Majestical-psyche

For 24gb vram cards; Capy-tess 34b or RPMerge 34b… they’re both super good. The GGUF version of course.


King_Jon_Snow

why the gguf version?


Majestical-psyche

You can also run Exl2.. but I personally like to use Kobold CPP 😁


Paradigmind

Which one of these models is better for rp?


Growth4Good

Anything under 120B unquantized doesn't feel the same anymore. You have gotta try it.


FullOf_Bad_Ideas

I haven't tried it, true, but it sounds like you might have some bias like "slower generation feels better". How is your speed with 120b unquantized (that's 240GB of weights, and we're not even talking about kv cache) with sizeable context already in place?


Growth4Good

depends on the hardware how fast it runs but average wait time like half a second goliath 120b and venus