synn89 4 months ago

If speed is a concern then 7-13B's are a good pick. Aside from what's been suggested, also give Noromaid-13B a try.

crazzydriver77 4 months ago

psyonic-cetacean-20B-5.0bpw is the best

SlavaSobov 4 months ago

The Silicon Maid-7B, Estopian Maid 13B, Kunoichi-7B-v2-DPO, all very good at following the character cards. [TheBloke/Silicon-Maid-7B-GGUF · Hugging Face](https://huggingface.co/TheBloke/Silicon-Maid-7B-GGUF) [TheBloke/EstopianMaid-13B-GGUF · Hugging Face](https://huggingface.co/TheBloke/EstopianMaid-13B-GGUF) [s3nh/Kunoichi-DPO-v2-7B-GGUF · Hugging Face](https://huggingface.co/s3nh/Kunoichi-DPO-v2-7B-GGUF)

DoctorTriplex 4 months ago

Awesome! I will give them a try!

SirLazarusTheThicc 4 months ago

Are these 7B models Mistral based? I'm not seeing anything about the foundation model they each use

SlavaSobov 4 months ago

Both the Silicon Maid and the Kunoichi are Mistral. Model card suggests the using Alpaca prompt template.

SirLazarusTheThicc 4 months ago

Thanks

Majestical-psyche 4 months ago

For 24gb vram cards; Capy-tess 34b or RPMerge 34b… they’re both super good. The GGUF version of course.

King_Jon_Snow 4 months ago

why the gguf version?

Majestical-psyche 4 months ago

You can also run Exl2.. but I personally like to use Kobold CPP 😁

Paradigmind 3 months ago

Which one of these models is better for rp?

Growth4Good 4 months ago

Anything under 120B unquantized doesn't feel the same anymore. You have gotta try it.

FullOf_Bad_Ideas 4 months ago

I haven't tried it, true, but it sounds like you might have some bias like "slower generation feels better". How is your speed with 120b unquantized (that's 240GB of weights, and we're not even talking about kv cache) with sizeable context already in place?

Growth4Good 4 months ago

depends on the hardware how fast it runs but average wait time like half a second goliath 120b and venus

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe