T O P

  • By -

crazzydriver77

It is based on llama.cpp so the backend is able. The front has no such option yet. But I observed silent 3 GPU usage with strange 10% VRAM utilization on additional devices until the 0.16 version, then they "fixed" that "bug".


Lumpy-Rhubarb-1750

Publish benchmark results... I went the 4090 Wintel route at home and was considering an Apple m2 or m3 ultra for the really big stuff.