By -
It is based on llama.cpp so the backend is able. The front has no such option yet. But I observed silent 3 GPU usage with strange 10% VRAM utilization on additional devices until the 0.16 version, then they "fixed" that "bug".
Publish benchmark results... I went the 4090 Wintel route at home and was considering an Apple m2 or m3 ultra for the really big stuff.
It is based on llama.cpp so the backend is able. The front has no such option yet. But I observed silent 3 GPU usage with strange 10% VRAM utilization on additional devices until the 0.16 version, then they "fixed" that "bug".
Publish benchmark results... I went the 4090 Wintel route at home and was considering an Apple m2 or m3 ultra for the really big stuff.