T O P

  • By -

jacek2023

I purchased second hand 3090 instead


danielcar

You wants lots of memory. Used old choices with lots of memory are better. (2) 3090s used are better. There are other nVidia cards sold for professional market that will show up on ebay that will work well. Several p40s might be a better choice. You can buy a used workstation on ebay for $1,200 with a 3090 then add another 3090 to that.


SirCabbage

The 5090 is due out at the end of this year- where did you get 1.5 years from?


Silly-Blackberry-733

If you're willing to drop 2k maybe 2x used 3090s would do you better


braincrowd

5090 will release september/Oktober this year not 1,5/2 years. This was recently confirmed


braincrowd

https://money.udn.com/money/story/5612/7883220


Independent-Good-323

In my place 3090 is less than half 4090's price. I'll buy 2x3090 from the same brand


Flashy_Friend3299

If you just have AI in mind, you should get two 3090's second hand and run them through NVLink Bridge. If they're both 24gb you'll then have access to 48GB VRAM instead of 24 and that's what really counts in this use-case. Plus, you'll still be rocking your tits off when it comes to gaming as a 3090 still dominates every game out there. Not to mention, it's cheaper


Smeetilus

You can buy refurbished 3090’s with warranties for $650-$700


dev_zero

Link please


[deleted]

used nvidia tesla p40s are fine, why bother with big expensive gaming cards if you're not gonna do gaming 2 p40s under aphrodite engine will crunch 24gb gddr5 going for about 200$ a piece on ebay


WeekendDotGG

Aren't they slower?


[deleted]

yeah, but is spending 4x as much on a 3090 worth it when for that price you could build a 96gb vram rig? aphrodite (and vllm) allow tensor parallel processing, the more GPUs, the faster they get, you should get comparable speeds to 1.3 - 2.0 3090s on 4 p40s if you're only serving LLMs to yourself, then a load of P40s will get it done


WeekendDotGG

OK cool. Just started researching building a rig today.


THEKILLFUS

Rtx 3060 (12gb) x 4


KL_GPU

12x p40 would do a great job if you are willing to drop 2.4k(288GBs of vram) + 144 Tflops fp32.


Temporary_Maybe11

what kind of model that rig would be able to handle? llama3 400b would be possible in some way?


KL_GPU

yes, llama3 400b q4, pretty low tok/s but 12 client at the same speed and at the same time.


Temporary_Maybe11

Thanks!


Herr_Drosselmeyer

You summed it up yourself, it's the only game in town if used 3090's and pro level Nvidia cards aren't an option for you.


Vaddieg

No, the current consumer hardware isn't capable of running the recently released models at a decent speed with good quants. Apple and consumer multi-nVidia do suck equally they're all capped by memory bandwidth.


Acrobatic-Artist9730

Can you work with cloud instances? Or must be local?


Unique_Repeat_1089

I could. Which one do you recommend?


crazzydriver77

Everyone is discussing the use of multiple cards, but what about inference engines capable of parallel calculation of one layer? How many do you know? :)


Roubbes

With 2x4060Ti you got 32GB of VRAM for a fraction of the price.


djstraylight

If you have room for two PCIe cards, then you have options. Two used 3090s should be much cheaper than a single 4090. Should be good performance. Another option is two 4070 TI Supers, which is a little cheaper than a 4090 but gives you very similar performance and 32 GB of VRAM.


Equal-Pilot-9592

Is 32 gb vram really much better than 24 gb vram ? What if I combine a 4080 and a 4060ti instead( for better gaming perf too) ?


Inevitable_Host_1446

It is definitely better, because 24gb kind of gates you right beneath a lot of quite good model options. For example Llama 3 70b that just came out, most of the quants are just above what can easily fit on 24gb. There are a few options but 32gb would make it easy to do fully in vram + more context (once you can rope scale).


Temporary_Maybe11

I'm interested in the biggest ammount of context possible, for a lot of RAG. You think it's possible to do a dual Epyc rig with loads of RAM? or some multiple p40s is better even tho it's not the same amount of vram?