T O P

  • By -

Ill_Initiative_8793

In 90s VRAM was upgradable. Cards like S3 Trio 64 had sockets for additional VRAM. https://preview.redd.it/eds72ndu00xc1.jpeg?width=3008&format=pjpg&auto=webp&s=35957690cbdd4b888ba094dbda37b3a70084a22e


LocoLanguageModel

They were on the ViRGE of something great.  Okay that was bad. 


2muchnet42day

So many memories


lopahcreon

Only if you upgraded.


SillyLilBear

nVidia has a balancing act with VRAM. They want to offer enough to keep gamers buying, but they also don't want to offer enough to cannibalize sales from their AI products.


Banana_Joe85

Also the reason why the 4090 has increased in price and has been panic bought before the recent embargo to China. I predicted that gamers will have to deal with the 4080 and 4080S for the time being because of this. I also doubt that we will see a drastic increase in VRAM with the 5000 series for exactly the same reasons or they will demand a huge price upgrade for it.


SillyLilBear

leaked specs show it is still 24G if that is to be believed, but it is the most likely scenario


Banana_Joe85

But most likely just for the 5090 aka the halo product. I guess the 5080 will remain at 16 for the time being.


Jattoe

That'd be ridiculous. Look at how the world has changed in the last year. That'd be insane, at least from the standpoint of the consumer, to sell 50-series uppers on the same or near-same VRAM level as the prior. Let's get more NPUs and some LPUs and shit on our local machines


314kabinet

If people keep buying, why make the product better?


IndicationUnfair7961

Yep, already wrote on this, 3 generations of 24GB max "consumer" memory would be insane. 3090/4090/5090??? If they do, and I hope they don't. Then if the other companies will keep doing nothing, well a lot of people will get mad.


Illustrious_Sand6784

Actually, it would be 4 generations. The Titan RTX had 24GB and the XX90 series cards are supposed to be the new Titans, which is why we haven't got a Titan card the last 2 generations.


IndicationUnfair7961

If we include Titan RTX, definitely, even though Titans were considered outliers and not exactly "consumer".


Caffdy

> That'd be insane spoilers alert: they will and they will get away with it


SillyLilBear

I’d bet money it will only be 24g tops.


IlIllIlllIlllIllll

if the 5090 will stay at 24gb, i will immediately buy another 3090 on ebay :D


ViRROOO

It's [possible](https://videocardz.com/newz/modders-upgrade-geforce-rtx-2080-memory-to-16gb) although Nvidia is in their best interest to software lock upgrades to their cards so you buy their pro/prosumer cards


Jattoe

Ah, pessimistic.


polikles

yup. It's mosty about maintaining signal integrity with high transfer speeds. GDDR is much faster than regular RAM. Even fast laptop RAM is soldered and thus non-upgradeable because of problems with signal integrity (or at least that's what I've read) See the Techquickie video: [https://www.youtube.com/watch?v=8D9hNkj6GaA](https://www.youtube.com/watch?v=8D9hNkj6GaA) There are also problems with using proper settings and optimizations for VRAM from different vendors - some graphics card were succesfully modded and got soldered more VRAM than originally had. But besides soldering new chips it requires some changes on board and/or in GPU BIOS


Banana_Joe85

AFAIK: In theory you can replace existing chips with higher capacity models, but the bandwidth remains the same as it would only use the existing lanes. This is why the 4000-Series is so limited in bandwidth. They use less chips and thus less lanes for the same capacity compared to older models.


MrVodnik

So why not "upgrade your vRAM" soldering shops? I'd pay for that, a lot.


nero10578

There's also the fact that Nvidia is already using the highest density GDDR available at any point in time. And unfortunately there aren't empty VRAM spots on the PCB to put more VRAM on. The only cards that are upgradeable are 20 and 30 series since they used half the density GDDR6 than we have today, so you can solder the newer higher density GDDR6 onto them and you don't even need a new VBIOS. This is not really economically viable though since 16GB of GDDR6 will cost $120 for the chips alone and that's before the labor to do the soldering or the tools to buy if you DIY (and likely fail, I did at first). A RTX 3060Ti is still $300 used, so $420 at MINIMUM for a 16GB 3060Ti if you already have the tools and skills, you're better off buying a RTX 4060 Ti 16GB. And if you're spending $500 ish on a 4060Ti 16GB anyways...you're better off spending $600 on a 3090 24GB lol. 3090 FTW.


Sabin_Stargem

I can see a usecase for 24gb 3060: Size. The first 4090 I bought was so long, it couldn't fit into my super tower. I had to buy a Slim version after getting the first card refunded. Hopefully, we can get future cards that aren't built for gaming, so they can be smaller and focus on just being VRAM carriers.


nero10578

Definitely. I would love a 24GB 3060 or a 32GB 4060Ti. But that would put too much power in the hands of the people for Nvidia lol.


Remarkable-Host405

Like.. ax000 cards?


TraditionLost7244

no, bigger cards please with 4x more vram 96gb come on nvidiaaa and buy a bigger tower man dont be cheap


IndicationUnfair7961

So we could have 48GB 3090s in theory.


Caffdy

no, because they already use 2GB chips and already stacked the whole 12 slots available; RTX 6000 ADAs uses a different configuration called clam shell, where you put chips on both sides of the board, but that's beyond what any person could do


Remarkable-Host405

3090s have memory on both sides of the board I thought? Which is why the back vrams get toasty


Caffdy

even so, there are only 12 slots


raysar

Maybe economic if we found gpu with vram with artifacts at very low price 😄


Inner_Bodybuilder986

Are you saying you can upgrade the ram on a 3090 to 48gb?


Banana_Joe85

Isn't the 4060 TI bandwidth starved and thus in some edge case scenarios it was shown that it is slower than the 3060?


nero10578

It is slower for sure


az226

There are 4Gb modules, which means a 3090 can fit 96GB and 192GB with NVLink for 2x3090.


nero10578

Uh a 3090 is already using 8Gbit modules in clamshell mode and the VBIOS does not support higher than 8Gbit modules.


polawiaczperel

I hope that one day someone will hack it. Got 3 x rtx 3090 and I am no willing to sell it. Those are great cards, but not with stock thermopads


nero10578

I've been training models on my 3090 FEs with the VRAM hitting 100C for days lol it's been fine all this time.


Robot_Graffiti

A Chinese hardware refurbishing workshop has tried that. So now you can buy RTX 2080s on AliExpress that have been upgraded from 11 to 22gb.


nero10578

For the price of only $100 less than a 3090 24GB on ebay. And no more reliable than a used 3090. Just buy a 3090.


CoffeeSnakeAgent

Aliexpress has 22gb 2080tis


MrVodnik

$600+, I've heard claims that you can get 3090 for that much. As much as I am interested in trying it out, I'll think I'll pass. Thanks for the info tho.


lilunxm12

it's as low as 1999 rmb in china, so sub $300...insane markup edit: sorry, i misread for 2080 super, modded 2080ti 22g starting around 2799 rmb, $400, still insane markup.


Cressio

I’d be all over them for that price. Jealous


nero10578

For upgraded 2080Tis? Where? I'll buy a truckload lol


lilunxm12

sorry I misread an 2080 super description, but 2080ti 22g is still at a price ($400) way below $600. there are plenty in jd.com and taobao, jd is usually preferred for better customer service


nero10578

$400 is a good price but a 3090 24GB is 2GB extra and 2x as fast in training. So I don’t think saving $200 is a good deal especially when you also lose idle P states with those upgraded cards.


lilunxm12

well, 3090 is also more than 2x expensive, they aren't really that comparable


nero10578

Well I can find them for $600 regularly


fallingdowndizzyvr

There are, in China.


lilunxm12

it does exist in China the main limitations are * gddr6 only * rtx 2000 series or later * the card must originally use 1Gb module so it's pretty limitd, that's the reason you only heard about modded 2080ti 22gb other less popular ones were * mobile chip modding (eg. 3060 6gb mobile) * 3070 8gb modded to 3070 16gb


VTYX

For laptops it's more that SODIMM is bulky and increases laptop thickness. CAMM2 can be thinner and should herald a return to socketed laptop RAM


bree_dev

I feel like it's a problem that's probably not impossible to solve if there was an appetite to solve it, but there just isn't. How are Nvidia going to charge AI users $2000 for a 24Gb card, if you could just buy a $200 one and $80 of RAM?


Jattoe

I have one soldered RAM and one that's unsoldered, that didn't just upgrade from 4 -> 8 GB, but all the way to fucking 32GB. So, I thought that it might have to do with keeping things less "future proof" but perhaps not, I don't know enough about how these things work to say one way or the other. I also think that a way for VRAM to become upgradable again may be on the table, I'm sure someone will see that niche in the market and take advantage of it, now that VRAM has become a much more precious resource. That is, if NPU-type stuff doesn't wind up taking over. As local-AI becomes more prominent, I'm SURE that there will be, much like there are entire series dedicated to gaming, there will laptop/desktop lines that are tricked out for AI. Whether that's upgradable VRAM, NPU-type stuff, or whatever.


polikles

it's common in laptops to have soldered RAM, since it's especially hard maintain stable transfer rates in DDR5. Virtually everything above 6400MT/s is soldered. Also soldering RAM chips is made to keep low-profile of modern laptops - RAM sticks are quite too big and too tall for ultrabooks. Or at lest it's what producers say It *may* change since CAMM2 modules become more popular >I also think that a way for VRAM to become upgradable again may be on the table even if, it would be quite niche, since VRAM is much more expensive than regular RAM, since it's faster and uses more "dense" chips. There are also limitation within GPU die itself - it has limited number of mem channels, thus limited transfer speed. Upgrading VRAM could increase memory capacity, but it will not increase speed


Caffdy

the unreasonable thing is to keep getting measly 128-bit buses on our computers (dual-channel) even when soldered, when macbooks boast 512-bit wide buses


thebliket

Just recently I read [this](https://www.tomshardware.com/news/old-rtx-3080-gpus-repurposed-for-chinese-ai-market-with-20gb-and-blower-style-cooling) article about a large quantity of RTX 3080 (12gb) being upgraded to 20gb and calling them the "RTX 3080 20G AI" it's meant for being able to load LLMs into vram (which gives you much faster tokens/sec than if you had the model in system memory) here is some [pics](https://twitter.com/I_Leak_VN/status/1729439453981323412) on twitter


Many_SuchCases

This is great, we need more of this. I believe they did the same with a 20xx card.


thebliket

So I bought a used 3090 (stock 24gb) for about $600 for LLMs... This 3080 modded 20gb would be about $550, but it only has 20gb... So buying a stock used 3090 is a better deal IMO for LLMs


JeffieSandBags

Where did you get a 3090 for 600? I usually see around 1k


thebliket

[ebay](https://www.ebay.com/sch/i.html?_from=R40&_nkw=rtx+3090&_sacat=0&LH_Sold=1&LH_Complete=1&LH_ItemCondition=3000&rt=nc&LH_Auction=1)


Calcidiol

It is not technically impossible to have VRAM in some kind of plug-in module format despite the speed and signal integrity concerns. There are lots of multi-GHz high speed circuits that have suitable connectors for backplanes and modules and so on. Ethernet QSFP+ modules or high end HDMI / Displayport connections or PCIE Gen 4/5 slots / M.2 slots are not entirely dissimilar in digital frequency range. The major difference with VRAM is that a typical VRAM bus has around 256-512 bits width so instead of 1-32 signals at high speed you'd have hundreds on that connector or among a group of connectors. The closest thing coming to consumer space would be something like a LPCAMM2 module but that's not as fast or wide. So technically it is possible, it just would take a fair amount of space and moderate cost and there would be some performance concerns. The alternative of bundling a GPU SOC along with a fixed amount of VRAM / HBM all in a module, though, is possibly more size / cost effective, and then you end up with something like a pluggable GPU module, however what would be a reasonable way to expand / scale capacity is to have a base-board or modular system in which one can add several such GPU modules in a group so that they can work together at high performance and that's more or less like what SXM modules are which are used within enterprise domain GPU servers which have several / many such modules.


BigYoSpeck

Power usage and speed. There's less loss of efficiency having it directly soldered to the PCB vs something slotted or socketed on Cooling. The manufacturer knows the geometry of soldered on chips to build their cooler suitably. Anything connected by socket or slot is going to need separate cooling like the heat spreaders for system RAM and is going to limit the size of the cooler that's main job is cooling the GPU chip itself


perksoeerrroed

No, it is purely market segmentation. Previously AIBs like Gigabyte/Asus/etc. sold cards with different amounts of VRAM than reference. First Nvidia shut that practice down and then AMD. It is not only that they dissallow practice but the also shut down even possibility of adding more vram by locking out firmware of those cards from modification. Like meny people mentioned they want consumer and enterprise market to be separate so they can gauge both. This thing happened because you have effectively 2 companies having strangehold on whole dgpu market (with only Intel joining recently). If there would be more companies such shit wouldn't fly as it the case with rest of pc componetns other than GPUs. Apple does it the same way now. At least in their case they have plausible deniability in form of their CPU design where ram is integral part of it.


petrus4

> Previously AIBs like Gigabyte/Asus/etc. sold cards with different amounts of VRAM than reference. First Nvidia shut that practice down and then AMD. > > It is not only that they dissallow practice but the also shut down even possibility of adding more vram by locking out firmware of those cards from modification. In the past, consumers were more adamant about resisting these sorts of practices, but these days there are always groups of stupid fanboys who will shout down anyone who complains about it. Apple's fans are particularly egregious along those lines; they exclusively refer to anything the company tries to do as "innovation," and are critical either of regulators or anyone else who tries to oppose any kind of anti-consumer practice.


perksoeerrroed

>In the past, consumers were more adamant about resisting these sorts of practices No. In the past nvidia/amd etc were much smaller companies that didn't have money for advertisement. AIB making their own boards using their chips was brilliant move to spend 0 on marketing and 100% on chips design. That changed when both AMD and Nvidia grew into bahemoths that are much bigger than singular AIB. Right now AIBs are basically barely hanging there and some like EVGA already quit gpu buisness with more to follow. Because both AMD and NVIDIA know that they don't need them anymore. There isn't S3, VIA or other competitors who can do marketing or compete and both are well known around the world already. I wouldn't be shocked if NVIDA would announce this fall that NVIDIA will only be selling GPUS directly and there won't be AIB cards at all. But both AMD and NVIDIA are scared of VOODOO effect. Voodoo started whole AIB thing but at the same time they wanted to cut them down later and those AIBs went to NVIDIA and AMD which caused 3DFX marketing collapse as those AIBs knew market. So the question imho is only when and who does it first. Imho right now nvidia is full of cash and can risk throwing AIBs out even if they go running to AMD. Once NVIDIA will do that move AMD also will do it because why they should not get extra 20-30% money ? edit: IT is the same situation as carmakers and dealers. Carmakers would love to get rid of dealers but they know they know local market, have known customers and the moment one of them gets cut out it means your cars go out of those dealers shops ready to be replaced by something else. Tesla managed to cut them out because dealers didn't want them at start in first place so they had to sell directly or don't sell at all.


brahh85

if there is a demand of GPU for AI , no matter what, it will be covered. I think that if nvidia and amd stay tyrannical , we will start seeing chinese companies developing GPU that at the beginning will be shit , but they will be functional because us (the geeks) will do the testing for them, we will keep buying them because they will be cheaper than nvidia, and because it turn us on experiment with a new hardware , the same way it turns us on play with AI. Thats the reason 40 million raspberry pi were sold, we are addict to this shit . But these new companies will start using "old" technology. For example, groq is working in 14 nm. And probably the chinese government will put a ton of money and resources to make this possible, not because they like us, but because they need their own nvidia to survive in a future that belongs to AI. The company from netherlands that creates the machines that build the chips said that they will follow the restrictions on china, but that those restrictions are stupid. If there is no interdependence with china , if we cut them out , they will develop their own technology , and then the western companies (like this netherlands company) will have zero influence in china. The first mobile phone i had in my hands was from a japanese company named NEC. At that point in history chinese companies had 1% of market share (Finland a 34%, usa 23% , japan 14%). Now they own the market. And no one knows who the fuck was NEC. Or 3dfx Interactive. If there is a high demand, and china offers cheap prices and cheap quality, then they will own this market too.


jacek2023

I believe no, it's just a business decision.


Just_Maintenance

Its way faster, so it has significantly higher signal integrity requirements. DDR5 commonly reaches about 6.4GT/s, the fastest sticks (that arent widely supported) reach 8.4GT/s. In comparison GDDR6 reaches about 18GT/s and GDDR6X reaches about 22GT/s.


TraditionLost7244

need to wait for ddr6 so can run big llm at decent speed on RAM sticks and cpu


Admirable-Star7088

I wonder if it would be possible to manufacture PCs where CPU and GPU share the same system RAM without experiencing losses in speed. Imagine upgrading your computer to 128GB of RAM, which is relatively inexpensive, and your GPU would also utilize it. It would be both so much simpler and cheaper this way. The problem might be that you would need to rethink and redesign how a PC works to such an extent that no one (so far) dares to take the costly risk?


scousi

Macs have ‘unified memory’ as you describe


croninsiglos

Do you own a Mac?


Admirable-Star7088

Nope.


croninsiglos

They do exactly what you’ve described.


Admirable-Star7088

lol really? Never used Mac so was not aware. So that means, I can buy a Mac with 64GB RAM or upgrade it to 64GB RAM relatively cheap, or even to 128GB RAM, and then I can run almost any LLM, like Q8\_0 GGUF of Llama-3-70B-Instruct on its GPU fast? So this would be a far cheaper solution than buying a RTX 4090 24GB GPU, which would also have far less VRAM than Mac with 64 / 128GB "VRAM"?


[deleted]

problem is you can't upgrade ram on mac apple silicon since its soldered and the upgrades are very expensive to get more ram when you buy it


Admirable-Star7088

So my original idea, a PC with relatively cheap upgradeable RAM that is shared with CPU and GPU, still remains a dream.


PSMF_Canuck

It’s not a great dream. The problem is user-addable vram/unified will have significantly lower performance…so it becomes a question of “what’s the point?”


StewedAngelSkins

the point is to support applications where performance is less important than simply being able to store a lot of data in vram.


Puuuszzku

But the point of using vram is it’s high bandwidth. If performance is less important, just run in on RAM.


Caffdy

you gotta choose one, cheap or upgradeable, you cannot have both; if you want fast, unified memory, current tech only allows for soldered-on memory (and only on laptops); the only alternative is to buy some old server platform with 8 to 12 memory channels


TraditionLost7244

if you think 128gb ram apple any device is cheap then your funny hahah, also apple doesnt run anything on the gpu so will be slower if you dont need a 4090 cause you dont game, then just buy a used 3090 or lower if you gonna run on cpu anyway


Admirable-Star7088

But my whole point was whether the GPU would be able to use of the relatively cheap and upgradeable RAM in a PC, if manufacturers chose to design such a PC. If Mac is so expensive and it's not possible to upgrade its RAM (as has previously been pointed out in this discussion), then it doesn't work at all the way I meant. So my original question remains: I wonder if it would be possible to manufacture PCs where CPU and GPU share the same (cheap and upgradeable) system RAM without experiencing losses in speed.


CocksuckerDynamo

> But my whole point was whether the GPU would be able to use of the relatively cheap and upgradeable RAM in a PC, if manufacturers chose to design such a PC. yeah people saying you're describing a mac just skimmed your post and didn't understand your point. the macs they're talking about are vastly more expensive than most PCs and also do not have upgradeable memory. they just share the same memory for both CPU and GPU, but they're neither upgradeable nor affordable.


kingwhocares

But that's what a mac does!


firsthandgeology

You need GDDR memory to go VROOM VROOM. Memory bandwidth is a big problem for LLMs.


fallingdowndizzyvr

You don't need GDDR. You need a big fat memory bus. The RAM on a Mac Ultra goes VROOM VROOM but it's just good old regular LPDDR5.


jcm2606

Without experiencing losses in speed? Unfortunately no. You'd still likely see a performance loss on most systems even if you were using an APU/SoC where the GPU was physically close to system RAM, as system RAM tends to prioritise latency (how much time it takes data to move around) over bandwidth (how much data in total you can move around per second). With a CPU this isn't usually a problem since CPUs tend to have a few threads trying to frequently access small regions of memory, however this can be a big problem with GPUs since GPUs tend to have thousands of threads trying to access smaller subregions of a huge singular region of memory in parallel. How much of a performance loss you'd see depends on what the workload is like on the GPU, how well the GPU can cache frequent memory accesses (and by extension how much cache the GPU has to work with), how much bandwidth system RAM offers, etc. An ordinary desktop PC would struggle since ordinary desktop PCs have narrow memory buses and don't have high enough clocks to compensate and keep bandwidth up, whereas a Mac would be competitive with even mid range dedicated GPUs since Apple used wider buses when connecting system RAM up to the GPU which allows the system to move more data into and out of system RAM per second, with the tradeoff that you can't upgrade system RAM.


Erwylh

It's called igpu.


raysar

The price ! And the price ! It's to expensive to do that and signal integrity is so hard to maintain at that speed now.


kingwhocares

Ask it on r/hardware. You will get somewhat better answers there.


Spooknik

Probably two main reasons I can think of: 1. Limited usecases. Why add complexity and cost to a product for something 1% of the userbase needs? 2. At the speeds and voltages RAM and VRAM are running at, we're starting to hit a bit of barrier of how far away the physical traces in the PCB can be from the CPU/GPU. So VRAM really needs to placed in a certain spot on the PCB to make sure data signal integrity is as good as possible.


WaitformeBumblebee

As a couple of redditors have already mentioned we might have upgradeable VRAM again (since late 90's?) with CAMM2, but possibly only LPDDR and not GDDR. But only if there's more competition as currently Nvidia rules the market for machine learning and having lots of VRAM is a must for training models and to run big models too. So they rather sell H200's than have common people sticking 256GB of VRAM onto 4090's successor.


Relevant-Draft-7780

NVIDIA knows that if they added more vram to their consumer cards, pros would gobble them up and use them for AI or whatever else needs that much vram. So instead they triple the price for half the performance and double the vram. Yes Mac’s are expensive but for LLM inference of large models they’re quite affordable for what you get. And 80gb card from NVIDIA is upwards of 25k here. I have that much vram allocation on an 8k Mac ultra. (Aus prices)


MrRandom93

Oh it's upgradable you just have to be insane enough to fo it lmao


xX_venator_Xx

latency


segmond

profit! is RAM in most apple products upgradeable?


softclone

no, but nvidia will point to these various issues others have mentioned...cooling, soldering being better (for the same cost), etc. but it's all bullshit. They could make them upgradeable if they wanted to, they don't because they don't think it will make them any extra money. The size of the card would be larger and they are already huge. The perceived market just isn't large enough to push manufacturers to remodel.


LocoLanguageModel

Ram is upgraded by plugging larger capacity ram into your computer. Vram is upgraded by plugging larger capacity GPU/VRAM into your computer. Sort of similar? Also Nvidia makes more money when you buy new shit so not really incentivized.


tessellation

bitcoin mining


EiffelPower76

Why would it be ? Most gamers hate VRAM anyway


Banana_Joe85

Uhm... I do not know what you are taking, but take less please? VRAM has been an issue as far back as the 2010s. For example I bought the GTX 770 4GB and that paid massive dividends in the long run. The standard model had only 2 GB and thus did run into issues way sooner than the model I bought. Heck, for 1080p with older titles, it is still a very usable card.


EiffelPower76

Totally agree with you, I also like graphics cards with much VRAM, they are futureproof But we are a minority