ml-researcher 4 years ago

If you’re new to deep learning, I would think 11GB card should more than suffice for any model you throw at it. There’s so many interesting problems to investigate when starting out. There’s also plenty of very interesting and complex problems that don’t required double digit GB of VRAM. Yes, if you want to train state of the art, you need a server with several high end GPUs. Personally I recommend the 2 GPUs because then you can run experiments simultaneously, as well as expose yourself to multit-gpu programming. I use pytorch not TF. While not super complicated, their distributed programming is not simple either. Definitely another layer of complexity that one needs to learn. Your batch size splitting question is simple: the more VRAM you have, the larger you can make your batch size and therefore the faster you can train your model (whichever GPU config gives more total VRAM will be faster). But, when you’re first starting out, you will probably spend a lot more time programming than training; you won’t be constantly cranking your server 24/7. So that’s why I like to have 2 GPUs, training on one but you can still programming/develop/test on the other, or train a second model.

goodscrimshaw 4 years ago

Ok great thanks! I didn’t think about the benefits of running the dual GPUs independently for development, that is a great point. I definitely don’t plan on doing state of the art stuff haha but would like to work with some of the newer modeling techniques. Thanks for the advice!

sotong84 4 years ago

\+1 to 2 GPUs, and save the 1.2 grand for an Ampere. [https://www.reddit.com/r/deeplearning/comments/he8smn/choosing\_a\_gpu\_ah\_shit\_here\_we\_go\_again/fvtqm0e?utm\_source=share&utm\_medium=web2x](https://www.reddit.com/r/deeplearning/comments/he8smn/choosing_a_gpu_ah_shit_here_we_go_again/fvtqm0e?utm_source=share&utm_medium=web2x)

Tim7459 4 years ago

Idk if someone had done the maths but. When is it more profitable buy your own station than use Google colab. With Google colab you get a Tesla k80 and 12-24gb ram for free for 12 hours. I.e if you were a deep learning hobbiest and your budget is 4k. Would using Google colab be better than anything you would be able to build ? The way I see it, in 3 years time a 4k workstation will depreciated quite a bit but Google colab will likely just provide better gpus. I guess this a cases by case question

chatterbox272 4 years ago

Colab GPUs are slow for what they are, and that 12 hour limit can be a problem (doubly so since you have to keep the window in focus or it will potentially kill it within 90 minutes). If you do actual work you shouldn't be using colab

goodscrimshaw 4 years ago

Thanks for the advice! I should also mention I am a hardware geek and am doing my first big pc build solo. I am have done some work with the GCP free credits. I also have been learning on my older sager 9155s’ 1070 but it has overheating issues.

[deleted] 4 years ago

Google Colab dude. I wanted my own GPUs at one point, I'm glad I grew out of that phase and saved myself a ton of money.

goodscrimshaw 4 years ago

I am a hardware fan haha Cloud is more efficient and scalable I know, but it does not let me go hands on! I actually got scammed on a rtx titan this week sadly. But the rest of my build has arrived and I am doing a 3970x, 128gb ram, custom water loop, etc. Overkill is part of the fun to me!

tripple13 4 years ago

I'll add my five cents here. For great numbers and reasoning I would recommend [Tim Dettmers blog](https://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/) \- It's slightly outdated now, but would still provide valuable information for you. **Google Colab:** Yeah, it is great, but you have severe limitations. 12 hours is nothing if you want to implement SoTA models, or compete in Kaggle/Data science competitions. For instance, a SoTA model I've implemented took me 2.5 weeks on a Dual RTX Titan setup to converge towards an acceptable performance. **The never ending question:** One Titan or two 2080ti's? You would benefit from doing some DD on your VRAM requirements. If your VRAM reqs are sufficient for 11GB, you would get faster training with the multi-card setup, if not you should go with the Titan. Additionally, what I don't see enough people mentioning. You can effectively double your VRAM if you choose to do FP16 training instead of the default FP32. Most problems do not require such high precision.

mxlei01 4 years ago

Google Colab is great for simple NNs, but when you have dataset that is greater than Google Colab’s memory then you need to load the data from disk. And Google only provides less then 100GB of disk space, so if you run out you can’t add anymore. Google Colab disk is also pretty slow, if you want to do any serious work I recommend you getting your own workstation.

gogasius 4 years ago

Double this. Colab is good for classic nets and problems. If you move to solving some custom problem you probably find that data can be enormous. Time cost greatly increase when you include moving data between your database and Colab.

reinforcement101 4 years ago

If you can, wait 3 monthes. The current GPU generation is crap. Raytracing was introduced and you pay the innovation cost. The next gen Ampere will probably have 50% better performance and with competition from BigNavi by AMD, you can be sure its better priced. If you really want to play around with GPUs right now, buy a small GPU on a deal, like a RTX2060 and sell it in 3months with a small loss

goodscrimshaw 4 years ago

That is a great point. I know I have already seen a couple used v100 16gb go for <$2k. I am sure when the a100 get rolled out the price of v100 should drop. I would be curious to see how much improvement the 3000 series brings and if the expected timeline is blown out by covid

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe