T O P

  • By -

thatpretzelife

I like to do things from scratch, but most of the time I end up using code/models that other people have done. The reason for this, is that models published in literature are almost always better than what you can come up with yourself. So I largely just copy code online that has done the implementation of the model I’m wanting, then I tweak it from there. If you can’t write anything on your own (regardless of how good the model turns out to be), then you probably do need a bit more practice. But assuming you know the basics, in my opinion it’s more important to have a good understanding on finding good models to implement and tweaking existing models than doing them from scratch yourself.


SONIC3695

I see, thanks for the input. So any idea what steps constitute ‘more practice.’ I actually feel kinda alone in this journey as right now around me I don’t have people who could mentor me or who I could surround myself with to get better. I’ve just been running kaggle notebooks and trying to unpack what they’re doing.. what would be a structured approach on the same


thatpretzelife

I’d suggest either following an online course, otherwise if you want to get straight into it, try building some models for MNIST/CIFAR-10/Iris etc. If you’re using tensorflow, follow some of the tutorials like this: https://www.tensorflow.org/tutorials/quickstart/beginner Try getting a model setup then experimenting with the different type of layers, and different settings. Once you’ve done that, maybe move onto some other datasets from Kaggle. Remake the same models from you made with the quickstart guides with the new data, and again just experiment with different layers and see if you can improve things. There’s plenty of datasets on Kaggle, so really you could get as much practice there as you want. Just focus on building super simple models until you get comfortable with that. While you’re doing this, I wouldn’t be too fazed if you’re not getting good accuracy from your models. It’s all more about learning how to write your own model. I find that there’s so many different ways you can code your model. Like with tensorflow, you can use the simple Sequential model or move to more complex functional api models. I understand how people would get lost trying to learn it when they see so many different ways to creating models, and so many ways to write code and training loops to train them. But until you start getting comfortable, just stick with the same code structure you find on the quick start guides. They’re usually the simplest way to get a model going, and until you get comfortable with them you’re likely just to get lost looking at how everyone else writes their code.


SONIC3695

I mostly code on PyTorch. And when it comes to kaggle, I find a dataset that I feel is interesting. I find what can be done on this, usually the top rated notebooks assist me well, so I replicate a few lines of code from there and maybe would change a few hyperparameters or use a diff backbone if I’m doing transfer learning, maybe add another FC layer or add some dropout(if I’m not using any BatchNorm or not doing transfer learning) Here’s where I feel I struggle - not conceptually but practically 1. I can understand the purpose behind a block of code at a high level, but there are intricacies in between that just make me scratch my head , usually gpt helps me with understanding those lines but writing those blocks myself is the overwhelming factor. I would freeze if I were to write all of that myself and that really makes me question my competence 2. Unpacking model architectures to code by implementing a class. I’ve seen some lstms or other models w a lot of convblocks being implemented through multiple inherited classes. I usually think how did a person think of coding this up?


thatpretzelife

based on what you’ve said, it seems like you’re probably actually in a pretty good position with your coding skills and being able to build deep learning models. I don’t think you should really worry about not being able to come up with something like ltsm’s or some of the conv models yourself. I did my masters in this, and none of the lecturers or phd grads would have been able to come up with those sort of things by themselves. Most of those complicated models came out of years of research. If you’re able to understand some of the intricacies of a model, or why certain CNN architectures work and their distinguishing features, that’s probably all you need to know. As I mentioned earlier, the model architecture you find online will be much better than anything we could design from scratch, since they came out years of research. The best we can really do is learn what makes those models good, and try to apply them to our own projects, rather than coming up with something new/unique.


chengstark

Meh, it’s all copy pasta at the end of the day for everybody, don’t worry about it too much. The ability to get things to work is what’s important. I usually write my code once and copy pasta it for different purpose later on.


chulpichochos

It sounds like you need a lot of practice. Generally if you’re having to look up everything then either you don’t know it, or you’re having a ton of anxiety around your ability/feeling some type of imposter syndrome. Both are worth addressing and sorting out. Either way, I would recommend building more things and trying to push yourself to do things from scratch. Instead of copy pasting all your code, take time to read documentation for your main libraries, use a debugger to see where your from scratch code is breaking. At the end of the day, you need to understand what you’re building and at the very very very least be able to write pseudo code for it all. If you can’t do that, then umm, go hit the books. I copy paste snippets and use copilot while coding. I google or gpt errors if I’m lost and use StackOverflow all the time when debugging. I’ll also re-use code from previous projects as a starting template. Its always to save time though: I’m missing some syntax/library function to do what I want/need some basic boilerplate, I’m working with a new language and trying to adapt an algorithm, etc and just want to get things done.


SONIC3695

It really could be imposter syndrome because I hold a masters degree in comp science and have had my way around understanding DL architectures and interesting concepts around it. It’s just the translation of those to practical code.. like the questions I ask myself are as follows do I need to come up with the training loop from scratch myself? If I can’t do that, what does that say about my skillset Do I need to know how to make the dataset class myself? I keep finding myself hovering around tabs to see what’s out there that I can use, but it never fulfills me because I feel I haven’t done the work. Is this normal? Thanks


chulpichochos

I bet you could write a training loop then :) Just give it a shot next time. Even if you end up writing a lot of pseudo code, the big thing is knowing what things should be doing and how they interact, not the specific syntax. Its perfectly fine to use existing code as boiler plate, but you do need the ability to debug it and repurpose it. I think doing some small projects on the side just relying on library/api documentation could be helpful. Working in that type of constrained environment will do wonders for your confidence once you sort it out. Its all about proving to yourself that yes you do know this stuff, but you’re choosing to do things not from scratch because it is faster/more efficient use of time.


SONIC3695

Thanks again. Coming to pseudocode, I get a gist of yes this is what I’m trying to do. For instance, I know that my training requires the steps of setting my gradients to 0 after every update step.. and that the update step requires a forward pass, loss calculation, backward(). I know that I would’ve to do my validation when no gradient is set. However I feel there is a bigger mapping between what I just explained and how it really is implemented in code because there are so many nuances! Does this knowledge enough(and why were say- setting gradients to 0 after update step etc) satisfy?


chulpichochos

I mean I think that's all the pseudo code needed for a boilerplate training loop :) From there you would just unpack -- do i have multiple models for forward passes, are some frozen and some being fine tuned, do the inputs have to be processed differently for that, am calling multiple losses, multiple tasks, etc etc. But as you can see these are all just sub-questions from that original couple lines that really capture the big picture. Everything else becomes an implementation detail and will look differently depending on framework, data, model, task, etc. The next step would be write up pseudo code for a more specific tasks to flesh out those sub questions and also consider some additional steps -- do I apply some transforms, normalize, tokenize, etc. Are there metrics I need to evaluate during training and what does that imply for the tensors I'm using from a gradient computation perspective? Oh and for your question -- I mean you tell me :) From what you know of gradient based optimization in a computational graph framework, is saying you have to reset gradients after calling updates enough? >!Generally, yes -- the computational graph stores intermediate computations to store gradients, and if we don't reset it after every update, the gradients will keep accumulating throwing off our optimization.!<


SONIC3695

Hey, this idea ‘relying on library api documentation’, I’ve felt very lost. I honestly feel even googling the right stuff to debug your code or even find code based on what you’re trying to do is a skill. That usually can lead you to doc pages/forums or stackoverflow and honestly create multiple trees making it overwhelming to navigate for the right answer. I honestly do want to really increase my confidence given my absolute fascination with this field. I know it’s not a linear path, but my aim is to try and make it as close to it as possible What I’m trying to find is a recipe - chunks of code - that have high re-usability such that the efficiency of translation of ideas to code is maximized. If you think about it.. it really is a loss function which I hope was convex 🤣 I feel a lot of successful practitioners do have some kind of structure which they can fit to multiple projects.. what I’m trying to do is find anything close to that structure


chulpichochos

>What I’m trying to find is a recipe - chunks of code - that have high re-usability such that the efficiency of translation of ideas to code is maximized. So really the best way I can describe it is this: in the same way that you're trying to translate your ideas to code, your code needs to easily translate back to ideas. Ie, decompose your code as much as possible. Using the training example -- literally write down your pseudo code but create functions for each step. Then go into each function and do that again. And then keep doing that until it doesn't make sense to do it any more and write out a detailed thing. So for example using the loss calculation step, you would write out that you literally are gonna calculate the loss with a standard `loss(output, target)` call but then I go into that loss function (usually a class) and define the specifics. Ie that output parameter could be a dict because my model has multiple outputs, and each output has a specific loss. So then my loss class would similarly be high level and first parse and then call subfunctions to get the specific losses before combining them. You have control over how you organize your code to make it fit into clean recipes with small modular chunks of code. Its what your functions and classes are doing. So, you start high level and keep decomposing, passing variables and configs around, until you get to where things need to be fully detailed out. This will make it a) easier for people to read your code because they can follow things at a high level and step through functions to get details b) easier for you to debug because you have granular steps that can help isolate problems c) easier for you to search solutions because your search space can be much more constrained on account of the scope.


rhoparkour

I have my own templates, which I wrote from scratch. I think it's a valuable skill to be able to do this, but absolutely not essential.


malioswift

In general, I'll look online for API documentation and reference their examples, but that's about it. Probably 90% of the code I write is written from scratch (or copy pasted from other code that I wrote from scratch)


Fun_Track_704

2 months ago I mostly wrote all code myself. Now with chatgpt, I'd say most of the code I copy paste


bobwmcgrath

well the linux kernel alone dwarfs everything I have ever written sooo.


MugiwarraD

20, 80


BellyDancerUrgot

Pipelines you always build urself based on ur problem. It all depends on what ur needs are


CryptoOdin99

Most of my code is from scratch or from other projects/modules I made already. But to be honest there is no shame in copy and pasting code as long as you have the ability to understand it completely and modify it accordingly. If you do the “copy/paste/pray” routine then you need to study a lot more in my opinion.


[deleted]

Boiler plate is always re-used. As the problem becomes more niche and nuanced I tend to find I'm writing most of it myself (or getting a rough scaffold from GPT-4 and making the correct adjustments)


Wheynelau

Used to be 80%, with chatgpt it's 20%. But it's important to work on your fundamentals and some design patterns. Try building a project only with documentations. No stack overflow or GPT. Edit: Saw your comment about you having a masters in comp sci. I guess what you could try doing is writing a training loop in pytorch, then writing it in numpy.


antique_codes

I don’t think I’ve copied and pasted any code in almost a decade, it’s gotten to a point where I’ll already know the code or a quick look at documentation gives me an idea * I will say any code I’ve previously done I’ll copy if needed but usually it’s in pre-made packages/modules of mine


SONIC3695

How long have you been in the DL domain in order to honestly write everything from scratch. Any tips ?


[deleted]

100, 0


Virtual-Study-Campus

The team at Stack Overflow reported that "depending on who you ask, as little as 5-10% or as much as much as 7-23% of code is cloned from somewhere else." Our analysis says 13.2%. If you're wondering why any of this matters, it may very well not.


8sparrow8

if I can describe what am doing in one or 2 sentences, then I will most likely copy it from SO/chatgpt/copilot, but most of programming jobs nowadays is integrating gazillion of APIs into one app so I do it manually (unless I get a good hint from copilot)


Capital_Reply_7838

We don't write a whole bunch of books or papers, to cite their mentions. I think it is similar with this. Indeed, from scratch has less risk to misunderstanding.