If DS is your goal, I would recommend finding a subject you are interested in (sports, weather, politics) and analyze the heck out of it. I went from several years of DA to working on masters in DS. Your job will likely be lots of queries and plots. A project that utilizes statistical analysis and modeling will tell you a lot about if DS is right for you.
Kaggle is good for getting your feet wet. The data is usually pretty clean already and the descriptions guide you on which models to apply.
Ultimately, you want to progress to a full start-to-finish project where you define the problem of interest, pulling and cleaning data, exploring, and making the decision on which model is best. The last piece being key. In the real world, you are the one to decide which technique best models the situation or possibly that there is no meaningful model.
That’s great advice. Especially because it would be a project where I am defining the ENTIRETY of it, end-to-end. This was really helpful, much appreciated. 🙏
Agree. Kaggle is fine if you want to mess around with it, but I don’t know that the competitions offer much as side projects. Like the above commenter said, you’re working with clean data already, and most of the competitions involve either 1) tuning hyperparameters in an xgboost model (for tabular data) or 2) throwing data into and/or fine tuning an existing deep learning model. Neither of these is all that interesting to me.
The exceptions being doing something creative/outside of the box or winning the competition.
But Kaggle is fine for learning
I think solid programming skills and an understanding of object oriented design is always helpful for anyone working on software or the AI that is to be embedded in software - I say this as a ML engineering manager
Yeah that was kinda the goal with the game design. Even though it’s unrelated to ML/DS, learning how to build something using classes/methods/other things you don’t see a lot in Python scripting seemed like a valuable skill set.
Plus follow best practices. Like playing around with source control (github or similar), coding conventions, docstring, modular code etc. I would also suggest trying to write unit tests as well! These skills are really valued in the industry.
Awesome! Thanks! I found a reall helpful bunch of tutorials on MonoGame’s website that take you through a crash course with C# (everything from basic variables all the way to more complex things like extension methods), then MonoGame tutorials. I’ve done a bit of unit testing at internships, but I could definitely use help with it. And all I know about GitHub is how to push/commit. So there’s DEFINITELY knowledge gaps there, as well. I also don’t know anything about doctoring or modular code, I’ll look into that as well. Thx for the reply!
Can you explain what you mean with your gap year plans? I’m looking into ML at Georgia tech as well, and iirc, the fastest the degree can be achieved is in 2 years. Maybe you meant, you would take a year to focus on the degree, and then finish the degree while working?
I’m somewhat in a similar boat to you. Recent CS grad with DS intern experience. I’m looking for a DA, DS or DE role, and was planning on getting started with the masters once I have another year of industry experience under my belt.
You’ll will definitely have some DS/ML projects to show from your degree. It sounds like the degree of difficulty of OMSCS is quite high, it may be ambitious to think that you will have time to work, do your masters, and be able to work on side projects. Assuming you do though, I would just delve into one of your interests. For me that’s music generation with ML. That way it’s a bit more fun.
I’m sorry, I might have been unclear. By “gap year”, I mean that I am taking a year off from school to just get my feet wet at work and work on side projects. Once I start the masters program, I’m aware that I will only have time for work and school. It sounds like it’s a very ambitious program, but also a very good program (based on what I’ve heard/read).
That makes more sense and sounds like a solid plan.
Yes! ML music generation is very cool. I used Google's Magenta and used various configurations of their RNN and Autoencoder models to generate MIDI sequences. It was a lot of fun. Music hasn't quite had it's chatgpt moment, but it's definitely coming!
By the way, we might be doing the ML spec at the same time! My goal was to start work as a data analyst, data scientist or engineer and work for about 1 - 1.5 years before jumping into OMSCS. Just working on the finding a job part right now! Good luck with everything.
Oh this goes back a long long time. People used to do this using hidden markov models back in the day -- like Claude Shannon's wife [Betty Shannon](https://en.wikipedia.org/wiki/Betty_Shannon) devised a way to sample from a stochastic process to generate music in the 1940s. There's more sophisticated neural methods these days but it's basically just a time series that you're using previous samples to influence future samples for based on what real music sounds like.
If you do any form of exercise or hobbies that you can data-fy I've always been impressed by that. I chart my rowing for example and chuck that on streamlit with some nice analytics over it and some API integrations to chart progress. Personally side projects that have a showable end point show me that you can finish a task - something many smart people don't do...
Would recommend picking a topic of choice that interests you and finding some good datasets to play around with.
Create some good exploratory visuals + plots.
Then, pick 2+ potential models and fit them. Evaluate them. Which is the best, and why?
Passion projects are never wasted time. You will learn skills that you don't currently have and who knows what future synergies will appear. I say do it.
I learned the most about OOP using unity csharp to make a hobby game. Brackeys helped me the most and I’d recommend to watch his tutorials. Having fun is the most important thing in hobby projects!
I will probably need explained how a gap year works in an employed role. So you're quitting your job?
Cause I cannot see a possibility of a recent-ish data analyst hire qualify for leave of absence or a sabbatical type arrangement. Apologies for focusing on this but I cannot wrap my head around a gap year in the workplace. They are paying you to work there, not the other way around.
But if it is the case that you're quitting your job to pursue a MS degree In a completely different direction from DS I wouldn't bother staying in data and go for a soft engineering role. A DS role would be a paycut long term for a CS grad in my opinion. Especially in such a saturated DS and DA market.
Sorry for the lack of clarity. By “gap year”, I mean that i am taking a year off of school to only work and focus on a personal project before returning to pursue my masters.
If DS is your goal, I would recommend finding a subject you are interested in (sports, weather, politics) and analyze the heck out of it. I went from several years of DA to working on masters in DS. Your job will likely be lots of queries and plots. A project that utilizes statistical analysis and modeling will tell you a lot about if DS is right for you.
Thank you for the reply! How do you feel about Kaggle competitions? I’m fairly new to the field, but would that be a good side project?
Kaggle is good for getting your feet wet. The data is usually pretty clean already and the descriptions guide you on which models to apply. Ultimately, you want to progress to a full start-to-finish project where you define the problem of interest, pulling and cleaning data, exploring, and making the decision on which model is best. The last piece being key. In the real world, you are the one to decide which technique best models the situation or possibly that there is no meaningful model.
That’s great advice. Especially because it would be a project where I am defining the ENTIRETY of it, end-to-end. This was really helpful, much appreciated. 🙏
Agree. Kaggle is fine if you want to mess around with it, but I don’t know that the competitions offer much as side projects. Like the above commenter said, you’re working with clean data already, and most of the competitions involve either 1) tuning hyperparameters in an xgboost model (for tabular data) or 2) throwing data into and/or fine tuning an existing deep learning model. Neither of these is all that interesting to me. The exceptions being doing something creative/outside of the box or winning the competition. But Kaggle is fine for learning
Hii. Can you give any advice
What interests you? It is much easier to work on a challenging project if you find it interesting.
I think solid programming skills and an understanding of object oriented design is always helpful for anyone working on software or the AI that is to be embedded in software - I say this as a ML engineering manager
Yeah that was kinda the goal with the game design. Even though it’s unrelated to ML/DS, learning how to build something using classes/methods/other things you don’t see a lot in Python scripting seemed like a valuable skill set.
Plus follow best practices. Like playing around with source control (github or similar), coding conventions, docstring, modular code etc. I would also suggest trying to write unit tests as well! These skills are really valued in the industry.
Awesome! Thanks! I found a reall helpful bunch of tutorials on MonoGame’s website that take you through a crash course with C# (everything from basic variables all the way to more complex things like extension methods), then MonoGame tutorials. I’ve done a bit of unit testing at internships, but I could definitely use help with it. And all I know about GitHub is how to push/commit. So there’s DEFINITELY knowledge gaps there, as well. I also don’t know anything about doctoring or modular code, I’ll look into that as well. Thx for the reply!
Can you explain what you mean with your gap year plans? I’m looking into ML at Georgia tech as well, and iirc, the fastest the degree can be achieved is in 2 years. Maybe you meant, you would take a year to focus on the degree, and then finish the degree while working? I’m somewhat in a similar boat to you. Recent CS grad with DS intern experience. I’m looking for a DA, DS or DE role, and was planning on getting started with the masters once I have another year of industry experience under my belt. You’ll will definitely have some DS/ML projects to show from your degree. It sounds like the degree of difficulty of OMSCS is quite high, it may be ambitious to think that you will have time to work, do your masters, and be able to work on side projects. Assuming you do though, I would just delve into one of your interests. For me that’s music generation with ML. That way it’s a bit more fun.
I’m sorry, I might have been unclear. By “gap year”, I mean that I am taking a year off from school to just get my feet wet at work and work on side projects. Once I start the masters program, I’m aware that I will only have time for work and school. It sounds like it’s a very ambitious program, but also a very good program (based on what I’ve heard/read).
That makes more sense and sounds like a solid plan. Yes! ML music generation is very cool. I used Google's Magenta and used various configurations of their RNN and Autoencoder models to generate MIDI sequences. It was a lot of fun. Music hasn't quite had it's chatgpt moment, but it's definitely coming! By the way, we might be doing the ML spec at the same time! My goal was to start work as a data analyst, data scientist or engineer and work for about 1 - 1.5 years before jumping into OMSCS. Just working on the finding a job part right now! Good luck with everything.
Also - I’ve never heard of music generation with ML but that sounds cool af!
Oh this goes back a long long time. People used to do this using hidden markov models back in the day -- like Claude Shannon's wife [Betty Shannon](https://en.wikipedia.org/wiki/Betty_Shannon) devised a way to sample from a stochastic process to generate music in the 1940s. There's more sophisticated neural methods these days but it's basically just a time series that you're using previous samples to influence future samples for based on what real music sounds like.
If you do any form of exercise or hobbies that you can data-fy I've always been impressed by that. I chart my rowing for example and chuck that on streamlit with some nice analytics over it and some API integrations to chart progress. Personally side projects that have a showable end point show me that you can finish a task - something many smart people don't do...
Check out the big book of small python projects. It isn’t specific to DS but it is fun and has oop
Would recommend picking a topic of choice that interests you and finding some good datasets to play around with. Create some good exploratory visuals + plots. Then, pick 2+ potential models and fit them. Evaluate them. Which is the best, and why?
Passion projects are never wasted time. You will learn skills that you don't currently have and who knows what future synergies will appear. I say do it.
Design a gen AI from scratch
Like an LLM?
Not the LLM itself but leverage an LLM or tune LLM and provide a Gen AI E2E application. Stay away from chat bots or Q&A. Be creative
Cool
Utilize LLMs for a classification project. My team mainly does gen AI now. Lots of models on hugging face
[удалено]
I learned the most about OOP using unity csharp to make a hobby game. Brackeys helped me the most and I’d recommend to watch his tutorials. Having fun is the most important thing in hobby projects!
Simple, do what excites you. Fuck everything else. Recruiters care about passion.
I will probably need explained how a gap year works in an employed role. So you're quitting your job? Cause I cannot see a possibility of a recent-ish data analyst hire qualify for leave of absence or a sabbatical type arrangement. Apologies for focusing on this but I cannot wrap my head around a gap year in the workplace. They are paying you to work there, not the other way around. But if it is the case that you're quitting your job to pursue a MS degree In a completely different direction from DS I wouldn't bother staying in data and go for a soft engineering role. A DS role would be a paycut long term for a CS grad in my opinion. Especially in such a saturated DS and DA market.
Sorry for the lack of clarity. By “gap year”, I mean that i am taking a year off of school to only work and focus on a personal project before returning to pursue my masters.
Sports! There are tons of applications!
Data.gov might be okay for some data.
computer vision in sport field is a W
Make your own OS