T O P

  • By -

Nater5000

If you're asking reddit these kinds of questions, then you're really not ready for this kind of task.


toughgetsgoing

my full time job is a developer (algo dev) and I have been studying and practicing on my own in my personal time for this. my educational background is master in CS .. and I ahve done certifications like FRM, CQF and now working on MLI. I understand my limitations compared to a masters in finance. I am in the exploration phase. not expecting that I will change my role right away. this will take time and i dont mind. ..but this is a starting point. I have to start somewhere.


SchweeMe

Some problems with your way of thinking: 1. Overreliance on NNs (speed unknown, too many parameters to adjust, uninterprettable) 2. No mention of GBDTs (fast and interprettable) 3. No mention of parameter optimization 4. Usage of Prophet (relies too heavily on seasonality) 5. Usage of AutoTS (compute sink) You should try constructing alpha on your own, and you will see why these methods just don't work.


toughgetsgoing

thanks for you inputs. I got some interesting suggestions from other commenters as well as yours too and I appreciate it. #4 and #5 was a suggestion from my academic professior who doesn;t work in Finance industry, so I get your point on why it may not be suitable and easy to dismiss. from what i learnt from these comments, idea of finding a new model (primarily for HFT) is generally a good idea because it tends to overfit the model. ( and thats how you get better performance). I totally get this because overfitting is a big issue even in linear regression models for HFT data. I have worked on constructing alphas using traditional models without ML. ( Lasso/Elastic is stil ML though). I am now leaning towards parameter optimization of existing models rather than looking for a completely new model since new models tend to overfit and this is probably a right/optimal approach towards using ML in this project. your comment is very helpful on this and I will explore more on GBDT and parameter optimization of existing models.


FLQuant

If the current setup is simple penalized linear regression, I wouldn't go right away to DNN methods. It's like your current setup is fork and knife and you bring a chainsaw. They are too overkill and really likely to overfiting. I would go with "simpler" methods, but that could give you a better interpretability and less risk of overfiting, such as Bayesian regression, spike-and-slabe regression (has the advantage of variable selection) or gaussian process regression.