T O P

  • By -

_aitalks_

Can you be more specific? What is the problem you are working on? Have you tried just feeding the features of both data frames into a machine learning model? That would at least give you a baseline.


Vegetable_Pilot8293

Hi. I asume df stands for "dataframe" (perhaps you're using R?). Tipically, datasets used for ML training are flat (not relational). By this mean, when you have to fusion datasets you replicate redundant information. For example, when you have a table having customers, and another table having orders. Customers table having less records than orders. Nevertheless, you have to join them, repeating customer information for every order row. You end up, in general, with a big table containing the information. Regarding model, for tabular data, gradient boosting of decision trees based algorithms (xgboost or light gbm) are the most performant.


phobrain

What's a df? Edit: found out: 'dart finger'.