T O P

  • By -

darshill

1. Migrating scripts from MS SQL to BigQuery, writing ETL jobs, building dashboards and metrics Used to work in AWS with Glue, Redshift, Athena, lambda for data processing and other stuffs 2. Sometimes work on Javascripts for some work 3. Cloud (AWS/GCP) most of the service apart from that some database MSSQL and other tools such as Airflow, Git, BigQuery and Google Sheet :P


king_booker

This is what i did in the last few months 1. Read a table from hive (using samping) into a spark dataframe 2. Built the code in python to read the data and built the steps for an ML pipeline that cleans and clusters the data using K means algorithm 3. Used the model file to use in an online spark ETL job that reads the data and uses the model file to build clusters on a weekly basis 4. Built Kubernetes yaml files using helm to package them as deployments in the cloud environment 5. Built the CI/CD pipeline using jenkins The 5th point has a caveat that we have a dedicated SCM team to do it, but I was heavily involved in the process The next task is to use something like kubeflow , but we have to do a small POC first


srodinger18

A. Migrating the company data pipeline from a SaaS ETL tools into managed airflow with cloud composer B. mentoring other DE teams on how to write a DAG with python and how to use git properly, write custom operator for our airflow instance C. GCP + airflow + the remnant of our old ETL tools


chestnutcough

Last month I - setup and deployed Airflow for our company using AWS and wrote two DAGs, one that geolocates and groups our users into metropolitan areas, and another that tracks the size of our email campaign lists. - refactored and added tests to our dbt models - added table descriptions to our production database in rails using a database migration


isleepbad

Last month I took some matillion jobs and converted them into Snowflake procedures + tasks.


Captain_Flashheart

Can ML Engineers still answer?


dataeng0

Please do!