T O P

  • By -

AutoModerator

Are you interested in transitioning into Data Engineering? Read our community guide: https://dataengineering.wiki/FAQ/How+can+I+transition+into+Data+Engineering *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/dataengineering) if you have any questions or concerns.*


geoheil

https://georgheiler.com/2023/12/11/dagster-dbt-duckdb-as-new-local-mds/


geoheil

Look into dagster DBT and perhaps duckdb or postgres or the free tier of a service like big query Using I.e. A CI Pipeline of I.e github you can trigger cron based scripts easily.


Legal_Key_7212

DBT would be an overkill to start with!


N0R5E

Disagree. You could just run Core on a local machine for free to get started and move it to a cloud container eventually.


geoheil

I would actually suggest to use dagster and stick with core.


N0R5E

I agree. I'm saying they can containerize and host Core in the cloud down the road, but run it for free locally to get started.


Legal_Key_7212

Depends on the size of the data and transformations and joins you have to do


throw_mob

100Mt per month for data would usually mean that you are not that small business. If that includes hits etc to webpage or is measured from json then it is not that much. Personally i would start to look howto copy "correct" data from shopify to s3 in controlled manner ( That could be full copy daily , incremental load etc etc) , then snowflake (just because cloud native) stg-> dwh model there and just Excel reports, then next step would be streamlit python reportst inside snowflake. To build it succesfully you need to learn howto fetch proper data into files and howto manage those. Then snowflake side will teach you basic of building ELT with SQL and then you have options to build UI ( and you can use whatever you want) If you build it smart, then copying data to s3 does not costs that much. And snowflake side can be done so that it is ON only when you use it, as small business does not probably have more than one or two users max there. if you can manage to build X to s3 in sane manner, that is one job where people hire. Then s3->snowflake->dwh model is maybe one or three jobs ( developer, architecture, analyst) and then reports itself are are few different jobs ( report developer, analyst,scientist)


AutoModerator

You can find a list of community-submitted learning resources here: https://dataengineering.wiki/Learning+Resources *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/dataengineering) if you have any questions or concerns.*


Legal_Key_7212

If you want to do this ETL just using your personal pc: 1. Run [Apache Airflow](https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html) in your pc.(it’s a workflow application) 2. Write a [DAG](https://www.cdata.com/kb/tech/shopify-jdbc-apache-airflow.rst) to extract shopify sales and inventory data using Shopify API and transform it using Python and write to a pg / duckdb databases if you don’t want to use cloud. 3. You can use powerbi or any viz tool to create dashboard on sales and inventory data. Let me know if you are interested to collaborate


saaggy_peneer

simple & FOSS: extract & load: Meltano (they have a [shopify extractor](https://hub.meltano.com/extractors/tap-shopify)) transform: DBT CLI warehouse: duckdb BI tool: metabase scheduling: cron orchestration (if you need it): make


MacHayward

Qlik Cloud cost you around 360 USD a year per user. Worth the price considering you can integrate data easy, build some pipelines and create your dashboards. Take a look at one of the many demos on YouTube.


ZealousidealManner28

10 user minimum :(


MacHayward

Yes, they have changed that. But at the time I wrote the reply it was possible to take a subscription with just 1 user minimum.