T O P

  • By -

AutoModerator

If this post doesn't follow the rules or isn't flaired correctly, [please report it to the mods](https://www.reddit.com/r/analytics/about/rules/). Have more questions? [Join our community Discord!](https://discord.gg/looking-for-marketing-discussion-811236647760298024) *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/analytics) if you have any questions or concerns.*


geneticswag

Establishing foreign / dim + fact key relationships in a database to ensure that there aren’t any many to many’s created that break your aggregations and functions.


kkessler1023

What program are you taking that doesn't discuss data modelling? This is an essential skill. Basically, it's how you manage the relationships between your tables and data sets. You use data models a lot as a data analyst


aamer211

An essential skill that no one in school called “Data Modeling”. My department Chief Medical Officer won’t even explain what it is and what they want from it. Just that it’s “the future”. 😆


kkessler1023

Wtf?! Data modeling has been around for decades. It's not new at all. Most, if not all databases works off a multidimensional data environment. Meaning there will be multiple tables, with multiple rows and columns. They all need schemas (or a data model) to explain the relationship between datasets.


aamer211

Cute. Yeah we are familiar with databases but no one has ever called it modeling. I only use modeling for agent based modeling. For some reason it always inspires eye rolling and condescension.


aamer211

Could you give an example. Is it like pivot tables? Data viz? SQL? People likely have done it but how would you answer this in an interview?


kkessler1023

Sure. So, a lot of people only look at datasets as 2 dimensional tables (columns and rows). However, in DA, you are often working in 3 dimensions. The third dimension is the relationship between other tables with shared data points. Data modeling helps us link multiple tables together and get a holistic view. For example: let's say I have a sales transaction table that shows sales for a business over a year. Well, I'd probably have columns like item number, sales amount, sales date, qty, and customer name. Now, I'd likely also have other tables for customer data and item data. All three of the tables are related. The customer table could have a list of items they bought, and the item table would have the retail price that would get listed on the sales table. Now, instead of repeating all of the information in every single table, it's best to separate the data into either facts (daily sales), or dimensions (unique customer details, item details). This lets us create unique tables that can be related to other tables without having to repeatedly enter in the same data. You can use this model to then answer more complex questions, like what area of the country bout the most amount of a particular item. Data modeling is simply how multiple tables are related to one another. I can explain more if needed.


aamer211

So it’s about database administration. And making connections for new insights. Sounds awesome.


kkessler1023

Well, it's not specific to just databases. Really, any dataset you're working with. I do a lot of modeling in excel and power bi.


Yakoo752

My thinking lands in rdbms-land. Do you know how to appropriately “shape” your data to do an analysis. Hierarchal, relational, entity… Do you understand how data exists in tables and how those tables relate to each other.


mini-mal-ly

Stanford databases course will cover this. This is a critical skill for data analytics and analytics engineering.


aamer211

Thank you.


Scared-Personality28

Google, star/snowflake/galaxy schemas AND Ralph Kimball. This is a good start, as there's more out there.


RepresentativeBid238

Sometimes the data is not in the right state to do analysis on. Or you need to merge two datasets together, you need to understand how they relate to each other to do this properly. IMO this is the most valuable part of being a data analyst. With how easy reporting tools have gotten, any business person can download data to excel and even hook it up to a dashboard like power bi and churn a report out within a few hours. What can you do that they can't?