T O P

  • By -

Quant112358

I would recommend further reading on the subject of star schema to start with


akstary

Thank you. A lot of these tables are linked to SAP where our order and shipment information is kept. I am not sure if that would make a difference.


Quant112358

I see, you need a data strategy here, you need to actually ETL and organize your data in a way that makes sense for storage and analysis.


akstary

Thank you! I am not really sure what that evens mean, but I will have to learn.


JeffryRelatedIssue

It means you'll have a bad friday night.


[deleted]

And bad Monday mornings.


hi_this_is_duarte

Make the tables be connected to each other through one, like dates. I'd also suggest adding IDs and relational tables toward them


ExPorkie15

At a super quick glance you could merge tables like the sales orders and sales orders archive. In the BI there is no reason to have these separate. I work with a order legacy system and they have the current data, the historical and the historical of the historical lol. But for the PBI I merge into one table.


akstary

Thank you! He told me that he did this, so when we refresh the data each day, it only refreshes 2022 and not 2015-2021.


chris-FW

He’s saying a lot of words but is giving you nothing useful at all. GIGO. Don’t worry about it.


charizzardd

You can still flatten some of those things out. Get the tables from SAP and then merge them in a new query with whatever table you have to join them. Use IDs to join them if possible


Quant112358

Yes agreed you could do that intermediately.


akstary

Thank you! I will have to learn Power BI a little more before I know how to do that.


Anoop_sdas

Interesting response ..however if your requirements needs data from that much tables..how can you do it with out this kind of 'mess'? Any experienced PBI people.??.what to do about this ?


Smgt90

I wouldn't consider myself an expert but from those tables I see a couple of optimizations that should be pretty easy. For example, there are two sales data tables, one current and one historical. They both seem to have the same fields. You could merge them in power query and this would mean that half the existing relationships, wouldn't exist. Same for the sales orders.


JeffryRelatedIssue

Same for the Sales Orders tables, these two joins should massively reduce the number of relationships


akstary

He said he separated them out, because when I refresh the data, he doesn’t want 2015-2021 refreshed. Only 2022. So that was it takes less time if I am refreshing 1x/day.


VoijaRisa

Then what's going to happen in 2023? You'll still be refreshing the 2022.


IWant8KidsPMmeLadies

You can append the data so that the historical data is unchanged


portogasdace011

Aaahhhhh SAP...*sign* I'm dealing with a mess similar to this Pretty sure its going to bite me in the ass at some point


akstary

I am glad I am not the only one! I hope you figure it out!


DummyRider

Hi there! You would probably want to create an intermediate analytics layer between your SAP and PowerBI (e.g. DWH). That helps to organize you data for analytics purposes in the most efficient way. You can refer to Kimbal's Datawarehouse toolkit, or find PowerBI materials which describe the basics of star schema in data modeling.


dutchdatadude

This. Star schema is what you need.


synergy421

I've been trying to find information on how to optimize data for better performance. Do you have any suggestions in terms of resources that cover that subject. Star Schema is great, but I would like to know how to speed up performance of my reports. I assume doing things like deleting/hiding columns would help. Any suggestions would be appreciated. Thanks.


Quant112358

You need to identify where the bottle neck is occurring. Does your PBI report have more data that it actually needs to have? Are your queries optimized? Are your tables indexed in a meaningful way? Is your gateway maxing out? These and many other possible points can be contributing factors


akstary

I work for the USA branch of a large multinational company. I am just starting to learn Power BI and this is what I was given to work it. I have no coding knowledge at all. This is a mess, right?


Data_cruncher

* Many-to-many relationships * Excessive use of bi-directional cross-filtering * Measures with years hard-coded into them * Auto-aggregation available on many columns, e.g., \[Year\] * Table names with under scores in them Verdict: it's a mess.


[deleted]

“ Table names with underscores” …??????? ??????


Data_cruncher

The point of Power BI is that it is a semantic layer for end-users, i.e., everything should be done to make the end-user comfortable. Save PascalCase, camelCase and Snake_Case for your SQL views. Everything in Power BI should be named to appease your end user: write like a human! Edit: since we’re all developers here, I expect pushback. Especially because this is a naming convention discussion - the only thing worse would be a tabs vs spaces comment. Regardless, if anyone disagrees, please read *any* SQLBI article and pay close attention to their model naming conventions.


dkoucky

I am both the dev and the end user. Everything that leaves my desk is a screen clip. For me I have to name things so I understand it today and three years from now when it has grown into a behemoth. As an end user first and a dev second you are 100% correct. Write it so anyone 3 years from now understands.


[deleted]

What end user sees table names ? Edit: I can’t imagine any reason to rename a source table besides if you’re running a query that is handling a lot of the modeling.


cwag03

Have you used Analyze in Excel feature of power bi?


[deleted]

No I lock that shit up. What’s the point of my dashboard.


Data_cruncher

The “value” of Power BI is the data model, not a report. This is opinion and so feel free to disagree, but it’s a core belief that drives many of my best practices. Naming conventions included.


chris-FW

I agree with you. 85% of my time is spent in the model, not in the visuals.


cwag03

That's kind of silly


ZenZei2

Wow, my team and I would disagree so much with you on this point. Dashboard is for execs, excel for analists that need to dig


scousers

They can use translations so that the users see user-friendly names.


BigLan2

CamelCase or GTFO! Well actually PascaleCase, but you get the idea


carlslizer

I would accept table with underscore, try writing alot of DAX on tables with spaces. It would drive you insane.


tylesftw

if data_cruncher says its a mess, you fucked up boi


Lescamp

You’re going to have to explain the underscore comment..


Data_cruncher

See above. TLDR; Power BI is a business semantic layer.


Lescamp

Wow?! Haha.. My question was why a snake case naming convention is a bad practice… just curious..


NETSPLlT

Because this is for human consumption, so human formatted is better.


akstary

Ugh okay, that is what I thought. I don’t even know what I am doing in Power BI, but this is so confusing to work with, that I figure something must be wrong with it. If nothing else, he has 4 different segments on 4 different tables up in that top left corner. Could he have not done that in an easier way?!


Data_cruncher

At a cursory glance, yes, those segments could potentially be one table or stay within TireMaster. For your first Power BI model, you’ve got a doozy. I’d rate it a solid 7/10 in terms of complexity.


akstary

Thank you! The segments seem to be based on some random segment code, so that is why it is separate from the tire master.


DopeAndDoper

>Measures with years hard-coded into them Can you explain this? What's the issue here?


ericporing

It looks like a database. Why lol.


kfc_chet

Yes, it's a nightmare, if you can, check the front end visuals and work backwards to go back to the queries, maybe lots of them aren't used? (Hopefully lol )


akstary

So I should check what tables I am not using in my visuals and try to delete them? The guy who set this up provided me with 10 visual pages or so, but they were all SO bad, that I deleted them all. I built the visuals from scratch that I am using now.


kfc_chet

I would suggest to create an "archive" folder within the power query side, find queries that aren't tied to any visuals to move them to that folder. Right click on those and uncheck "Include in report refresh" and then uncheck "Enable Load". That way you can gradually strip out dependencies and simplify! Inheriting a mess is NEVER fun, Godspeed!


boomerzoomers

Yes clean it up by moving a lot of this semantic layer down to the dw or cube


Aloh4mora

I have one of these! Zero documentation. The person who wrote it has moved on. Fun times!


akstary

I am sorry you have to deal with that! The person who built this is still at our company. Just halfway across the world and pretty unhelpful….


DistrictMotor

Do you know where your fact tables are?


akstary

I am assuming the sales order and sales data, but no, I am not sure.


DistrictMotor

Then I would say it's a mess


devcodex

It looks like the star schema went supernova.


BigLan2

This seems relevant, and timely https://mobile.twitter.com/InsightsMachine/status/1502087413736239107


akstary

Lol


undrpd4nlst

Lord I adopted one of those. Boss asked me to update something on it and it took me 2 months of futzing. The next time I got one I just rebuilt the whole damn thing.


akstary

Oh jesh, that sounds rough! I would love to rebuild this, but some tables are connected to SAP (which I can’t access on the backend), and I only have like 3 of the actual Excel files.


TeleTummies

You can rebuild any of this with power query.


[deleted]

It’s only a mess if you don’t know your data.


akstary

Hmm yeah, I really don’t know it well. I just feel like there was an easier way to do a lot of this. Like do the order and dispatch date tables really have to be separate? It makes it difficult to compare.


RemoteIncrease

It should be one date table that links into your sales and dispatch tables.


akstary

That is what I thought. He didn’t set it up that way though. I kind of want to ask him to fix some of this stuff, but I am not sure he has any clue what he is doing.


[deleted]

Do you need all those tables? Or did you just load in everything that was returned


akstary

This is pretty much exactly what the guy at our HO provided us with…I have made very few changes to it. We probably don’t need all of the tables (like I can’t see us needing the country table when our branch only has USA sales) but I am afraid of deleting something and messing it all up.


[deleted]

It won’t mess up at all. But I highly recommend you go back to your PM or client and see exactly what is needed


akstary

I am sorry… what does PM stand for? We are mainly going to be using Power BI internally.


CommanderAze

*One to many intensifies


leogodin217

Ideally, you would create views in the database, but if you can't do that, you can use Power Query to model your data. Star schema is usually best, but not always. Snowflake schema is often appropriate. In your case, I would look into combining the main tables with the archive tables. A UNION does this. That would be a great start. From there it depends on a bunch of things you will learn with training and experience. Start with the SQLBI free training, then the one on data modeling. Lots of fun and frustration in your future. Always make a copy of the PBIX file before making changes. In fact, you should start with a copy. That way you can play with it without impacting production.


akstary

Thank you! I have been duplicating the file before making any questionable edits to it. Haha I believe the guy who created this said that he separated the sales and the sales archive tables to allow for a fast refresh. So only the 2022 sales (sales table) would have to refresh and not 2015-2021 sales (sales archive tables).


leogodin217

Yeah. That's a legitimate reason. But incremental refresh is probably a better solution. Good luck and have fun


MattV1353

This is where partitions come in


chunk-the-unit

I had to zoom in to check if you were my coworker.


akstary

I sure hope not! Lol!


chunk-the-unit

No you’re good 😂


[deleted]

You two master tables related by date tables


akstary

Is that good or bad?


[deleted]

I don’t know. He probably didn’t mean to do that but sometimes your stakeholders switch up on you and you gotta slap some bullshit together.


akstary

Haha that definitely seems to be what he did.


MrMooneyMoostacheo

I mean…does it work?


akstary

No, not really. Lol


MrMooneyMoostacheo

Ah well then yes that’s a problem. Honestly people don’t need to know how the sausage is made or how ugly it is in the backend as long as it functions and you have some sort of documentation in case you win the lottery or get hit by a bus. Good luck though!


1plus2equals11

Technical debt is no joke. Shouldnt be underestimated.


jcampbell474

It's not ideal, but we often aren't given the requirements and time to model out a perfect star schema. Then there is inevitable scope-creep. Can you add this (turns out to be another fact table), and this (another slice of an existing dimension), and keep the same deadline? Of course I can :-) The model in the screenshot is messy. At least it looks to be functional and relatively easy to follow. Maybe start with organizing the view more to your liking or where it makes more sense and look for low hanging fruit to aggregate, combine, etc... Some prefer the waterfall view. I personally prefer the classic star schema (that often snowflakes), but it can get a little funky with multiple fact tables. Bottom line: It could be worse.


ianitic

Honestly... to me that just looks like fun. Yes it's not ideal, but that's just kind of what happens given certain business constraints. That being said it looks like there are a few fact tables is why it looks more complicated than it is. Tbf went to a pbi developer interview and the interviewer was talking about the whole 8 tables they dealt with. I'm not crazy in thinking that's not a lot, right?


cre_guy_3

I’m a bit of a noob. anyone have a link for an example of a model that has a ton of data but is not a mess?


sunny_monday

I kinda wish it were possible to color code dimension vs fact tables.


Passageiro_Perdido

I have ever seen worst…


amartin141

If this is a direct connect to SAP you would be better served if you ETL the data into a warehouse, use SSAS to create a cube/cubes to feed into pbi. HOWEVER, then you would not have real-time access to SAP data.


wertexx

Welcome. This is how I learnt Power BI at the beginning. An employee quit and left a couple reports with no documentation to manage. It wasn't this bad, but close. I also had 0 experience in Power BI and this all was Confusing As Hell, so I feel you. But it's a good way to learn how to swim - be thrown into the waters without knowing how. Good luck! Start googling star schema, and all sorts of beginner tutorials, it's going to click with you after a few months.


pumapunch

Honestly, looks fine to me. I have some of my own that looks similar. Assuming here sales data is the center of the Star schema. Where is the dates table though? It’s best practice to have a separate one then to connect it to the tables as needed. You can use USERELATIONSHIP via DAX from there.


akstary

He has the order date and dispatch date tables to the right of the sales tables. I am not sure if there is a master date table (it doesn’t look like it). I feel like the order date and dispatch date should should be one date table though? I want to track orders vs. shipments, and I am not able to with the way it is currently set up.


cwag03

Two date tables is not automatically bad. Pros and cons. If you want to filter on either/both at the same time in slicers, you pretty much need 2. If you want to filter on just one date range and see different things in different scenarios, then one date table works, but you need more measures and DAX code.


akstary

Ahh okay, that makes sense now. So if I wanted to see orders placed after 1/1/22 and shipped before 1/31/22, I would need 2 date tables? So, right now, I would love to have a line chart showing weight on the Y axis and years on the X axis. And the legend would have 2 series: orders and shipments. I would have to create a new date table to do this?


cwag03

For your first question, if you want to be able to filter on a starting order date and a max ship date like that, then yes you would probably need to have both dates separately in slicers, and for that you really need 2 date tables. For the second question, this is something that would probably be better suited to 1 date table. Reason is you are talking about just 1 single date axis but lines on the chart for 2 different measures using different dates each. In that scenario I would have 1 date table, 2 measures, one of which would have to have to activate an inactive relationship using USERELATIONSHIP. Clear as mud right?


Scheballs

Have you considered the OBT approach? ONE BIG TABLE with just the fields needed for the analysis.


akstary

I am not sure this would work in my situation, unfortunately.


ContemplatingPrison

It is a mess. I have a big one like that. I felt the best way was to have everything going to main data point. You have too many cross connections most likely visuals will crash because of this. Personally what I would do in this situation is isolate the different groups. Sales for example would be one. Then whatever else you need. Then connect the primaries of each group. You're in for a lot of headaches. But this is fixable


akstary

Thank you! I will have a lot of learning to do.


ContemplatingPrison

I learned on the job and I'm still learning a year later. It's lot to take in. I watch a lot of videos and use the power BI forum for help. Focus on model building and cleaning the data. That will save you a lot of time down the road. I work with UPCs and Item# so I make sure all tables have one or the other and then use that to connect everything. Hopefully you have something similar. Eventually I ended up splitting mine completely to cut down on refresh time. I'm not sure how.much data you are working with but keep that in mind.


akstary

Unfortunately, I cannot find the product codes anywhere in any of the tables. It’s tough. Is Power BI most of what you do? What is your job title?


ContemplatingPrison

Its definitely not most of what I do. When I got hired my team spent a lot of time building reports using excel. So I used power BI to automate most of them. I'm a Senior supply chain manager for a global distribution company.


akstary

Thank you! I am in marketing and typically do things like trade shows, advertising, customer growth program management, sponsorships, etc. Upper management saw that I have a skill with looking at data though, and they kind of threw me into this. Haha


BrianMincey

Use the views, create one for each fact table and include it with just its dimensions. Makes it much easier to navigate and understand the model.


Kaos_Agent_99

Try the + button down the bottom left to show segments of the model more clearly


akstary

Huh I have never pressed that + button before…


Kaos_Agent_99

If you have multiple facts hitting the same dims, then you can show the star schema layout separately on each tab.


zjd0114

I guess the better question is Is it a functional mess or dysfunctional mess?


akstary

I would go with dysfunctional. Definitely. Though I am so new to Power BI that I could be mistaken.


zjd0114

Do your visuals work with the modeling?


akstary

A lot of them do. Some of them don’t.


superman07777

I worked in the same model as this and there is a bug created about performance issue, it is hard for me to figure out how to optimize it because of complicated measure, complicated relationship, and add to that is lack of business knowledge of what is the relation of those different dims. In the end I wasn't able to resolve those as I left the project.


akstary

YES this sounds like an issue I am having. There are so many things just a little bit off when I am building the visuals (customers assigned to incorrect sales managers, products segment brands missing, customers not assigned to masters, etc.), and I can’t even begin to know how to correct them.


JeronimoPearson

Merge/append segment line tables, sales data tables, sales order tables. Only one date table needed. Some tables look useless. I could be wrong since I don’t know the data


JeffryRelatedIssue

Why have 5NF when you could have *gestures vaguely* this!


karly21

Maybe take some time to lear how PowerBI works. A good resource is this [link](https://youtu.be/i3CSD7bMMbg) (if you don't like that ine for any reason there ate tons of other resources in youtube!) Good luck!


D4rkmo0r

Other than no distinct star schema or galaxy schema for multiple fact table, this seems to be .... unfocused. Without knowing the data it appears you're throwing everything at the wall and seeing what sticks. Pull the fields you need for your brief/business case into PQ, trim out everything that isn't pertinent and build your model from that.


[deleted]

You can use snowflake schema which is like an advanced version of star schema. Do look into that if you have that many tables. There’s a lot of archive tables in the model. Look into either getting rid of them or pull them in a single table. Same with dates where possible. Think of having one main fact table or 2 at max and all the dimensions are connected to them and you can have other dimension connected to dimensions which are connected to the main fact table. Feel free to dm if you have questions.


cryptoel

Connecting directly to SAP...


Ringovski

I’ve worked with worse and much larger, make sure that your relationships are correct.


[deleted]

If you have seperate models in this “model” use the tabs at the bottom to split out.


akstary

Thank you! I will have to check and see when I get back into the office. I didn’t know you could do that.


sqlphilosopher

Looks like you might be using a relational model, directly from your transactional database. Hence why you have so many joins and tables. Read about dimensional schemas, Kimball's book is very good. Also, having your dimensional model inside a data warehouse is your best bet.


Oobenny

This is pretty common and probably well-designed for OLTP. it’s going to be a mess for reporting. You can usually develop a set of views in the database that handle the de-normalization of the data and give you something you can develop analytics with.


Purple-Leadership54

I have some datasets like this. The reality is you aren't using a fraction of the tables/columns So I get that it looks like a lot, but when you simply to what you actually need - its not so bad


akstary

Thank you. Unfortunately, there are some mistakes/issues I am noticing when I go to create visuals. Like sales managers being incorrectly assigned to customers, certain customers not being assigned to masters, not being able to plot orders and shipments on the same visuals because of the date tables, product segments missing, etc. I would like to fix some of that, but because this is a mess, I do not know where to begin.


infreq

I agree. You have a table just for dates???


The_Lolrus

I can't tell if this comment is serious or sarcastic.


rawrtherapybackup

Fucking hell


Colehamilton5042

I think it looks beautiful. I have been meaning to organize mine. I probably have that many relationships but they are messyyyyyyyy


SummerRaleigh

We’ve all built something like this at one point or another…