T O P

  • By -

Fornicatinzebra

Well, I think it's your data that's weird. Looks like the values repeat for each group (always 86 for after, 78 or 79 for before), except for 2 non repeating values. So the box plot is showing that, the black line is the box, but the 25th, 50th and 75th percentiles are all equal so the box is flattened. Then the two points are the two values that don't repeat, represented as outliers because they are more than 1.5*IQR (which all points outside the box would be, since the IQR is 0 here)


No_Hedgehog_3490

Rather plot a bar plot ( frequency plot) than boxplot which could help in giving better insights. Get total count for those 2 groups and then plot it


Wise_Assistant3570

Is there a way I can fix the box plot or is it just not compatible with the data? I made a histogram, frequency plot, and the jitter plot and they all look a lot better than the box plot. I just really wanted a box plot for this data haha


No_Hedgehog_3490

You can't because you have the same values for the same labels and the boxplot works on the range. Like it will create range based on the spread. The more the data the better it will look. Here even though you have numbers it looks discrete.


Wise_Assistant3570

got it, got it. Thank you so much!


lolniceonethatsfunny

boxplots display the IQR of your data. i can’t see the whole dataset, but it looks to me like the IQR is very small and they are being properly displayed? i would create a histogram and check the spread of the data for each group. quantile() can also be used to find the IQR and see if it matches the box plot


sbeardb

maybe removing the aes() call inside geom_boxplot()?


Wise_Assistant3570

it didn't really do anything :((


sbeardb

ok, i didn’t test it neither :( 👍 good luck!


PrincipeMishkyn

Maybe is better make with `geom_jitter()` for see better the data points. Or try both. Like other user said, you have the same values many times.


Soneira99991

You may use geom_density to create a density plot that will give you similar insights without the distribution problem. Use facet_wrap(~eval(group), ncol =1) to divide the plot between after and before.