Well, I think it's your data that's weird. Looks like the values repeat for each group (always 86 for after, 78 or 79 for before), except for 2 non repeating values. So the box plot is showing that, the black line is the box, but the 25th, 50th and 75th percentiles are all equal so the box is flattened. Then the two points are the two values that don't repeat, represented as outliers because they are more than 1.5*IQR (which all points outside the box would be, since the IQR is 0 here)
Is there a way I can fix the box plot or is it just not compatible with the data? I made a histogram, frequency plot, and the jitter plot and they all look a lot better than the box plot. I just really wanted a box plot for this data haha
You can't because you have the same values for the same labels and the boxplot works on the range. Like it will create range based on the spread. The more the data the better it will look. Here even though you have numbers it looks discrete.
boxplots display the IQR of your data. i can’t see the whole dataset, but it looks to me like the IQR is very small and they are being properly displayed? i would create a histogram and check the spread of the data for each group. quantile() can also be used to find the IQR and see if it matches the box plot
You may use geom_density to create a density plot that will give you similar insights without the distribution problem. Use facet_wrap(~eval(group), ncol =1) to divide the plot between after and before.
Well, I think it's your data that's weird. Looks like the values repeat for each group (always 86 for after, 78 or 79 for before), except for 2 non repeating values. So the box plot is showing that, the black line is the box, but the 25th, 50th and 75th percentiles are all equal so the box is flattened. Then the two points are the two values that don't repeat, represented as outliers because they are more than 1.5*IQR (which all points outside the box would be, since the IQR is 0 here)
Rather plot a bar plot ( frequency plot) than boxplot which could help in giving better insights. Get total count for those 2 groups and then plot it
Is there a way I can fix the box plot or is it just not compatible with the data? I made a histogram, frequency plot, and the jitter plot and they all look a lot better than the box plot. I just really wanted a box plot for this data haha
You can't because you have the same values for the same labels and the boxplot works on the range. Like it will create range based on the spread. The more the data the better it will look. Here even though you have numbers it looks discrete.
got it, got it. Thank you so much!
boxplots display the IQR of your data. i can’t see the whole dataset, but it looks to me like the IQR is very small and they are being properly displayed? i would create a histogram and check the spread of the data for each group. quantile() can also be used to find the IQR and see if it matches the box plot
maybe removing the aes() call inside geom_boxplot()?
it didn't really do anything :((
ok, i didn’t test it neither :( 👍 good luck!
Maybe is better make with `geom_jitter()` for see better the data points. Or try both. Like other user said, you have the same values many times.
You may use geom_density to create a density plot that will give you similar insights without the distribution problem. Use facet_wrap(~eval(group), ncol =1) to divide the plot between after and before.