Contents
How are density plots used to compare distributions?
Similar to the histogram, the density plots are used to show the distribution of data. Additionally, density plots are especially useful for comparison of distributions.
How to compare a sample with a distribution?
When we compare a sample with a theoretical distribution, we can use a Monte Carlo simulation to create a test statistics distribution. For instance, if we want to test whether a p-value distribution is uniformly distributed (i.e. p-value uniformity test) or not, we can simulate uniform random variables and compute the KS test statistic.
How is the KS test used to compare two distributions?
As a non-parametric test, the KS test can be applied to compare any two distributions regardless of whether you assume normal or uniform. In practice, the KS test is extremely useful because it is efficient and effective at distinguishing a sample from another sample, or a theoretical distribution such as a normal or uniform distribution.
How to compare two distributions in real life?
The red line is the actual test statistic and the green line is the test statistic for 1000 random normal variables. By inserting the KS test statistic for the actual sample (i.e. the red line), we can see that the actual KS test statistic is contained inside the distribution.
How to make a multiple density plot in ggplot?
We get a multiple density plot in ggplot filled with two colors corresponding to two level/values for the second categorical variable. If our categorical variable has five levels, then ggplot2 would make multiple density plot with five densities.
How to make a density plot in R?
To install and load the package use the code below: In this example, I am using iris data set and comparing the distribution of the length of sepal for different species. After you load the dataset run the code below to build the density plot. Here is the plot. To make a fancy density plot, Chris shared a R script with us:
What is a density chart with several groups?
A multi density chart is a density chart where several groups are represented. It allows to compare their distribution. The issue with this kind of chart is that it gets easily cluttered: groups overlap each other and the figure gets unreadable. An easy workaround is to use transparency.