Select Page

Let’s consider a small data set with 12 observations sorted from lowest to highest: (1,1,4),(4,5,8),(8,9,10),(10,12,13).  I grouped the observations into four equal groups so that we can easily spot the quartiles. (I purposefully made the numbers at the border of each quartile equal so we don’t have to worry about calculating quartiles in a discontinuous distribution.) The first quartile is 4, median is 8 and the third quartile is 10, interquartile range (IQR) is 6. The box in the box plot will show the median and the first and third quartiles. The length of the upper whisker is the largest value that is no greater than the third quartile plus 1.5 times the interquartile range. In this case, the third quartile plus 1.5 times IQR is 10 + 1.5*6 = 19. The largest value that is no greater than 19 is 13, so the upper whisker will reach to 13. The lower whisker is defined analogously.

Suppose that the last observation is 19 instead of 13, so the data looks like this: (1,1,4),(4,5,8),(8,9,10),(10,12,19). The quartiles are all the same, but the largest value no greater than 19 is 19. Thus the upper whisker will reach to 19.

Finally, suppose that the last observation is 20 instead of 19, so the data looks like this: (1,1,4),(4,5,8),(8,9,10),(10,12,20). The quartiles are all the same, but the largest value no greater than 19 is 12. Thus, the whisker will reach to 12. The computer will plot the point that is outside of the 3rd quartile plus 1.5 times IQR range. Because the point is outside of that range, it would often be considered an outlier.

In summary, if there are no individual data points plotted, the whiskers indicate data’s minimum and maximum. If there are individual data points plotted, the whiskers indicate the largest/lowest points inside the range defined by 1st or 3rd quartile plus 1.5 times IQR.

The figure below summarizes all three cases.