Loading...

Hello.

My name's Miss.

Parnham.

In this lesson, we will learn how to plot a box plot and compare distributions.

A box plot is drawn from five key pieces of information, the minimum and maximum values, the lower and upper quartiles and the median of a frequency distribution and all of this we can find from a cumulative frequency diagram.

So we're going to use this to draw a box plot.

We need a scale for our box plot.

This scale looks exactly the same as the horizontal axis on the cumulative frequency diagram.

We're also going to put the values into a table.

Let's start with the median.

You can see the diagram ends at 40, so the medium will be the middle value, the 20th value.

Let's take a reading from the graph by first drawing a horizontal line, and then a vertical line down to the horizontal access.

This gives us 4.

2 Above at scale at 4.

2 we draw a vertical line.

Let's repeat now this process for the lower quartile.

This time we need to rule a line from 10 because it is a quarter of the way through the data.

And 10 is a quarter of 40.

We take a reading and exactly the same way of 2.

8 and put a vertical line above 2.

8 on the scale.

Let's repeat for the upper quartile.

Three quarters of 40 is 30.

So a horizontal line, and then taking a reading gives us 5.

4 and we place a vertical line above 5.

4.

We then join the lower quartile and an upper quartile with a horizontal line top and bottom.

This forms a box.

This box shows us the interquartile range because the width of the box is the interquartile range.

Now let's have the minimum and maximum values.

we can see from the graph the minimum is zero.

So from the midpoint of the line for the level quartile we extend a horizontal line to zero and draw on the vertical line to represent the minimum value.

And then on the other side of the box, the maximum value will be 10.

So we extend a horizontal line from the midpoint of the line indicating the upper quartile to the 10, which is the maximum value.

Sometimes this is called a box or whisker plot.

And these two horizontal lines that go out to the minimum and the maximum values can be referred to as whiskers.

Remember the interquartile range is the width of the box.

The difference between the lower quartile and the upper quartile.

Here's a question for you to try.

Pause the video, to complete the task and restart the video when you're finished.

Here are the answers.

Interpreting box plots is all about reading from a scale.

So we can also see that the range is 52 because it's the width of the whole diagram.

And the interquartile range is 21.

And that's the width of the box.

Here's another question for you to try.

Pause the video to complete the task and restart the video when you've finished.

Here are the answers.

Sometimes people draw box plots directly underneath the cumulative frequency diagram, and they use the horizontal scale as the scale for the box plot too.

It's scale doesn't have to go underneath.

It can go above the box plot.

Box plots are ideal for comparing two or more distributions.

We can draw them quite closely together, and therefore it's easy to make comparisons.

Let's look at the information about test scores from class A and class B.

Starting with class A we have a median of 58 and the minimum and maximum are 12 and 89.

So the range is 77, because that's the difference between them.

Class A has a lower quartile of 33 and an upper quartile of 68.

So an interquartile range of 35.

Class B has a median of 55.

The minimum and maximum are 21 and 84 respectively.

So the range is 63 and the lower and upper quartile are 45 and 72.

So the interquartile range is 27.

When we're asked to compare distributions, we need to make one comment about an average and one comment about spread.

The only average we have here is median.

We can see that class A has a greater median than class B.

So the average test score for class A was higher than that of class B.

With regard to spread, we can either make a comment about the range or the interquartile range.

So here the results for class A, have a greater interquartile range so that the results are more spread out around the median than in class B.

So they are not as consistent.

Here's a question for you to try, pause the video to complete the task and restart the video when you're finished.

Here are the answers.

Putting the results in a table helps to ensure we compare the correct data.

When you're asked for two comparisons, always give one regarding average, which in this case is the median.

And one about the spread of the data, which can be about the range or the interquartile range.

Here's another question for you to try.

Pause the video to complete the task and restart the video when you're finished.

Here are the answers.

You needed to work out the maximum in this problem by adding the range onto the minimum.

So instead of commenting on the interquartile range, you could have talked about the range being greater for the self service checkout again, and not being as consistent as the results from the staff to check out.

That's all for this lesson.

Thank you for watching.