Join us for conversations that inspire, recognize, and encourage innovation and best practices in the education profession.
Available on Apple Podcasts, Spotify, Google Podcasts, and more.
In This Part: The Median
Another useful summary measure for a collection of data is the median. As you learned in Session 2, the median is the middle data value in an ordered list. Here’s one way to find the median of our ordered noodles.
First, place your 11 noodles in order from shortest to longest on a new piece of paper or cardboard. Your arrangement should look something like this:
Next, remove two noodles at a time, one from each end, and put them to the side:
Continue this process until only one noodle remains. This noodle is the median. Label it “Med”:
Notice that the median divides the set of 11 noodles into two groups of equal size — the five noodles shorter than the median and the five noodles longer than the median. Another way to say this is that there are just as many noodles before the median as there are after the median.
Problem B1
If you could see only the median noodle, what would you know about the other noodles?
What would knowing the median tell you about each of the first five (the shortest five) noodles? What would it tell you about each of the last five (the longest five) noodles?
Problem B2
If you could see only the median noodle, describe some information you would not know about the other noodles.
In This Part: The Three-Noodle Summary
Now remove all the noodles except Min, Med, and Max.
We’ll call this display the “Three-Noodle Summary.
Problem B3
If you could see Min, Med, and Max, what would you know about the other noodles? Be specific about how this compares to Problem A3 (where you only knew Min and Max) and Problem B1 (where you only knew Med).
Problem B4
Describe some information you still wouldn’t know about the other noodles from the Three-Noodle Summary.
In This Part: The Three-Number Summary
Now let’s convert the Three-Noodle Summary to the Three-Number Summary. If they’re not already there, place the three noodles — Min, Med, and Max — in order on the horizontal axis.
Next add a vertical number line, and mark the lengths of the three noodles. (Left)
Remove the noodles, and you’re left with the Three-Number Summary. (Right)
Problem B5
If we call the length of the fourth noodle N4, how does N4 compare to Min, Med, and Max? What wouldn’t you know about N4 if you only knew Min, Med, and Max?
In This Part: Even Data Sets
In the previous example, it wasn’t hard to find the median because there were 11 noodles — an odd number. For an odd number of noodles, the median is the noodle in the middle. But how do we find the median for an even number of noodles?
Add a 12th noodle, with a different length from the other 11 noodles, to the original collection. Arrange the noodles in order from shortest to longest.
Problem B6
Using the method of removing pairs of noodles (the longest and the shortest), try to determine the median noodle length. What happens?
This time, there won’t be one remaining noodle in the middle — there will be two! If you remove this middle pair, you’ll have no noodles left.
Therefore, you’ll need to draw a line midway between the two remaining noodles to play the role of the median. The length of this line should be halfway between the lengths of the two middle noodles:
Move the middle pair aside, and you can see your new median:
Notice that this median still divides the set of noodles into two groups of the same size — the six noodles shorter than the median and the six noodles longer than the median:
The major difference is that, this time, the median is not one of the original noodles; it was computed to divide the set into two equal parts.
Note: It is a common mistake to include this median in your data set when you’ve added it in this way. This median, however, is not part of your data set.
Video Segment
In this video segment, participants discuss the process of finding the median of a data set with an even number of values (in this case n = 20). Watch this video segment to review the process you used in Problem B6 or if you would like further explanation.
Note: The data set used by the onscreen participants is different from the one provided above.
Problem B7
If you could see only the median of a set of 12, what would you know about the other noodles?
You can convert the Three-Noodle Summary for these 12 noodles to the Three-Number Summary in the same way you did it for the set of 11 noodles:
Add a vertical number line, and mark the lengths of the three noodles:
Remove the noodles, and you’re left with the Three-Number Summary:
In This Part: Review
As we have seen with the noodle examples, the median divides ordered numeric data into two groups, each with the same number of data values.
If you only know the Three-Number Summary (Min, Med, and Max) for a set of data, you can still glean quite a bit of information about the data. You know that all the data values are between Min and Max, and you know that Med divides the data into two groups of equal size. One group contains data values to the left of Med, and the other group contains data values to the right of Med. You also know that the group of values to the left of the median must be lower than (or equal to) the median in value, and that the group of values to the right of the median must be greater than (or equal to) the median in value.
Problem B1
You would know that there must be exactly five noodles shorter than the median noodle and five noodles longer than the median noodle.
Problem B2
You would not know the actual values of any of the other noodles: The five shorter noodles could be extremely short, the five longer noodles could be many feet long, they could all be fairly close in size to the median, etc. You would also not know or be able to estimate the maximum or minimum length of the other noodles.
Problem B3
You would know that all of the noodles are between Min and Max, and you can divide the noodles into two equal groups: five that are shorter than Med (including Min) and five that are longer than Med (including Max). This information gives you two specific intervals that contain an equal number of noodles, and all of the noodles are contained in these intervals. This is different from Problem A3, where you knew nothing about the size of the noodles between Min and Max, and from Problem B1, where you knew nothing about the upper and lower boundaries of your data set.
Problem B4
You still wouldn’t know the lengths of the noodles in the two intervals between Min and Med, or between Med and Max. These noodles could be very close to Med, very close to the extreme values, evenly spread within the intervals, or something else entirely. There is no way to know without more information.
Problem B5
You would know that N4 must be larger than Min, smaller than Med, and smaller than Max. This is true because N6 is the median, and N4 must be smaller than N6. You still wouldn’t know N4’s actual value or whether N4 was closer to Min or to Med. (A common mistake is to claim that N4 must be closer to Med than it is to Min. This is not necessarily true, since the values of N2 through N5 can be anywhere in the interval between Min and Med; for example, they could all be very close to Min.)