Join us for conversations that inspire, recognize, and encourage innovation and best practices in the education profession.
Available on Apple Podcasts, Spotify, Google Podcasts, and more.
In This Part: From Ordered Lists and Line Plots
A common way to summarize data is to use numerical summaries, many of which are based on the ordered data. For example, the largest and smallest data values (minimum and maximum) are the first and last values in the ordered data. If we know the first and last values in an ordered list, we know that all the data values are between these two numbers.
Another numerical summary that is based on ordered data is the median, which is the middle value in an ordered list. Let’s find and interpret the median, using our raisin data. See Note 8, below.
We’ll begin by examining the ordered list of Brand X raisin counts:
Problem D1
Which position corresponds to the median (the position in the middle of the list)? How many raisins are there in the box at this position?
Tip: One way to find the median is to continue to remove the highest and lowest values in the data set until only the median remains.
The median is the value in the exact center of a data set — in other words, there are as many values above it as there are below it. In this case, the median is in the ninth position, since there are eight values below it and eight above. Note that in any data set with 17 values, the ninth value in the ordered list will always be the median.
Problem D2
Suppose that the data were ordered from highest to lowest, instead of from lowest to highest. How would you find the median then?
We can use the median along with the minimum and maximum to describe variation in data. The median divides the raisin-count data into two groups: the data values below the median and the data values above the median. Note that each group has eight data values, which is approximately half the data. Consequently, approximately half of the raisin-counts are in the interval 25 to 28 (from the minimum to the median), and approximately half of the raisin counts are in the interval 28 to 31 (from the median to the maximum).
We can also determine the median by looking at a line plot. For the line plot of the raisin counts, you can identify the 17 positions in the ordered data as follows:
Alternatively, you could number the 17 positions in this way:
These two line plots are identical, since there is no need to distinguish the order of raisin boxes that have the same number of raisins.
Again we see that the ordered lists contain 17 data values, and there are 17 positions in the ordered list. Position 9 contains the middle value because eight positions precede Position 9, and eight positions follow Position 9. Position 9 corresponds to a box that contains 28 raisins, so 28 is the median. This can be determined from either of the line plots above.
Problem D3
Find the median of this data set: 72, 68, 63, 70, 84, 75, 72, 70, 82.
Problem D4
Find the median of the data set for this line plot:
In This Part: From Cumulative Frequency Tables
You can also use a cumulative frequency table to find the median of the raisin count:
The first position contains raisin count 25, the third position contains count 26, and so on.
Problem D5
According to the cumulative frequency table, what count is in the second position?
Problem D6
The sixth position corresponds to a box containing 27 raisins. How many raisins are in the boxes in the fourth and fifth positions?
How many raisins are in the third position? Why can’t this be the same as the number of raisins in the fourth position?
Problem D7
How would you use the cumulative frequency table to find the median?
First find the position in which the median is located (see Problem D1). Then look for the corresponding number on the table.
Note 8
This session provides a quick look at the median, which will be explored in more detail in Session 4.
Three different representations are used in Part D. The median is first examined in the ordered list. It is then determined by looking at the ordering in the line plot:
Finally, cumulative frequencies are used to determine the median.
If you are working with actual raisins, find the median for your data using each of the three representations: ordered list, line plot, and cumulative frequencies.
Problem D1
The median is in the ninth position, with 28 raisins.
Problem D2
You would find the median in the same way, by finding the value that has an equal number of values above and below it. You could still do this by removing the highest and lowest values in the data set until only the median remains.
Problem D3
The ordered list is 63, 68, 70, 70, 72, 72, 75, 82, 84. The value in the center of this list is 72, which makes it the median.
Problem D4
The median is the ninth number in the ordered list, which, in this case, is 6.
Problem D5
The second value is 26, since its position (2) is higher than the cumulative frequency of 25 (1), but not higher than the cumulative frequency of 26 (3).
Problem D6
They also have 27 raisins.
Problem D7
We know that the median is in the ninth position (of 17 total boxes), which falls between the cumulative frequency of 27 (6) and 28 (11); therefore, the median is 28.