Private: Learning Math: Data Analysis, Statistics, and Probability
Data Organization and Representation Part D: The Median (25 minutes)
In This Part: From Ordered Lists and Line Plots
A common way to summarize data is to use numerical summaries, many of which are based on the ordered data. For example, the largest and smallest data values (minimum and maximum) are the first and last values in the ordered data. If we know the first and last values in an ordered list, we know that all the data values are between these two numbers.
Another numerical summary that is based on ordered data is the median, which is the middle value in an ordered list. Let’s find and interpret the median, using our raisin data. See Note 8, below.
We’ll begin by examining the ordered list of Brand X raisin counts:
Which position corresponds to the median (the position in the middle of the list)? How many raisins are there in the box at this position?
Tip: One way to find the median is to continue to remove the highest and lowest values in the data set until only the median remains.
The median is the value in the exact center of a data set — in other words, there are as many values above it as there are below it. In this case, the median is in the ninth position, since there are eight values below it and eight above. Note that in any data set with 17 values, the ninth value in the ordered list will always be the median.
Suppose that the data were ordered from highest to lowest, instead of from lowest to highest. How would you find the median then?
We can use the median along with the minimum and maximum to describe variation in data. The median divides the raisin-count data into two groups: the data values below the median and the data values above the median. Note that each group has eight data values, which is approximately half the data. Consequently, approximately half of the raisin-counts are in the interval 25 to 28 (from the minimum to the median), and approximately half of the raisin counts are in the interval 28 to 31 (from the median to the maximum).
We can also determine the median by looking at a line plot. For the line plot of the raisin counts, you can identify the 17 positions in the ordered data as follows:
Alternatively, you could number the 17 positions in this way:
These two line plots are identical, since there is no need to distinguish the order of raisin boxes that have the same number of raisins.
Again we see that the ordered lists contain 17 data values, and there are 17 positions in the ordered list. Position 9 contains the middle value because eight positions precede Position 9, and eight positions follow Position 9. Position 9 corresponds to a box that contains 28 raisins, so 28 is the median. This can be determined from either of the line plots above.
Find the median of this data set: 72, 68, 63, 70, 84, 75, 72, 70, 82.
Find the median of the data set for this line plot:
In This Part: From Cumulative Frequency Tables
You can also use a cumulative frequency table to find the median of the raisin count:
The first position contains raisin count 25, the third position contains count 26, and so on.
According to the cumulative frequency table, what count is in the second position?
The sixth position corresponds to a box containing 27 raisins. How many raisins are in the boxes in the fourth and fifth positions?
How many raisins are in the third position? Why can’t this be the same as the number of raisins in the fourth position?
How would you use the cumulative frequency table to find the median?
First find the position in which the median is located (see Problem D1). Then look for the corresponding number on the table.
This session provides a quick look at the median, which will be explored in more detail in Session 4.
Three different representations are used in Part D. The median is first examined in the ordered list. It is then determined by looking at the ordering in the line plot:
Finally, cumulative frequencies are used to determine the median.
If you are working with actual raisins, find the median for your data using each of the three representations: ordered list, line plot, and cumulative frequencies.
The median is in the ninth position, with 28 raisins.
You would find the median in the same way, by finding the value that has an equal number of values above and below it. You could still do this by removing the highest and lowest values in the data set until only the median remains.
The ordered list is 63, 68, 70, 70, 72, 72, 75, 82, 84. The value in the center of this list is 72, which makes it the median.
The median is the ninth number in the ordered list, which, in this case, is 6.
The second value is 26, since its position (2) is higher than the cumulative frequency of 25 (1), but not higher than the cumulative frequency of 26 (3).
They also have 27 raisins.
We know that the median is in the ninth position (of 17 total boxes), which falls between the cumulative frequency of 27 (6) and 28 (11); therefore, the median is 28.
Session 1 Statistics As Problem Solving
Consider statistics as a problem-solving process and examine its four components: asking questions, collecting appropriate data, analyzing the data, and interpreting the results. This session investigates the nature of data and its potential sources of variation. Variables, bias, and random sampling are introduced.
Session 2 Data Organization and Representation
Explore different ways of representing, analyzing, and interpreting data, including line plots, frequency tables, cumulative and relative frequency tables, and bar graphs. Learn how to use intervals to describe variation in data. Learn how to determine and understand the median.
Session 3 Describing Distributions
Continue learning about organizing and grouping data in different graphs and tables. Learn how to analyze and interpret variation in data by using stem and leaf plots and histograms. Learn about relative and cumulative frequency.
Session 4 Min, Max and the Five-Number Summary
Investigate various approaches for summarizing variation in data, and learn how dividing data into groups can help provide other types of answers to statistical questions. Understand numerical and graphic representations of the minimum, the maximum, the median, and quartiles. Learn how to create a box plot.
Session 5 Variation About the Mean
Explore the concept of the mean and how variation in data can be described relative to the mean. Concepts include fair and unfair allocations, and how to measure variation about the mean.
Session 6 Designing Experiments
Examine how to collect and compare data from observational and experimental studies, and learn how to set up your own experimental studies.
Session 7 Bivariate Data and Analysis
Analyze bivariate data and understand the concepts of association and co-variation between two quantitative variables. Explore scatter plots, the least squares line, and modeling linear relationships.
Session 8 Probability
Investigate some basic concepts of probability and the relationship between statistics and probability. Learn about random events, games of chance, mathematical and experimental probability, tree diagrams, and the binomial probability model.
Session 9 Random Sampling and Estimation
Learn how to select a random sample and use it to estimate characteristics of an entire population. Learn how to describe variation in estimates, and the effect of sample size on an estimate's accuracy.
Session 10 Classroom Case Studies, Grades K-2
Explore how the concepts developed in this course can be applied through a case study of a K-2 teacher, Ellen Sabanosh, a former course participant who has adapted her new knowledge to her classroom.
Session 11 Classroom Case Studies, Grades 3-5
Explore how the concepts developed in this course can be applied through case studies of a grade 3-5 teacher, Suzanne L'Esperance and grade 6-8 teacher, Paul Snowden, both former course participants who have adapted their new knowledge to their classrooms.