Private: Learning Math: Data Analysis, Statistics, and Probability
Bivariate Data and Analysis Part B: Contingency Tables (20 minutes)
In Part A, you examined bivariate data — data on two variables — graphed on a scatter plot. Another useful representation of bivariate data is a contingency table, which indicates how many data points are in each quadrant.
Let’s take another look at the scatter plot from Part A, with the quadrants indicated:
|•||Quadrant I has points that correspond to people with above-average arm spans and heights.|
|•||Quadrant II has points that correspond to people with below-average arm spans and above-average heights.|
|•||Quadrant III has points that correspond to people with below-average arm spans and heights.|
|•||Quadrant IV has points that correspond to people with above-average arm spans and below-average heights.|
The following diagram summarizes this information:
If you count the number of points in each quadrant on the scatter plot, you get the following summary, which is called a contingency table:
Use the counts in this contingency table to answer the following:
|a.||Do most people with below-average arm spans also have below-average heights?|
|b.||Do most people with above-average arm spans also have above-average heights?|
|c.||What do these answers suggest?|
The column proportions and percentages are also useful in summarizing these data:
|Note that there are 12 people with below-average arm spans. Most of them (10/12, or 83.3%) are also below average in height. Also, there are 12 people with above-average arm spans. Most of them (11/12, or 91.7%) are also above average in height.
Note that the proportions and percentages are counted for the groups of arm spans only. The proportion 2/12 in the upper left corner of the table means that two out of 12 people with below-average arm spans also have above-average heights.
It is important to note that the proportions across each row may not add up to 1. When we look at column proportions, we divide the values in the contingency table by the total number of values in the column, rather than in the row. In this example, there are 13 values in the first row, but there are 12 values in the column; therefore, we’re looking at proportions of 12 rather than 13.
Percentages are equivalent to proportions but can be more descriptive for interpreting some results.
Since 91.7% of the people with above-average arm spans are also above average in height, and 83.3% of the people with below-average arm spans are also below average in height, this indicates a strong positive association between arm span and height. Note that in this study, we’re using the word “strong” in a subjective way; we have not defined a specific cut-off point for a “strong” versus a “not strong” association.
Use the counts in the contingency table to answer the following:
|a.||Do most people with below-average heights also have below-average arm spans?|
|b.||Do most people with above-average heights also have above-average arm spans?|
Perform the calculations to find the row proportions and row percentages for this data, and complete the tables below. Note that there are 13 people whose heights are above average and 11 whose heights are below average; this will have an effect on the proportions and percentages you calculate. Do you find a strong positive association between height and arm span?
Tip: The proportions in the “Above Average” row will be out of 13. Once you find the proportions, use them to find the percentages.
|a.||Yes. Of the 12 people with below-average arm spans, 10 have below-average heights.|
|b.||Yes. Of the 12 people with above-average arm spans, 11 have above-average heights.|
|c.||These answers suggest a positive association between arm span and height.|
Solution: Problem B2
|a.||Yes. Of the 11 people with below-average heights, 10 have below-average arm spans.|
|b.||Yes. Of the 13 people with above-average heights, 11 have above-average arm spans.|
Solution: Problem B3
Here are the completed tables:
Since 90.9% of the people with below-average heights also have below-average arm spans, and 84.6% of the people with above-average heights also have above-average arm spans, this again indicates a strong positive association between height and arm span.
Session 1 Statistics As Problem Solving
Consider statistics as a problem-solving process and examine its four components: asking questions, collecting appropriate data, analyzing the data, and interpreting the results. This session investigates the nature of data and its potential sources of variation. Variables, bias, and random sampling are introduced.
Session 2 Data Organization and Representation
Explore different ways of representing, analyzing, and interpreting data, including line plots, frequency tables, cumulative and relative frequency tables, and bar graphs. Learn how to use intervals to describe variation in data. Learn how to determine and understand the median.
Session 3 Describing Distributions
Continue learning about organizing and grouping data in different graphs and tables. Learn how to analyze and interpret variation in data by using stem and leaf plots and histograms. Learn about relative and cumulative frequency.
Session 4 Min, Max and the Five-Number Summary
Investigate various approaches for summarizing variation in data, and learn how dividing data into groups can help provide other types of answers to statistical questions. Understand numerical and graphic representations of the minimum, the maximum, the median, and quartiles. Learn how to create a box plot.
Session 5 Variation About the Mean
Explore the concept of the mean and how variation in data can be described relative to the mean. Concepts include fair and unfair allocations, and how to measure variation about the mean.
Session 6 Designing Experiments
Examine how to collect and compare data from observational and experimental studies, and learn how to set up your own experimental studies.
Session 7 Bivariate Data and Analysis
Analyze bivariate data and understand the concepts of association and co-variation between two quantitative variables. Explore scatter plots, the least squares line, and modeling linear relationships.
Session 8 Probability
Investigate some basic concepts of probability and the relationship between statistics and probability. Learn about random events, games of chance, mathematical and experimental probability, tree diagrams, and the binomial probability model.
Session 9 Random Sampling and Estimation
Learn how to select a random sample and use it to estimate characteristics of an entire population. Learn how to describe variation in estimates, and the effect of sample size on an estimate's accuracy.
Session 10 Classroom Case Studies, Grades K-2
Explore how the concepts developed in this course can be applied through a case study of a K-2 teacher, Ellen Sabanosh, a former course participant who has adapted her new knowledge to her classroom.
Session 11 Classroom Case Studies, Grades 3-5
Explore how the concepts developed in this course can be applied through case studies of a grade 3-5 teacher, Suzanne L'Esperance and grade 6-8 teacher, Paul Snowden, both former course participants who have adapted their new knowledge to their classrooms.