Learning Math: Data Analysis, Statistics, and Probability
Probability Part D: Are You a Random Player? (20 minutes)
In This Part: Developing the Mathematical Probability
Let’s return to the statistics question we considered in Part A: Can a person develop skill at playing a game of chance like Push Penny? See
Let’s say that a very determined person played 100 rounds of Push Penny with the following results (the number of hits out of four pushes per round):
Note: This is an ordered list (i.e., the first round was three hits out of four pushes, the second was two out of four, and so on).
Do these results suggest that our player has developed skill in playing the game? How might we analyze these data to answer this question? One approach would be to compare these 100 scores to the scores for a player who has no skill — in other words, a player who is just making “random” pushes.
In order to do this, you must have a description of a random player — specifically, you need to know the probability that a random push will hit a line.
What do you think the probability is that a random push will hit a line? Remember that each line is exactly two coin diameters apart. It may be helpful to experiment with a quarter on your Push Penny board and to examine this illustration. What percentage of the total area of the board is shaded?
In This Part: Experimental Probability
As seen in Problem D1, a random player has a 50% chance of hitting a line with a push. Consequently, this problem can use the binomial probability model; making four random pushes is equivalent to tossing a fair coin four times. See Note 10 below.
Here is the tree diagram for four pushes (H represents a hit, and M represents a miss):
Here is the probability table for four pushes:
Note that we’ve added columns to indicate decimal values and percentages for the mathematical probabilities, which are our expectations for the experimental probability. How do the results from our player compare with the mathematical probabilities for our random player? Once again, here are the experimental results from 100 rounds of Push Penny:
To compare these results with the random player’s, you’ll need to summarize them in a probability table and count the frequency of scores 0, 1, 2, 3, and 4. To make it easier to compare these frequencies with the random player’s probabilities, let’s calculate them as decimal proportions: See
Too bad for our competitor — there are only very slight differences between the proportions in the experimental data and the probabilities for a random player. Therefore, we do not have strong evidence that our competitor has developed any skill in playing the game. Indeed, our competitor’s skills do not appear to be demonstrably greater (or weaker) than the random player’s. (We’ll break the news gently.)
Here is the summary of scores of 100 rounds of another player’s attempt to master Push Penny. Do these scores suggest that this player has developed some serious Push Penny skill?
In Part D, we return to the statistics question in Part A, based on the game of Push Penny: “After several practices of Push Penny, have you developed skill in playing the game?”
One approach is to compare the data (the 100 scores of a player) to the expected scores of a “random” player (one with no particular skill, who is making “random” pushes). This strategy requires a model for a “random” player, which must be based on probabilities, because there is randomness in the outcomes of the games.
A game consists of four pushes. First, you consider the probability that a single random push will hit a line. Experiment with a quarter on the Push Penny board to investigate this. The key is to discover that the lines are uniformly spaced (the distance between lines is equal to two times the diameter of a quarter). By moving a quarter perpendicularly to the lines, you’ll discover that the coin is touching a line half of the time and not touching a line half of the time.
The leap from playing the game to describing the outcomes with a binomial model can be challenging. To further illustrate how the coin-tossing model describes a random player, the tree diagram is revisited. When the tree diagram for possible scores of this game is discovered to be the same as the tree diagram for the possible number of heads on four coin tosses, the equivalence of the modes may then be more clear.
This analysis is based on investigating how the experimental results compare with the mathematical probabilities. As you’ll see, there are very slight differences between the player’s scores and the probabilities for a random player. Therefore, there does not appear to be strong evidence that our competitor is any better (or worse) than a random player.
This type of analysis is referred to as “goodness of fit” because we are asking how well the model fits the data. A more advanced analysis would require a Chi-Square test, which considers whether the observed differences between the experimental proportions and the theoretical probabilities can be explained by the random variation alone, or if the differences are due to other factors (skill, for instance).
The probability is one-half. If you look at the picture of the Push Penny board, you’ll notice that the shaded strips are one diameter wide, and the unshaded strips are also one diameter wide:
This means that, if the game is random, it is just as likely for the coin to land on a shaded strip as on an unshaded strip. This makes the probability of hitting a line (and landing on a shaded strip) equal to one-half, or 50%.
Let’s use a probability table to compare the experimental probability for this player to the probabilities for a random player:
This player seems to have improved. In particular, this player’s experimental probability of getting four hits in four tries is more than three times larger than the expected probability for a random player. This suggests that this player has developed skill in playing Push Penny.
Answers will vary. Good luck!
Session 1 Statistics As Problem Solving
Consider statistics as a problem-solving process and examine its four components: asking questions, collecting appropriate data, analyzing the data, and interpreting the results. This session investigates the nature of data and its potential sources of variation. Variables, bias, and random sampling are introduced.
Session 2 Data Organization and Representation
Explore different ways of representing, analyzing, and interpreting data, including line plots, frequency tables, cumulative and relative frequency tables, and bar graphs. Learn how to use intervals to describe variation in data. Learn how to determine and understand the median.
Session 3 Describing Distributions
Continue learning about organizing and grouping data in different graphs and tables. Learn how to analyze and interpret variation in data by using stem and leaf plots and histograms. Learn about relative and cumulative frequency.
Session 4 Min, Max and the Five-Number Summary
Investigate various approaches for summarizing variation in data, and learn how dividing data into groups can help provide other types of answers to statistical questions. Understand numerical and graphic representations of the minimum, the maximum, the median, and quartiles. Learn how to create a box plot.
Session 5 Variation About the Mean
Explore the concept of the mean and how variation in data can be described relative to the mean. Concepts include fair and unfair allocations, and how to measure variation about the mean.
Session 6 Designing Experiments
Examine how to collect and compare data from observational and experimental studies, and learn how to set up your own experimental studies.
Session 7 Bivariate Data and Analysis
Analyze bivariate data and understand the concepts of association and co-variation between two quantitative variables. Explore scatter plots, the least squares line, and modeling linear relationships.
Session 8 Probability
Investigate some basic concepts of probability and the relationship between statistics and probability. Learn about random events, games of chance, mathematical and experimental probability, tree diagrams, and the binomial probability model.
Session 9 Random Sampling and Estimation
Learn how to select a random sample and use it to estimate characteristics of an entire population. Learn how to describe variation in estimates, and the effect of sample size on an estimate's accuracy.
Session 10 Classroom Case Studies, Grades K-2
Explore how the concepts developed in this course can be applied through a case study of a K-2 teacher, Ellen Sabanosh, a former course participant who has adapted her new knowledge to her classroom.
Session 11 Classroom Case Studies, Grades 3-5
Explore how the concepts developed in this course can be applied through case studies of a grade 3-5 teacher, Suzanne L'Esperance and grade 6-8 teacher, Paul Snowden, both former course participants who have adapted their new knowledge to their classrooms.