1. What Is Statistics?
Statistics is the art and science of gathering, organizing, analyzing and drawing conclusions from data. And without rudimentary knowledge of how it works, people can't make informed judgments and evaluations of a wide variety of things encountered in daily life. Go to this unit.
As a first step in visualizing data, we use stemplots to understand measurements taken by the U.S. Army when they size up soldiers in order to design well-fitting gear and supplies for modern warfighters. Go to this unit.
Meteorologists use histograms to map when lightning strikes and this visualization technique helps them understand the data in new ways. Go to this unit.
4. Measures of Center
It's helpful to know the center of a distribution — which is what the clerical workers in Colorado Springs found out in the 1980s when they campaigned for comparable wages for comparable work. Mean and median are two different ways to describe the center. Go to this unit.
Using the example of hot dog calorie counts, we use boxplots to visualize the five-number summary and make comparisons between different types of frankfurters. Go to this unit.
6. Standard Deviation
How can we compare sales at two franchises in the Wahoo's restaurant chain? Standard deviation helps us quantify the variability in sales. Go to this unit.
7. Normal Curves
A nature preserve that's tracked bird migrations through New England for decades records tons of bird-related data; everything from wingspan measurements to arrival dates provides examples of normal distributions. Go to this unit.
8. Normal Calculations
Visit the Boston Beanstalks club for tall people. Height is normally distributed and we can use membership cutoffs and population data to calculate z-scores. Go to this unit.
9. Checking Assumption of Normality
Production at Pete and Gerry's Organic Eggs provides a number of distributions that look normal — but are they? Go to this unit.
Plotting annual numbers of Florida powerboat registrations and manatee killings suggests an uncomfortable relationship for the marine mammals. Go to this unit.
11. Fitting Lines to Data
Winter snowpack in the Colorado Rockies can predict spring water supply. Plotting annual measurements in a scatterplot lets resource managers draw a regression line that helps them forecast water availability. Go to this unit.
Twin studies track how similar identical and fraternal twins are on various characteristics, even if they don't grow up together. Correlation lets researchers put a number on it. Go to this unit.
13. Two-Way Tables
One city surveyed the happiness of its residents. Two-way tables help organize the data and tease out relationships between happiness levels and opinions about aspects of the city itself. Go to this unit.
14. The Question of Causation
This historical story describes how researchers untangled the relationship between smoking and lung cancer. Go to this unit.
15. Designing Experiments
We move beyond observational studies — like one of marine life in the remote Line Islands — to designing experiments that manipulate various subject groups — as in the case of a medical study about osteoarthritis treatments. Go to this unit.
16. Census and Sampling
The U.S. counts every resident every ten years — or at least tries to. Statisticians use sampling from a population as an alternative to a complete count, as utilized at a potato chip factory. Go to this unit.
17. Sample and Surveys
A visit to the University of New Hampshire Survey Center illustrates how pollsters create accurate surveys. They can then use details from their sample to make inferences about a whole population. Go to this unit.
18. Introduction to Probability
Probability is the mathematics of chance behavior — and can help predict events such as the daily weather, or whether an asteroid will collide with Earth. Go to this unit.
19. Probability Models
Casinos are as well versed in probability as statisticians and probability models help them maintain the house advantage over gamblers. Go to this unit.
20. Random Variables
The Challenger space shuttle disaster was blamed on faulty O-rings. How can probability calculations on random variables help predict the chances of this kind of failure? Go to this unit.
21. Binomial Distributions
Sickle cell disease is an example of binomial distribution in families with two parents who are carriers for this genetic trait. Go to this unit.
22. Sampling Distributions
Heights of third graders in one class. Quality scores for circuit boards at a factory. Taking multiple samples allows us to visualize the sampling distribution of the sample mean. Go to this unit.
23. Control Charts
This quality control method helped Quest Diagnostics streamline and improve their system for processing and testing lab samples so they could meet their nightly deadlines. Go to this unit.
24. Confidence Intervals
A battery manufacturer tests just a sample of its product to verify its claims about battery life. A margin of error and a confidence level help quantify its accuracy. Go to this unit.
25. Tests of Significance
Is a newly-discovered poem really written by William Shakespeare? Using statistical analysis of his known word use, researchers set up null and alternative hypotheses to investigate. Go to this unit.
26. Small Sample Inference for One Mean
A brewer uses this technique to monitor quality differences in multiple batches of the same beer. Go to this unit.
27. Comparing Two Means
Comparing the activity and calorie expenditure levels of Western office workers and African hunter gatherers adds some surprising new data to the science of obesity. Go to this unit.
28. Inference for Proportions
Managers have no clue what conditions actually motivate their workers best, as shown by research conducted by Teresa Amabile, host of the original Against All Odds. Go to this unit.
29. Inference for Two-Way Tables
Host Dr. Pardis Sabeti's own research examines possible genetic resistance to deadly Lassa fever in West Africa. Using Inference for Two-Way Tables helps untangle potential relationships. Go to this unit.
30. Inference for Regression
Historical story of how statisticians built the case against DDT as the culprit behind plummeting peregrine falcon population numbers. Go to this unit.
31. One-Way ANOVA
Does holding a heavier clipboard make you estimate that a jar of coins has more money in it than if you're holding a lighter clipboard? Psychologists use One-Way ANOVA to analyze the data from this experiment. Go to this unit.
This review of the course through the preceding 31 video modules provides an overview of the practice of statistics and helps students appreciate how statistical methods can help them better understand their world. Go to this unit.
33. What Is Statistics?
Using historical anecdotes and contemporary applications, this introduction to the series explores the vital links between statistics and our everyday world. The program also covers the evolution of the discipline.
34. Picturing Distributions
With this program, students will see how key characteristics in the distribution of a histogram shape, center, and spread help professionals make decisions in such diverse fields as meteorology, television programming, health care, and air traffic control. Through a discussion of the advantages of back-to-back stem plots, this program also emphasizes the importance of seeking explanations for gaps and outliers in small data sets.
35. Describing Distributions
This program examines the difference between mean and median, explains the use of quartiles to describe a distribution, and looks to the use of boxplots and the five-number summary for comparing and describing data. An illustrative example shows how a city government used statistical methods to correct inequity between men's and women's salaries.
36. Normal Distributions
Students will advance from histograms through smooth curves to normal curves, and finally to a single normal curve for standardized measurement, as this program shows ways to describe the shape of a distribution using progressively simpler methods. In a lesson on creating a density curve, students also learn why, under steadily decreasing deviation, today's baseball players are less likely to achieve a .400 batting average.
37. Normal Calculations
With this program, students will discover how to convert the standard normal and use the standard deviation; how to use a table of areas to compute relative frequencies; how to find any percentile; and how a computer creates a normal quartile plot to determine whether a distribution is normal. Vehicle emissions standards and medical studies of cholesterol provide real-life examples.
38. Time Series
Statistics can reveal patterns over time. Using the concept of seasonal variation, this program shows ways to present smooth data and recognize whether a particular pattern is meaningful. Stock market trends and sleep cycles are used to explore the topics of deriving a time series and using the 68-95-99.7 rule to determine the control limits.
39. Models for Growth
Topics of this program include linear growth, least squares, exponential growth, and straightening an exponential growth curve by logic. A study of growth problems in children serves to illustrate the use of the logarithm function to transform an exponential pattern into a line. The program also discusses growth in world oil production over time.
40. Describing Relationships
Segments describe how to use a scatterplot to display relationships between variables. Patterns in variables (positive, negative, and linear association) and the importance of outliers are discussed. The program also calculates the least squares regression line of metabolic rate y on lean body mass x for a group of subjects and examines the fit of the regression line by plotting residuals.
With this program, students will learn to derive and interpret the correlation coefficient using the relationship between a baseball player's salary and his home run statistics. Then they will discover how to use the square of the correlation coefficient to measure the strength and direction of a relationship between two variables. A study comparing identical twins raised together and apart illustrates the concept of correlation.
42. Multidimensional Data Analysis
This program reviews the presentation of data analysis through an examination of computer graphics for statistical analysis at Bell Communications Research. Students will see how the computer can graph multivariate data and its various ways of presenting it. The program concludes with an example of a study that analyzes data on many variables to get a picture of environmental stresses in the Chesapeake Bay.
43. The Question of Causation
Causation is only one of many possible explanations for an observed association. This program defines the concepts of common response and confounding, explains the use of two-way tables of percents to calculate marginal distribution, uses a segmented bar to show how to visually compare sets of conditional distributions, and presents a case of Simpson's Paradox. The relationship between smoking and lung cancer provides a clear example.
44. Experimental Design
Statistics can be used to evaluate anecdotal evidence. This program distinguishes between observational studies and experiments and reviews basic principles of design including comparison, randomization, and replication. Case material from the Physician's Health Study on heart disease demonstrates the advantages of a double-blind experiment.
45. Blocking and Sampling
Students learn to draw sound conclusions about a population from a tiny sample. This program focuses on random sampling and the census as two ways to obtain reliable information about a population. It covers single- and multi-factor experiments and the kinds of questions each can answer, and explores randomized block design through agriculturalists' efforts to find a better strawberry.
46. Samples and Surveys
This program shows how to improve the accuracy of a survey by using stratified random sampling and how to avoid sampling errors such as bias. While surveys are becoming increasingly important tools in shaping public policy, a 1936 Gallup poll provides a striking illustration of the perils of undercoverage.
47. What Is Probability?
Students will learn the distinction between deterministic phenomena and random sampling. This program introduces the concepts of sample space, events, and outcomes, and demonstrates how to use them to create a probability model. A discussion of statistician Persi Diaconis's work with probability theory covers many of the central ideas about randomness and probability.
48. Random Variables
This program demonstrates how to determine the probability of any number of independent events, incorporating many of the same concepts used in previous programs. An interview with a statistician who helped to investigate the space shuttle accident shows how probability can be used to estimate the reliability of equipment.
49. Binomial Distributions
This program discusses binomial distribution and the criteria for it, and describes a simple way to calculate its mean and standard deviation. An additional feature describes the quincunx, a randomizing device at the Boston Museum of Science, and explains how it represents the binomial distribution.
50. The Sample Mean and Control Charts
The successes of casino owners and the manufacturing industry are used to demonstrate the use of the central limit theorem. One example shows how control charts allow us to effectively monitor random variation in business and industry. Students will learn how to create x-bar charts and the definitions of control limits and out-of-control limits.
51. Confidence Intervals
This program lays out the parts of the confidence interval and gives an example of how it is used to measure the accuracy of long-term mean blood pressure. An example from politics and population surveys shows how margin of error and confidence levels are interpreted. The program also explains the use of a formula to convert the z* values into values on the sampling distribution curve. Finally, the concepts are applied to an issue of animal ethics.
52. Significance Tests
This program explains the basic reasoning behind tests of significance and the concept of null hypothesis. The program shows how a z-test is carried out when the hypothesis concerns the mean of a normal population with known standard deviation. These ideas are explored by determining whether a poem "fits Shakespeare as well as Shakespeare fits Shakespeare." Court battles over discrimination in hiring provide additional illustration.
53. Inference for One Mean
In this program, students discover an improved technique for statistical problems that involve a population mean: the t statistic for use when σ is not known. Emphasis is on paired samples and the t confidence test and interval. The program covers the precautions associated with these robust t procedures, along with their distribution characteristics and broad applications.
54. Comparing Two Means
How to recognize a two-sample problem and how to distinguish such problems from one- and paired-sample situations are the subject of this program. A confidence interval is given for the difference between two means, using the two-sample t statistic with conservative degrees of freedom.
55. Inference for Proportions
This program marks a transition in the series: from a focus on inference about the mean of a population to exploring inferences about a different kind of parameter, the proportion or percent of a population that has a certain characteristic. Students will observe the use of confidence intervals and tests for comparing proportions applied in government estimates of unemployment rates.
56. Inference for Two-Way Tables
A two-way table of counts displays the relationship between two ways of classifying people or things. This program concerns inference about two-way tables, covering use of the chi-square test and null hypothesis in determining the relationship between two ways of classifying a case. The methods are used to investigate a possible relationship between a worker's gender and the type of job he or she holds.
57. Inference for Relationships
With this program, students will understand inference for simple linear regression, emphasizing slope, and prediction. This unit presents the two most important kinds of inference: inference about the slope of the population line and prediction of the response for a given x. Although the formulas are more complicated, the ideas are similar to t procedures for the mean μ of a population.
58. Case Study
This program presents a detailed case study of statistics at work. Operating in a real-world setting, the program traces the practice of statistics planning the data collection, collecting and picturing the data, drawing inferences from the data, and deciding how confident we can be about our conclusions. Students will begin to see the full range and power of the concepts and techniques they have learned.