## Join us for conversations that inspire, recognize, and encourage innovation and best practices in the education profession.

**Available on Apple Podcasts, Spotify, Google Podcasts, and more.**

**In This Part:**** Asking Questions and Collecting Data**

Let’s start our exploration of statistics by focusing on the first two steps of the process: “Ask a question” and “Collect the appropriate data.” The other steps will be explored in later sessions. We’ll start with a simple statistical question. Note 2

**Problem B1**

Let’s say you’d like to find out the length of the room you’re in.

**1. Ask a Question**

How long is the room?

**2. Collect Data**

Measure the length of the room in inches, using two different measurement devices: (1) a one-foot ruler and (2) a yardstick.

Measure the room length five times with each device, and fill in the tables below. Record your measurements to the nearest inch.

- Are the five measurements you obtained with the ruler exactly the same? Can you explain why there may be differences?
- Are the five measurements you obtained with the yardstick exactly the same? Can you explain why there may be differences?
- Did you get similar answers using the different measuring tools? Why or why not? Did you get identical answers using the different measuring tools? Why or why not?
- Which measuring tool do you think gave you more accurate results, the ruler or the yardstick? Why?
- Do you think a tape measure would be more or less accurate than a ruler or a yardstick? Why? If you have a tape measure available, use it to measure the same room five times and see how the results compare with your previous measurements.

**Video Segment **

In this video segment, participants discuss the results of the Room-Measurement Activity. Professor Kader then introduces the concept of variation in data. Watch the segment after you have completed Problem B1 and compare the variation in your measurements with those of the onscreen participants.

Which method(s) used by the participants produced the most variation? Which method(s) produced the least variation? How do your results compare with those of the onscreen participants?

You can find this segment on the session video approximately 11 minutes and 46 seconds after the Annenberg Media logo.

Variation, or differences in measured data, occurs for a number of reasons. Examining variation is a crucial part of data analysis and interpretation. In fact, explaining the variation in your data is as important as measuring the data itself.

**Problem B2**

Let’s study two more statistical questions. For example, suppose you were curious about the relative heights and arm spans of men and women.

**1. Ask a Question**

Are men typically taller than women?

Do men typically have longer arm spans than women?

**2. Collect Data**

Using a meter stick, measure the heights (without shoes) and arm spans (fingertip to fingertip) of three men and three women. Record your measurements to the nearest centimeter.

**a.** Did you get the same height for all six people? Did you get the same arm span for all six people? Why or why not?

**b. **If you measured all six heights and arm spans again, would the results be identical? Why or why not?

**Problem B3**

Let’s look at heights and arm spans again, this time measuring 24 people. Here are their data [heights (without shoes) and arm spans were measured to the nearest centimeter, using a meter stick]:

**a. **Examine the 24 measurements for height and arm span. You’ll notice that they are not all the same. What is the source of this variation? Can you explain why there are differences

**b. **Suppose your goal was to prove that men are typically taller than women. Does this data prove that conclusion? Why or why not?

**Problem B4**

**1. Ask a Question**

How much does a penny weigh?

**2. Collect Data**

We used a metric scale to weigh 32 pennies to the nearest centigram (1/100 of a gram). Here are the resulting weights

**a. **The 32 measurements are not all the same. What is the source of this variation?

**b. **What do you think would happen if you weighed the same penny 32 times? How would you expect that data to compare to the weights of the 32 different pennies?

**Problem B5:**

Based on the data in Problem B4, how much would you expect the 33rd penny to weigh? Could you be sure of its weight before weighing it? If you can’t name an exact weight, could you be confident about a range of weights that it falls between? Why?

**In This Part: How Long Is a Minute?**

**Problem B6**

**1. Ask a Question**

How well can people judge the time it takes for a minute to pass?

**2. Collect Data**

Use the following Interactive Activity (Flash version discontinued) to collect data on this question yourself, and try this experiment with a few friends.

Use a stopwatch. Have two or more subjects engage in a conversation, and ask one of them to let you know when he or she thinks a minute has passed. Record how much time actually passed. Repeat this several times with the same or different subjects.

**Problem B7**

**1. Ask a Question**

How many raisins are in a half-ounce box of raisins?

**2. Collect Data**

We counted the number of raisins in 17 half-ounce boxes:

**
a. **The 17 counts are not all the same. What do you think accounts for this variation?

**Problem B8**

**1. Ask a Question**

Should nuclear power be developed as an energy source?

**2. Collect Data**

Twenty-five people completed the following questionnaire:

The responses were as follows:

**a. **For Questions 1, 3, and 4, there are differences in the 25 responses to each question. What are the sources of this variation?

**b.** For Question 2, there was no variation. Why? Would you expect the same results from another sample of 25 people?

**c. **Take a closer look at this questionnaire. How are the questions posed, and how might that influence responses?

**In This Part:**** Variables
**

To answer the previous questions, you collected and examined data. Data are defined in terms of variables, or characteristics that may be different from one observation to the next. When we measure these characteristics, we assign a value for each variable. This set of values for a given variable is known as data.

Let’s take a closer look at variables. In Problem B4, the variable is the weight of a penny, and the data are the measured weights of the 32 pennies.

**Problem B9**

Look back at Problems B3, B6, B7, and B8. What were the variables in each of these problems?

At least one of these questions has more than one variable. A variable is any characteristic that may change from one observation to the next.

Some questions, such as “What is your height?,” are answered with a number. Answers to questions like “What is your sex?” do not require a number.

We distinguish between variables that are measured in numbers and those that are not. This distinction becomes useful and important when we get to the analysis phase of statistical problem-solving. These two types of variables are called quantitative variables and qualitative variables.

Quantitative Variables

Quantitative variables represent numbers or quantities; in fact, they are sometimes referred to as numerical variables. A test score, the number of votes cast in an election, the measured amount of soda in a two-liter bottle — these are all examples of quantitative variables.

Qualitative Variables

Rather than numbers, qualitative variables represent categories, such as “excellent” or “female,” and they are sometimes referred to as categorical variables. The hometown of a college student, the favorite TV show of a politician, and the model of a car seen on a highway — these are all qualitative variables.

We refer to data in the same terms.

Data are called quantitative if they come from measurements of a quantitative variable.

Data are called qualitative if they come from measurements of a qualitative variable.

Each activity presented here begins with a question and then considers appropriate data for answering that question. A set of measurements produced by a previous class is provided for each activity.

You may want to collect your own measurements for some of the activities, particularly if you are working with a group. This would provide group members with additional experience and understanding; however, it also requires more time.

**Problem B1**

Answers will vary. Here is one sample data set and the solutions to problems (a)-(e).

**
a. **The five measurements obtained using the ruler are not all exactly the same. The most likely cause of these differences is the method of measurement, such as the way the ruler was laid out.

**Problem B2****
a. **You should find that the heights and arm spans are different for the six people, since people are inherently different and come in all shapes and sizes.

**Problem B3****
a. **The heights are not the same, nor are the arm spans. These measurements vary partly because of the differences between people. You may notice that we also have data on sex; these values vary depending on whether the person measured is male or female. The sex of an individual also has an effect on the variation in the list of heights and arm spans. Also, there is always a possibility of variation due to measurement errors. We cannot expect measurements of height or arm span with a meter stick to be exact every time. And still another type of measurement error may occur: A mistake might be made in recording the person’s sex.

**Problem B4
**

- Measurement errors may have occurred.
- Older pennies may be more worn than newer ones and therefore weigh less.
- Different ingredients may have been used for making pennies in different years.
- Pennies may have been made at different mints, using different equipment.
- A penny may have something attached to it, such as a piece of dirt or gum.

**b. **You should still expect to find some slight variation in the data as a result of measurement errors, but the values should be much closer than if you had measured 32 different pennies.

**Problem B5
**We might expect the 33rd penny to weigh roughly three grams. There is no way to be absolutely sure of its weight beforehand since there is variation in the data we were given for the first 32 pennies. Judging from the first 32 pennies, it is quite likely that the 33rd will be between 2.42 and 3.18 grams, and less likely that it will be between 3.00 and 3.10 grams.

**Problem B6**

One possible source of the variation in this data is that some people are better at judging time than others; some may consistently overestimate or underestimate the minute. A second source, as always, is measurement error. Finally, people learn from experience — their own or someone else’s. After witnessing the measurement errors in the first estimate, they are likely to adjust their second estimates accordingly

**Problem B7**

**a. **Here are some possible sources of the variation in this data:

- Measurement errors may have occurred.
- Raisins come in many sizes and boxes are filled by weight. It takes fewer large raisins and more smaller raisins to fill a half-ounce box.
- The machine that fills the boxes is not perfect; it may include too many or too few raisins.
- Each box probably doesn’t contain exactly one-half ounce of raisins when you account for clumping or air in the box.

**b.** Since all of the values were very close, and there are few enough possible values for the number of raisins, we should expect some of them to be exactly the same. This would also be true of a large class taking a test with 20 questions; some students in this class would get the same score, just by chance.

**Problem B8
a. **Here are some possible sources of the variation in this data:

- Differences of opinion
- Measurement errors
- Misread or misunderstood questions
- Untruthful responses

**b. **It may be that so many people would say “Yes” to this question that finding one of 25 people to say “No” is unlikely. Possibly a person may have misunderstood the question or may not have responded truthfully to what he or she perceived as a “loaded” question. Although the data suggest that another 25 people might respond in the same way, there is not enough data to prove this conclusively.

**c. **The wording on each of these questions makes one think about the negative aspects of nuclear power. A respondent might be influenced to answer “Yes” to the final question after reading the first three. This questionnaire may be considered to be biased against nuclear power.

**Problem B9**

Here are the variables for each problem:

- Problem B3: Sex, height, arm span
- Problem B6: Time in seconds (i.e., people’s estimates of 60 seconds)
- Problem B7: Number of raisins in a box
- Problem B8: Answers to Questions 1, 2, 3, and 4