In data analysis, we use graphs, tables, and numerical summaries to study the variation present in our data. Often, we want to extend our interpretation to a larger group beyond the particular group studied. Such generalizations are only valid, however, if the data we examine are representative of that larger group. If not, our interpretation may misrepresent the larger group! Note 4
The entire group that we want information about is called the population. We can gain information about this group by examining a portion of the population, called a sample.
To gain useful information, the sample must be representative of the population. A representative sample is one in which the relevant characteristics of the sample members are generally the same as the characteristics of the population.
There are several good reasons that we use samples to study populations; chief among them are feasibility and cost. For instance, in a nationwide political survey of the population of all voters in the United States, it would be difficult, if not impossible, to poll every voter. It would also be quite expensive. Statistical theory shows that a survey of a 1,000 carefully selected voters suffices to represent the opinions of the millions of people in the population of voters.
Another problem in answering questions about a population arises when we want to inspect or test products. For example, testing an air bag to see if it works properly means that we have to destroy it. We certainly can't test every air bag, but testing a carefully selected sample of air bags will tell us what we need to know about all the air bags in the population.