Questions about Survey Sampling
Why a Sample Survey?
What is the purpose of a sample survey? We use surveys any time we want to gather
information about any large group, when it would be too costly or
cumbersome to interview every individual. Sampling means that we
choose part of a population to represent the whole. (A population
is the collection of units from which we want information. In the
case of the Exit Poll, our population is voters in Utah who actually
vote.) It is important to try to select a sample that will be representative
of the population with regard to the information we are trying to
find. In a sample survey, we randomly select individuals to question,
and use their answers to make inferences about how the population
at large thinks or feels.
The results
of surveys can be used for diverse purposes: they help us understand
the demographics of an area, they help determine government policy,
they determine which television or radio programs are broadcast
(and when they are broadcast), and they help advertisers decide
where and when to place advertisements. These are only a few of
the many uses for surveys.
Some people
may wonder why we bother to conduct an exit poll. After all, the
election winners will be known when the votes are counted. So why
bother to estimate who will win? Actually, our exit poll collects
more information than just the election winners. We ask questions
about the current political climate, environmental issues, government
policies, etc., that can be very useful to better understand the
political process and why people vote the way they do.
What
is a Statistic?
A statistic
is the summary of numerical values for the characteristic we are
measuring in the sample. In our exit poll, one of the statistics
we measure is the proportion of people in our sample who voted for
a certain candidate. This statistic estimates the actual proportion
of voters in the population who voted for a certain candidate. The
actual proportion is called the parameter.
Why
a Random Sample?
The best way
to choose a sample is by some process of random sampling. This means
that we allow impersonal chance to do the choosing. This method
prevents favoritism by the sampler and self-selection by the respondents.
In fact, random sampling gives each and every individual in the
population a known chance to be a part of the sample. There are
many ways to do this. We used computer software to randomly select
the sampling units in the exit poll.
What
kinds of surveys are there?
There are many
different kinds of sampling methods. Methods of sampling can be
classified into two groups, according to how the sample is selected:
probability sampling or non-probability sampling.
In probability
sampling, researchers control the chance (or probability) an individual
in the population being selected for the sample. In the simplest
kind of probability sampling, a Simple Random Sample (SRS), every
set of individuals of the same size has a known chance of being
selected. For the exit poll, we're interested in asking questions
of individual voters. For an SRS, we would randomly select voters
from a list of all registered voters in Utah.
In another kind
of probability sample, a stratified sample, we first divide our
population into different groups, or strata, based on a predetermined
characteristic. For example, at a university, students might be
grouped into strata according to their major. Then, we select a
certain number of individuals from each stratum. The exit poll uses
a stratified sample. Each of the large, densely populated counties
in the state, as well as any county that had a participating school
in it, was included. The counties in the rural portion of the state
were grouped into strata based on the percentage of voters who voted
Democratic in the last election (2000). We then took a probability
sample of counties from each stratum.
A third, commonly
used kind of probability sample, is called a multistage sample.
In a multistage sample, we choose the sample in stages. For example,
a nationwide exit poll might first choose a sample of states (say,
ten states). Then, from each of the ten states, they'd choose a
sample of counties. Within each county, they choose a sample of
polling places. Finally, at each polling place they choose a sample
of voters to fill out their survey. The Utah Colleges Exit Poll
uses both stratified sampling and multistage sampling to ultimately
select the individuals to interview.
There are other
ways of selecting samples as well. These samples are called non-probability,
or non-scientific samples--in part because no randomization takes
place when the individuals are selected to participate in a survey.
Sometimes the individuals are self-selected; the individuals volunteer
themselves to answer a survey. Examples of this kind of survey are
web surveys, or 1-900 number phone surveys, where you can call in
and respond if you desire. These surveys are called voluntary response
surveys.
Another non-probability
survey is called a convenience sample. In this kind of sample, individuals
are chosen because they are easily accessible. An example of this
might be an exit poll where the pollster chose to interview the
first twenty voters at the first polling place she came to. Mall-intercept
samples are similar to convenience samples; in these samples, interviewers
ask question of mall shoppers.
Finally, in
quota sampling, interviewers simply administer the survey until
they have reached a certain goal or quota with regard to race, gender,
age, etc. These non-probability surveys are all limited because
they don't accurately represent the whole population. That's why
most survey companies choose to use probability sampling.
How
do we judge the quality of a sample survey?
The quality
of a statistic is based on several factors, among them are: · The
sampling method
- The sample size
- The margin of error
- Who is included in the sample
- How the questions that are asked are worded
- How the questions are ordered
- When the poll is conducted
- What is in the news during the polling
A good statistic
will be based on good information, so the pollster will not feel
the need to hide his information from the reader.
When selecting
a sample, some type of probability sampling should be used. This
means that everyone in the population has a chance of being selected
for the sample. Call-in and or internet surveys never meet this
requirement.
When choosing
an appropriate size for a sample, the general rule is, the bigger
the better. The more people surveyed, the more accurate the statistic
will be.
It is important
to know what population the statistic applies to. When we are told
that 47% of all people prefer vanilla ice cream to chocolate, it
is important to know whether that 47% is for all the people in the
United States, or only those living in Utah, or just the people
in Provo.
A very important
quality of a sample survey is its margin of error. The margin of
error should always be available for a good statistic. If it is
not given, it may be too large to be trustworthy, or the sample
was not randomly selected. These problems undermine the validity
of the statistic.
Is
there any protection to the public against bad statistics?
Unfortunately,
the answer is no. Some people use unethical methods to create a
statistic that says what they want it to say. Your best bet is to
be an informed reader and to always carefully question and analyze
each statistic that you encounter.
How
and why do we predict voter turnout?
For each of
the counties in our sample, we estimate how many voters will vote
on election day. This is for both statistical and practical purposes.
For instance, we use these estimates for choosing which voting precincts,
or more specifically, which polling places to include in our sample.
(A precinct is a small geographic division within a county, created
to manage the voting process in a county. One or more precincts
is assigned to vote at each polling place, so all the voters do
not overrun a single voting station.) We designed our sample so
that polling places with more expected voters are more likely to
appear in the sample.
To predict the
voter turnout of each county, we used statistical methods to predict
voter turnout based on turnouts from past elections. Next, we estimate
the number of voters who will come to the selected polling places
within those counties. Again, we used past data to find the proportion
of voters in that county who voted at these polling places and distributed
the county turnout according to those percentages.
Just how close
are our estimates? We won't know until Election Day!
What
is non-response and how can it affect survey accuracy?
Simply put,
the rate of non-response is the proportion of people who refuse
to participate in the survey when asked. Because we use probability
sampling in the exit poll, we identify certain voters as the sample,
and these individuals are the people we ask to participate in the
poll. If they refuse to do so, we do not know how they would have
responded, and so our results may be inaccurate. For example, in
the exit poll, if everyone who refuses to participate votes for
the same candidate, our results will show a lower proportion for
that candidate than actually occurred, because we were not able
to gather data from the non-respondents.
What
is a margin of error?
We take surveys
to estimate some measurement of the entire group on the basis of
a smaller sample of individuals within that group. We are estimating
this value because we don't know the true value. No matter how good
a survey is, you will rarely get an estimate that perfectly matches
the actual results. That is the nature of statistics. As a result,
we want to calculate an interval that we are reasonably certain
will contain the true value. Statisticians use the margin of error
to calculate this interval.
Say that a political
poll gives a certain candidate 55% of the popular vote with a margin
of error of ± 3%. What this means is that the poll is reasonably
certain that the actual percent of popular vote is between 52% and
58%. When the resulting intervals for both candidates overlaps it
is difficult to say for certain who is really ahead.
The lower the
margin of error, the better the survey. There is only one way to
be 100% sure that your results are correct, and that is to take
a census. As an informed reader, be suspicious of any survey that
either omits the margin of error entirely or has an unusually high
margin of error.
For
more further information regarding how we obtained our sample, click
here
For more
information regarding survey research methods in general visit:
The
Survey Research Methods Section of the American Statistical Association
The
American Association for Public Opinion Research