What Is Statistical Bias? (3 Important Things To Remember)

Back in 1976, Ann Landers, the famous advice columnist, posed this question to parents, “if you had to do it over again, would you become a parent?” Over 10,000 people voluntarily responded to the survey and the result was really surprising: over 70% of respondents replied “no.”

survey yes or no — Even a simple yes/no survey question can introduce statistical bias, depending on how the question is worded, who it is sent to, who responds, and so forth.

That seems like a lot of unhappy parents. What’s going on? Was this truly an accurate indicator of the overall population?This case illustrates what happens when surveys aren’t conducted properly.

Often, studies inadvertently introduce bias into their sampling surveys. Let’s dig a little deeper and figure out what statistical bias is.

Simply put, bias is the tendency for a statistic to overestimate or underestimate the population parameter we’re trying to measure. For example, suppose you conduct a survey to determine the average height of people in your school.

scale height weight — When finding average height or weight, statistical bias can skew your results if your sample over or under-represents a part of the population.

You determine that the average height is 5’5“ but in actuality the population’s average height is 5’10”. Perhaps some statistical bias crept into your analysis. In this case, the statistic underestimated the population’s average height.

When conducting a survey, one of the most important things is to avoid statistical bias. We want to make sure that the sample we’re choosing accurately represents the population we’re studying.

For example, if we wanted to conduct a study on the shopping habits of all teenagers during the holiday season and then we went to the local mall to conduct the survey as teens pass by, we’re going to end up with pretty faulty results.

If our goal is to learn the shopping habits of all teens during the holidays, we are now underrepresenting all the teens that don’t go to the local mall. We could miss all those that shop online or those teens that don’t shop at all. Our sample is biased.

If we don’t avoid bias in the sampling, our results can be misleading. So, it always makes sense to consider how we are going to sample in order to avoid misleading or even useless results.

Types of Bias

Statistical bias comes in many forms and we’re going to look at several types of bias. And, we’ll discuss how to minimize bias in studies. Remember, the goal of a sample survey is to select a subset of the overall population we’re studying in order to get an accurate picture of what’s happening with the population.

Selection Bias

This type of bias can occur if one group is inadvertently selected for the survey over another group. Think back to our teenage holiday shopping habits example.

We’re interested in learning the holiday shopping habits of all teenagers, but we only surveyed those that are at a mall. Online teen shoppers as well as non-shoppers aren’t represented at all.

A famous example of selection bias concerns the 1936 presidential election. The Literary Digest, a weekly general interest magazine, conducted an opinion poll that predicted a landslide victory for the Republican Alfred Landon over Franklin D. Roosevelt in the presidential election.

voting vote — Selection bias occurs when the sample from a population is selected in a way that over represents one group.

When the magazine conducted the survey, they polled their subscribers and then used lists of registered automobile owners and telephone users. Think about that.

The magazine surveyed people with cars and telephones. The year was 1936. Most people didn’t own cars or phones at that time, except the wealthy.

This created a biased poll since most of those polled were wealthy and tended to vote Republican. The magazine received about 2 million responses and still incorrectly predicted the outcome of the election! That’s how important bias is in a study.

Voluntary Response Bias (also known as self-selection bias)

This type of bias occurs when a survey relies on participants who volunteer to take the survey. We can end up with a group that does not represent the research population.

For example, suppose an airline sends out an email survey to all of the travelers on a recent flight from Paris to Boston. It’s quite possible that the people that take the time to fill out the survey were all unhappy with some aspect of the flight – the timeliness of the flight, the food quality, the temperature of the cabin, the lack of legroom, etc.

airplane in flight — If you survey every passenger on a given flight, you may only hear from those with extreme opinions, due to voluntary response bias (self-selection bias).

Those that found the flight to be perfectly fine perhaps aren’t as likely to reply to the email. Will the airline get a good understanding of how all 234 passengers felt about the flight? Probably not.

Often, people who fill out surveys have strong feelings about the subject. This can lead to skewed results since those with more neutral feelings are underrepresented.

Funding Bias (also known as sponsorship bias)

This occurs when the sponsor of the survey or study has an interest in the results of the study.

money — Funding bias (sponsorship bias) can influence the outcome of a statistical study in favor of whoever paid for the research.

For example, a 2006 review of experimental studies examining the health effects of cell phone use found that studies funded exclusively by those in the cell phone industry were least likely to report a direct connection between cell phone use and ill health.

This isn’t surprising since a negative connection wouldn’t help cell phone sales!

Nonresponse Bias

This type of bias can happen in a few different ways.

Suppose a survey is mailed to residents in a town to determine how they feel about an upcoming town vote. Many people may not bother to take the time to fill out the survey and mail it back in.

Another type of nonresponse bias can happen because people refuse to participate because the topic that’s being surveyed is embarrassing or even illegal.

Nonresponse bias occurs when people fail to respond to a survey.

In other cases, certain people may be more likely to respond to the survey than others. For example, people who consistently exercise are more inclined to answer a survey about exercise habits.

There can also be overlap in the different types of biases. In our example concerning the 1936 presidential election, the survey suffered from selection bias, as we’ve already noted. When investigated further, it came to light that the survey also suffered greatly from non-response bias.

Apparently, the people that mailed in their responses were strongly opposed to Roosevelt. And, it was discovered that most Roosevelt supporters didn’t respond to the survey.

Response Bias (also called survey bias)

This type of bias occurs when survey respondents answer questions untruthfully or in a misleading way. Why would they do that?

Perhaps the goal of the study is to determine the exercise habits of middle aged people. Well, maybe some inactive people don’t want to admit how sedentary their lifestyle has become so if a survey question asks, “in a given week, how often do you exercise?” they may answer untruthfully, “1-2 times per week” just so they’re not perceived as lazy!

weights gym fitness health — Response bias occurs when survey respondents answer untruthfully (for example, to cover up the fact that they don’t exercise much!)

Undercoverage Bias

This is a common type of sampling bias and it happens when some of the variables in the population are poorly represented or not represented in the study sample.

The goal in conducting a survey is to draw conclusions about the population as a whole by investigating a subset of the population. If part of the population in the sample is underrepresented, then the survey results will be an inaccurate representation of the population.

This type of bias can occur when researchers use convenience sampling which is exactly as it sounds: include people in the sample who are easily available.

For example, you could stand in the center of town at lunch time and poll people about the town’s plans to build a new school. What can go wrong with this type of survey?

real estate — You may get undercoverage bias if you use convenience sampling (include people in the sample who are readily available, such as at a crowded space in a city).

Well, if you’re trying to determine the opinions of the town residents as a whole, the sample you obtain from standing in the center of town on a weekday will likely exclude school students, those that are confined to their homes, as well as those who work out of town. This can result in undercoverage bias.

Wording Bias

This type of bias crops up when the person conducting the survey uses non-neutral language.

Consider this scenario: the campaign manager for a college class presidential candidate parks herself at the campus center and asks passersby of the incumbent, “do you plan to vote for Chris as class president even though he was such a disastrous leader this past year?”

feedback statistical bias — Survey results can be skewed by wording bias if the language of a question is suggestive or misleading.

Well, it’s pretty obvious that the campaign manager is wording the question in such a way to elicit a certain response. Language is important when conducting a survey.

Again, notice that there can be considerable overlap between types of bias. A survey may have more than one type of bias. Non-neutral questions in a survey can have both response bias and wording bias. Selection bias and undercoverage bias pair up frequently.

How do you avoid statistical bias?

You might think – why bother with sampling at all? Why not conduct a census – that is, question everyone in the entire population about the particular issue?

That’s a lot more difficult than it sounds – a census is a costly, time-consuming way to gather data and is impractical on many levels. Even actual censuses aren’t foolproof. The US Census which is conducted every 10 years can still have problems collecting data from every person living in the United States.

Homeless people, those who frequently move, those who don’t submit a census form by mail or online, and those who don’t open the door to strangers may all be missed in a census. So, even a census won’t be perfect.

It turns out, the best way to avoid statistical bias is to select your study participants at random. Taking a random sample is the best way to ensure that bias isn’t introduced into your study.

It also means that every individual in the population has the same chance of being selected for the study. There are many ways to choose a random sample.

One method would be assigning numbers to all of the people in the population to be studied, then use a random number generator to select the participants. That way, each person in the population will have an equal chance of being included in the sample group.

Conclusion

The bottom line: to get meaningful results when conducting a statistical survey, we need to avoid bias in our sampling method. “Bias is the bane of sampling – the one thing above all to avoid.” (Stats, Modeling the World, by Bock, Velleman, De Veux). The best way to do that is to use random samples! And that’s a topic for another day.

Some statistics are resistant to outliers, while others are not – you can learn more here.

You can learn about 10 applications of statistics here.

You can learn the difference between sample and population standard deviation here.

I hope you found this article helpful. If so, please share it with someone who can use the information.

Don’t forget to subscribe to our YouTube channel & get updates on new math videos!

Subscribe To Our YouTube Channel!

About the author:
Jean-Marie Gard is an independent math teacher and tutor based in Massachusetts. You can get in touch with Jean-Marie at https://testpreptoday.com/.