10.7 Measurement error and reducing measurement bias
Learning Objectives
Learners will be able to…
- Explain the difference between random and systematic error
- Assess for systematic error and provide examples for how to reduce systematic error in measurement
As you can see, measures never perfectly describe what exists in the real world. Good measures demonstrate validity and reliability but will always have some degree of error. Measurement error consists of two types of error: systematic error and random error.
Random error is unpredictable and does not consistently result in scores that are consistently higher or lower on a given measure. If you’ve ever stepped on a bathroom scale twice and gotten two slightly different results, maybe a difference of a tenth of a pound, then you’ve experienced random error. Maybe you were standing slightly differently or had a fraction of your foot off of the scale the first time. If you were to take enough measures of your weight on the same scale, you’d be able to figure out your true weight. If you administered a depression scale to a group of people, it may include someone who just lost their job or someone who had just fallen in love. These people may have scores that don’t accurately reflect their true measure of depression. Any measure is subject to influence by the random occurrences of life, which means random error is present in any measurement. However, it will likely average out across participants.
>>>FROM Bhattacherjee, Chap 7 (do we like?)
Random error is the error that can be attributed to a set of unknown and uncontrollable external factors that randomly influence some observations but not others. As an example, during the time of measurement, some respondents may be in a nicer mood than others, which may influence how
they respond to the measurement items. For instance, respondents in a nicer mood may respond more positively to constructs like self-esteem, satisfaction, and happiness than those who are in a poor mood. However, it is not possible to anticipate which subject is in what type of mood or control for the effect of mood in research studies. Likewise, at an organizational level, if we are measuring firm performance, regulatory or environmental changes may affect the performance of some firms in an observed sample but not others. Hence, random error is considered to be “noise” in measurement and generally ignored.<<<<<<<
In contrast, systematic error in our instruments (i.e, “measurement bias”) causes our measures to consistently return incorrect results in one direction or another, usually due to an identifiable process. Imagine you created a measure of height, but you didn’t put an option for anyone over six feet tall. If you gave that measure to your local college or university, some of the taller students might not be measured accurately. In fact, you would be under the mistaken impression that the tallest person at your school was six feet tall, when in actuality there are likely people taller than six feet at your school. This error seems innocent, but if you were using that measure to help you build a new building, those people might hit their heads!
>>FROM Bhattacherjee, Chap 7 (do we like?)
Systematic error is an error that is introduced by factors that systematically affect all
observations of a construct across an entire sample in a systematic manner. In our previous
example of firm performance, since the recent financial crisis impacted the performance of
financial firms disproportionately more than any other type of firms such as manufacturing or
service firms, if our sample consisted only of financial firms, we may expect a systematic
reduction in performance of all firms in our sample due to the financial crisis. Unlike random
error, which may be positive negative, or zero, across observation in a sample, systematic
errors tends to be consistently positive or negative across the entire sample. Hence, systematic
error is sometimes considered to be “bias” in measurement and should be corrected.
Types of measurement bias
Another form of error arises when researchers word questions in a way that might cause participants to think one answer choice is preferable to another. For example, if I were to ask you “Do you think global warming is caused by human activity?” you would probably feel comfortable answering honestly. But what if I asked you “Do you agree with 99% of scientists that global warming is caused by human activity?” Would you feel comfortable saying no, if that’s what you honestly felt? I doubt it. That is an example of a leading question, a question with wording that influences how a participant responds. We’ll discuss leading questions and other problems in question wording in greater detail in Chapter 12.
Social desirability bias occurs when participants in a research study want to present themselves in a positive, socially desirable way to the researcher. People in your study may want to seem tolerant, open-minded, and intelligent, but their true feelings may be closed-minded, simple, and biased. Participants might provide answers that reflect what they want others to think about them rather than their own reality. An example of this is political polling, which may show greater support for a candidate from a minority race, gender, or political party than actually exists in the electorate.
>FROM Bhattacherjee, Chap 7 (do we like?)
Social desirability bias. Many respondents tend to avoid negative opinions or embarrassing comments about themselves, their employers, family, or friends. With negative questions such as do you think that your project team is dysfunctional, is there a lot of office politics in your workplace, or have you ever illegally downloaded music files from the Internet, the researcher may not get truthful responses. This tendency among respondents to “spin the truth” in order to portray themselves in a socially desirable manner is called the “social desirability bias”, which hurts the validity of response obtained from survey research. There is practically no way of overcoming the social desirability bias in a questionnaire survey, but in an interview setting, an astute interviewer may be able to spot inconsistent answers and ask probing questions or use personal observations to supplement respondents’ comments.<<<<<<
Another source of systematic error is called acquiescence bias, also known as “yea-saying.” It occurs when people say yes or agree to the items on a questionnaire, even when doing so doesn’t reflect their true feelings or contradicts their previous answers. For example, a person might say yes to both “I am a confident leader in group discussions” and “I feel anxious interacting in group discussions.” Those two responses are unlikely to both be true for the same person. Why would someone do this? Similar to social desirability, people may want to be agreeable. Or they might ignore their own contradictory feelings. They might not be fully engaged with the research process. You could interpret this as someone saying “yeah, I guess.” Respondents may also act on cultural reasons, trying to “save face” for themselves or the person asking the questions by avoiding disagreeing with the interviewer. Regardless of the reason, the results of your measure won’t match what the person truly feels.
Reducing measurement bias in research
So far, we have discussed sources of error that come from choices made by respondents or researchers. Systematic errors will result in responses that are incorrect in one particular direction or another. Systematic error can be reduced with careful planning and implementation, but random error can never be eliminated. Thus, social scientists speak with humility about our measures. We must always acknowledge that, even when we work diligently to reduce systematic error, our measures are only an approximation of reality. Humility is important in scientific measurement, as errors can have real consequences. Imagine someone taking a pregnancy test from a pharmacy. If the test said they were pregnant when they were not pregnant, that would be a false positive. On the other hand, if the test indicated that they were not pregnant when they were pregnant, that would be a false negative. Even if the test is 99% accurate, that means that one in a hundred people will get an erroneous result when they use a home pregnancy test. For some, a false positive would initially be exciting, then devastating once they found out they were not actually having a child. A false negative would have been disappointing at first and then quite shocking when they found out they were indeed having a child. While both false positives and false negatives are not very likely for home pregnancy tests (when taken correctly), it’s a good example of how measurement error can have consequences for the people being measured.
Systematic measurement error can be a substantial problem for conducting rigorous, ethical research. What can we do while selecting or creating our instruments so that we minimize the potential of errors? The headings below are guidelines for reducing measurement bias. Explore the dropdown items some ideas for good research practices.
Guidelines for reducing measurement bias
1. Make sure that you engage in a rigorous literature review so that you understand the concept that you are studying. This means understanding the different ways that your concept may manifest itself. This review should include a search for existing instruments.[1]
- Do you understand all the dimensions of your concept? Do you have a good understanding of the content dimensions of your concept(s)?
- What instruments exist? How many items are on the existing instruments? Are these instruments appropriate for your population?
- Are these instruments standardized? Note: If an instrument is standardized, that means it has been rigorously studied and tested.
2. Consult content experts to review your instrument. This is a good way to check the face validity of your items. Additionally, content experts can also help you understand the content validity.[2]
- Do you have access to a reasonable number of content experts? If not, how can you locate them?
- Did you provide a list of critical questions for your content reviewers to use in the reviewing process?
3. Pilot test your instrument on a sufficient number of people and get detailed feedback.[3] Ask your group to provide feedback on the wording and clarity of items. Keep detailed notes and make adjustments BEFORE you administer your final tool.
- How many people will you use in your pilot testing?
- How will you set up your pilot testing so that it mimics the actual process of administering your tool?
- How will you receive feedback from your pilot testing group? Have you provided a list of questions for your group to think about?
4. Provide training for anyone collecting data for your project.[4] You should provide those helping you with a written research protocol that explains all of the steps of the project. You should also problem solve and answer any questions that those helping you may have. This will increase the chances that your tool will be administered in a consistent manner.
- How will you conduct your orientation/training? How long will it be? What modality?
- How will you select those who will administer your tool? What qualifications do they need?
5. When thinking of variables, use a higher level of measurement, if possible. This will provide more information and you can always downgrade to a lower level of measurement later.
- Have you examined your items and the levels of measurement?
- Have you thought about whether you need to modify the type of data you are collecting? Specifically, are you asking for information that is too specific (at a higher level of measurement) which may reduce participants’ willingness to participate?
6. Use multiple indicators for a variable.[5] Think about the number of items that you will include in your tool.
- Do you have enough items? Enough indicators? The correct indicators?
7. Conduct an item-by-item assessment of multiple-item measures. When you do this assessment, think about each word and how it changes the meaning of your item.
- Are there items that are redundant? Do you need to modify, delete, or add items?
Standardized Instruments
The idea of coming up with your own measurement tool might sound pretty intimidating at this point. The good news is that often you can find something in the literature that works for you, you can use it (with proper attribution, of course). If there are only pieces of it that you like, you may be able to reuse those pieces (with proper attribution and describing/justifying any changes). You don’t always have to start from scratch!
NEED CONTENT HERE X-REF RUBIN & BABBIE OR OTHER OER
Key Takeaways
- Systematic error may arise from the researcher, participant, or measurement instrument.
- Systematic error biases results in a particular direction, whereas random error can be in any direction.
- All measures are prone to error and should interpreted with humility.
- Using standardized instruments that have been validated with your population is a good idea to reduce potential systematic error in measurement
- Sullivan G. M. (2011). A primer on the validity of assessment instruments. Journal of graduate medical education, 3(2), 119–120. doi:10.4300/JGME-D-11-00075.1 ↵
- Sullivan G. M. (2011). A primer on the validity of assessment instruments. Journal of graduate medical education, 3(2), 119–120. doi:10.4300/JGME-D-11-00075.1 ↵
- Engel, R. & Schutt, R. (2013). The practice of research in social work (3rd. ed.). Thousand Oaks, CA: SAGE. ↵
- Engel, R. & Schutt, R. (2013). The practice of research in social work (3rd. ed.). Thousand Oaks, CA: SAGE. ↵
- Engel, R. & Schutt, R. (2013). The practice of research in social work (3rd. ed.). Thousand Oaks, CA: SAGE. ↵
Unpredictable error that does not result in scores that are consistently higher or lower on a given measure but are nevertheless inaccurate.
(also known as bias) refers to when a measure consistently outputs incorrect data, usually in one direction and due to an identifiable process
when a participant's answer to a question is altered due to the way in which a question is written. In essence, the question leads the participant to answer in a specific way.
Social desirability bias occurs when we create questions that lead respondents to answer in ways that don't reflect their genuine thoughts or feelings to avoid being perceived negatively.
In a measure, when people say yes to whatever the researcher asks, even when doing so contradicts previous answers.
when a measure indicates the presence of a phenomenon, when in reality it is not present
when a measure does not indicate the presence of a phenomenon, when in reality it is present