11.5 Sample quality

Learning Objectives

Learners will be able to…

  • Identify the considerations that go into producing a representative sample
  • Distinguish between sampling error and sampling bias and explain the factors that lead to each
  • Describe sources of sampling bias

Okay, so you’ve chosen where you’re going to get your data (setting), which elements you want and don’t want in your sample (inclusion/exclusion criteria), how you will select and recruit participants (sampling strategy and recruitment), and what size your sample should be. Unfortunately, even if you make good choices and do everything the way you’re supposed to, you can still draw a poor sample.

The purpose of a sample is to accurately represent the characteristics of the population without having to include every element of the population in the study. A representative sample is, “roughly speaking…a scaled-down version of the population, capturing it’s characteristics” (Grafström & Schelin, 2013).[1] If a representative sample is important to your study (e.g., studies that aim to establish valid nomethetic explanations), it would be best to use a sampling strategy that increases the likelihood of representativeness.

If you are investigating a research question using quantitative methods, the best choice is some kind of probability sampling, but aside from that, how do you know a good sample from a bad sample? As an example, we’ll use a bad sample I collected as part of a research project that didn’t go so well. Hopefully, your sampling will go much better than mine did, but we can always learn from what didn’t work.

 

How representative of the coastline is the image in the frame compared to the whole photograph?

For my study on how much it costs to get an LCSW in each state, I did not get a sample that looked like the overall population to which I wanted to generalize. My sample had a few states with more than ten responses and most states with no responses. That does not look like the true distribution of social workers across the country. I could compare the number of social workers in each state, based on data from the National Association of Social Workers, or the number of recent clinical MSW graduates from the Council on Social Work Education. More than that, I could see whether my sample matched the overall population of clinical social workers in gender, race, age, or any other important characteristics. Sadly, it wasn’t even close. So, I wasn’t able to use the data to publish a report.

Exercises

TRACK 1 (IF YOU ARE CREATING A RESEARCH PROPOSAL FOR THIS CLASS):

Critique the representativeness of the sample you are planning to gather.

  • Will the sample of people (or documents) look like the population to which you want to generalize?
  • Specifically, what characteristics are important in determining whether a sample is representative of the population? How do these characteristics relate to your research question?

Consider returning to this question once you have completed the sampling process and evaluate whether the sample in your study was similar to what you designed in this section.

TRACK 2 (IF YOU AREN’T CREATING A RESEARCH PROPOSAL FOR THIS CLASS):

Imagine you are studying the disproportionate rates of abuse and sexual assault for people with intellectual and developmental disabilities. You are interested in learning more about abuse prevention strategies, such as healthy relationship education, for this population.

Based on the previous section, critique the representativeness of the sample you planned to gather.

  • Will the sample of people (or documents) look like the population to which you want to generalize?
  • Specifically, what characteristics are important in determining whether a sample is representative of the population? How do these characteristics relate to your research question?

In my study, if I wanted to make sure I had a certain number of people from each state (state being the strata), making the proportion of social workers from each state in my sample similar to the overall population, I would need to know which email addresses were from which states. That was not information I had. So, instead I conducted simple random sampling and randomly selected 5,000 of 100,000 email addresses on the NASW list. There was less of a guarantee of representativeness, but whatever variation existed between my sample and the population would be due to random chance. We will discuss these non-random differences later in the chapter when we talk about bias. For now, just remember that the representativeness of a sample is helped by using random sampling, though it is not a guarantee.

Assessing representativeness should start prior to data collection. I mentioned that I drew my sample from the NASW email list, which (like most organizations) they sell to advertisers when companies or researchers need to advertise to social workers. How representative of my population is my sampling frame? Well, the first question to ask is what proportion of my sampling frame would actually meet my exclusion and inclusion criteria. Since my study focused specifically on clinical social workers, my sampling frame likely included social workers who were not clinical social workers, like macro social workers or social work managers. However, I knew, based on the information from NASW marketers, that many people who received my recruitment email would be clinical social workers or those working towards licensure, so I was satisfied with that. Anyone who didn’t meet my inclusion criteria and opened the survey would be greeted with clear instructions that this survey did not apply to them.

At the same time, I should have assessed whether the demographics of the NASW email list and the demographics of clinical social workers more broadly were similar. Unfortunately, this was not information I could gather. I had to trust that this was likely going to be the best sample I could draw and the most representative of all social workers

Let’s say you work for a children’s mental health agency, and you want a representative sample to study children who have experienced abuse. Walking through the steps here might proceed like this:

  1. Think about or ask your coworkers how many of the clients at your agency have experienced this issue. If it’s common, then clients at your agency would probably make a good sampling frame for your study. If not, then you may want to adjust your research question or consider a different agency to sample. You could also change your target population to be more representative with your sample. For example, while your agency’s clients may not be representative of all children who have survived abuse, they may be more representative of abuse survivors in your state, region, or county. In this way, you can draw conclusions about a smaller population, rather than everyone in the world who has experienced child abuse.
  2. Think about those characteristics that are important for individuals in your sample to have or not have. Obviously, the variables in your research question are important, but so are the variables related to it. Take a look at the empirical literature on your topic. Are there different demographic characteristics or covariates that are relevant to your topic?
  3. All of this assumes that you can actually access information about your sampling frame prior to collecting data. This is a challenge in the real world. Even if you ask around your office about client characteristics, there is no way for you to know for sure until you complete your study whether it was the most representative sampling frame you could find. You will need to consider feasibility and address any shortcomings in sampling within the limitations section of your article.
  4. Even if you choose a sampling frame that is representative of your population and use a probability sampling approach, there is no guarantee that the sample you are able to collect will be representative. Random chance may mean that people differ on important characteristics from your target population. Or perhaps people with certain important characteristics don’t respond to your recruitment efforts, which would make the sample non-representative in a systematic way.

Sampling error

>>from Bhattacherjee, 2012<<<

When you measure a certain observation from a given unit, such as a person’s response to a Likert-scaled item, that observation is called a response (see Figure 8.2). In other words, a response is a measurement value provided by a sampled unit,(e.g., a person’s response on a questionnaire). Each respondent will give you different responses to different items in an instrument. Responses from different respondents to the same item or observation can be graphed into a frequency distribution based on their frequency of occurrences. For a large number of responses in a sample, a graph of this frequency distribution tends to resemble a bell-shaped curve (called a normal distribution). The frequency distribution from the sample can be used to estimate the population‘s overall characteristics such as the mean (average of all observations) or standard deviation (variability or spread of the observations). In a sample, these values are called sample statistics (a “statistic” is a value that is estimated from observed data in a sample). Populations also have means and standard deviations that could be obtained if we could do a census of the entire population. However, since this is unfeasible or impossible for most populations, population characteristics remain unknown, and are called population parameters. The true value for characteristics of a population are not called “statistics” because they are not statistically estimated from data). Sample statistics may differ from population parameters if the sample is not perfectly representative of the population; the difference between the two is called sampling error. Theoretically, if we could gradually increase the sample size so that the sample approaches closer and closer to the population, then sampling error will decrease and a sample statistic will increasingly approximate the corresponding population parameter.

===============

In my ill-fated study of clinical social workers, I received 87 complete responses. That is far below the hundred thousand licensed or license-eligible clinical social workers. Moreover, since I wanted to conduct state-by-state estimates, there was no way I had enough people in each state to do so. Regardless of whether you are conducting exploratory, descriptive, or explanatory research, it is important to recruit the appropriate number of participants. For example, if your agency conducts a community scan of people in your service area on what services they need, the results will inform the direction of your agency, which grants they apply for, who they hire, and its mission for the next several years. Being overly confident in your sample could result in wasted resources for clients.

 

Honestly, I did not do a power analysis for my study. Instead, I asked for 5,000 surveys with the hope that 1,000 would come back. Given that only 87 came back, a power analysis conducted after the survey was complete would likely to reveal that I did not have enough statistical power to answer my research questions.

Bias

Sampling bias is a variance between the sample and population in a specific direction, such as towards those who have time to check their junk mail. Bias may be introduced by the sampling method used or due to conscious or unconscious bias introduced by the researcher (Rubin & Babbie, 2017).[2] A researcher might select people who “look like good research participants,” in the process transferring their unconscious biases to their sample. They might exclude people from the sampling from who “would not do well with the intervention.” Careful researchers can avoid these, but unconscious and structural biases can be challenging to root out.

One of the interesting things about surveying professionals is that sometimes, they email you about what they perceive to be a problem with your study. I got an email from a well-meaning participant in my LCSW study saying that my results were going to be biased! She pointed out that respondents who had been in practice a long time, before clinical supervision was required, would not have paid anything for supervision. This would lead me to draw conclusions that supervision was cheap, when in fact, it was expensive. My email back to her explained that she hit on one of my hypotheses, that social workers in practice for a longer period of time faced fewer costs to becoming licensed. Her email reinforced that I needed to account for the impact of length of practice on the costs of licensure I found across the sample. She was right to be on the lookout for bias in the sample.

One of the key questions you can ask is if there is something about your process that makes it more likely you will select a certain type of person for your sample, making it less representative of the overall population. In my project, it’s worth thinking more about who is more likely to respond to an email advertisement for a research study. I know that my work email and personal email filter out advertisements, so it’s unlikely I would even see the recruitment for my own study (probably something I should have thought about before using grant funds to sample the NASW email list). Perhaps an older demographic that does not screen advertisements as closely, or those whose NASW account was linked to a personal email with fewer junk filters would be more likely to respond. To the extent I made conclusions about clinical social workers of all ages based on a sample that was biased towards older social workers, my results would be biased. This is called selection bias, a bias in which the elements in the sample differ systematically from the overall population. The Catlogue of Bias website provides some helpful examples of selection bias.

Another potential source of sampling bias is nonresponse bias, or self-selection bias. Because people do not often respond to email advertisements (no matter how well-written they are), my sample is likely to be representative of people with characteristics that make them more likely to respond. They may have more time on their hands to take surveys and respond to their junk mail. To the extent that the sample is comprised of social workers with a lot of time on their hands (who are those people?) my sample will be biased and not representative of the overall population.

ADD:

Survivorship bias (see https://thedecisionlab.com/biases/survivorship-bias)

the healthy user effect, the healthy adherer effect, confounding by functional status or cognitive impairment (?), and confounding by selective prescribing (?) (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3077477/

undercoverage bias: https://www.scribbr.com/research-bias/undercoverage-bias/#:~:text=Undercoverage%20bias%20occurs%20when%20a,this%20type%20of%20research%20bias

 

from Scholarpedia (CC-BY-SA 3.0) what do you think?? the “causes” section has some repetition, but the correction/reduction seems new

(http://www.scholarpedia.org/article/Sampling_bias)

Causes of sampling bias

A common cause of sampling bias lies in the design of the study or in the data collection procedure, both of which may favor or disfavor collecting data from certain classes or individuals or in certain conditions. Sampling bias is also particularly prominent whenever researchers adopt sampling strategies based on judgment or convenience, in which the criterion used to select samples is somehow related to the variables of interest. For example, referring again to the opinion poll example, an academic researcher collecting opinion data may choose, because of convenience, to collect opinions mostly from college students because they happen to live nearby, and this will further bias the sampling toward the opinion prevalent in the social class living in the neighborhood.

Figure 1: Possible sources of bias occurring in the selection of a sample from a population.

In social and economic sciences, extracting random samples typically requires a sampling frame such as the list of the units of the whole population, or some auxiliary information on some key characteristics of the target population to be sampled. For instance, conducting a study about primary schools in a certain country requires obtaining a list of all schools in the country, from which a sample can be extracted. However, using a sampling frame does not necessarily prevent sampling bias. For example, one may fail to correctly determine the target population or use outdated and incomplete information, thereby excluding sections of the target population. Furthermore, even when the sampling frame is selected properly, sampling bias can arise from non-responsive sampling units (e.g. certain classes of subjects might be more likely to refuse to participate, or may be harder to contact etc.) Non-responses are particularly likely to cause bias whenever the reason of non-response is related to the phenomenon under study. Figure 1 illustrates how the mismatches between sampling frame and target population, as well as non-responses, could bias the sample.

In experiments in physical and biological sciences, sampling bias often occurs when the target variable to be measured during the experiment (e.g. the energy of a physical system) is correlated to other factors (e.g. the temperature of the system) that are kept fixed or confined within a controlled range during the experiment. Consider for example the determination of the probability distribution of the speed of all cars on British roads at any time during a certain day. Speed is definitely related to location: therefore measuring speed only at certain types of locations may bias the sample. For instance, if all measures are taken at busy traffic junctions in the city centre, the sampled distribution of car speeds will not be representative of Britain’s cars and will be strongly biased toward slow speeds, because it neglects cars travelling on motorways and on other fast roads. It is important to note that a systematic distortion of a sampled distribution of a random variable can result also from factors other than sampling bias, such as a systematic error in the instruments used to collect the sample data. Considering again the example of the distribution of the speed of cars in Britain, and suppose that the experimenter has access to the simultaneous reading of the speedometers placed on every car, so that there is no sampling bias. If most speedometers are tuned to overestimate the speed, and to overestimate it more at higher speed, then the resulting sampled distribution will be biased toward high velocities.

Correction and reduction of sampling bias

To reduce sampling bias, the two most important steps when designing a study or an experiment are (i) to avoid judgment or convenience sampling (ii) to ensure that the target population is properly defined and that the sample frame matches it as much as possible. When finite resources or efficiency reasons limit the possibility to sample the entire population, care should be taken to ensure that the excluded populations do not differ from the overall one in terms of the statistics to be measured. In social sciences population representative surveys most commonly are not simple random samples, but follow more complex sample designs (Cochran 1977). For instance, in a typical household survey a sample of households is selected in two stages: in a first stage there is a selection of villages or parts of cities (cluster) and in a second stage a set number of households is selected within the same cluster. When adopting such complex sample designs it is essential to ensure that the sample frame information is used properly and that the probability and random selection are implemented and documented at each stage of the sampling process. In fact, such information will be essential to compute unbiased estimates for the population using sampling weights (the inverse of the probability of selection) and taking into account the sampling design in order to properly compute the sampling error. In complex sample designs the sampling error will always be larger than in the simple random samples (Cochran 1977).

Whenever the sampling frame includes units that do not exist anymore (e.g., because the sample frames are incorrect and outdated) it will be impossible to obtain any samples from such non existing units. This situation does not bias the estimates, provided that such cases are not substituted using non-random methods, and that original sampling weights are properly adjusted to take into account such sample frame imperfections (nevertheless sample frame imperfections clearly have costs implications and if the sample size is reduced this also influences the size of the sampling error).

Solutions to the bias due to non-response are much more articulated, and can generally be divided in ex-ante and ex-post solutions (Groves et al. 1998). Ex-ante solutions try to prevent and minimize non-response in various ways (for instance specific training of enumerators, several attempts to interview the respondent, etc.) whereas ex-post solutions try to gather auxiliary information about non-respondents which is then used to calculate a probability of response for different population sub-groups and so re-weight response data for the inverse of such probability or alternatively some post-stratification and calibration.

Sampling error versus bias

It’s important to note that both bias and error describe how samples differ from the overall population. (Although, somewhat confusingly, a sample that is not representative of the population is called “biased” even if it’s lack of representativeness comes from sampling error and not sampling bias!).

To recap:

  • Sampling error describes random variations between the sample and the population, due to chance (i.e., who was randomly included in the sample) and is unavoidable when sampling.
  • Sampling bias is systematic and can be reduced by careful research planning and execution.

 

Exercises

TRACK 1 (IF YOU ARE CREATING A RESEARCH PROPOSAL FOR THIS CLASS):

  • Identify potential sources of bias in your sample and brainstorm ways you can minimize them, if possible.

TRACK 2 (IF YOU AREN’T CREATING A RESEARCH PROPOSAL FOR THIS CLASS):

Imagine you are studying the disproportionate rates of abuse and sexual assault for people with intellectual and developmental disabilities. You are interested in learning more about abuse prevention strategies, such as healthy relationship education, for this population.

  • Identify potential sources of bias in your sample and brainstorm ways you can minimize them, if possible.

Critical considerations

Think back to your undergraduate education. Did you ever participate in a research project as part of an introductory psychology or sociology course? Social science researchers on college campuses have a luxury that researchers elsewhere may not share—they have access to a whole bunch of (presumably) willing and able human guinea pigs. But that luxury comes at a cost—sample representativeness. One study of top academic journals in psychology found that over two-thirds (68%) of participants in studies published by those journals were based on samples drawn in the United States (Arnett, 2008).[3] Further, the study found that two-thirds of the work that derived from U.S. samples published in the Journal of Personality and Social Psychology was based on samples made up entirely of American undergraduate students taking psychology courses.

These findings certainly raise the question: What do we actually learn from social science studies and about whom do we learn it? That is exactly the concern raised by Joseph Henrich and colleagues (Henrich, Heine, & Norenzayan, 2010),[4] authors of the article “The Weirdest People in the World?” In their piece, Henrich and colleagues point out that behavioral scientists very commonly make sweeping claims about human nature based on samples drawn only from WEIRD (Western, Educated, Industrialized, Rich, and Democratic) societies, and often based on even narrower samples, as is the case with studies relying on samples drawn from U.S. college classrooms. As it turns out, findings about the nature of human behavior when it comes to fairness, cooperation, visual perception, trust, and other behaviors are based on studies that excluded participants from outside the United States and sometimes excluded anyone outside the college classroom (Begley, 2010).[5] This certainly raises questions about what we really know about human behavior as opposed to U.S. resident or U.S. undergraduate behavior. Of course, not all research findings are based on samples of WEIRD folks like college students. But even then, it would behoove us to pay attention to the population on which studies are based and the claims being made about those to whom the studies apply.

Another thing to keep in mind is that just because a sample may be representative in all respects that a researcher thinks are relevant, there may be relevant aspects that didn’t occur to the researcher. In 2008, you might not have thought that a person’s phone would have much to do with their voting preferences, for example. But had pollsters making predictions about the results of the 2008 presidential election not been careful to include both cell phone-only and landline households in their surveys, it is possible that their predictions would have underestimated Barack Obama’s lead over John McCain because Obama was much more popular among cell phone-only users than McCain (Keeter, Dimock, & Christian, 2008).[6]

 

Putting it all together

So how do we know how good our sample is or how good the samples gathered by other researchers are? There are a couple of things we can keep in mind as we read the claims researchers make about their findings.

First, remember that sample quality is determined only by the sample actually obtained, not by the sampling method itself. A researcher may set out to administer a survey to a representative sample by correctly employing a random sampling approach with impeccable recruitment materials. But, if only a handful of the people sampled actually respond to the survey, the researcher should not make claims as if their sampling went according to plan.

Another thing to keep in mind, as demonstrated by the preceding discussion, is that researchers may be drawn to talking about implications of their findings as though they apply to some group other than the population actually sampled. Whether the sampling frame does not match the population or the sample and population differ on important criteria, the resulting sampling error can lead to bad science.

We’ve talked previously about the perils of generalizing social science findings from students in the United States and other Western countries to all cultures in the world, imposing a Western view as the right and correct view of the social world. As consumers of theory and research, it is our responsibility to be attentive to this. And as researchers, it is our responsibility to make sure that we only make conclusions about populations from samples that are likely representative. A larger sample size and probability sampling can improve the representativeness and generalizability of the study’s findings to larger populations, though neither are guarantees.

At their core, questions about sample quality should address who has been sampled, how they were sampled, and for what purpose they were sampled. Being able to answer those questions will help you better understand, and more responsibly interpret, research results. For your study, keep the following questions in mind.

  • Are your sample size and your sampling approach appropriate for your research question(s)?
  • How much do you know about your sampling frame ahead of time? How will that impact the feasibility of different sampling approaches?
  • What gatekeepers and stakeholders are necessary to engage in order to access your sampling frame?
  • Are there any ethical issues that may make it difficult to sample those who have first-hand knowledge about your topic?
  • Does your sampling frame look like your population along important characteristics? Once you get your data, ask the same question of the sample you successfully recruit.
  • What about your population might make it more difficult or easier to sample?
  • How many people can you feasibly sample in the time you have to complete your project?
  • Are there steps in your sampling procedure that may bias your sample to render it not representative of the population?
  • If you want to skip sampling altogether, are there sources of secondary data you can use? Or might you be able to answer you questions by sampling documents or media, rather than people?

Key Takeaways

  • The sampling plan you implement should have a reasonable likelihood of producing a representative sample.
  • The sample you collect is one of an infinite number of potential samples that could have been drawn. To the extent the data in your sample varies from the data in the entire population, it includes some error or bias. Error is the result of random variations. Bias is systematic error that pushes the data in a given direction.
  • Even if you do everything right, there is no guarantee that you will draw a good sample.
  • Flawed samples mean you would have limited generalizability beyond your specific participants.
  • Historically, samples were drawn from dominant groups and generalized to all people. This shortcoming is a limitation of some social science literature and should be considered a colonialist scientific practice.

Post-awareness check (Emotion)

After reading this chapter and completing the Track 1 & 2 exercises, what have you noticed about your interest to study your research topic?


  1. Grafström, A. and Schelin, L. (2014). How to select representative samples. Scandinavian Journal of Statistics, 41, 277-290. https://doi.org/10.1111/sjos.12016
  2. Rubin, C. & Babbie, S. (2017). Research methods for social work (9th edition). Boston, MA: Cengage.
  3. Arnett, J. J. (2008). The neglected 95%: Why American psychology needs to become less American. American Psychologist, 63, 602–614.
  4. Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33, 61–135.
  5. Newsweek magazine published an interesting story about Henrich and his colleague’s study: Begley, S. (2010). What’s really human? The trouble with student guinea pigs. Retrieved from http://www.newsweek.com/2010/07/23/what-s-really-human.html
  6. Keeter, S., Dimock, M., & Christian, L. (2008). Calling cell phones in ’08 pre-election polls. The Pew Research Center for the People and the Press. Retrieved from http://people-press.org/files/legacy-pdf/cell-phone-commentary.pdf
definition

License

Doctoral Research Methods in Social Work Copyright © by Mavs Open Press. All Rights Reserved.

Share This Book