14.1 What is experimental design and when should you use it?
Learning Objectives
Learners will be able to…
- Describe the purpose of experimental design research
- Describe nomethetic causality and the logic of experimental design
- Identify the characteristics of a basic experiment
- Discuss the relationship between dependent and independent variables in experiments
- Identify the three major types of experimental designs
Pre-awareness check (Knowledge)
What are your thoughts on the phrase ‘experiment’ in the realm of social sciences? In an experiment, what is the independent variable?
The basics of experiments
In social work research, experimental design is used to test the effects of treatments, interventions, programs, or other conditions to which individuals, groups, organizations, or communities may be exposed to. There are a lot of experiments social work researchers can use to explore topics such as treatments for depression, impacts of school-based mental health on student outcomes, or prevention of abuse of people with disabilities. The American Psychological Association defines an experiment as:
a series of observations conducted under controlled conditions to study a relationship with the purpose of drawing causal inferences about that relationship. An experiment involves the manipulation of an independent variable, the measurement of a dependent variable, and the exposure of various participants to one or more of the conditions being studied. Random selection of participants and their random assignment to conditions also are necessary in experiments.
In experimental design, the independent variable is the intervention, treatment, or condition that is being investigated as a potential cause of change (i.e., the experimental condition). The effect, or outcome, of the experimental condition is the dependent variable. Trying out a new restaurant, dating a new person – we often call these things “experiments.” However, a true social science experiment would include recruitment of a large enough sample, random assignment to control and experimental groups, exposing those in the experimental group to an experimental condition, and collecting observations at the end of the experiment.
Social scientists use this level of rigor and control to maximize the internal validity of their research. Internal validity is the confidence researchers have about whether the independent variable (e.g, treatment) truly produces a change in the dependent, or outcome, variable. The logic and features of experimental design are intended to help establish causality and to reduce threats to internal validity, which we will discuss in Section 14.5.
Experiments attempt to establish a nomothetic causal relationship between two variables—the treatment and its intended outcome. We discussed the four criteria for establishing nomothetic causality in Section 4.3:
- plausibility,
- covariation,
- temporality, and
- nonspuriousness.
Experiments should establish plausibility, having a plausible reason why their intervention would cause changes in the dependent variable. Usually, a theory framework or previous empirical evidence will indicate the plausibility of a causal relationship.
Covariation can be established for causal explanations by showing that the “cause” and the “effect” change together. In experiments, the cause is an intervention, treatment, or other experimental condition. Whether or not a research participant is exposed to the experimental condition is the independent variable. The effect in an experiment is the outcome being assessed and is the dependent variable in the study. When the independent and dependent variables covary, they can have a positive association (e.g., those exposed to the intervention have increased self-esteem) or a negative association (e.g., those exposed to the intervention have reduced anxiety).
Since researcher controls when the intervention is administered, they can be assured that changes in the independent variable (the treatment) happens before changes in the dependent variable (the outcome). In this way, experiments assure temporality.
Finally, one of the most important features of experiments is that they allow researchers to eliminate spurious variables to support the criterion of nonspuriousness. True experiments are usually conducted under strictly controlled conditions. The intervention is given in the same way to each person, with a minimal number of other variables that might cause their post-test scores to change.
The logic of experimental design
How do we know that one phenomenon causes another? The complexity of the social world in which we practice and conduct research means that causes of social problems are rarely cut and dry. Uncovering explanations for social problems is key to helping clients address them, and experimental research designs are one road to finding answers.
Just because two phenomena are related in some way doesn’t mean that one causes the other. Ice cream sales increase in the summer, and so does the rate of violent crime; does that mean that eating ice cream is going to make me violent? Obviously not, because ice cream is great. The reality of that association is far more complex—it could be that hot weather makes people more irritable and, at times, violent, while also making people want ice cream. More likely, though, there are other social factors not accounted for in the way we just described this association.
As we have discussed, experimental designs can help clear up at least some of this fog by allowing researchers to isolate the effect of interventions on dependent variables by controlling extraneous variables. In true experimental design (discussed in the next section) and quasi-experimental design, researchers accomplish this with a control group or comparison group and the experimental group. The experimental group is sometimes called the treatment group because people in the experimental group receive the treatment or are exposed to the experimental condition (but we will call it the experimental group in this chapter.) The control/comparison group does not receive the treatment or intervention. Instead they may receive what is known as “treatment as usual” or perhaps no treatment at all.
In a well-designed experiment, the control group should look almost identical to the experimental group in terms of demographics and other relevant factors. What if we want to know the effect of CBT on social anxiety, but we have learned in prior research that men tend to have a more difficult time overcoming social anxiety? We would want our control and experimental groups to have a similar portions of men, since ostensibly, both groups’ results would be affected by the men in the group. If your control group has 5 women, 6 men, and 4 non-binary people, then your experimental group should be made up of roughly the same gender balance to help control for the influence of gender on the outcome of your intervention. (In reality, the groups should be similar along other dimensions, as well, and your group will likely be much larger.) The researcher will use the same outcome measures for both groups and compare them, and assuming the experiment was designed correctly, get a pretty good answer about whether the intervention had an effect on social anxiety.
Random assignment[/pb_glossary], also called randomization, entails using a random process to decide which participants are put into the control or experimental group (which participants receive an intervention and which do not). By randomly assigning participants to a group, you can reduce the effect of extraneous variables on your research because there won’t be a systematic difference between the groups.
Do not confuse random assignment with random sampling. Random sampling is a method for selecting a sample from a population and is rarely used in psychological research. Random assignment is a method for assigning participants in a sample to the different conditions, and it is an important element of all experimental research in psychology and other related fields. Random sampling helps a great deal with external validity, or generalizability, whereas random assignment increases internal validity.
Other Features of Experiments that Help Establish Causality
To control for spuriousness (as well as meeting the three other criteria for establishing causality), experiments try to control as many aspects of the research process as possible: using control groups, having large enough sample sizes, standardizing the treatment, etc. Researchers in large experiments often employ clinicians or other research staff to help them. Researchers train their staff members exhaustively, provide pre-scripted responses to common questions, and control the physical environment of the experiment so each person who participates receives the exact same treatment. Experimental researchers also document their procedures, so that others can review them and make changes in future research if they think it will improve on the ability to control for spurious variables.
An interesting example is Bruce Alexander’s (2010) Rat Park experiments. Much of the early research conducted on addictive drugs, like heroin and cocaine, was conducted on animals other than humans, usually mice or rats. The scientific consensus up until Alexander’s experiments was that cocaine and heroin were so addictive that rats, if offered the drugs, would consume them repeatedly until they perished. Researchers claimed this behavior explained how addiction worked in humans, but Alexander was not so sure. He knew rats were social animals and the experimental procedure from previous experiments did not allow them to socialize. Instead, rats were kept isolated in small cages with only food, water, and metal walls. To Alexander, social isolation was a spurious variable, causing changes in addictive behavior not due to the drug itself. Alexander created an experiment of his own, in which rats were allowed to run freely in an interesting environment, socialize and mate with other rats, and of course, drink from a solution that contained an addictive drug. In this environment, rats did not become hopelessly addicted to drugs. In fact, they had little interest in the substance. To Alexander, the results of his experiment demonstrated that social isolation was more of a causal factor for addiction than the drug itself.
One challenge with Alexander’s findings is that subsequent researchers have had mixed success replicating his findings (e.g., Petrie, 1996; Solinas, Thiriet, El Rawas, Lardeux, & Jaber, 2009). Replication involves conducting another researcher’s experiment in the same manner and seeing if it produces the same results. If the causal relationship is real, it should occur in all (or at least most) rigorous replications of the experiment.
Replicability
[INSERT A PARAGRAPH ABOUT REPLICATION/REPRODUCTION HERE. CAN USE/REFERENCE THIS IF IT’S HELPFUL; include glossary definition as well as other general info]
To allow for easier replication, researchers should describe their experimental methods diligently. Researchers with the Open Science Collaboration (2015)[1] conducted the Reproducibility Project, which caused a significant controversy regarding the validity of psychological studies. The researchers with the project attempted to reproduce the results of 100 experiments published in major psychology journals since 2008. What they found was shocking. Although 97% of the original studies reported significant results, only 36% of the replicated studies had significant findings. The average effect size in the replication studies was half that of the original studies. The implications of the Reproducibility Project are potentially staggering, and encourage social scientists to carefully consider the validity of their reported findings and that the scientific community take steps to ensure researchers do not cherry-pick data or change their hypotheses simply to get published.
Generalizability
Let’s return to Alexander’s Rat Park study and consider the implications of his experiment for substance use professionals. The conclusions he drew from his experiments on rats were meant to be generalized to the population. If this could be done, the experiment would have a high degree of external validity, which is the degree to which conclusions generalize to larger populations and different situations. Alexander argues his conclusions about addiction and social isolation help us understand why people living in deprived, isolated environments may become addicted to drugs more often than those in more enriching environments. Similarly, earlier rat researchers argued their results showed these drugs were instantly addictive to humans, often to the point of death.
Neither study’s results will match up perfectly with real life. There are clients in social work practice who may fit into Alexander’s social isolation model, but social isolation is complex. Clients can live in environments with other sociable humans, work jobs, and have romantic relationships; does this mean they are not socially isolated? On the other hand, clients may face structural racism, poverty, trauma, and other challenges that may contribute to their social environment. Alexander’s work helps understand clients’ experiences, but the explanation is incomplete. Human existence is more complicated than the experimental conditions in Rat Park.
Effectiveness versus Efficacy
Social workers are especially attentive to how social context shapes social life. This consideration points out a potential weakness of experiments. They can be rather artificial. When an experiment demonstrates causality under ideal, controlled circumstances, it establishes the efficacy of an intervention.
How often do real-world social interactions occur in the same way that they do in a controlled experiment? Experiments that are conducted in community settings by community practitioners are less easily controlled than those conducted in a lab or with researchers who adhere strictly to research protocols delivering the intervention. When an experiment demonstrates causality in a real-world setting that is not tightly controlled, it establishes the effectiveness of the intervention.
The distinction between efficacy and effectiveness demonstrates the tension between internal and external validity. Internal validity and external validity are conceptually linked. Internal validity refers to the degree to which the intervention causes its intended outcomes, and external validity refers to how well that relationship applies to different groups and circumstances than the experiment. However, the more researchers tightly control the environment to ensure internal validity, the more they may risk external validity for generalizing their results to different populations and circumstances. Correspondingly, researchers whose settings are just like the real world will be less able to ensure internal validity, as there are many factors that could pollute the research process. This is not to suggest that experimental research findings cannot have high levels of both internal and external validity, but that experimental researchers must always be aware of this potential weakness and clearly report limitations in their research reports.
Types of Experimental Designs
Experimental design is an umbrella term for a research method that is designed to test hypotheses related to causality under controlled conditions. Table 14.1 describes the three major types of experimental design (pre-experimental, quasi-experimental, and true experimental) and presents subtypes for each. As we will see in the coming sections, some types of experimental design are better at establishing causality than others. It’s also worth considering that true experiments, which most effectively establish causality, are often difficult and expensive to implement. Although the other experimental designs aren’t perfect, they still produce useful, valid evidence and may be more feasible to carry out.
Design type/subtype | Basic characteristics | Sample research questions |
1. Pre-experimental design (Section 14.4) | 1. No comparison group | |
A. One-group pretest posttest | A. Pre- and posttests are administered, but no comparison group | XXXX |
B. One-shot case study | B. No pretest | What is the average level of loneliness among graduates of a peer support training program? What percent of graduates rate their social support as “good” or “excellent”? |
2. Quasi-experimental design (Section 14.3) | 2. Comparison group, but no random assignment | |
C. Nonequivalent comparison group design | C. Similar to classical experimental design only without random assignment | XXXX |
D. Static-group design | D. No pretest, posttest administered after the intervention
|
|
E. Natural experiments | E. Naturally occurring event becomes “experimental condition”; observational study in which some cases are exposed to condition (which becomes the “experimental condition”) and others are not; changes in “experimental” group can be assessed; | |
3. True experimental design (Section 14.2) | 3. Random assignment to experimental and control groups; Intervention provided to the experimental group; Posttest (at a minimum) is administered | XXXX |
F. Classical experimental design | F. Pre- and posttest; control group | |
G. Posttest only control group | G. Does not use a pretest and assumes random assignment results in equivalent groups | |
H. Solomon four group design | H. Random assignment, two experimental and two control groups, pretests for half of the groups and posttests for all |
.
Key Takeaways
- Experimental designs are useful for establishing causality, but some types of experimental design do this better than others.
- Experiments help researchers isolate the effect of the independent variable on the dependent variable by controlling for the effect of extraneous variables.
- Experiments use a control/comparison group and an experimental group to test the effects of interventions. These groups should be as similar to each other as possible in terms of demographics and other relevant factors.
- True experiments have control groups with randomly assigned participants; quasi-experimental types of experiments have comparison groups to which participants are not randomly assigned; pre-experimental designs do not have a comparison group.
Exercises
TRACK 1 (IF YOU ARE CREATING A RESEARCH PROPOSAL FOR THIS CLASS):
- Think about the research project you’ve been designing so far. How might you use a basic experiment to answer your question? If your question isn’t explanatory, try to formulate a new explanatory question and consider the usefulness of an experiment.
- Why is establishing a simple relationship between two variables not indicative of one causing the other?
TRACK 2 (IF YOU AREN’T CREATING A RESEARCH PROPOSAL FOR THIS CLASS):
Imagine you are interested in studying child welfare practice. You are interested in learning more about community-based programs aimed to prevent child maltreatment and to prevent out-of-home placement for children.
- Think about the research project stated above. How might you use a basic experiment to look more into this research topic? Try to formulate an explanatory question and consider the usefulness of an experiment.
- Why is establishing a simple relationship between two variables not indicative of one causing the other?
- Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. Doi: 10.1126/science.aac4716 ↵
an operation or procedure carried out under controlled conditions in order to discover an unknown effect or law, to test or establish a hypothesis, or to illustrate a known law.
treatment, intervention, or experience that is being tested in an experiment (the independent variable) that is received by the experimental group and not by the control group.
Ability to say that one variable "causes" something to happen to another variable. Very important to assess when thinking about studies that examine causation such as experimental or quasi-experimental designs.
circumstances or events that may affect the outcome of an experiment, resulting in changes in the research participants that are not a result of the intervention, treatment, or experimental condition being tested
causal explanations that can be universally applied to groups, such as scientific laws or universal truths
as a criteria for causal relationship, the relationship must make logical sense and seem possible
when the values of two variables change at the same time
as a criteria for causal relationship, the cause must come before the effect
an association between two variables that is NOT caused by a third variable
variables and characteristics that have an effect on your outcome, but aren't the primary variable whose influence you're interested in testing.
the group of participants in our study who do not receive the intervention we are researching in experiments with random assignment
the group of participants in our study who do not receive the intervention we are researching in experiments without random assignment
in experimental design, the group of participants in our study who do receive the intervention we are researching
The ability to apply research findings beyond the study sample to some broader population,
This is a synonymous term for generalizability - the ability to apply the findings of a study beyond the sample to a broader population.
performance of an intervention under ideal and controlled circumstances, such as in a lab or delivered by trained researcher-interventionists
The performance of an intervention under "real-world" conditions that are not closely controlled and ideal
the idea that one event, behavior, or belief will result in the occurrence of another, subsequent event, behavior, or belief