5.4 Operationalization
Learning Objectives
- Define and give an example of indicators for a variable
- Identify the three components of an operational definition
- Describe the purpose of multi-dimensional measures such as indexes, scales, and typologies and why they are used
Now that we have figured out how to define, or conceptualize, our terms we’ll need to think about operationalizing them. Operationalization is the process by which researchers conducting quantitative research spell out precisely how a concept will be measured. It involves identifying the specific research procedures we will use to gather data about our concepts. This of course requires that we know what research method(s) we will employ to learn about our concepts, and we’ll examine specific research methods later on in the text. For now, let’s take a broad look at how operationalization works. We can then revisit how this process works when we examine specific methods of data collection in later chapters. Remember, operationalization is only a process in quantitative research. Measurement in qualitative research will be discussed at the end of this section.
Operationalization works by identifying specific indicators that will be taken to represent the ideas we are interested in studying. If, for example, we are interested in studying masculinity, indicators for that concept might include some of the social roles prescribed to men in society such as breadwinning or fatherhood. Being a breadwinner or a father might therefore be considered indicators of a person’s masculinity. The extent to which a man fulfills either, or both, of these roles might be understood as clues (or indicators) about the extent to which he is viewed as masculine.
Let’s look at another example of indicators. Each day, Gallup researchers poll 1,000 randomly selected Americans to ask them about their well-being. To measure well-being, Gallup asks these people to respond to questions covering six broad areas: physical health, emotional health, work environment, life evaluation, healthy behaviors, and access to basic necessities. Gallup uses these six factors as indicators of the concept that they are really interested in, which is well-being.
Identifying indicators can be even simpler than the examples described thus far. What are the possible indicators of the concept of gender? Most of us would probably agree that “man” and “woman” are both reasonable indicators of gender, but you may want to include other options for people who identify as non-binary or other genders. Political party is another relatively easy concept for which to identify indicators. In the United States, likely indicators include Democrat and Republican and, depending on your research interest, you may include additional indicators such as Independent, Green, or Libertarian as well. Age and birthplace are additional examples of concepts for which identifying indicators is a relatively simple process. What concepts are of interest to you, and what are the possible indicators of those concepts?
We have now considered a few examples of concepts and their indicators, but it is important we don’t make the process of coming up with indicators too arbitrary or casual. One way to avoid taking an overly casual approach in identifying indicators, as described previously, is to turn to prior theoretical and empirical work in your area. Theories will point you in the direction of relevant concepts and possible indicators; empirical work will give you some very specific examples of how the important concepts in an area have been measured in the past and what sorts of indicators have been used. Often, it makes sense to use the same indicators as researchers who have come before you. On the other hand, perhaps you notice some possible weaknesses in measures that have been used in the past that your own methodological approach will enable you to overcome.
Speaking of your methodological approach, another very important thing to think about when deciding on indicators and how you will measure your key concepts is the strategy you will use for data collection. A survey implies one way of measuring concepts, while focus groups imply a quite different way of measuring concepts. Your design choices will play an important role in shaping how you measure your concepts.
Operationalizing your variables
Moving from identifying concepts to conceptualizing them and then to operationalizing them is a matter of increasing specificity. You begin the research process with a general interest, identify a few concepts that are essential for studying that interest you, work to define those concepts, and then spell out precisely how you will measure those concepts. In quantitative research, that final stage is called operationalization.
An operational definition consists of the following components: (1) the variable being measured, (2) the measure you will use, (3) how you plan to interpret the results of that measure.
The first component, the variable, should be the easiest part. In much quantitative research, there is a research question that has at least one independent and at least one dependent variable. Remember that variables have to be able to vary. For example, the United States is not a variable. Country of birth is a variable, as is patriotism. Similarly, if your sample only includes men, gender is a constant in your study…not a variable.
Let’s pick a social work research question and walk through the process of operationalizing variables. Suppose we hypothesize that individuals on a residential psychiatric unit who are more depressed are less likely to be satisfied with care than those who are less depressed. Remember, this would be a negative relationship—as depression increases, satisfaction decreases. In this question, depression is the independent variable (the cause) and satisfaction with care is the dependent variable (the effect). We have our two variables—depression and satisfaction with care—so the first component is done. Now, we move onto the second component–the measure.
How do you measure depression or satisfaction? Many students begin by thinking that they could look at body language to see if a person were depressed. Maybe they would also verbally express feelings of sadness or hopelessness more often. A satisfied person might be happy around service providers and express gratitude more often. These may indicate depression, but they lack coherence. Unfortunately, what this “measure” is actually saying is that “I know depression and satisfaction when I see them.” While you are likely a decent judge of depression and satisfaction, you need to provide more information in a research study for how you plan to measure your variables. Your judgment is subjective, based on your own idiosyncratic experiences with depression and satisfaction. They couldn’t be replicated by another researcher. They also can’t be done consistently for a large group of people. Operationalization requires that you come up with a specific and rigorous measure for seeing who is depressed or satisfied.
Finding a good measure for your variable can take less than a minute. To measure a variable like age, you would probably put a question on a survey that asked, “How old are you?” To evaluate someone’s length of stay in a hospital, you might ask for access to their medical records and count the days from when they were admitted to when they were discharged. Measuring a variable like income might require some more thought, though. Are you interested in this person’s individual income or the income of their family unit? This might matter if your participant does not work or is dependent on other family members for income. Do you count income from social welfare programs? Are you interested in their income per month or per year? Measures must be specific and clear.
Depending on your research design, your measure may be something you put on a survey or pre/post-test that you give to your participants. For a variable like age or income, one well-worded question may suffice. Unfortunately, most variables in the social world are not so simple. Depression and satisfaction are multi-dimensional variables, as they each contain multiple elements. Asking someone “Are you depressed?” does not do justice to the complexity of depression, which includes issues with mood, sleeping, eating, relationships, and happiness. Asking someone “Are you satisfied with the services you received?” similarly omits multiple dimensions of satisfaction, such as timeliness, respect, meeting needs, and likelihood of recommending to a friend, among many others.
INDICES, SCALES, AND TYPOLOGIES
To account for a variable’s dimensions, a researcher might rely on an index, scale, or typology. An index is a type of measure that contains several indicators and is used to summarize some more general concept. An index of depression might ask if the person has experienced any of the following indicators in the past month: pervasive feelings of hopelessness, thoughts of suicide, over- or under-eating, and a lack of enjoyment in normal activities. On their own, some of these indicators like over- or under-eating might not be considered depression, but collectively, the answers to each of these indicators add up to an overall experience of depression. The index allows the researcher in this case to better understand what shape a respondent’s depression experience takes. If the researcher had only asked whether a respondent had ever experienced depression, she wouldn’t know what sorts of behaviors actually made up that respondent’s experience of depression.
Taking things one step further, if the researcher decides to rank order the various behaviors that make up depression, perhaps weighting suicidal thoughts more heavily than eating disturbances, then she will have created a scale rather than an index. Like an index, a scale is also a measure composed of multiple items or questions. But unlike indexes, scales are designed in a way that accounts for the possibility that different items may vary in intensity.
If creating your own scale sounds complicated, don’t worry! For most variables, this work has already been done by other researchers. You do not need to create a scale for depression because scales such as the Patient Health Questionnaire (PHQ-9) and the Center for Epidemiologic Studies Depression Scale (CES-D) and Beck’s Depression Inventory (BDI) have been developed and refined over dozens of years to measure variables like depression. Similarly, scales such as the Patient Satisfaction Questionnaire (PSQ-18) have been developed to measure satisfaction with medical care. As we will discuss in the next section, these scales have been shown to be reliable and valid. While you could create a new scale to measure depression or satisfaction, a study with rigor would pilot test and refine that scale over time to make sure it measures the concept accurately and consistently. This high level of rigor is often unachievable in student research projects, so using existing scales is recommended.
Another reason existing scales are preferable is that they can save time and effort. The Mental Measurements Yearbook provides a searchable database of measures for different variables. You can access this database from your library’s list of databases. At the University of Texas at Arlington, the Mental Measurements Yearbook can be searched directly or viewed online. If you can’t find anything in there, your next stop should be the methods section of the articles in your literature review. The methods section of each article will detail how the researchers measured their variables. In a quantitative study, researchers likely used a scale to measure key variables and will provide a brief description of that scale. A Google Scholar search such as “depression scale” or “satisfaction scale” should also provide some relevant results. As a last resort, a general web search may bring you to a scale for your variable.
Unfortunately, all of these approaches do not guarantee that you will be able to actually see the scale itself or get information on how it is interpreted. Many scales cost money to use and may require training to properly administer. You may also find scales that are related to your variable but would need to be slightly modified to match your study’s needs. Adapting a scale to fit your study is a possibility; however, you should remember that changing even small parts of a scale can influence its accuracy and consistency. Pilot testing is always recommended for adapted scales.
A final way of measuring multidimensional variables is a typology. A typology is a way of categorizing concepts according to particular themes. Probably the most familiar version of a typology is the micro, meso, macro framework. Students classify specific elements of the social world by their ecological relationship with the person. Let’s take the example of depression again. The lack of sleep associated with depression would be classified as a micro-level element while a severe economic recession would be classified as a macro-level element. Typologies require clearly stated rules on what data will get assigned to what categories, so carefully following the rules of the typology is important.
Once you have (1) your variable and (2) your measure, you will need to (3) describe how you plan to interpret your measure. Sometimes, interpreting a measure is incredibly easy. If you ask someone their age, you’ll probably interpret the results by noting the raw number (e.g., 22) someone provides. However, you could also re-code age into categories (e.g., under 25, 20-29-years-old, etc.). An index may also be simple to interpret. If there is a checklist of problem behaviors, one might simply add up the number of behaviors checked off–with a higher total indicating worse behavior. Sometimes an index will assign people to categories (e.g., normal, borderline, moderate, significant, severe) based on their total number of checkmarks. As long as the rules are clearly spelled out, you are welcome to interpret measures in a way that makes sense to you. Theory might guide you to use some categories or you might be influenced by the types of statistical tests you plan to run later on in data analysis.
For more complicated measures like scales, you should look at the information provided by the scale’s authors for how to interpret the scale. If you can’t find enough information from the scale’s creator, look at how the results of that scale are reported in the results section of research articles. For example, Beck’s Depression Inventory (BDI-II) uses 21 questions to measure depression. A person indicates on a scale of 0-3 how much they agree with a statement. The results for each question are added up, and the respondent is put into one of three categories: low levels of depression (1-16), moderate levels of depression (17-30), or severe levels of depression (31 and over).
In sum, operationalization specifies what measure you will be using to measure your variable and how you plan to interpret that measure. Operationalization is probably the trickiest component of basic research methods. Don’t get frustrated if it takes a few drafts and a lot of feedback to get to a workable definition.
Qualitative research and operationalization
As we discussed in the previous section, qualitative research takes a more open approach towards defining the concepts in your research question. The questions you choose to ask in your interview, focus group, or content analysis will determine what data you end up getting from your participants. For example, if you are researching depression qualitatively, you would not use a scale like the Beck’s Depression Inventory, which is a quantitative measure we described above. Instead, you would start off with a tentative definition of what depression means based on your literature review and use that definition to come up with questions for your participants. We will cover how those questions fit into qualitative research designs later on in the textbook. For now, remember that qualitative researchers use the questions they ask participants to measure their variables and that qualitative researchers can change their questions as they gather more information from participants. Ultimately, the concepts in a qualitative study will be defined by the researcher’s interpretation of what her participants say. Unlike in quantitative research in which definitions must be explicitly spelled out in advance, qualitative research allows the definitions of concepts to emerge during data analysis.
Spotlight on UTA School of Social Work
Are interactions with a social robot associated with changes in depression and loneliness?
Robust measurement is very important in research. Furthermore, providing a clear explanation of the measures used in a study helps others to understand the concepts being studied and interpret the findings as well as and helps other researchers to accurately replicate the study in different settings.
Dr. Noelle Fields and Ling Xu from the University of Texas at Arlington’s School of Social Work collaborated with Dr. Julienne Greer from the College of Liberal Arts on a pilot study that incorporated a participatory arts intervention with the social robot, NAO. The intervention took place with older adults living in an assisted living facility. The overall aim of this study was to help older adults improve their psychological well-being through participation in a theatre arts activity led by NAO.
The key outcome variables for this pilot study were psychological well-being measured by depression, loneliness, and engagement with the robot. Depression and loneliness were measured by two standardized scales: the 15-item Geriatric Depression Scale (Sheikh & Yesavage, 1986) and the revised 3-item UCLA loneliness scale (Hughes, Waite, Hawkley, & Cacioppo, 2004). However, engagement with the robot did not have a standardized measure. Thus, the research team utilized a measure to capture engagement with the robot based on previous research.
In this study, engagement with robot was defined as the degree of interaction or involvement with a robot. One way to measure engagement is for members of the research team (i.e., observers) to rate the level of participant engagement (see Table 1).
Table 1. Please circle 0-5 to indicate the participant’s engagement levels (definitions for each levels can be found in the example column).
Rating | Meaning | Example |
0 | Intense noncompliance | Participant stood and walked away from the table on which the robot interaction took place |
1 | Noncompliance | Participant hung head and refused to comply with interviewer’s request to speak to the robot |
2 | Neutral | Participant complied with instructions to speak with the robot after several prompts from the confederate |
3 | Slight interest | Participant required two or three prompts from the confederate before responding to the robot |
4 | Engagement | Participant complied immediately following the confederate’s request to speak with the robot |
5 | Intense engagement | Participant spontaneously engaged with the robot |
This measurement was easy to apply in this study; however, it may lack the sensitivity to capture more detailed information about engagement, especially among older adult populations. Therefore, the researchers in this pilot study designed additional indicators to describe the participants’ reactions when interacting with a robot. More specifically, after watching a video of each participant interacting with NAO, each researcher gave an engagement score based on the following concepts: (1) attentiveness including focus on face of robot or gesture of robot, (2) smiling and/or laughter, (3) head nodding, and (4) facial/vocal expression that included eyes widening, eyebrows arching, and tonal changes in voice. Through video analysis, each of the concepts were counted and tabulated by independent researchers, and mean score among researchers on each concept was then calculated. Sum scores on total engagement were also adapted for analysis. See Table 2 for detailed information of this measurement.
Table 2: Each researcher should provide a score on each item below based on your observation of participants’ interaction with the robot.
1.Strongly disagree | 2. Disagree | 3. Neither agree or disagree | 4. Agree | 5.Strongly agree | |
Attentive | |||||
a. focus on face of robot | |||||
b. focus on gesture of robot | |||||
Smiling and/or laughter | |||||
Head nodding | |||||
Facial/Vocal expression | |||||
a. Eyes widen | |||||
b. Eyebrow arch | |||||
c. Tonal changes in voice |
The study found that participants reported improvements in mood, loneliness, and depression. The degree of difference/change was slightly greater in participants without dementia, perhaps suggesting social engagement and connection was a more profound attribute in cognitively intact older adults. Further research would be needed to confirm this hypothesis. Although the study is limited by its small scale and non-intervention control group, this exploratory pilot study supports the continuing development of participatory arts interventions with older adults using a social robotic platform. The benefits of performative participatory art between social robots and older adults is an emerging research area for human-robot social interactions and communications.
Key Takeaways
- Operationalization involves spelling out precisely how a concept will be measured.
- Operational definitions must include the variable, the measure, and how you plan to interpret the measure.
- Multi-dimensional concepts can be measured by an index, a scale, or a typology.
- It’s a good idea to look at how researchers have measured the concept in previous studies.
Glossary
- Index- measure that contains several indicators and is used to summarize a more general concept
- Indicators- represent the concepts that we are interested in studying
- Operationalization- process by which researchers conducting quantitative research spell out precisely how a concept will be measured and how to interpret that measure
- Scale- composite measure designed in a way that accounts for the possibility that different items on an index may vary in intensity
- Typology- measure that categorizes concepts according to particular themes