12 Conclusions

12.1 Overview

This textbook sought to train future public health professionals, specifically Master of Public Health (MPH) students, how to conduct basic applied data analysis using secondary data collected from national health surveys. The goal was to eliminate gaps in knowledge, skills and analytical abilities that may prohibit MPH graduates from being successful in entry-level public health practice and research-focused positions. A brief recap of what was covered in each chapter of this textbook is provided in the following sections. Results from each of the case studies covered in Chapters 6-10 are also provided.

12.2 Introduction and Basic Applied Data Analysis Recap

The first section included three chapters. Chapter 1 provided an overview of the textbook by outlining its purpose to train future public health professionals in the knowledge and skills to conduct applied secondary data analysis using national health surveys. Chapter 2 provided a general overview of the surveys used for the case studies presented in this textbook, including the National Health Interview Survey (NHIS) in Chapter 6, Medical Expenditure Panel Survey (MEPS) in Chapter 7, Health Information National Trends Survey (HINTS) in Chapter 8, Behavior Risk Factor Surveillance System (BRFSS) in Chapter 9, and the National Health and Nutrition Examination Survey (NHANES) in Chapter 10. Chapter 3 included a literature review of previous studies that have used national health surveys to answer public health and health services research-related research questions.

The second section included two chapters. Chapter 4 reviewed basic statistical functions commonly used for public health research questions. While this textbook was written for learners with some background knowledge of research methods and epidemiologic study designs, this chapter included basic terminology on types of data collected, descriptive (frequencies/percentages, means/standard deviations) and analytical statistical procedures (chi square, logistic regression) used for analysis of national health surveys. Chapter 5 included details on additional survey design features needed to be considered when analyzing complex surveys, including weights, primary sampling units, and stratum variables. Data from the NHIS were used for SAS programming examples in these chapters.

 

12.3 National Health Interview Survey (NHIS) Recap

Chapter 6 covered the background and details on how to obtain and analyze NHIS data. The objective of the NHIS case study was to explore whether Arab American adults were more or less likely to receive an annual flu vaccine in comparison to other racial/ethnic groups, such as other non-Hispanic Whites using 2018 person and sample adult files. The following specific aims were examined using chi square tests and logistic regression analyses.

  • Aim 6.1: Compare socioeconomic and health-related characteristics of Arab Americans compared to US-born and foreign-born non-Hispanic Whites from Europe and Russia (including former USSR countries)
  • Aim 6.2: Determine associations between region of birth and flu vaccine uptake among Arab Americans compared to US-born non-Hispanic Whites

Table 12.1 provides results from the chi square tests used to meet specific aim 6.1. Unweighted frequencies and weighted percentages are presented. There were statistically significant differences in flu vaccine uptake by region of birth, age, sex, and highest level of education among non-Hispanic Whites (all p’s<.05). Adults who had a flu vaccine in the last 12 months were more likely to be US-born, ages 35-54 years, female, and have a bachelor’s degree or higher level of education.

Table 12.1. Sociodemographic characteristics of non-Hispanic White adults in the US by flu vaccine uptake in the last 12 months, NHIS 2018

No
N(%)
Yes
N(%)
p-value
Region of birth among non-Hispanic Whites .0476
United States 18,303 (95.9) 17,436 (96.9)
Europe/Russia 513 (3.0) 406 (2.4)
Arab/Middle East 182 (1.1) 90 (0.7)
Age <.0001
18-34 years 10,109 (39.4) 5,004 (25.1)
35-54 years 13,270 (40.9) 8,508 (34.5)
55-64 years 4,316 (11.7) 4,318 (16.7)
65+ years 3,844 (8.0) 8,292 (23.7)
Sex <.0001
Male 15,127 (51.3) 10,774 (43.7)
Female 16,412 (48.7) 15,348 (56.3)
Highest level of education <.0001
<High school 9,526 (32.9) 6,209 (26.5)
High school diploma or GED 5,734 (20.0) 4,378 (17.8)
Some college or Associate degree 7,387 (25.3) 6,011 (24.4)
Bachelor’s degree or higher 6,829 (21.8) 8,020 (31.3)

Table 12.2 provides results from logistic regression analyses used for meeting specific aim 6.2. Odds ratios (OR) and the corresponding 95% confidence intervals (CI) are presented. The reference group is US-born non-Hispanic Whites. In the unadjusted model, foreign-born Arab Americans had 0.65 times lower odds of receiving a flu vaccine in the past 12 months compared to US-born Whites. However, because the confidence interval crosses the line of no effect at 1.00, the comparison is not statistically significant (95% CI=0.39, 1.10). This result differs from non-Hispanic Whites from Europe/Russia who had 0.78 times lower odds (95% CI=0.62, 0.97) of reporting a flu vaccine compared to US-born non-Hispanic Whites. For Arab Americans, results were statistically significantly different than US-born non-Hispanic Whites in the adjusted model (Model 2). After adjusting for age, sex, and education, foreign-born Arab Americans had 0.55 times lower odds (95% CI=0.32, 0.94) of reporting a flu vaccine in the past 12 months compared to US-born non-Hispanic Whites.  The odds were lower than foreign-born non-Hispanic Whites from Europe/Russia (OR=0.55 Arab compared to OR=0.69 Europe/Russia) and results were statistically significantly lower than US-born non-Hispanic Whites. This result highlights the need to separate Arab American individuals from other non-Hispanic Whites so that their health outcomes are not masked under the White racial group.

Table 12.2. Crude and multivariable logistic regression results, NHIS 2018

Model 1
Unadjusted
OR (95% CI)

Model 2
Adjusted for age, sex, and
education
OR (95% CI)

Region of birth among non-Hispanic Whites
United States

1.00

1.00

Europe/Russia

0.78 (0.62, 0.97)

0.69 (0.55, 0.86)

Arab/Middle East

0.65 (0.39, 1.10)

0.55 (0.32, 0.94)

12.4 Medical Expenditure Panel Survey (MEPS) Recap

Chapter 7 covered the background and details on how to obtain and analyze MEPS data. The objective of the MEPS case study was to explore whether adults who perceived their physician provided quality patient-provider communication (PPC) were more or less likely to receive an annual flu vaccine in comparison to those who did not receive quality PPC using household level in-person and self-administered questionnaire data. The following specific aims were examined using chi square tests and logistic regression analyses.

  • Aim 7.1: Compare sociodemographic and health-related characteristics of adults by influenza vaccine uptake
  • Aim 7.2: Determine association between adults’ perceptions of PPC qualities and their likelihood of receiving an influenza vaccine before and after controlling for covariates

Two PPC qualities that were examined in this case study were whether instructions given to patients were easy for them to understand and whether the health care provider asked the patient to “teach-back,” or describe how they will follow the instructions given to them. Table 12.3 provides results from the chi square tests used to meet specific aim 7.1. Unweighted frequencies and weighted percentages are presented. There were no statistically significant differences in flu vaccine uptake for either PPC quality evaluated. However, there were statistically significant differences in flu vaccine uptake by age and race/ethnicity (both p’s<.0001). Adults who did not receive a flu vaccine in the last 12 months were more likely to be younger (ages 18-44 years). Non-Hispanic Black and Hispanic adults had higher estimates of not receiving a flu vaccine compared to non-Hispanic White adults and non-Hispanic adults of other or multiple races, inclusive of non-Hispanic Asians.

Table 12.3. Patient-provider communication qualities and sociodemographic characteristics by flu vaccine uptake in the last 12 months in the US, MEPS 2015-2016

Flu vaccine in last 12 months p-value
No
N(%)
Yes
N(%)
Instructions provided easy to understand .8679
Not Always 2,049 (31.1) 2,462 (30.9)
Always 4,226 (68.9) 5,088 (69.1)
Asked to describe how you will follow instructions .4991
Not Always 4,105 (69.0) 5,003 (69.7)
Always 2,156 (31.0) 2,529 (30.3)
Age <.0001
18-44 years 10,107 (55.6) 4,740 (33.2)
45-64 years 5,937 (33.9) 4,765 (34.4)
65+ years 1,731 (10.5) 4,231(32.4)
Race/Ethnicity <.0001
Hispanic 5,774 (18.6) 3,238 (12.5)
Non-Hispanic White 6,738 (59.6) 6,611 (68.7)
Non-Hispanic Black 3,569 (13.2) 2,338 (9.7)
Non-Hispanic Other (including Asian/Multiple) 1,694 (8.6) 1,549 (9.1)

Table 12.4 provides results from the logistic regression results used to meet specific aim 7.2. Odds ratios (OR) and the corresponding 95% confidence interval (CI) are presented. The reference group are those who did “not always” perceive their health care provider exhibited each PPC quality. In the unadjusted models, there were no statistically significant differences between adults who perceived their health care provider always provided instructions that were easy to understand or asked them to describe how they will follow instructions with those who did not. Results remained statistically insignificant after adjusting for age and race/ethnicity for both PPC qualities. All logistic regression results were not statistically significant because the 95% confidence intervals cross the line of no effect at 1.00.

Table 12.4. Crude and multivariable logistic regression results, MEPS 2015-2016

Model 1
Unadjusted
OR (95% CI)

Model 2
Adjusted for age and
race/ethnicity

OR (95% CI)

Instructions provided were easy to understand
Not Always

1.00

1.00

Always 1.01 (0.92, 1.10) 1.06 (0.96, 1.16)
Asked to describe how you will follow instructions (teach-back)
Not Always

1.00

1.00

Always

0.97 (0.88, 1.07)

1.03 (0.93, 1.14)

12.5 Health Information National Trends Survey (HINTS) Recap

Chapter 8 covered the background and details on how to obtain and analyze HINTS data. The objective of the HINTS case study was to explore associations between e-mail PPC and colon cancer screening uptake using HINTS 5 Cycle 3 data. The following specific aims were examined using chi square tests and logistic regression analyses.

  • Aim 8.1: Compare sociodemographic and health-related characteristics of adults who use e-mail to communicate with their health care provider
  • Aim 8.2: Determine associations between e-mail PPC and adults’ likelihood of receiving a colon cancer screening before and after controlling for covariates

Table 12.5 provides results from the chi square tests used to meet specific aim 8.1. Unweighted frequencies and weighted percentages are presented. There were no statistically significant differences in colon cancer screening uptake among adults who did and did not use e-mail to communicate with their health care provider. Furthermore, there were no statistically significant differences by gender. Older adults (ages 60-69 years and ages 70+ years) were more likely to receive a colon cancer screening than adults ages 50-59 years (p<.0001).

Table 12.5. E-mail PPC and sociodemographic characteristics by colon cancer screening uptake in the last 12 months in the US, HINTS 5 Cycle 3

Colon cancer screening in last 12 months p-value
No
N(%)
Yes
N(%)
Communicate with health care provider by e-mail .2261
No 1,097 (55.2) 1,853 (57.8)
Yes 886 (44.8) 1,342 (42.2)
Gender .3134
Male 767 (50.3) 1,316 (48.1)
Female 1,097 (49.7) 1,676 (51.9)
Age <.0001
50-59 years 305 (69.1) 717 (38.0)
60-69 years 163 (18.9) 1,077 (32.0)
70+ years 131 (12.0) 1,166 (30.0)

Table 12.6 provides results from the logistic regression analyses used to meet specific aim 8.2. Odds ratios (OR) and the corresponding 95% confidence intervals (CI) are presented. The reference group includes those who did not communicate with their health care provider by e-mail. In the unadjusted model, there were no statistically significant differences in colon cancer screening among adults who did and did not use e-mail to communicate with their health care provider. Results were not statistically significant because the 95% confidence intervals cross the line of no effect at 1.00. However, adults who used e-mail to communicate with their health care provider had 2.15 times greater odds (95% CI=1.45, 3.19) of receiving a colon cancer screening after adjusting for gender and age. 

Table 12.6. Crude and multivariable logistic regression results, HINTS 5 Cycle 3

Model 1
Unadjusted
OR (95% CI)

Model 2
Adjusted for gender
and age
OR (95% CI)

Communicate with health care provider by e-mail 
No

1.00

1.00

Yes 0.90 (0.76, 1.07) 2.15 (1.45, 3.19)

12.6 BehaviorAL Risk Factor Surveillance System (BRFSS) Recap

Chapter 9 covered the background and details on how to obtain and analyze BRFSS data. The objective of the BRFSS case study was to explore whether differences in Alzheimer’s disease and related dementia (ADRD) caregiving experiences among urban (metro) and rural (non-metro) adults in Texas were moderated by race and ethnicity. The differences obtained among metro and non-metro adults reported collectively were stratified by racial and ethnic groups. Data from the 2019 BRFSS were used to fulfil the aims. The following specific aims were examined using chi square tests and logistic regression analyses.

  • Aim 9.1. Determine whether ADRD caregiving experiences differ across metro and non-metro geographic contexts among adults in Texas
  • Aim 9.2. Determine whether the relationship between geographic context and ADRD caregiving experiences is moderated by the caregiver’s race/ethnicity among metro and non-metro adults in Texas

Table 12.7 provides results from the chi square tests used to meet specific aim 9.1. Unweighted frequencies and weighted percentages are presented. There were no statistically significant differences in race/ethnicity, caregiver sex, caregiver relationship to care recipient or caregiver employment status by geographic context, among metro and non-metro ADRD caregivers (all p’s>.05).

Table 12.7. Selected characteristics of ADRD caregivers by geographic context, BRFSS 2019

Care Recipient Geographic Context p-value

Metro
N (weighted %)

Non-Metro
N (weighted %)

Race/Ethnicity 0.2318
Non-Hispanic White (Majority group)

142 (49.2)

60 (69.42)

Other Races (All minority groups)

79 (50.8)

14 (30.58)

Caregiver Sex 0.2986
Male

65 (35.9)

22 (48.88)

Female

156 (64.1)

52 (51.12)

Caregiver Relationship to Care Recipient 0.1021
Mother, Father, In-laws

80 (39.66)

30 (50.84)

Child

24 (11.37)

5 (1.11)

Husband, wife, live in partner

33 (9.79)

16 (22.16)

Other relative

40 (24.21)

11 (13.50)

Non-relative/family friend 42 (14.96) 11 (12.39)
Caregiver Employment Status 0.1039
Employed 98 (60.65) 27 (39.80)
Retired 71 (18.84) 33 (41.22)

Table 12.8 provides results from the logistic regression analyses used to meet specific aim 9.2 among non-Hispanic Whites (the majority group). Odds ratios (OR) and the corresponding 95% confidence intervals (CI) are presented. The reference group includes those whose care recipient lives in a metro (urban) area. There were no statistically significant differences in household or personal caregiving experiences among non-Hispanic White ADRD caregivers from metro and non-metro geographic contexts. All logistic regression results were not statistically significant because the 95% confidence intervals cross the line of no effect at 1.00.

Table 12.8. Crude and adjusted logistic regression results for non-Hispanic White ADRD caregivers, BRFSS 2019 Texas

Model 1
Unadjusted
OR (95% CI)
Model 2
Adjusted for sex, work, relationship
OR (95% CI)
Household Caregiving Experiences
Geographic Context
Metro 1.00 1.00
Non-Metro 0.92 (0.26, 3.17) 0.72 (0.22, 2.34)
Personal Caregiving Experiences
Geographic Context
Metro 1.00 1.00
Non-Metro 2.26 (0.76, 6.71) 1.91 (0.59, 6.17)

Table 12.9 provides results from the logistic regression analyses used to meet specific aim 9.2 among caregivers from minority groups, including non-Hispanic Blacks, Hispanics, non-Hispanic Asians, and all others. Odds ratios (OR) and the corresponding 95% confidence intervals (CI) are presented. The reference group includes those whose care recipient lives in a metro (urban) area. There were no statistically significant differences in household or personal caregiving experiences among minority ADRD caregivers from metro and non-metro geographic contexts. All logistic regression results were not statistically significant because the 95% confidence intervals cross the line of no effect at 1.00.

Table 12.9. Crude and adjusted logistic regression results for minority ADRD caregivers, BRFSS 2019 Texas

Model 1
Unadjusted
OR (95% CI)
Model 2
Adjusted for sex, work, relationship
OR (95% CI)
Household Caregiving Experiences
Geographic Context
Metro 1.00 1.00
Non-Metro 5.82 (0.73, 46.59) 3.05 (0.09, 103.33)
Personal Caregiving Experiences
Geographic Context
Metro 1.00 1.00
Non-Metro 9.80 (0.98, 97.73) 3.73 (0.32, 43.15)

12.7 National Health and Nutrition Examination Survey (NHANES) Recap

Chapter 10 covered the background and details on how to obtain and analyze NHANES data. The objective of the NHANES case study was to determine racial and ethnic differences in sedentary behavior guideline adherence among US- and foreign-born Hispanics, non-Hispanic Whites, non-Hispanic Blacks, and non-Hispanic Asians. Data from the NHANES 2017-March 2020 pre-pandemic data files were used to fulfil the aims. The following specific aims were examined using chi square tests and logistic regression analyses.

  • Aim 10.1: Compare the prevalence of adherence to 24-hour sedentary behavior guidelines in US adults by race, ethnicity, and nativity status
  • Aim 10.2: Determine associations between race, ethnicity, and nativity and sedentary guideline adherence among racially and ethnically diverse foreign-born adults compared to their US-born counterparts

Table 12.10 provides results from the chi square tests used to meet specific aim 10.1. Unweighted frequencies and weighted percentages are presented. There were statistically significant differences in sedentary guideline adherence by nativity status among non-Hispanic Black and Hispanic adults. Fewer foreign-born non-Hispanic Black adults were adherent to sedentary behavior guidelines than US-born non-Hispanic Black adults (p=.0127). However, the pattern differed among Hispanic adults. More foreign-born Hispanic adults were adherent to sedentary behavior guidelines compared to US-born Hispanic adults (p=.0003). There were no statistically significant differences among non-Hispanic Whites or Asian adults by nativity status (both p’s>.05). There were no differences by age or gender. However, there was a statistically significant difference in sedentary guideline adherence by BMI (p<.0001).

Table 12.10. Selected characteristics of adults by 24-hour movement sedentary guideline adherence, NHANES 2017-March 2020 pre-pandemic data

Sedentary Behavior Guideline

p-value

Not adherent
N (Weighted %)

Adherent 
N (Weighted %)

Non-Hispanic White

.2880

   US-born White

1,078 (95.7)

2,125 (94.4)

   Foreign-born White

44 (4.3)

96 (5.6)

Non-Hispanic Black

.0127

   US-born Black

665 (91.5)

1,639 (88.7)

   Foreign-born Black

56 (8.5)

169 (11.3)

Hispanic

.0003

   US-born Hispanic

205 (62.8)

601 (38.7)

   Foreign-born Hispanic

147 (37.2)

1149 (61.3)

Non-Hispanic Asian

.7071

   US-born Asian

56 (15.4)

94 (14.4)

   Foreign-born Asian

315 (84.6)

698 (85.6)

Age

.0727

   18-64 years

2,051 (81.3)

5,177 (78.9)

   65+ years

666 (18.7)

1,716 (21.1)

Gender

.4735

   Male

1,312 (47.6)

3,374 (48.7)

   Female

1,405 (52.4)

3,519 (51.3)

BMI

<.0001

   <25.00 BMI

524 (22.0)

1,558 (26.4)

   25.00-29.99 BMI

701 (28.2)

2,045 (34.0)

   30.00+ BMI

1,164 (49.8)

2,487 (39.6)

Table 12.11 provides results from the logistic regression analyses used to meet specific aim 10.2. Odds ratios (OR) and the corresponding 95% confidence interval (CI) are presented. There were no differences in sedentary guideline adherence by nativity status among non-Hispanic White or Asian adults. Foreign-born non-Hispanic Black had 1.37 times higher odds (95% CI=1.09, 1.71) of meeting sedentary behavior guidelines compared to US-born non-Hispanic Black adults in the unadjusted model. Results remained statistically significant after adjusting for age, gender, and BMI (OR=1.32; 95% CI=1.03, 1.69). Similar results were found for Hispanic adults. Hispanic adults had 2.67 times higher odds (95% CI=1.93, 3.70) of meeting sedentary behavior guidelines compared to US-born Hispanic adults in the unadjusted model. The odds increased to 2.85 (95% CI=2.02, 4.02) in the adjusted model.

Table 12.11. Unadjusted and adjusted logistic regression results, NHANES 2017-March 2020 pre-pandemic data

Model 1 
Unadjusted
OR (95% CI)
Model 2
Adjusted for
age, gender, BMI
OR (95% CI)
Race, Ethnicity, Nativity Status
Non-Hispanic White
   US-born White

1.00

1.00

   Foreign-born White

1.31 (0.78, 2.22)

1.35 (0.82, 2.21)

Non-Hispanic Black
   US-born Black

1.00

1.00

   Foreign-born Black

1.37 (1.09, 1.71)

1.32 (1.03, 1.69)

Hispanic
   US-born Hispanic

1.00

1.00

   Foreign-born Hispanic

2.67 (1.93, 3.70)

2.85 (2.02, 4.02)

Non-Hispanic Asian
   US-born Asian

1.00

1.00

   Foreign-born Asian

1.08 (0.70, 1.67)

0.96 (0.59, 1.54)

12.8 DISSEMINATION Recap

Chapter 11 covers the dissemination of research studies using secondary data from national health surveys. It includes details on how to disseminate results by abstracts, presentations, and original research manuscripts. Examples of poster presentations are provided as well as a thorough overview of writing each section of scientific manuscripts (Abstract, Introduction, Methods, Results, Discussion).

12.9 SUMMARY

The examples used in this textbook stem from previous studies and the current research laboratory focus of its primary author, Tiffany Kindratt, PhD, MPH. There is a wide range of research topics covered that may be of interest for undergraduate, graduate, and doctoral level students interested in national health surveys. Since all of the examples include SAS statistical software, future versions of this textbook and companion files will take into account other statistical software programs.

License

Icon for the Creative Commons Attribution 4.0 International License

Big Data for Epidemiology: Applied Data Analysis Using National Health Surveys Copyright © 2022 by Tiffany B. Kindratt is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Share This Book