8 Health Information National Trends Survey

8.1 Introduction

Chapter 8 covers the Health Information National Trends Survey (HINTS). The HINTS has been collected by the National Cancer Institute (NCI) since 2003 to monitor national trends in health communication, information technology use, and knowledge, attitudes, and practices towards cancer prevention and care.1 The first iteration (HINTS 1) was collected in 2003. HINTS 2 and HINTS 3 were collected in 2005 and 2007-2008 respectively. Beginning with HINTS 4, data collection was split into cycles. HINTS 4 has four cycles collected in 2011 (Cycle 1), 2012 (Cycle 2), 2013 (Cycle 3) and 2014 (Cycle 4). In 2015, the United States Federal Drug Administration (FDA) partnered with the NCI to evaluate tobacco use and related communications, and public knowledge, beliefs, and behaviors regarding dietary supplements and medical products.1 This chapter includes details on: how data are collected; how data are made publicly available as machine-actionable data files; what variables must be included to address design features of the complex sample; the strengths and limitations of the survey; and practical tips for conducting statistical analysis; and how to answer research questions using a case study. The practical tips provided for analysis of HINTS data are based on the author’s previous experiences analyzing HINTS 4 data to answer questions related to associations between predisposing and enabling factors that contribute to morbidity, mortality and health services use. The HINTS case study will explore how e-mail communication between patients and health care providers between visits influences colon cancer screening uptake. The bulk of the chapter will comprise of section 8.6: Case Study in order to give the reader hands-on practice downloading and cleaning large databases and conducting basic categorical data analysis using PROC SURVEYFREQ and PROC SURVEYLOGISTIC. The syntax provided was created for use with SAS 9.4.

8.2 Data Collection

Data collection methods for HINTS have evolved over the past 17 years to increase participation using information technology. Surveys can be completed in English or Spanish. The sampling frame includes two strata to ensure inclusion of minority and non-minority populations. The most recent iteration as of this writing (HINTS 5 Cycle 3) includes self-administered and web-based options. Prior to HINTS 5, self-administered questionnaires were collected by mail.1 HINTS 5 Cycle 3 included as “web pilot” with two experimental conditions (“Web Option” or “Web Bonus”). Participants randomly selected for the Web Option group choose whether they wanted to complete the survey by paper or online. Other participants randomly selected for the Web Bonus group were given the same choices but offered a $10 incentive to complete the survey online. Participants recruited to complete the self-administered questionnaire by mail received the initial survey, a reminder postcard, and up to two follow-up mailings with additional copies of the survey. Participants recruited to complete the web-based options received instructions by mail with a website link and pin number/access code to complete the survey. It was requested that the adult with the “next-birthday” in the household complete the questionnaires. For example, if there were two adults in the household (one with a birthday in May, the other birthday was in October), the adult whose birthday was in May would be requested to fill out the survey if it was receiving in January. A $2 incentive was included with this questionnaire to promote participation. HINTS 5 Cycle 4 data were not available as of this writing. Further details of the HINTS sample design and data collection methods are reported on the HINTS website and published research.1–4

8.3 Data File

The HINTS is comprised of one data file per cycle, which can be combined with other iterations and cycles to increase sample size. Each HINTS administration includes questions related to the measurement of the following core constructs: sociodemographics; technology use and access; health care use and access; health information-seeking; cancer prevention and screening knowledge and behavior; cancer risk perceptions; and cancer-related behavior.2 The HINTS 3 Cycle 5 questionnaire includes questions related to the following 15 topics, represented by A through O (no Section I):5

  • A) Looking for health information
  • B) Using the internet to find information
  • C) Your health care
  • D) Medical records
  • E) Caregiving
  • F) Your overall health
  • G) Health and nutrition
  • H) Physical activity and exercise
  • J) Sun & UV exposure
  • K) Tobacco products
  • L) Cancer screening and awareness
  • M) Your cancer history
  • N) Beliefs about cancer
  • O) You and your household

8.4. Strengths and Limitations

A strength of the HINTS is its measurement of the rapidly changing health information and communication landscape using nationally representative samples. The HINTS is unique in its ability to provide data on cancer patients and survivors. The public-use data tools are easily and readily available on the website and several supporting documents are provided to support data users.6 HINTS abides by the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles for data management.2 Several survey questions are collected across survey iterations and cycles allowing for the examination of trends over time or collapsing to increase the sample size. Despite its strengths as a cross-sectional survey, results from the HINTS cannot be used to determine causality. Questions measuring health behaviors during the same time period (e.g. last 12 months) may suffer from temporality bias due to the inability to determine whether an exposure (e.g. communication with health care provider by e-mail or patient portal) occurred before the outcome (e.g. cancer screenings). Due to the small sample size, statewide estimates cannot be determined. However, the HINTS allows for regional estimates by Census region and divisions. Similar to other national health surveys, response rates have decreased over time. HINTS results may be biased due to large numbers of missing data. However, the HINTS recommends jackknife weighting processes, instead of Taylor-Linearization methods, to address this bias.7

8.5. Design Features

Data analysts must use special procedures to account for the complex sample design used by the HINTS. Although analysts can use complex survey methods similar to other national health surveys (e.g. cluster, stratification variables), the recommended approach for analyzing HINTS data is to use jackknife replicate weights to ensure the computation of the correct variance estimates.8 Each HINTS cycle includes a set of 50 replicate weights. A final sample weight is used to calculate population estimates and 50 replicate weights are used to calculate accurate standard errors of the estimates for the combined sample. A final sample weight is used to calculate population estimates and 50 replicate weights are used to calculate accurate standard errors of the estimates for the paper-only sample. Then, a final sample weight is used to calculate population estimates and 50 replicate weights are used to calculate accurate standard errors of the estimates for the web-option sample. Next, a final sample weight is used to calculate population estimates and 50 replicate weights are used to calculate accurate standard errors of the estimates for the web-bonus sample.  Finally, a final sample weight is used to calculate population estimates and 150 replicate weights are used to calculate accurate standard errors of the estimates for the combined sample, controlling for group differences. Although not the recommended approach for population estimates, stratum and cluster variables are available for analysis using Taylor-series linearization methods.8 An overview of the weights, cluster and stratum variables are provided in Table 8.1.

Table 8.1. Overview of important analytic variables for weighting and complex sample characteristics, HINTS 5 Cycle 3

Final Sample Weight Jackknife Replication Methods
Replicate Weights Degrees of Freedom
Combined Sample

Taylor
Linearization Methods
VAR_STRATUM and VAR_CLUSTER

TG_all_FINWT0 TG_all_FINWT0 through TG_all_FINWT50 49
Paper-Only TG1_all_FINWT0 TG1_all_FINWT0 through TG1_all_FINWT50 49
Web-Option TG2_all_FINWT0 TG2_all_FINWT0 through TG2_all_FINWT50 49
Web-Bonus TG3_all_FINWT0 TG3_all_FINWT0 through TG3_all_FINWT50 49
Combined Sample (Control for Group Differences) NWGT0 NWGT0 through NWGT150 147

8.6. HINTS Case Study: The Influence of E-mail PPC on COLON Cancer Screening

Communicating with health care providers by e-mail, patient portals, mobile applications and text messaging has increased substantially over the past several years. E-mail patient-provider communication (PPC) describes the communication between health care providers and patients between visits using e-mail or direct communication through patient portals.9 E-mail PPC use differs by age, sex, education, race/ethnicity, and history of chronic diseases such as diabetes and hypertension.9 Few studies have evaluated how e-mail PPC may improve individual’s likelihood of receiving preventive services. Kindratt and colleagues previous research found mixed results when evaluating the influence of e-mail PPC on preventive service utilization. Using National Health Interview Survey data, they found that adults who used e-mail to communicate with their provider had greater odds of reporting they received an influenza vaccine, mammogram, pap test, and colon cancer screening in the past 12 months.10-11 However, using HINTS 4 Cycles 1 through 4 data, Kindratt and colleagues did not find any associations between e-mail PPC and cancer screening uptake.12 In this case study, we will determine the association between e-mail PPC and adults’ likelihood of receiving a colon cancer screening using HINTS 5 Cycle 3 data.

8.6.1 Specific Aims

  • Aim 8.1: Compare sociodemographic and health-related characteristics of adults who use e-mail to communicate with their health care provider
  • Aim 8.2: Determine associations between e-mail PPC and adults’ likelihood of receiving a colon cancer screening before and after controlling for covariates.

8.6.2 Methods

Complete the following steps to download, clean, recode and analyze HINT 5 Cycle 3 data to answer the specific aims.

Step 1: Download HINTS 5 Cycle 3 Public Use Data Files 

  • Go to the HINTS website to access public use data for download
  • Review “HINTS Data Terms of Use”
  • Check the box at the bottom to indicate you will comply with terms of use
  • Enter your e-mail address and click “Accept”
  • You will be taken to an updated page with “Public Use Datasets”
  • Under the heading “HINTS 5, Cycle 3 (2019) dataset, updated March 2020,” click on the “SAS data and supporting documents” link to download a zip file
  • Unzip the file and save the documents to your computer. I recommend creating a folder on the “C Drive” labeled HINTS and separated by each iteration (if using more than 1 iteration or cycle). This will be consistent with the location statements used in the textbook examples (HINTS 5.3 representing HINTS 5 Cycle 3).

The downloaded zip file should contain the following files:

  1. HINTS 5 Cycle 3 Public Codebook
  2. HINTS 5 Cycle 3 Public Format Assignments
  3. HINTS 5 Cycle 3 Public Formats
  4. HINTS 5 Cycle 3 Public History Document
  5. HINTS 5 Cycle 3 Survey Overview & Data Analysis Recommendations
  6. HINTS 5 Cycle 3 Methodology Report
  7. HINTS 5 Cycle 3 SAS Public-Use Dataset
  8. HINTS 5 Cycle3 Annotated Instruments English and Spanish
  9. HINTS 5 Cycle3 Web Pilot Results Report

The most useful files for conducting the statistical analysis in SAS are the public codebook, public format assignments, public formats, and public use dataset files. The public codebook contains an overview of all variable names, labels, formats, response options, weighted and unweighted sample sizes and proportions. SAS programming statements (.sas files) are provided in the public format assignments and public formats files. The statements can be used to apply labels to your data file. For example, if you do not use the formatting files, your outputs will read “1” as responses instead of “yes” to indicate the actual responses from the survey. The public use dataset files include the responses to each variable in numerical format.

Step 2: Run SAS programming statements to create library and labels for HINTS 5 Cycle 3 Data

Sample SAS programs to create the libraries and format the HINTS 5 Cycle 3 data with labels are provided in Box 8.1 and Box 8.2, respectively. The full SAS programs are available for download in the chapter 8 folder in the Open ICPSR data repository.

HINTS 5 Cycle 3 SAS Public Formats File

  • Open the HINTS 5 Cycle 3 Public Formats File
  • Create a LIBNAME statement that houses the data and files associated with the analysis. I recommend creating the LIBNAME statement as the survey name (e.g. “HINTS”) and using the same location that the data files for saved in on the C drive (e.g. “C:\HINTS\HINTS 5.3”)
  • Highlight all programming statements and click RUN.

Box 8.1. SAS program for public formats, HINTS 5 Cycle 3

SAS syntax for public formats, HINTS 5 Cycle 3

HINTS 5 Cycle 3 SAS Public Format Assignments File

  • Open the HINTS 5 Cycle 3 Public Format Assignments file
  • Enter the LIBNAME in the “options fmtsearch” statement and update file name
  • Highlight all programming statements and click RUN

Box 8.2. SAS program for public format assignments, HINTS 5 Cycle 3

SAS syntax for public format assignments, HINTS 5 Cycle 3

Step 3: Select Variables for HINTS 5 Cycle 3 Analysis

Once formats and labels have been assigned to the dataset, you can remove any variables that are not needed for your analysis. This will reduce the size of the dataset and make processing time quicker when running SAS programming statements. In this case study, I have kept the following variables (Table 8.2) to denote the survey design features and creation of the independent variable, dependent variable and selected covariates.

Table 8.2. Overview of variables used for HINTS case study

Variable Name Variable Description
Design Variables
TG_all_FINWT0
Final person-level sample weight – all modalities combined
TG_all_FINWT1 – TG_all_FINWT50 Final person-level replicate weights 1-50 – all modalities combined
Independent Variables
Electronic_TalkDoctor In the past 12 months have you used a computer, smart phone, or other electronic means to use e-mail or the internet to communicate with a doctor?
EverTestedColonCa Have you ever had one of these tests to check for colon cancer?
Covariates/Inclusion Criteria
FreqGoProvider In the past 12 months, not counting times you went to an emergency room, how many times did you go to a doctor, nurse, or other health professional to get care for yourself?
Age What is your age?
SelfGender Self-reported gender
Race_Cat2 Derived variable to categorize responses given in O6 (Race)

Step 4: Recode and rename variables

 Questionnaire responses often need to be recoded or responses collapsed prior to conducting statistical analysis. For example, the HINTS has response options “-9=Missing Data (Not ascertained)” and “-7=Missing data (Web-partial, Question Never Seen)” for several questions. The responses are often removed and made “missing” prior to analysis. Furthermore, the numbers that represent certain values may need to be changed for easier interpretation of statistical analysis results. For example, HINTS has response options “1=Yes” and “2=No.” It is common practice to change “no” responses to 0, “0=No.” It is best practice to rename these recoded variables with a new variable name instead of replacing the original variable.

An overview of the variables recoded and renamed for analysis in this case study is provided in Table 8.3.

Table 8.3. Overview of HINTS variables recoded and renamed to meet research aims

Question Description Original
Variable
Original
Responses
Renamed
Variable
Recoded
Responses
In the past 12 months have you used a computer, smart phone, or other electronic means to use e-mail or the internet to communicate with a doctor or a doctor’s office? Electronic_TalkDoctor -9=Missing data
(Not Ascertained)
1=Yes
2=No
EMAIL_PPC 0=No
1=Yes
Have you ever had one of these tests to check for colon cancer? EverTestedColonCa -9=Missing data
-7=Missing data (Web partial)
1=Yes
2=No
COL_NEW 0=No
1=Yes

Table 8.3 (continued). Overview of HINTS variables recoded and renamed to meet research aims

Question Description Original
Variable
Original
Responses
Renamed
Variable
Recoded Responses
What is your age? Age -9=Missing data (Not Ascertained)
-4=Unreadable non-conforming numeric response
18-98 years
AGE_NEW 1=50-59 years
2=60-69 years
3=70 and older
Self: Gender SelfGender -9=Missing data (Not Ascertained)
-7=Missing data (Web partial – Question Never Seen)
1=Male
2=Female
GENDER 1=Male
2=Female

A sample SAS program for recoding and renaming HINTS 5 Cycle 3 data for this case study is provided in Box 8.3.

Box 8.3. Sample SAS program to recode and rename HINTS variables

SAS syntax to recode and rename HINTS variables

Step 5: Conduct Descriptive Statistical Analysis

 Once all variables are recoded, collapsed, and renamed they can be used for statistical analysis. Statistical analysis should always start with descriptive analysis to describe the data source. Chi square analyses should be conducted to make categorical comparisons between the independent variable, covariates, and dependent variables. It is important to remember that all analysis of HINTS data needs to be conducted with SAS Survey procedures due to the complex sample design. It is recommended to use the final sample weights (variable: TG_all_FINWT0), replicate weights (variables TG_all_FINWT1 to TG_all_FINWT50) and degrees of freedom (50-1=49) for the “delete one” jackknife replication method. More details on the replicate weighting are available in the HINTS 5 Cycle 3 Methodology Report.8

A sample SAS program for conducting chi-square tests using HINTS Cycle 3 data for this case study is provided in Box 8.4.

Box 8.4. Sample SAS program for running descriptive statistics (chi-square)

SAS syntax for running descriptive statistics (chi-square)

Step 6: Conduct Inferential Statistical Analysis

 After calculating descriptive statistics, inferential statistical analysis can be conducted. Crude and multivariable logistic regression models can be calculated to determine associations between e-mail PPC (using computer, smart phone, or other electronic means to use e-mail or the internet to communicate with a doctor or a doctor’s office) and colon cancer screening. Crude logistic regression models are used to determine the association between the independent and dependent variables without adjusting for other factors. Multivariable logistic regression models are used to determine associations between the independent and dependent variables after adjusting for potential covariates (e.g. age, sex). A reference category for the independent variables is needed. For this analysis, the reference group is “No.” Results compare adults did and did not use a computer, smart phone, or other electronic means to use e-mail or the internet to communicate with their doctor or doctor’s office. A sample SAS program for conducting logistic regression analysis using HINTS 5 Cycle 3 data for this case study is provided in Box 8.5.

Box 8.5. Sample SAS program for running HINTS inferential statistics (logistic regression)

SAS syntax for running HINTS inferential statistics (logistic regression)

8.7 Summary

This chapter provided an overview of the HINTS and ways to conduct basic statistical analysis using HINTS 5 Cycle 3 public-use data files. The HINTS case study explored whether adults who used e-mail to communicate with their health care provider were more or less likely to receive a colon cancer screening. Sample SAS programming statements were provided for downloading data files, labeling and formatting data files, recoding and renaming variables, and conducting categorical descriptive and inferential statistical analysis. The dataset and full SAS programming statements for the HINTS case study are available in the chapter 8 folder in the Open ICPSR data repository.

8.8 References 

  1. National Cancer Institute. Health Information National Trends Survey (HINTS): Overview of the HINTS 5 Cycle 3 Survey and Data Analysis Recommendations, January 2020.
  2. Finney Rutten LJ, Blake KD, Skolnick VG, Davis T, Moser RP, Hesse BW. Data Resource Profile: The National Cancer Institute’s Health Information National Trends Survey (HINTS). Int J Epidemiol. 2020;49(1):17-17j. doi:10.1093/ije/dyz083
  3. Finney Rutten LJ, Davis T, Beckjord EB, Blake K, Moser RP, Hesse BW. Picking up the pace: changes in method and frame for the health information national trends survey (2011-2014). J Health Commun. 2012;17(8):979-989. doi:10.1080/10810730.2012.700998
  4. Nelson DE, Kreps GL, Hesse BW, et al. The Health Information National Trends Survey (HINTS): development, design, and dissemination. J Health Commun. 2004;9(5):443-460; discussion 81-84. doi:10.1080/10810730490504233
  5. National Institutes of Health. Health Information National Trends Survey (HINTS): Annotated Form Cycle 3/Web Pilot, English Version.
  6. National Cancer Institute. What is HINTS? https://hints.cancer.gov/
  7. Maitland A, Lin A, Cantor D, et al. A Nonresponse Bias Analysis of the Health Information National Trends Survey (HINTS). J Health Commun. 2017;22(7):545-553. doi:10.1080/10810730.2017.1324539
  8. Westat. Health Information National Trends Survey 5 (HINTS 5) Cycle 3 Methodology Report, August 2019.
  9. Ye J, Rust G, Fry-Johnson Y, Strothers H. E-mail in patient-provider communication: a systematic review. Patient Educ Couns. 2010;80(2):266-273. doi:10.1016/j.pec.2009.09.038
  10. Kindratt T, Callender L, Cobbaert M, Wondrack J, Bandiera F, Salvo D. Health information technology use and influenza vaccine uptake among US adults. Int J Med Inform. 2019;129:37-42. doi:10.1016/j.ijmedinf.2019.05.025
  11. Kindratt TB, Allicock M, Atem F, Dallo FJ, Balasubramanian BA. Email Patient-Provider Communication and Cancer Screenings Among US Adults: Cross-sectional Study. JMIR Cancer. 2021;7(3):e23790. doi:10.2196/23790
  12. Kindratt TB, Atem F, Dallo FJ, Allicock M, Balasubramanian BA. The Influence of Patient–Provider Communication on Cancer Screening. Journal of Patient Experience. Published online May 11, 2020:2374373520924993. doi:10.1177/2374373520924993

License

Icon for the Creative Commons Attribution 4.0 International License

Big Data for Epidemiology: Applied Data Analysis Using National Health Surveys Copyright © 2022 by Tiffany B. Kindratt is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Share This Book