Area-level measures are often used to approximate socioeconomic status (SES) when individual-level data are not available. However, no national studies have examined the validity of these measures in ...approximating individual-level SES.
Data came from ~ 3,471,000 participants in the Mortality Disparities in American Communities study, which links data from 2008 American Community Survey to National Death Index (through 2015). We calculated correlations, specificity, sensitivity, and odds ratios to summarize the concordance between individual-, census tract-, and county-level SES indicators (e.g., household income, college degree, unemployment). We estimated the association between each SES measure and mortality to illustrate the implications of misclassification for estimates of the SES-mortality association.
Participants with high individual-level SES were more likely than other participants to live in high-SES areas. For example, individuals with high household incomes were more likely to live in census tracts (r = 0.232; odds ratio OR = 2.284) or counties (r = 0.157; OR = 1.325) whose median household income was above the US median. Across indicators, mortality was higher among low-SES groups (all p < .0001). Compared to county-level, census tract-level measures more closely approximated individual-level associations with mortality.
Moderate agreement emerged among binary indicators of SES across individual, census tract, and county levels, with increased precision for census tract compared to county measures when approximating individual-level values. When area level measures were used as proxies for individual SES, the SES-mortality associations were systematically underestimated. Studies using area-level SES proxies should use caution when selecting, analyzing, and interpreting associations with health outcomes.
The National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program provides a rich source of data stratified according to tumor biomarkers that play an important role in ...cancer surveillance research. These data are useful for analyzing trends in cancer incidence and survival. These tumor markers, however, are often prone to missing observations. To address the problem of missing data, the authors employed sequential regression multivariate imputation for breast cancer variables, with a particular focus on estrogen receptor status, using data from 13 SEER registries covering the period 1992-2007. In this paper, they present an approach to accounting for missing information through the creation of imputed data sets that can be analyzed using existing software (e.g., SEER*Stat) developed for analyzing cancer registry data. Bias in age-adjusted trends in female breast cancer incidence is shown graphically before and after imputation of estrogen receptor status, stratified by age and race. The imputed data set will be made available in SEER*Stat (http://seer.cancer.gov/analysis/index.html) to facilitate accurate estimation of breast cancer incidence trends. To ensure that the imputed data set is used correctly, the authors provide detailed, step-by-step instructions for conducting analyses. This is the first time that a nationally representative, population-based cancer registry data set has been imputed and made available to researchers for conducting a variety of analyses of breast cancer incidence trends.
Health disparities are commonplace and of broad interest to policy makers, but are also challenging to measure and communicate. The Health Disparity Calculator software (HD*Calc, v1.2.4) offers Monte ...Carlo simulation (MCS)-based confidence interval (CI) estimation of eleven disparity measures. The MCS approach provides accurate CI estimation, except when data are scarce (e.g., rare cancers). To address sparse data challenges to CI estimation, we propose two solutions: 1) employing the gamma distribution in the MCS and 2) utilizing a zero-inflated Poisson estimate for Poisson sampling in simulation experiments. We evaluate each solution through simulation studies using female breast, female brain, lung, and cervical cancer data from the Surveillance, Epidemiology, and End Results (SEER) program. We compare the coverage probabilities (CPs) of eleven health disparity measures based on simulated datasets. The truncated normal distribution implemented in the MCS with the standard Poisson samples (the default setting of HD*Calc) leads to less-than-optimal coverage probabilities (<95%). When both the gamma distribution and the estimated mean from the zero-inflated Poisson are used for the MCS, the coverage probabilities are close to the nominal level of 95%. Simulation studies also demonstrate that collapsing age categories for better CI estimation is not a pragmatic solution.
More than half of all smartphone app downloads involve weight, diet, and exercise. If successful, these lifestyle apps may have far-reaching effects for disease prevention and health cost-savings, ...but few researchers have analyzed data from these apps.
The purposes of this study were to analyze data from a commercial health app (Lose It!) in order to identify successful weight loss subgroups via exploratory analyses and to verify the stability of the results.
Cross-sectional, de-identified data from Lose It! were analyzed. This dataset (n=12,427,196) was randomly split into 24 subsamples, and this study used 3 subsamples (combined n=972,687). Classification and regression tree methods were used to explore groupings of weight loss with one subsample, with descriptive analyses to examine other group characteristics. Data mining validation methods were conducted with 2 additional subsamples.
In subsample 1, 14.96% of users lost 5% or more of their starting body weight. Classification and regression tree analysis identified 3 distinct subgroups: "the occasional users" had the lowest proportion (4.87%) of individuals who successfully lost weight; "the basic users" had 37.61% weight loss success; and "the power users" achieved the highest percentage of weight loss success at 72.70%. Behavioral factors delineated the subgroups, though app-related behavioral characteristics further distinguished them. Results were replicated in further analyses with separate subsamples.
This study demonstrates that distinct subgroups can be identified in "messy" commercial app data and the identified subgroups can be replicated in independent samples. Behavioral factors and use of custom app features characterized the subgroups. Targeting and tailoring information to particular subgroups could enhance weight loss success. Future studies should replicate data mining analyses to increase methodology rigor.
Abstract Purpose The rapid proliferation of mobile devices offers unprecedented opportunities for patients and health care professionals to exchange health information electronically, but little is ...known about patients' willingness to exchange various types of health information using these devices. We examined willingness to exchange different types of health information via mobile devices, and assessed whether sociodemographic characteristics and trust in clinicians were associated with willingness in a nationally representative sample. Methods We analyzed data for 3,165 patients captured in the 2013 Health Information National Trends Survey. Multinomial logistic regression analysis was conducted to test differences in willingness. Ordinal logistic regression analysis assessed correlates of willingness to exchange 9 types of information separately. Results Participants were very willing to exchange appointment reminders (odds ratio OR = 6.66; 95% CI, 5.68–7.81), general health tips (OR = 2.03; 95% CI, 1.74–2.38), medication reminders (OR = 2.73; 95% CI, 2.35–3.19), laboratory/test results (OR = 1.76; 95% CI, 1.62–1.92), vital signs (OR = 1.63; 95% CI, 1.48–1.80), lifestyle behaviors (OR = 1.40; 95% CI, 1.24–1.58), and symptoms (OR = 1.62; 95% CI, 1.46–1.79) as compared with diagnostic information. Older adults had lower odds of being more willing to exchange any type of information. Education, income, and trust in health care professional information correlated with willingness to exchange certain types of information. Conclusions Respondents were less willing to exchange via mobile devices information that may be considered sensitive or complex. Age, socioeconomic factors, and trust in professional information were associated with willingness to engage in mobile health information exchange. Both information type and demographic group should be considered when developing and tailoring mobile technologies for patient-clinician communication.
The Physical Activity Monitor component was introduced into the 2003–2004 National Health and Nutrition Examination Survey (NHANES) to collect objective information on physical activity including ...both movement intensity counts and ambulatory steps. Because of an error in the accelerometer device initialization process, the steps data were missing for all participants in several primary sampling units, typically a single county or group of contiguous counties, who had intensity count data from their accelerometers. To avoid potential bias and loss in efficiency in estimation and inference involving the steps data, we considered methods to accurately impute the missing values for steps collected in the 2003–2004 NHANES. The objective was to come up with an efficient imputation method that minimized model‐based assumptions. We adopted a multiple imputation approach based on additive regression, bootstrapping and predictive mean matching methods. This method fits alternative conditional expectation (ace) models, which use an automated procedure to estimate optimal transformations for both the predictor and response variables. This paper describes the approaches used in this imputation and evaluates the methods by comparing the distributions of the original and the imputed data. A simulation study using the observed data is also conducted as part of the model diagnostics. Finally, some real data analyses are performed to compare the before and after imputation results. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.
Evidence is building that strength training may reduce complications associated with cancer such as fatigue, muscle wasting, and lymphedema, particularly among breast and prostate cancer survivors. ...Population estimates are available for rates of aerobic physical activity; however, data on strength training in this population are limited. The objective of this study was to identify rates of meeting public health recommendations for strength training and aerobic activity among cancer survivors and individuals with no cancer history.
Data from the Health Information National Trends Survey (HINTS), Iteration 4 Cycle 1 and Cycle 2 were combined to conduct the analyses. Missing data were imputed, and weighted statistical analyses were conducted in SAS.
The proportion of individuals meeting both strength training and aerobic guidelines were low for both cancer survivors and those without a history of cancer. The odds of meeting strength training guidelines were significantly lower for women with a history of any cancer except breast, compared with women with no history of cancer (OR: 0.70, 95% CI: 0.51-0.96).
More work needs to be done to understand why women with cancers other than breast, may be less inclined to engage in aerobic physical activity and strength training.
Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse ...(missingness) probabilities.
We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model) and the other fits a logistic regression for predicting missingness probabilities (the missingness model). A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented.
The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method.
We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with more than two levels for assessing the distribution of the outcome. In terms of the choices for the working models, we suggest a multinomial logistic regression for predicting the missing outcome and a binary logistic regression for predicting the missingness probability.
Over the past several decades, advances in lung cancer research and practice have led to refinements of histologic diagnosis of lung cancer. The differential use and subsequent alterations of ...nonspecific morphology codes, however, may have caused artifactual fluctuations in the incidence rates for histologic subtypes, thus biasing temporal trends.
We developed a multiple imputation (MI) method to correct lung cancer incidence for nonspecific histology using data from the Surveillance, Epidemiology, and End Results Program during 1975 to 2010.
For adenocarcinoma in men and squamous in both genders, the change to an increasing trend around 2005, after more than 10 years of decreasing incidence, is apparently an artifact of the changes in histopathology practice and coding system. After imputation, the rates remained decreasing for adenocarcinoma and squamous in men, and became constant for squamous in women.
As molecular features of distinct histologies are increasingly identified by new technologies, accurate histologic distinctions are becoming increasingly relevant to more effective "targeted" therapies, and therefore, are important to track in patients. However, without incorporating the coding changes, the incidence trends estimated for histologic subtypes could be misleading.
The MI approach provides a valuable tool for bridging the different histology definitions, thus permitting meaningful inferences about the long-term trends of lung cancer by histologic subtype.
Unrealistically optimistic or pessimistic risk perceptions may be associated with maladaptive health behaviors. This study characterized factors associated with unrealistic optimism (UO) and ...unrealistic pessimism (UP) about breast cancer. Data from the 2005 National Health Interview Survey were analyzed (
N
= 14,426 women). After accounting for objective risk status, many (43.8%) women displayed UO, 12.3% displayed UP, 34.5% had accurate risk perceptions (their perceived risk matched their calculated risk), and 9.5% indicated “don’t know/no response.” Multivariate multinomial logistic regression indicated that UO was associated with higher education and never smoking. UP was associated with lower education, lower income, being non-Hispanic Black, having ≥3 comorbidities, current smoking, and being overweight. UO was more likely to emerge in younger and older than in middle-aged individuals. UO and UP are associated with different demographic, health, and behavioral characteristics. Population segments that are already vulnerable to negative health outcomes displayed more UP than less vulnerable populations.