Digital technologies such as smartphones are transforming the way scientists conduct biomedical research. Several remotely conducted studies have recruited thousands of participants over a span of a ...few months allowing researchers to collect real-world data at scale and at a fraction of the cost of traditional research. Unfortunately, remote studies have been hampered by substantial participant attrition, calling into question the representativeness of the collected data including generalizability of outcomes. We report the findings regarding recruitment and retention from eight remote digital health studies conducted between 2014-2019 that provided individual-level study-app usage data from more than 100,000 participants completing nearly 3.5 million remote health evaluations over cumulative participation of 850,000 days. Median participant retention across eight studies varied widely from 2-26 days (median across all studies = 5.5 days). Survival analysis revealed several factors significantly associated with increase in participant retention time, including (i) referral by a clinician to the study (increase of 40 days in median retention time); (ii) compensation for participation (increase of 22 days, 1 study); (iii) having the clinical condition of interest in the study (increase of 7 days compared with controls); and (iv) older age (increase of 4 days). Additionally, four distinct patterns of daily app usage behavior were identified by unsupervised clustering, which were also associated with participant demographics. Most studies were not able to recruit a sample that was representative of the race/ethnicity or geographical diversity of the US. Together these findings can help inform recruitment and retention strategies to enable equitable participation of populations in future digital health research.
Abstract
Collection of high-dimensional, longitudinal digital health data has the potential to support a wide-variety of research and clinical applications including diagnostics and longitudinal ...health tracking. Algorithms that process these data and inform digital diagnostics are typically developed using training and test sets generated from multiple repeated measures collected across a set of individuals. However, the inclusion of repeated measurements is not always appropriately taken into account in the analytical evaluations of predictive performance. The assignment of repeated measurements from each individual to both the training and the test sets (“record-wise” data split) is a common practice and can lead to massive underestimation of the prediction error due to the presence of “identity confounding.” In essence, these models learn to identify subjects, in addition to diagnostic signal. Here, we present a method that can be used to effectively calculate the amount of identity confounding learned by classifiers developed using a record-wise data split. By applying this method to several real datasets, we demonstrate that identity confounding is a serious issue in digital health studies and that record-wise data splits for machine learning- based applications need to be avoided.
Parkinson's disease (PD) is a neurodegenerative disorder associated with motor and non-motor symptoms. Current treatments primarily focus on managing motor symptom severity such as tremor, ...bradykinesia, and rigidity. However, as the disease progresses, treatment side-effects can emerge such as on/off periods and dyskinesia. The objective of the Levodopa Response Study was to identify whether wearable sensor data can be used to objectively quantify symptom severity in individuals with PD exhibiting motor fluctuations. Thirty-one subjects with PD were recruited from 2 sites to participate in a 4-day study. Data was collected using 2 wrist-worn accelerometers and a waist-worn smartphone. During Days 1 and 4, a portion of the data was collected in the laboratory while subjects performed a battery of motor tasks as clinicians rated symptom severity. The remaining of the recordings were performed in the home and community settings. To our knowledge, this is the first dataset collected using wearable accelerometers with specific focus on individuals with PD experiencing motor fluctuations that is made available via an open data repository.
Parkinson's disease (PD) is a neurodegenerative disorder characterized by motor and non-motor symptoms. Dyskinesia and motor fluctuations are complications of PD medications. An objective measure of ...on/off time with/without dyskinesia has been sought for some time because it would facilitate the titration of medications. The objective of the dataset herein presented is to assess if wearable sensor data can be used to generate accurate estimates of limb-specific symptom severity. Nineteen subjects with PD experiencing motor fluctuations were asked to wear a total of five wearable sensors on both forearms and shanks, as well as on the lower back. Accelerometer data was collected for four days, including two laboratory visits lasting 3 to 4 hours each while the remainder of the time was spent at home and in the community. During the laboratory visits, subjects performed a battery of motor tasks while clinicians rated limb-specific symptom severity. At home, subjects were instructed to use a smartphone app that guided the periodic performance of a set of motor tasks.
Sparse-grid methods have recently gained interest in reducing the computational cost of solving high-dimensional kinetic equations. In this paper, we construct adaptive and hybrid sparse-grid methods ...for the Vlasov–Poisson–Lenard–Bernstein (VPLB) model. This model has applications to plasma physics and is simulated in two reduced geometries: a 0x3v space homogeneous geometry and a 1x3v slab geometry. We use the discontinuous Galerkin (DG) method as a base discretization due to its high-order accuracy and ability to preserve important structural properties of partial differential equations. We utilize a multiwavelet basis expansion to determine the sparse-grid basis and the adaptive mesh criteria. We analyze the proposed sparse-grid methods on a suite of three test problems by computing the savings afforded by sparse-grids in comparison to standard solutions of the DG method. The results are obtained using the adaptive sparse-grid discretization library ASGarD.
Increasing pH and decreasing Al in surface waters recovering from acidification have been accompanied by increasing concentrations of dissolved organic carbon (DOC) and associated organic acids that ...partially offset pH increases and complicate assessments of recovery from acidification. To better understand the processes of recovery, monthly chemistry from 42 lakes in the Adirondack region, NY, collected from 1994 to 2011, were used to (1) evaluate long-term changes in DOC and associated strongly acidic organic acids and (2) use the base-cation surplus (BCS) as a chemical index to assess the effects of increasing DOC concentrations on the Al chemistry of these lakes. Over the study period, the BCS increased (p < 0.01) and concentrations of toxic inorganic monomeric Al (IMAl) decreased (p < 0.01). The decreases in IMAl were greater than expected from the increases in the BCS. Higher DOC concentrations that increased organic complexation of Al resulted in a decrease in the IMAl fraction of total monomeric Al from 57% in 1994 to 23% in 2011. Increasing DOC concentrations have accelerated recovery in terms of decreasing toxic Al beyond that directly accomplished by reducing atmospheric deposition of strong mineral acids.