The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain ...restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.
Vocal performance degradation is a common symptom for the vast majority of Parkinson's disease (PD) subjects, who typically follow personalized one-to-one periodic rehabilitation meetings with speech ...experts over a long-term period. Recently, a novel computer program called Lee Silverman voice treatment (LSVT) Companion was developed to allow PD subjects to independently progress through a rehabilitative treatment session. This study is part of the assessment of the LSVT Companion, aiming to investigate the potential of using sustained vowel phonations towards objectively and automatically replicating the speech experts' assessments of PD subjects' voices as "acceptable" (a clinician would allow persisting during in-person rehabilitation treatment) or "unacceptable" (a clinician would not allow persisting during in-person rehabilitation treatment). We characterize each of the 156 sustained vowel /a/ phonations with 309 dysphonia measures, select a parsimonious subset using a robust feature selection algorithm, and automatically distinguish the two cohorts (acceptable versus unacceptable) with about 90% overall accuracy. Moreover, we illustrate the potential of the proposed methodology as a probabilistic decision support tool to speech experts to assess a phonation as "acceptable" or "unacceptable." We envisage the findings of this study being a first step towards improving the effectiveness of an automated rehabilitative speech assessment tool.
In this paper, we present an assessment of the practical value of existing traditional and nonstandard measures for discriminating healthy people from people with Parkinson's disease (PD) by ...detecting dysphonia. We introduce a new measure of dysphonia, pitch period entropy (PPE), which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency. We collected sustained phonations from 31 people, 23 with PD. We then selected ten highly uncorrelated measures, and an exhaustive search of all possible combinations of these measures finds four that in combination lead to overall correct classification performance of 91.4%, using a kernel support vector machine. In conclusion, we find that nonstandard methods in combination with traditional harmonics-to-noise ratios are best able to separate healthy from PD subjects. The selected nonstandard methods are robust to many uncontrollable variations in acoustic environment and individual subjects, and are thus well suited to telemonitoring applications.
Wearable devices can capture objective day-to-day data about Parkinson's Disease (PD). This study aims to assess the feasibility of implementing wearable technology to collect data from multiple ...sensors during the daily lives of PD patients. The Parkinson@home study is an observational, two-cohort (North America, NAM; The Netherlands, NL) study. To recruit participants, different strategies were used between sites. Main enrolment criteria were self-reported diagnosis of PD, possession of a smartphone and age≥18 years. Participants used the Fox Wearable Companion app on a smartwatch and smartphone for a minimum of 6 weeks (NAM) or 13 weeks (NL). Sensor-derived measures estimated information about movement. Additionally, medication intake and symptoms were collected via self-reports in the app. A total of 953 participants were included (NL: 304, NAM: 649). Enrolment rate was 88% in the NL (n = 304) and 51% (n = 649) in NAM. Overall, 84% (n = 805) of participants contributed sensor data. Participants were compliant for 68% (16.3 hours/participant/day) of the study period in NL and for 62% (14.8 hours/participant/day) in NAM. Daily accelerometer data collection decreased 23% in the NL after 13 weeks, and 27% in NAM after 6 weeks. Data contribution was not affected by demographics, clinical characteristics or attitude towards technology, but was by the platform usability score in the NL (χ2 (2) = 32.014, p<0.001), and self-reported depression in NAM (χ2(2) = 6.397, p = .04). The Parkinson@home study shows that it is feasible to collect objective data using multiple wearable sensors in PD during daily life in a large cohort.
The process of collecting and organizing sets of observations represents a common theme throughout the history of science. However, despite the ubiquity of scientists measuring, recording and ...analysing the dynamics of different processes, an extensive organization of scientific time-series data and analysis methods has never been performed. Addressing this, annotated collections of over 35 000 real-world and model-generated time series, and over 9000 time-series analysis algorithms are analysed in this work. We introduce reduced representations of both time series, in terms of their properties measured by diverse scientific methods, and of time-series analysis methods, in terms of their behaviour on empirical time series, and use them to organize these interdisciplinary resources. This new approach to comparing across diverse scientific data and methods allows us to organize time-series datasets automatically according to their properties, retrieve alternatives to particular analysis methods developed in other scientific disciplines and automate the selection of useful methods for time-series classification and regression tasks. The broad scientific utility of these tools is demonstrated on datasets of electroencephalograms, self-affine time series, heartbeat intervals, speech signals and others, in each case contributing novel analysis techniques to the existing literature. Highly comparative techniques that compare across an interdisciplinary literature can thus be used to guide more focused research in time-series analysis for applications across the scientific disciplines.
Despite the large number of studies that have investigated the use of wearable sensors to detect gait disturbances such as Freezing of gait (FOG) and falls, there is little consensus regarding ...appropriate methodologies for how to optimally apply such devices. Here, an overview of the use of wearable systems to assess FOG and falls in Parkinson’s disease (PD) and validation performance is presented. A systematic search in the PubMed and Web of Science databases was performed using a group of concept key words. The final search was performed in January 2017, and articles were selected based upon a set of eligibility criteria. In total, 27 articles were selected. Of those, 23 related to FOG and 4 to falls. FOG studies were performed in either laboratory or home settings, with sample sizes ranging from 1 PD up to 48 PD presenting Hoehn and Yahr stage from 2 to 4. The shin was the most common sensor location and accelerometer was the most frequently used sensor type. Validity measures ranged from 73–100% for sensitivity and 67–100% for specificity. Falls and fall risk studies were all home-based, including samples sizes of 1 PD up to 107 PD, mostly using one sensor containing accelerometers, worn at various body locations. Despite the promising validation initiatives reported in these studies, they were all performed in relatively small sample sizes, and there was a significant variability in outcomes measured and results reported. Given these limitations, the validation of sensor-derived assessments of PD features would benefit from more focused research efforts, increased collaboration among researchers, aligning data collection protocols, and sharing data sets.
The standard reference clinical score quantifying average Parkinson's disease (PD) symptom severity is the Unified Parkinson's Disease Rating Scale (UPDRS). At present, UPDRS is determined by the ...subjective clinical evaluation of the patient's ability to adequately cope with a range of tasks. In this study, we extend recent findings that UPDRS can be objectively assessed to clinically useful accuracy using simple, self-administered speech tests, without requiring the patient's physical presence in the clinic. We apply a wide range of known speech signal processing algorithms to a large database (approx. 6000 recordings from 42 PD patients, recruited to a six-month, multi-centre trial) and propose a number of novel, nonlinear signal processing algorithms which reveal pathological characteristics in PD more accurately than existing approaches. Robust feature selection algorithms select the optimal subset of these algorithms, which is fed into non-parametric regression and classification algorithms, mapping the signal processing algorithm outputs to UPDRS. We demonstrate rapid, accurate replication of the UPDRS assessment with clinically useful accuracy (about 2 UPDRS points difference from the clinicians' estimates, p < 0.001). This study supports the viability of frequent, remote, cost-effective, objective, accurate UPDRS telemonitoring based on self-administered speech tests. This technology could facilitate large-scale clinical trials into novel PD treatments.
Parkinson's disease is a complex and heterogeneous condition, and there are many gaps in the medical community's scientific and practical understanding of the disease. Closing these gaps relies on ...objective data about symptoms and signs, collected over long durations. Smartphones contain sensor devices which can be used to remotely capture behavioral signals. From these signals, computational algorithms can distill metrics of symptom severity and progression. This brief review introduces the main concepts of the discipline, addressing the experimental, hardware and software logistics, and computational analysis. The article finishes with an exploration of future prospects for the technology.