Conversation has been a primary means for the exchange of information since ancient times. Understanding patterns of information flow in conversations is a critical step in assessing and improving ...communication quality. In this paper, we describe COnversational DYnamics Model (CODYM) analysis, a novel approach for studying patterns of information flow in conversations. CODYMs are Markov Models that capture sequential dependencies in the lengths of speaker turns. The proposed method is automated and scalable, and preserves the privacy of the conversational participants. The primary function of CODYM analysis is to quantify and visualize patterns of information flow, concisely summarized over sequential turns from one or more conversations. Our approach is general and complements existing methods, providing a new tool for use in the analysis of any type of conversation. As an important first application, we demonstrate the model on transcribed conversations between palliative care clinicians and seriously ill patients. These conversations are dynamic and complex, taking place amidst heavy emotions, and include difficult topics such as end-of-life preferences and patient values. We use CODYMs to identify normative patterns of information flow in serious illness conversations, show how these normative patterns change over the course of the conversations, and show how they differ in conversations where the patient does or doesn't audibly express anger or fear. Potential applications of CODYMs range from assessment and training of effective healthcare communication to comparing conversational dynamics across languages, cultures, and contexts with the prospect of identifying universal similarities and unique "fingerprints" of information flow.
Studying the hysteretic relationships embedded in high‐frequency suspended‐sediment concentration and river discharge data over 600+ storm events provides insight into the drivers and sources of ...riverine sediment during storm events. However, the literature to date remains limited to a simple visual classification system (linear, clockwise, counter‐clockwise, and figure‐eight patterns) or the collapse of hysteresis patterns to an index. This study leverages 3 years of suspended‐sediment and discharge data to show proof‐of‐concept for automating the classification and assessment of event sediment dynamics using machine learning. Across all catchment sites, 600+ storm events were captured and classified into 14 hysteresis patterns. Event classification was automated using a restricted Boltzmann machine (RBM), a type of artificial neural network, trained on 2‐D images of the suspended‐sediment discharge (hysteresis) plots. Expansion of the hysteresis patterns to 14 classes allowed for new insight into drivers of the sediment‐discharge event dynamics including spatial scale, antecedent conditions, hydrology, and rainfall. The probabilistic RBM correctly classified hysteresis patterns (to the exact class or next most similar class) 70% of the time. With increased availability of high‐frequency sensor data, this approach can be used to inform watershed management efforts to identify sediment sources and reduce fine sediment export.
Plain Language Summary
In this study, the river stage (water level) and amount of suspended sediment (soil particles) within a river and five of its tributaries were monitored for 3 years; more than 600 storm events were captured across all six sites. For each storm event, traces of the sediment concentration and river stage were plotted against each other; and the emerging patterns such as clockwise, counter‐clockwise, and figure‐eight (hysteresis) loops were grouped into 14 reoccurring patterns. We also developed a machine‐learning (artificial intelligence) tool to recognize the 14 patterns using only the visual sediment‐stage image, in the same way that handwritten characters are recognized by computers. This allowed classification of the individual storm events to be automated. To better understand what these patterns tell us about the physics associated with the storm events and where on the landscape sediments may originate, we analyzed the 14 storm categories using measured rainfall, soil moisture, sediment, and river level data. The machine‐learning tool helped capture the linkages between the visual images and the types and origin of erosion using only data monitored at the river outlet during storm events.
Key Points
Storm‐event classification is automated using hysteresis images from high‐resolution turbidity sensors and a restricted Boltzmann machine
New hysteresis patterns in suspended‐sediment discharge relationships are identified
Distributions of hysteresis types show linkages in space and season, and provide insight into stream sediment source connectivity
Autism spectrum disorder (ASD) is a neurodevelopmental disorder that can cause significant social, communication, and behavioral challenges. Diagnosis of ASD is complicated and there is an urgent ...need to identify ASD-associated biomarkers and features to help automate diagnostics and develop predictive ASD models. The present study adopts a novel evolutionary algorithm, the conjunctive clause evolutionary algorithm (CCEA), to select features most significant for distinguishing individuals with and without ASD, and is able to accommodate datasets having a small number of samples with a large number of feature measurements. The dataset is unique and comprises both behavioral and neuroimaging measurements from a total of 28 children from 7 to 14 years old. Potential biomarker candidates identified include brain volume, area, cortical thickness, and mean curvature in specific regions around the cingulate cortex, frontal cortex, and temporal-parietal junction, as well as behavioral features associated with theory of mind. A separate machine learning classifier (i.e., k-nearest neighbors algorithm) was used to validate the CCEA feature selection and for ASD prediction. Study findings demonstrate how machine learning tools might help move the needle on improving diagnostic and predictive models of ASD.
There is demand for scalable algorithms capable of clustering and analyzing large time series data. The Kohonen self-organizing map (SOM) is an unsupervised artificial neural network for clustering, ...visualizing, and reducing the dimensionality of complex data. Like all clustering methods, it requires a measure of similarity between input data (in this work time series). Dynamic time warping (DTW) is one such measure, and a top performer that accommodates distortions when aligning time series. Despite its popularity in clustering, DTW is limited in practice because the runtime complexity is quadratic with the length of the time series. To address this, we present a new a self-organizing map for clustering TIME Series, called SOMTimeS, which uses DTW as the distance measure. The method has similar accuracy compared with other DTW-based clustering algorithms, yet scales better and runs faster. The computational performance stems from the pruning of unnecessary DTW computations during the SOM’s training phase. For comparison, we implement a similar pruning strategy for K-means, and call the latter K-TimeS. SOMTimeS and K-TimeS pruned 43% and 50% of the total DTW computations, respectively. Pruning effectiveness, accuracy, execution time and scalability are evaluated using 112 benchmark time series datasets from the UC Riverside classification archive, and show that for similar accuracy, a 1.8
×
speed-up on average for SOMTimeS and K-TimeS, respectively with that rates vary between 1
×
and 18
×
depending on the dataset. We also apply SOMTimeS to a healthcare study of patient-clinician serious illness conversations to demonstrate the algorithm’s utility with complex, temporally sequenced natural language.
A spatially explicit agent-based vehicle consumer choice model is developed to explore sensitivities and nonlinear interactions between various potential influences on plug-in hybrid vehicle (PHEV) ...market penetration. The model accounts for spatial and social effects (including threshold effects, homophily, and conformity) and media influences. Preliminary simulations demonstrate how such a model could be used to identify nonlinear interactions among potential leverage points, inform policies affecting PHEV market penetration, and help identify future data collection necessary to more accurately model the system. We examine sensitivity of the model to gasoline prices, to accuracy in estimation of fuel costs, to agent willingness to adopt the PHEV technology, to PHEV purchase price and rebates, to PHEV battery range, and to heuristic values related to gasoline usage. Our simulations indicate that PHEV market penetration could be enhanced significantly by providing consumers with ready estimates of expected lifetime fuel costs associated with different vehicles (e.g., on vehicle stickers), and that increases in gasoline prices could nonlinearly magnify the impact on fleet efficiency. We also infer that a potential synergy from a gasoline tax with proceeds is used to fund research into longer-range lower-cost PHEV batteries.
► We model consumer agents to study potential market penetration of PHEVs. ► The model accounts for spatial, social, and media effects. ► We identify interactions among potential leverage points that could inform policy. ► Consumer access to expected lifetime fuel costs may enhance PHEV market penetration. ► Increasing PHEV battery range has synergistic effects on fleet efficiency.
This paper presents the first time series clustering benchmark utilizing all time series datasets currently available in the University of California Riverside (UCR) archive — the state of the art ...repository of time series data. Specifically, the benchmark examines eight popular clustering methods representing three categories of clustering algorithms (partitional, hierarchical and density-based) and three types of distance measures (Euclidean, dynamic time warping, and shape-based), while adhering to six restrictions on datasets and methods to make the comparison as unbiased as possible. A phased evaluation approach was then designed for summarizing dataset-level assessment metrics and discussing the results. The benchmark study presented can be a useful reference for the research community on its own; and the dataset-level assessment metrics reported may be used for designing evaluation frameworks to answer different research questions.
•This paper presents the first benchmark study on time series clustering using the UCR archive.•This paper frames and follows six restrictions that guide the construction of the time series benchmark.•This paper demonstrates a phased evaluation as an approach to controlled benchmark test.•This paper posits that the clustering methods tested vary significantly in accuracy depending on datasets.
We present evidence of increasing persistence in daily precipitation in the northeastern United States that suggests that global circulation changes are affecting regional precipitation patterns. ...Meteorological data from 222 stations in 10 northeastern states are analyzed using Markov chain parameter estimates to demonstrate that a significant mode of precipitation variability is the persistence of precipitation events. We find that the largest region‐wide trend in wet persistence (i.e., the probability of precipitation in 1 day and given precipitation in the preceding day) occurs in June (+0.9% probability per decade over all stations). We also find that the study region is experiencing an increase in the magnitude of high‐intensity precipitation events. The largest increases in the 95th percentile of daily precipitation occurred in April with a trend of +0.7 mm/d/decade. We discuss the implications of the observed precipitation signals for watershed hydrology and flood risk.
Key Points
Precipitation in the northeastern United States is becoming more persistent
Precipitation in the northeastern United States is becoming more intense
Observed trends constitute an important hydrological impact of climate change
Enhancing Quality of Life (QOL) has long been an explicit or implicit goal for individuals, communities, nations, and the world. But defining QOL and measuring progress toward meeting this goal have ...been elusive. Diverse “objective” and “subjective” indicators across a range of disciplines and scales, and recent work on subjective well-being (SWB) surveys and the psychology of happiness have spurred interest. Drawing from multiple disciplines, we present an integrative definition of QOL that combines measures of human needs with subjective well-being or happiness. QOL is proposed as a multi-scale, multi-dimensional concept that contains interacting objective and subjective elements. We relate QOL to the opportunities that are provided to meet human needs in the forms of built, human, social and natural capital (in addition to time) and the policy options that are available to enhance these opportunities. Issues related to defining, measuring, and scaling these concepts are discussed, and a research agenda is elaborated. Policy implications include strategies for investing in
opportunities to maximize QOL enhancement at the individual, community, and national scales.
The shallow and deep hypothesis suggests that stream concentration‐discharge (CQ) relationships are shaped by distinct source waters from different depths. Under this hypothesis, baseflows are ...typically dominated by groundwater and mostly reflect groundwater chemistry, whereas high flows are typically dominated by shallow soil water and mostly reflect soil water chemistry. Aspects of this hypothesis draw on applications like end member mixing analyses and hydrograph separation, yet direct data support for the hypothesis remains scarce. This work tests the shallow and deep hypothesis using co‐located measurements of soil water, groundwater, and streamwater chemistry at two intensively monitored sites, the W‐9 catchment at Sleepers River (Vermont, United States) and the Hafren catchment at Plynlimon (Wales). At both sites, depth profiles of subsurface water chemistry and stream CQ relationships for the 10 solutes analyzed are broadly consistent with the hypothesis. Solutes that are more abundant at depth (e.g., calcium) exhibit dilution patterns (concentration decreases with increasing discharge). Conversely, solutes enriched in shallow soils (e.g., nitrate) generally exhibit flushing patterns (concentration increases with increasing discharge). The hypothesis may hold broadly true for catchments that share such biogeochemical stratifications in the subsurface. Soil water and groundwater chemistries were estimated from high‐ and low‐flow stream chemistries with average relative errors ranging from 24% to 82%. This indicates that streams mirror subsurface waters: stream chemistry can be used to infer scarcely measured subsurface water chemistry, especially where there are distinct shallow and deep end members.
Key Points
Collated depth profiles of subsurface water chemistry and stream chemistry provide direct data support for the shallow and deep hypothesis
Stream chemistry at high and low discharge can be used to approximate shallow soil water and deep groundwater chemistry, respectively
The shallow‐versus‐deep concentration contrast (Cratio) can predict the concentration‐discharge (CQ) power law slope (b)