An adaptive weight estimation approach is proposed to provide robust latent ability estimation in computerized adaptive testing (CAT) with response revision. This approach assigns different weights ...to each distinct response to the same item when response revision is allowed in CAT. Two types of weight estimation procedures, nonfunctional and functional weight, are proposed to determine the weight adaptively based on the compatibility of each revised response with the assumed statistical model in relation to remaining observations. The application of this estimation approach to a data set collected from a large-scale multistage adaptive testing demonstrates the capability of this method to reveal more information regarding the test taker’s latent ability by using the valid response path compared with only using the very last response. Limited simulation studies were concluded to evaluate the proposed ability estimation method and to compare it with several other estimation procedures in literature. Results indicate that the proposed ability estimation approach is able to provide robust estimation results in two test-taking scenarios.
Diagnosis aims to predict the future health status of patients according to their historical electronic health records (EHR), which is an important yet challenging task in healthcare informatics. ...Existing diagnosis prediction approaches mainly employ recurrent neural networks (RNN) with attention mechanisms to make predictions. However, these approaches ignore the importance of code descriptions, i.e., the medical definitions of diagnosis codes. We believe that taking diagnosis code descriptions into account can help the state-of-the-art models not only to learn meaning code representations, but also to improve the predictive performance, especially when the EHR data are insufficient.
We propose a simple, but general diagnosis prediction framework, which includes two basic components: diagnosis code embedding and predictive model. To learn the interpretable code embeddings, we apply convolutional neural networks (CNN) to model medical descriptions of diagnosis codes extracted from online medical websites. The learned medical embedding matrix is used to embed the input visits into vector representations, which are fed into the predictive models. Any existing diagnosis prediction approach (referred to as the base model) can be cast into the proposed framework as the predictive model (called the enhanced model).
We conduct experiments on two real medical datasets: the MIMIC-III dataset and the Heart Failure claim dataset. Experimental results show that the enhanced diagnosis prediction approaches significantly improve the prediction performance. Moreover, we validate the effectiveness of the proposed framework with insufficient EHR data. Finally, we visualize the learned medical code embeddings to show the interpretability of the proposed framework.
Given the historical visit records of a patient, the proposed framework is able to predict the next visit information by incorporating medical code descriptions.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Truth discovery algorithms have been widely applied to identify the true claims from the conflicting information provided by multiple sources. In general, they conduct an iterative procedure to ...estimate source reliability degrees as weights and infer the true claims via weighted voting. However, there is little prior work that provides theoretical analysis on the convergence of truth discovery methods. In this paper, we formulated the truth discovery task as a joint maximum likelihood estimation (JMLE) problem for unknown source reliability and truth claims. Within this framework, we proposed a Unified Truth Discovery (UTD) algorithm to get the numerical solution to JMLE for truth and source reliability. With mild conditions, we proved the consistency of the JMLE and the convergence of the proposed UTD algorithm. In addition, our proposed UTD algorithm turns out to include many existing truth discovery algorithms as special cases. This guarantees that our theoretical results can be applied to these truth discovery algorithms. We further conduct extensive experiments on synthetic data sets as well as five real-world data sets, and results from these numerical analysis support the theoretical results of the proposed UTD algorithm and the other state-of-the-art truth discovery algorithms.
The recent proliferation of human-carried mobile devices has given rise to mobile crowd sensing (MCS) systems that outsource the collection of sensory data to the public crowd equipped with various ...mobile devices. A fundamental issue in such systems is to effectively incentivize worker participation. However, instead of being an isolated module, the incentive mechanism usually interacts with other components which may affect its performance, such as data aggregation component that aggregates workers' data and data perturbation component that protects workers' privacy. Therefore, different from the past literature, we capture such interactive effect and propose INCEPTION, a novel MCS system framework that integrates an incentive, a data aggregation, and a data perturbation mechanism. Specifically, its incentive mechanism selects workers who are more likely to provide reliable data and compensates their costs for both sensing and privacy leakage. Its data aggregation mechanism also incorporates workers' reliability to generate highly accurate aggregated results, and its data perturbation mechanism ensures satisfactory protection for workers' privacy and desirable accuracy for the final perturbed results. We validate the desirable properties of INCEPTION through theoretical analysis as well as extensive simulations.
Charting by machines Murray, Scott; Xia, Yusen; Xiao, Houping
Journal of financial economics,
March 2024, 2024-03-00, Letnik:
153
Journal Article
Recenzirano
We test the efficient market hypothesis by using machine learning to forecast stock returns from historical performance. These forecasts strongly predict the cross-section of future stock returns. ...The predictive power holds in most subperiods and is strong among the largest 500 stocks. The forecasting function has important nonlinearities and interactions, is remarkably stable through time, and captures effects distinct from momentum, reversal, and extant technical signals. These findings question the efficient market hypothesis and indicate that technical analysis and charting have merit. We also demonstrate that machine learning models that perform well in optimization continue to perform well out-of-sample.
In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be ...reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers (ADMM). However, existing consensus optimization frameworks assume that every node has the same quality of information (QoI), i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this article, we propose a novel consensus optimization framework for distributed machine-learning that incorporates the crucial metric, QoI. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations, but has a tighter upper bound compared with ADMM. Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM-based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.
This paper examines video games, a form of digital innovation, and seeks to predict a successful game based on the composition of game development team members. Team composition is measured with ...observable features generated from a graph network based on development team information derived from individual team member work on previous games. Features include network features, such as team member closeness, success percentile, and failure percentile, and non-network features, such as the number of games published prior by the studio. We propose a novel framework using these features to predict the chance of success for new games with an accuracy higher than 92%. Further, we investigate important features for prediction and provide model interpretability for practical implementations. We then build a decision support tool that allows video game producers, and associated stakeholders such as investors, to understand how the predictive model decides, predicts, and performs its recommendations. The findings have implications for those seeking to proactively impact digital product performance through graph network-generated features of team composition, where features are directly observable, as opposed to features that are more challenging to observe, such as personalities.
•Network analysis of team composition can predict digital product performance.•Observable features can reduce reliance on difficult-to-see traits like personality.•High prediction accuracy shows network analysis's value in team composition.•Interpret how networks may help predict and explain digital product performance.•A reliable approach has been proposed for making team composition decisions.
Computerized adaptive testing (CAT) is a widely embraced approach for delivering personalized educational assessments, tailoring each test to the real-time performance of individual examinees. ...Despite its potential advantages, CAT s application in small-scale assessments has been limited due to the complexities associated with calibrating the item bank using sparse response data and small sample sizes. This study addresses these challenges by developing a two-step item bank calibration strategy that leverages the 1-bit matrix completion method in conjunction with two distinct incomplete pretesting designs. We introduce two novel 1-bit matrix completion-based imputation methods specifically designed to tackle the issues associated with item calibration in the presence of sparse response data and limited sample sizes. To demonstrate the effectiveness of these approaches, we conduct a comparative assessment against several established item parameter estimation methods capable of handling missing data. This evaluation is carried out through two sets of simulation studies, each featuring different pretesting designs, item bank structures, and sample sizes. Furthermore, we illustrate the practical application of the methods investigated, using empirical data collected from small-scale assessments.
Design flows are the explicit combinations of design transformations, primarily involved in synthesis, placement and routing processes, to accomplish the design of Integrated Circuits (ICs) and ...System-on-Chip (SoC). Mostly, the flows are developed based on the knowledge of the experts. However, due to the large search space of design flows and the increasing design complexity, developing Intellectual Property (IP)-specific synthesis flows providing high Quality of Result (QoR) is extremely challenging. This work presents a fully autonomous framework that artificially produces design-specific synthesis flows without human guidance and baseline flows, using Convolutional Neural Network (CNN). The demonstrations are made by successfully designing logic synthesis flows of three large scaled designs.
The demand for automatic extraction of true information (i.e., truths) from conflicting multi-source data has soared recently. A variety of truth discovery methods have witnessed great successes via ...jointly estimating source reliability and truths. All existing truth discovery methods focus on providing a point estimator for each object's truth, but in many real-world applications, confidence interval estimation of truths is more desirable, since confidence interval contains richer information. To address this challenge, in this paper, we propose a novel truth discovery method ( ETCIBoot ) to construct confidence interval estimates as well as identify truths, where the bootstrapping techniques are nicely integrated into the truth discovery procedure. Due to the properties of bootstrapping, the estimators obtained by ETCIBoot are more accurate and robust compared with the state-of-the-art truth discovery approaches. The proposed framework is further adapted to deal with large-scale truth discovery task in distributed paradigm. Theoretically, we prove the asymptotical consistency of the confidence interval obtained by ETCIBoot . Experimentally, we demonstrate that ETCIBoot is not only effective in constructing confidence intervals but also able to obtain better truth estimates.