In many practical applications of supervised learning the task involves the prediction of multiple target variables from a common set of input variables. When the prediction targets are binary the ...task is called multi-label classification, while when the targets are continuous the task is called multi-target regression. In both tasks, target variables often exhibit statistical dependencies and exploiting them in order to improve predictive accuracy is a core challenge. A family of multi-label classification methods address this challenge by building a separate model for each target on an expanded input space where other targets are treated as additional input variables. Despite the success of these methods in the multi-label classification domain, their applicability and effectiveness in multi-target regression has not been studied until now. In this paper, we introduce two new methods for multi-target regression, called
stacked single-target
and
ensemble of regressor chains
, by adapting two popular multi-label classification methods of this family. Furthermore, we highlight an inherent problem of these methods—a discrepancy of the values of the additional input variables between training and prediction—and develop extensions that use out-of-sample estimates of the target variables during training in order to tackle this problem. The results of an extensive experimental evaluation carried out on a large and diverse collection of datasets show that, when the discrepancy is appropriately mitigated, the proposed methods attain consistent improvements over the independent regressions baseline. Moreover, two versions of Ensemble of Regression Chains perform significantly better than four state-of-the-art methods including regularization-based multi-task learning methods and a multi-objective random forest approach.
•Three novel multi-target support vector regressor models are proposed.•The first builds an independent single-target support vector regressor for each output variable.•The second builds an ensemble ...of random chains using the ERCC methodology.•The third calculates the targets correlation and builds a single regressor model using that chain.•Results show that using a single maximum correlation chain model obtains better performance than the ensemble of random chains while learning a single model.
Multi-target regression is a challenging task that consists of creating predictive models for problems with multiple continuous target outputs. Despite the increasing attention on multi-label classification, there are fewer studies concerning multi-target (MT) regression. The current leading MT models are based on ensembles of regressor chains, where random, differently ordered chains of the target variables are created and used to build separate regression models, using the previous target predictions in the chain. The challenges of building MT models stem from trying to capture and exploit possible correlations among the target variables during training. This paper presents three multi-target support vector regression models. The first involves building independent, single-target Support Vector Regression (SVR) models for each output variable. The second builds an ensemble of random chains using the first method as a base model. The third calculates the targets’ correlations and forms a maximum correlation chain, which is used to build a single chained support vector regression model, improving the models’ prediction performance while reducing the computational complexity. The experimental study evaluates and compares the performance of the three approaches with seven other state-of-the-art multi-target regressors on 24 multi-target datasets. The experimental results are then analyzed using non-parametric statistical tests. The results show that the maximum correlation SVR approach improves the performance of using ensembles of random chains.
Air pollution is considered one of the biggest threats for the ecological system and human existence. Therefore, air quality monitoring has become a necessity in urban and industrial areas. Recently, ...the emergence of Machine Learning techniques justifies the application of statistical approaches for environmental modeling, especially in air quality forecasting. In this context, we propose a novel feature ranking method, termed as Ensemble of Regressor Chains-guided Feature Ranking (ERCFR) to forecast multiple air pollutants simultaneously over two cities. This approach is based on a combination of one of the most powerful ensemble methods for Multi-Target Regression problems (Ensemble of Regressor Chains) and the Random Forest permutation importance measure. Thus, feature selection allowed the model to obtain the best results with a restricted subset of features. The experimental results reveal the superiority of the proposed approach compared to other state-of-the-art methods, although some cautions have to be considered to improve the runtime performance and to decrease its sensitivity over extreme and outlier values.
Display omitted
•Forecasting multiple air pollutant concentrations simultaneously.•The combination of Multi-Target Regression method and the Random Forest paradigm.•The proposed method ensures better performance in air quality forecast.
•A state-of-the-art deep tree-ensemble method for multi-target regression and multi-label classification.•Low-dimensional tree-embeddings are more representative than output features in deep-forests ...architectures.•A stopping (pruning) criterion to determine the optimal number of layers as well as mechanisms to surpass overfitting.•An extensive evaluation on 41 datasets, comparing our approach to state-of-the-art methods.
Recently, deep neural networks have expanded the state-of-art in various scientific fields and provided solutions to long standing problems across multiple application domains. Nevertheless, they also suffer from weaknesses since their optimal performance depends on massive amounts of training data and the tuning of an extended number of parameters. As a countermeasure, some deep-forest methods have been recently proposed, as efficient and low-scale solutions. Despite that, these approaches simply employ label classification probabilities as induced features and primarily focus on traditional classification and regression tasks, leaving multi-output prediction under-explored. Moreover, recent work has demonstrated that tree-embeddings are highly representative, especially in structured output prediction. In this direction, we propose a novel deep tree-ensemble (DTE) model, where every layer enriches the original feature set with a representation learning component based on tree-embeddings. In this paper, we specifically focus on two structured output prediction tasks, namely multi-label classification and multi-target regression. We conducted experiments using multiple benchmark datasets and the obtained results confirm that our method provides superior results to state-of-the-art methods in both tasks.
Metric Learning for Multi-Output Tasks Liu, Weiwei; Xu, Donna; Tsang, Ivor W. ...
IEEE transactions on pattern analysis and machine intelligence,
02/2019, Volume:
41, Issue:
2
Journal Article
Peer reviewed
Multi-output learning with the task of simultaneously predicting multiple outputs for an input has increasingly attracted interest from researchers due to its wide application. The ...<inline-formula><tex-math notation="LaTeX">k</tex-math> <inline-graphic xlink:href="liu-ieq1-2794976.gif"/> </inline-formula> nearest neighbor (<inline-formula><tex-math notation="LaTeX">k \text{NN}</tex-math> <inline-graphic xlink:href="liu-ieq2-2794976.gif"/> </inline-formula>) algorithm is one of the most popular frameworks for handling multi-output problems. The performance of <inline-formula><tex-math notation="LaTeX">k \text{NN}</tex-math> <inline-graphic xlink:href="liu-ieq3-2794976.gif"/> </inline-formula> depends crucially on the metric used to compute the distance between different instances. However, our experiment results show that the existing advanced metric learning technique cannot provide an appropriate distance metric for multi-output tasks. This paper systematically studies how to efficiently learn an appropriate distance metric for multi-output problems with provable guarantee. In particular, we present a novel large margin metric learning paradigm for multi-output tasks, which projects both the input and output into the same embedding space and then learns a distance metric to discover output dependency such that instances with very different multiple outputs will be moved far away. Several strategies are then proposed to speed up the training and testing time. Moreover, we study the generalization error bound of our method for three learning tasks, which shows that our method converges to the optimal solutions. Experiments on three multi-output learning tasks (multi-label classification, multi-target regression, and multi-concept retrieval) validate the effectiveness and scalability of the proposed method.
Multi-Target Regression via Robust Low-Rank Learning Zhen, Xiantong; Yu, Mengyang; He, Xiaofei ...
IEEE transactions on pattern analysis and machine intelligence,
02/2018, Volume:
40, Issue:
2
Journal Article
Peer reviewed
Open access
Multi-target regression has recently regained great popularity due to its capability of simultaneously learning multiple relevant regression tasks and its wide applications in data mining, computer ...vision and medical image analysis, while great challenges arise from jointly handling inter-target correlations and input-output relationships. In this paper, we propose Multi-layer Multi-target Regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general framework via robust low-rank learning. Specifically, the MMR can explicitly encode inter-target correlations in a structure matrix by matrix elastic nets (MEN); the MMR can work in conjunction with the kernel trick to effectively disentangle highly complex nonlinear input-output relationships; the MMR can be efficiently solved by a new alternating optimization algorithm with guaranteed convergence. The MMR leverages the strength of kernel methods for nonlinear feature learning and the structural advantage of multi-layer learning architectures for inter-target correlation modeling. More importantly, it offers a new multi-layer learning paradigm for multi-target regression which is endowed with high generality, flexibility and expressive ability. Extensive experimental evaluation on 18 diverse real-world datasets demonstrates that our MMR can achieve consistently high performance and outperforms representative state-of-the-art algorithms, which shows its great effectiveness and generality for multivariate prediction.
Unattributed-identity multi-target regression (UIMTR) is defined as a multi-target regression problem in which the identity of the target and predictor variables is not predefined. It is a problem ...that can be found in several real-world applications. For example, when historical data is available from a set of devices, but real-time data can only be requested from a subset of them (so called sentinels). For estimating real-time status of non-sentinels, it will be necessary to generate multi-target regression models. Therefore, attributing the identity of the real-time communicators (sentinels), i.e., the predictor variables, is a critical aspect. Moreover, unlike classical feature selection problems, the set of target variables is determined after applying the selection methods and not before, thus, some adaptations are necessary. We introduce three novel methods to solve the UIMTR and, after extensive evaluation, we demonstrate: (i) the feasibility of the methods, (ii) the usefulness of the approach, and (iii) the improvement over other classical techniques. The results have been evaluated from three perspectives: (i) the quality of the predictions, (ii) the stability of the methods and (iii) the execution time.
Display omitted
Multi-target regression has always been a challenging task in engineering applications. Nevertheless, it is easy to encounter problems such as low accuracy and inadequate robustness ...in some scenarios. To address these issues, an ensemble strategy considering correlations is proposed, named Ensemble-Adaptive Tree-based Correlation Chains. Specifically, a Follow-up Correlation Chaining strategy that quantifies the relationships among targets by arranging the L1 norms of correlations is suggested. Compared with other related strategies, it allows for the representation of these relationships through a single regressor chain. Under the proposed framework, the ensemble strategy integrates ten chains, wherein each chain adaptively updates the sample weights during training. This process involves employing the out-of-sample observations with new convergence criteria. Furthermore, the eXtreme Gradient Boosting is introduced as the base regressor to enhance the overall accuracy of the entire method. Finally, the proposed method is validated based on 25 multi-target datasets and a lightweight design of a high-speed rail bogie. The results demonstrate the superior accuracy and robustness compared to other state-of-the-art methods. In general, this study provides reliable predictions for specific scenarios and delivers practical significance in addressing relevant problems.
•Emotion recognition technology is a very promising idea for decision systems.•Environment assessment based on isolated attributes is unreliable.•EMOTIF provides a new quality of information in the ...form of Fused Features.•EMOTIF enables assessment of the phenomena diversity based on emotion similarity.•The synergy of ER, CV, and NN allows useful information about the environment to be acquired.
It is widely known that the decisions being made concern objects representing a set of characteristics whose importance is unique for every decision-maker. However, including this aspect in analyses is a challenge for many researchers. Classically applied information analysis methods fail to consider the synergy of these characteristics and ignore the impact of behavioural aspects that are inseparable from the decision-maker. The study proposed a solution based on an emotion detection technology using Computer Vision and Neural Networks. The presented approach comprises three main components: the detection of emotions using CNN – acquiring input vector value elements to the model for evaluation of space features; MLP for the assessment of anthropogenic and natural space features; and the verification of the utilitarian nature, usability, and suitability for the use of the developed solution. The novelty of the paper relates to the proposition of the new approaches by demonstrating that the assessment of the impact of an object’s features is a synergistic, inseparable conglomerate (Fusion Features), which thus indicates the greater usability of the results such studies in the analysis of a particular phenomenon, structure, or system.
•Explainable ML models for polymerization modeling are proposed.•Data for training ML models are provided by in-house kinetic Monte Carlo simulator.•ML-based approach for reverse engineering of ...polymerization processes is proposed.•The proposed ML models allow for creating of polymers with tailored properties.
Due to the complex polymerization technique and statistical composition of the polymer, tailoring its characteristics is a challenging task. Modeling of the polymerizations can contribute to deeper insights into the process. This study applies state-of-the-art machine learning (ML) methods for modeling and reverse engineering of polymerization processes. ML methods (random forest, XGBoost and CatBoost) are trained on data sets generated by an in house developed kinetic Monte Carlo simulator. The applied ML models predict monomer concentration, average molar masses and full molar mass distributions with excellent accuracy (R2> 0.96). Reverse engineering results delivering the polymerization recipe for a targeted molar mass distribution are less accurate, but still only minor deviations from the targeted molar mass distribution are seen. The influences of the input variables in ML models obtained by explainability methods correspond to the expert expectations.