A principled methodology for solving imbalanced binary classification problems has been recently introduced. It permits to obtain high performance designs avoiding the risks of degradation that other ...procedures suffer from. The corresponding paper Benítez-Buenache et al. (2019) shows evidence of these facts by applying direct versions, using just one of the possible rebalancing techniques and applying full rebalancing.
In this contribution, we extend the above study for maximizing the performance of the resulting designs. To this end, we combine principled techniques in order to taking benefit from their different characteristics. The combination weights as well as the rebalance degree are selected by means of a simple (cross-validation) search. A number of experiments with different kinds of databases shows significant performance improvements. At the same time, the database characteristics that limit the performance improvements −such as small size and noisy samples− are detected.
•Combining different (neutral) principled rebalancing techniques is proposed.•The combination degree and the rebalancing intensity are found by cross validation.•Extensive experiments support the effectiveness of the proposal.•Shallow and deep neural networks and ensembles are used in the experiments.•The database characteristics that reduce combinations performance are detected.
•A principled method for imbalanced classification is presented.•The iff conditions for principled re-balancing are established.•Informed two-step re-balancing techniques are introduced.•Extensive ...examples support the analysis.
This contribution proves that neutral re-balancing mechanisms, that do not alter the likelihood ratio, and training discriminative machines using Bregman divergences as surrogate costs are necessary and sufficient conditions to estimate the likelihood ratio of imbalanced binary classification problems in a consistent manner. These two conditions permit the estimation of the theoretical Neyman–Pearson operating characteristic corresponding to the problem under study. In practice, a classifier operates at a certain working point corresponding to, for example, a given false positive rate. This perspective allows the introduction of an additional principled procedure to improve classification performance by means of a second design step in which more weight is assigned to the appropriate training samples. The paper includes a number of examples that demonstrate the performance capabilities of the methods presented, and concludes with a discussion of relevant research directions and open problems in the area.
Variational quantum algorithms (VQA) have emerged as a promising quantum alternative for solving optimization and machine learning problems using parameterized quantum circuits (PQCs). The design of ...these circuits influences the ability of the algorithm to efficiently explore the solution space and converge to more optimal solutions. Choosing an appropriate circuit topology, gate set, and parameterization scheme is determinant to achieve good performance. In addition, it is not only problem-dependent, but the quantum hardware used also has a significant impact on the results. Therefore, we present BPQCO, a Bayesian Optimization-based strategy to search for optimal PQCs adapted to the problem to be solved and to the characteristics and limitations of the chosen quantum hardware. To this end, we experimentally demonstrate the influence of the circuit design on the performance obtained for two classification problems (a synthetic dataset and the well-known Iris dataset), focusing on the design of the circuit ansatz. In addition, we study the degradation of the obtained circuits in the presence of noise when simulating real quantum computers. To mitigate the effect of noise, two alternative optimization strategies based on the characteristics of the quantum system are proposed. The results obtained confirm the relevance of the presented approach and allow its adoption in further work based on the use of PQCs.