Hepatocellular carcinoma (HCC) is the most common liver cancer in adults. Many different factors make it difficult to diagnose in humans.. In this paper, a novel diagnostics approach based on machine ...learning techniques is presented. Logistic regression is one of the most classic machine learning models used to solve the problem of binary classification. In typical implementations, logistic regression coefficients are optimized using iterative methods. Additionally, parameters such as solver, C - a regularization parameter or the number of iterations of the algorithm operation should be selected. In our research, we propose a combination of logistic regression with genetic algorithms. We present three experiments showing the fusion of those methods. In the first experiment, we genetically select the logistic regression parameters, while the second experiment extends this approach by including a genetic selection of features. The third experiment presents a novel approach to train the logistic regression model - the genetic selection of coefficients (weights). Our models are tested for the survival prediction of hepatocellular carcinoma based on patient data collected at Coimbra's Hospital and Universitary Center (CHUC), Portugal. The model we proposed achieved a classification accuracy of 94.55% and an f1-score of 93.56%. Our algorithm shows that machine learning techniques optimized by the proposed concept can bring a new and accurate approach in HCC diagnosis with high accuracy.
•New ML model based on logistic regression and a genetic algorithm optimization with a final accuracy of 94.55%.•Comparison of different paths to logistic regression and evolutionary computation fusion with genetic optimization of weights.•Comparison of two biology-inspired algorithms to optimize ML model - genetic algorithms and particle swarm optimization.•Assessment of the missing values filling method influence on the final effectiveness of the proposed ML models.
This study is focused on applying genetic algorithms (GAs) to model and band selection in hyperspectral image classification. We use a forensic-inspired data set of seven hyperspectral images with ...blood and five visually similar substances to test GA-optimised classifiers in two scenarios: when the training and test data come from the same image and when they come from different images, which is a more challenging task due to significant spectral differences. In our experiments, we compare GA with a classic model optimisation through a grid search. Our results show that GA-based model optimisation can reduce the number of bands and create an accurate classifier that outperforms the GS-based reference models, provided that, during model optimisation, it has access to examples similar to test data. We illustrate this with experiments highlighting the importance of a validation set.
The increasing prevalence of mental disorders among youth worldwide is one of society's most pressing issues. The proposed methodology introduces an artificial intelligence-based approach for ...comprehending and analyzing the prevalence of neurological disorders. This work draws upon the analysis of the Cities Health Initiative dataset. It employs advanced machine learning and deep learning techniques, integrated with data science, statistics, optimization, and mathematical modeling, to correlate various lifestyle and environmental factors with the incidence of these mental disorders. In this work, a variety of machine learning and deep learning models with hyper-parameter tuning are utilized to forecast trends in the occurrence of mental disorders about lifestyle choices such as smoking and alcohol consumption, as well as environmental factors like air and noise pollution. Among these models, the convolutional neural network (CNN) architecture, termed as DNN1 in this paper, accurately predicts mental health occurrences relative to the population mean with a maximum accuracy of 99.79%. Among the machine learning models, the XGBoost technique yields an accuracy of 95.30%, with an area under the ROC curve of 0.9985, indicating robust training. The research also involves extracting feature importance scores for the XGBoost classifier, with Stroop test performance results attaining the highest importance score of 0.135. Attributes related to addiction, namely smoking and alcohol consumption, hold importance scores of 0.0273 and 0.0212, respectively. Statistical tests on the training models reveal that XGBoost performs best on the mean squared error and
-squared tests, achieving scores of 0.013356 and 0.946481, respectively. These statistical evaluations bolster the models' credibility and affirm the best-fit models' accuracy. The proposed research in the domains of mental health, addiction, and pollution stands to aid healthcare professionals in diagnosing and treating neurological disorders in both youth and adults promptly through the use of predictive models. Furthermore, it aims to provide valuable insights for policymakers in formulating new regulations on pollution and addiction.
As quantum computation grows, the number of qubits involved in a given quantum computer increases. But due to the physical limitations in the number of qubits of a single quantum device, the ...computation should be performed in a distributed system. In this paper, a new model of quantum computation based on the matrix representation of quantum circuits is proposed. Then, using this model, we propose a novel approach for reducing the number of teleportations in a distributed quantum circuit. The proposed method consists of two phases: the pre-processing phase and the optimization phase. In the pre-processing phase, it considers the bi-partitioning of quantum circuits by Non-Dominated Sorting Genetic Algorithm
(NSGA-III)
to minimize the number of global gates and to distribute the quantum circuit into two balanced parts with equal number of qubits and minimum number of global gates. In the optimization phase, two heuristics named
Heuristic I
and
Heuristic II
are proposed to optimize the number of teleportations according to the partitioning obtained from the pre-processing phase. Finally, the proposed approach is evaluated on many benchmark quantum circuits. The results of these evaluations show an average of 22.16% improvement in the teleportation cost of the proposed approach compared to the existing works in the literature.
Hepatocellular carcinoma (HCC) is one of the major challenges facing biomedical research. Despite the high lethality, methods to predict mortality for this type of aggressive malignant tumor are ...insufficient. Machine learning is recognized by many authors as a valuable, yet poorly studied tool in this field. Undoubtedly, searching for new feature selection methods is significant in building an effective machine‐learning model. In this study, we propose the novel hybrid model using neighborhood components analysis, genetic algorithm and support vector machine classifier (NCA‐GA‐SVM). Because SVM works with default parameters characterized by low classification results, we decided to use GA for the proper optimization and feature selection. As reported in the available literature, NCA and GA obtain high classification results. Here, we decided to combine these approaches, building a two‐level algorithm for HCC fatality prognosis. We used a well‐known dataset collected from 165 patients at Coimbra's Hospital and University Center, Portugal. Our results revealed 96.36% classification accuracy and 95.52% F1‐score. Additionally, we compared all data for these metrics published so far. We demonstrated that our algorithm achieved the highest accuracy and can be successfully applied for the assessment of hepatocellular carcinoma mortality in the future. Our findings bring methodological value for future HCC studies and emphasize the possibility of using machine‐learning techniques to improve the quality of medical decisions.
The article presents a new method for predicting mortality among patients with hepatocellular carcinoma based on neighborhood component analysis, genetic algorithm and support vector machine classifier. The proposed two‐level algorithm achieved the highest results known in the literature (96.36% and 95.52%, respectively for classification accuracy and F1‐score). This novel hybrid model can significantly affect improvements in the quality of clinical decision‐making process.
Image classification (categorization) can be considered as one of the most breathtaking domains of contemporary research. Indeed, people cannot hide their faces and related lineaments since it is ...highly needed for daily communications. Therefore, face recognition is extensively used in biometric applications for security and personnel attendance control. In this study, a novel face recognition method based on perceptual hash is presented. The proposed perceptual hash is utilized for preprocessing and feature extraction phases. Discrete Wavelet Transform (DWT) and a novel graph based binary pattern, called quintet triple binary pattern (QTBP), are used. Meanwhile, the K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) algorithms are employed for classification task. The proposed face recognition method is tested on five well-known face datasets: AT&T, Face94, CIE, AR and LFW. Our proposed method achieved 100.0% classification accuracy for the AT&T, Face94 and CIE datasets, 99.4% for AR dataset and 97.1% classification accuracy for the LFW dataset. The time cost of the proposed method is
O
(
nlogn
). The obtained results and comparisons distinctly indicate that our proposed has a very good classification capability with short execution time.
•Automated classification of normal and CAD classes.•A new work-related coronary artery disease (CAD) data set.•Proposed a novel heterogeneous hybrid feature selection (2HFS) algorithm to pre-process ...the CAD data.•Work place, environmental and clinical features are used.•Obtained maximum performance using various data sets.
Coronary artery disease (CAD) is a leading cause of death worldwide and is associated with high healthcare expenditure. Researchers are motivated to apply machine learning (ML) for quick and accurate detection of CAD. The performance of the automated systems depends on the quality of features used. Clinical CAD datasets contain different features with varying degrees of association with CAD. To extract such features, we developed a novel hybrid feature selection algorithm called heterogeneous hybrid feature selection (2HFS). In this work, we used Nasarian CAD dataset, in which work place and environmental features are also considered, in addition to other clinical features. Synthetic minority over-sampling technique (SMOTE) and Adaptive synthetic (ADASYN) are used to handle the imbalance in the dataset. Decision tree (DT), Gaussian Naive Bayes (GNB), Random Forest (RF), and XGBoost classifiers are used. 2HFS-selected features are then input into these classifier algorithms. Our results show that, the proposed feature selection method has yielded the classification accuracy of 81.23% with SMOTE and XGBoost classifier. We have also tested our approach with other well-known CAD datasets: Hungarian dataset, Long-beach-va dataset, and Z-Alizadeh Sani dataset. We have obtained 83.94%, 81.58% and 92.58% for Hungarian dataset, Long-beach-va dataset, and Z-Alizadeh Sani dataset, respectively. Hence, our experimental results confirm the effectiveness of our proposed feature selection algorithm as compared to the existing state-of-the-art techniques which yielded outstanding results for the development of automated CAD systems.
Computational intelligence methods achieve high efficiency in the analysis of multidimensional data from e-nose, the equivalent of the human sense of smell. This paper presents and compares selected ...and applied to approximations of five concentration levels of phenol algorithms. The measured responses of an array of 18 semiconductor gas sensors formed input vectors used for further analysis. The initial data processing consisted of standardization, principal component analysis, data normalization, and reduction. Nine systems based on soft computing can be divided into single method systems using neural networks, fuzzy systems, and hybrid systems like evolutionary-neural, neuro-fuzzy, and evolutionary-fuzzy. All the presented systems were evaluated based on accuracy (errors generated) and complexity (number of parameters and training time) criteria. A method of forming input data vector by aggregation of the first three principal components is also presented. The key contribution is applying and comparing nine CI techniques for estimating phenol concentration based on signals from metal-oxide sensor array.
Obstructive sleep apnea (OSA) is a long-term sleep disorder that causes temporary disruption in breathing while sleeping. Polysomnography (PSG) is the technique for monitoring different signals ...during the patient’s sleep cycle, including electroencephalogram (EEG), electromyography (EMG), electrocardiogram (ECG), and oxygen saturation (SpO2). Due to the high cost and inconvenience of polysomnography, the usefulness of ECG signals in detecting OSA is explored in this work, which proposes a two-dimensional convolutional neural network (2D-CNN) model for detecting OSA using ECG signals. A publicly available apnea ECG database from PhysioNet is used for experimentation. Further, a constant Q-transform (CQT) is applied for segmentation, filtering, and conversion of ECG beats into images. The proposed CNN model demonstrates an average accuracy, sensitivity and specificity of 91.34%, 90.68% and 90.70%, respectively. The findings obtained using the proposed approach are comparable to those of many other existing methods for automatic detection of OSA.