•Introduce Gaussian process classification (GPC) technique for diabetes classification in machine learning framework.•GPC uses three kernels types, namely: linear, polynomial and radial basis ...kernel.•Comparison of GPC with existing classification techniques such as: LDA, QDA and NB.•GPC-based model gave highest accuracy, sensitivity, specificity and other performance parameters.•Machine learning systems are very useful for Diabetes data classification, one of the deadly diseases of the globe.
Diabetes is a silent killer. The main cause of this disease is the presence of excessive amounts of metabolites such as glucose. There were about 387 million diabetic people all over the world in 2014. The financial burden of this disease has been calculated to be about $13,700 per year. According to the World Health Organization (WHO), these figures will more than double by the year 2030. This cost will be reduced dramatically if someone can predict diabetes statistically on the basis of some covariates. Although several classification techniques are available, it is very difficult to classify diabetes. The main objectives of this paper are as follows: (i) Gaussian process classification (GPC), (ii) comparative classifier for diabetes data classification, (iii) data analysis using the cross-validation approach, (iv) interpretation of the data analysis and (v) benchmarking our method against others.
To classify diabetes, several classification techniques are used such as linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and Naive Bayes (NB). However, most of the medical data show non-normality, non-linearity and inherent correlation structure. So in this paper we adapted Gaussian process (GP)-based classification technique using three kernels namely: linear, polynomial and radial basis kernel. We also investigate the performance of a GP-based classification technique in comparison to existing techniques such as LDA, QDA and NB. Performances are evaluated by using the accuracy (ACC), sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV) and receiver-operating characteristic (ROC) curves.
Pima Indian diabetes dataset is taken as part of the study. This consists of 768 patients, of which 268 patients are diabetic and 500 patients are controls. Our machine learning system shows the performance of GP-based model as: ACC 81.97%, SE 91.79%, SP 63.33%, PPV 84.91% and NPV 62.50% which are larger compared to other methods.
Abstract Purpose Fatty Liver Disease (FLD) is one of the most common diseases in liver. Early detection can improve the prognosis considerably. Using ultrasound for FLD detection is highly desirable ...due to its non-radiation nature, low cost and easy use. However, the results can be slow and ambiguous due to manual detection. The lack of computer trained systems leads to low image quality and inefficient disease classification. Thus, the current study proposes novel, accurate and reliable detection system for the FLD using computer-based training system. Materials and Methods One hundred twenty-four ultrasound sample images were selected retrospectively from a database of 62 patients consisting of normal and cancerous. The proposed training system was generated offline parameters using training liver image database. The classifier applied transformation parameters to an online system in order to facilitate real-time detection during the ultrasound scan. The system utilized six sets of features (a total of 128 features), namely Haralick, Basic geometric, Fourier transform, Discrete Cosine Transform, Gupta transform and Gabor transform. These features were extracted for both offline training and online testing. Levenberg-Marquardt Back Propagation Network (BPN) classifier was used to classify the liver disease into normal and abnormal categories. Results Random Partitioning approach was adapted to evaluate the classifier performance and compute its accuracy. Utilizing all the six sets of 128 features, the computer aided diagnosis (CAD) system achieved classification accuracy of 97.58%. Furthermore, the four performance metrics consisting of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) realized 98.08%, 97.22%, 96.23%, and 98.59%, respectively. Conclusion The proposed system was successfully able to detect and classify the FLD. Furthermore, the proposed system was benchmarked against previous methods. The comparison established an advanced set of features in the Levenberg-Marquardt Back Propagation Network reports a significant improvement compared to the existing techniques.
Diabetes mellitus is a group of metabolic diseases in which blood sugar levels are too high. About 8.8% of the world was diabetic in 2017. It is projected that this will reach nearly 10% by 2045. The ...major challenge is that when machine learning-based classifiers are applied to such data sets for risk stratification, leads to lower performance. Thus, our objective is to develop an optimized and robust machine learning (ML) system under the assumption that missing values or outliers if replaced by a median configuration will yield higher risk stratification accuracy. This ML-based risk stratification is designed, optimized and evaluated, where: (i) the features are extracted and optimized from the six feature selection techniques (random forest, logistic regression, mutual information, principal component analysis, analysis of variance, and Fisher discriminant ratio) and combined with ten different types of classifiers (linear discriminant analysis, quadratic discriminant analysis, naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, and random forest) under the hypothesis that both missing values and outliers when replaced by computed medians will improve the risk stratification accuracy. Pima Indian diabetic dataset (768 patients: 268 diabetic and 500 controls) was used. Our results demonstrate that on replacing the missing values and outliers by group median and median values, respectively and further using the combination of random forest feature selection and random forest classification technique yields an accuracy, sensitivity, specificity, positive predictive value, negative predictive value and area under the curve as:
92.26%, 95.96%, 79.72%, 91.14%, 91.20%,
and
0.93
, respectively. This is an improvement of 10% over previously developed techniques published in literature. The system was validated for its stability and reliability. RF-based model showed the best performance when outliers are replaced by median values.
A World Health Organization (WHO) Feb 2018 report has recently shown that mortality rate due to brain or central nervous system (CNS) cancer is the highest in the Asian continent. It is of critical ...importance that cancer be detected earlier so that many of these lives can be saved. Cancer grading is an important aspect for targeted therapy. As cancer diagnosis is highly invasive, time consuming and expensive, there is an immediate requirement to develop a non-invasive, cost-effective and efficient tools for brain cancer characterization and grade estimation. Brain scans using magnetic resonance imaging (MRI), computed tomography (CT), as well as other imaging modalities, are fast and safer methods for tumor detection. In this paper, we tried to summarize the pathophysiology of brain cancer, imaging modalities of brain cancer and automatic computer assisted methods for brain cancer characterization in a machine and deep learning paradigm. Another objective of this paper is to find the current issues in existing engineering methods and also project a future paradigm. Further, we have highlighted the relationship between brain cancer and other brain disorders like stroke, Alzheimer's, Parkinson's, and Wilson's disease, leukoriaosis, and other neurological disorders in the context of machine learning and the deep learning paradigm.
Human activity recognition (HAR) has multifaceted applications due to its worldly usage of acquisition devices such as smartphones, video cameras, and its ability to capture human activity data. ...While electronic devices and their applications are steadily growing, the advances in Artificial intelligence (AI) have revolutionized the ability to extract deep hidden information for accurate detection and its interpretation. This yields a better understanding of rapidly growing acquisition devices, AI, and applications, the three pillars of HAR under one roof. There are many review articles published on the general characteristics of HAR, a few have compared all the HAR devices at the same time, and few have explored the impact of evolving AI architecture. In our proposed review, a detailed narration on the three pillars of HAR is presented covering the period from 2011 to 2021. Further, the review presents the recommendations for an improved HAR design, its reliability, and stability. Five major findings were: (1) HAR constitutes three major pillars such as devices, AI and applications; (2) HAR has dominated the healthcare industry; (3) Hybrid AI models are in their infancy stage and needs considerable work for providing the stable and reliable design. Further, these trained models need solid prediction, high accuracy, generalization, and finally, meeting the objectives of the applications without bias; (4) little work was observed in abnormality detection during actions; and (5) almost no work has been done in forecasting actions. We conclude that: (a) HAR industry will evolve in terms of the three pillars of electronic devices, applications and the type of AI. (b) AI will provide a powerful impetus to the HAR industry in future.
The left atrium (LA) has a crucial function in maintaining left ventricular filling, which is responsible for about one-third of all cardiac filling. A growing body of evidence shows that LA is ...involved in several cardiovascular diseases from a clinical and prognostic standpoint. LA enlargement has been recognized as a predictor of the outcomes of many diseases. However, LA enlargement itself does not explain the whole LA’s function during the cardiac cycle. For this reason, the recently proposed assessment of atrial strain at advanced cardiac magnetic resonance (CMR) enables the usual limitations of the sole LA volumetric measurement to be overcome. Moreover, the left atrial strain impairment might allow several cardiovascular diseases to be detected at an earlier stage. While traditional CMR has a central role in assessing LA volume and, through cine sequences, a marginal role in evaluating LA function, feature tracking at advanced CMR (CMR-FT) has been increasingly confirmed as a feasible and reproducible technique for assessing LA function through strain. In comparison to atrial function evaluations via speckle tracking echocardiography, CMR-FT has a higher spatial resolution, larger field of view, and better reproducibility. In this literature review on atrial strain analysis, we describe the strengths, limitations, recent applications, and promising developments of studying atrial function using CMR-FT in clinical practice.
Key Points
•
The left atrium has a crucial function in maintaining left ventricular filling; left atrial size has been recognized as a predictor of the outcomes of many diseases
.
•
Left atrial strain has been confirmed as a marker of atrial functional status and demonstrated to be a sensitive tool in the subclinical phase of a disease
.
•
A comprehensive evaluation of the three phases of atrial function by CMR-FT demonstrates an impairment before the onset of atrial enlargement, thus helping clinicians in their decision-making and improving patient outcomes
.
•Identification of high risk differential gene expression using statistical tests.•Development of a machine learning strategy for predicting the cancerous genes.•Four statistical tests and ten ...machine learning classifiers were experimentally preformed, validated and compared.
A colon microarray data is a repository of thousands of gene expressions with different strengths for each cancer cell. It is necessary to detect which genes are responsible for cancer growth. This study presents an exhaustive comparative study of different machine learning (ML) systems which serves two major purposes: (a) identification of high risk differential genes using statistical tests and (b) development of a ML strategy for predicting cancer genes.
Four statistical tests namely: Wilcoxon sign rank sum (WCSRS), t test, Kruskal–Wallis (KW), and F-test were adapted for cancerous gene identification using their p-values. The extracted gene set was used to classify cancer patients using ten classifiers namely: linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), naïve Bayes (NB), Gaussian process classification (GPC), support vector machine (SVM), artificial neural network (ANN), logistic regression (LR), decision tree (DT), Adaboost (AB), and random forest (RF). Performance was then evaluated using cross-validation protocols and standardized metrics viz. accuracy (ACC) and area under the curve (AUC).
The colon cancer dataset consists of 2000 genes from 62 patients (40 cancer vs. 22 control). The overall mean ACC of our ML system using all four statistical tests and all ten classifiers was 90.50%. The ML system showed an ACC of 99.81% using a combination WCSRS test and RF-based classifier. This is an improvement of 8% over previously published values in literature.
RF-based model with statistical tests for detection of high risk genes showed the best performance for accurate cancer classification in multi-center clinical trials.
•Deep Learning (DL) technique is applied for detection of hypoechoic FLD and stratification of normal and abnormal US liver images under the class of Symtosis.•This paper provides comprehensive ...analysis and comparison of three ML-based classification methodologies: namely, support vector machines, extreme learning machines and deep learning.•A specialized deep learning operation called inception is comprehensively investigated.
Fatty Liver Disease (FLD) - a disease caused by deposition of fat in liver cells, is predecessor to terminal diseases such as liver cancer. The machine learning (ML) techniques applied for FLD detection and risk stratification using ultrasound (US) have limitations in computing tissue characterization features, thereby limiting the accuracy.
Under the class of Symtosis for FLD detection and risk stratification, this study presents a Deep Learning (DL)-based paradigm that computes nearly seven million weights per image when passed through a 22 layered neural network during the cross-validation (training and testing) paradigm. The DL architecture consists of cascaded layers of operations such as: convolution, pooling, rectified linear unit, dropout and a special block called inception model that provides speed and efficiency. All data analysis is performed in optimized tissue region, obtained by removing background information. We benchmark the DL system against the conventional ML protocols: support vector machine (SVM) and extreme learning machine (ELM).
The liver US data consists of 63 patients (27 normal/36 abnormal). Using the K10 cross-validation protocol (90% training and 10% testing), the detection and risk stratification accuracies are: 82%, 92% and 100% for SVM, ELM and DL systems, respectively. The corresponding area under the curve is: 0.79, 0.92 and 1.0, respectively. We further validate our DL system using two class biometric facial data that yields an accuracy of 99%.
DL system shows a superior performance for liver detection and risk stratification compared to conventional machine learning systems: SVM and ELM.
Automated EEG analysis of epilepsy: A review Acharya, U. Rajendra; Vinitha Sree, S.; Swapna, G. ...
Knowledge-based systems,
June 2013, 2013-6-00, 20130601, Letnik:
45
Journal Article
Recenzirano
Epilepsy is an electrophysiological disorder of the brain, characterized by recurrent seizures. Electroencephalogram (EEG) is a test that measures and records the electrical activity of the brain, ...and is widely used in the detection and analysis of epileptic seizures. However, it is often difficult to identify subtle but critical changes in the EEG waveform by visual inspection, thus opening up a vast research area for biomedical engineers to develop and implement several intelligent algorithms for the identification of such subtle changes. Moreover, the EEG signals are nonlinear and non-stationary in nature, which contribute to further complexities related to their manual interpretation and detection of normal and abnormal (interictal and ictal) activities. Hence, it is necessary to develop a Computer Aided Diagnostic (CAD) system to automatically identify the normal and abnormal activities using minimum number of highly discriminating features in classifiers. It has been found that nonlinear features are able to capture the complex physiological phenomena such as abrupt transitions and chaotic behavior in the EEG signals. In this review, we discuss various feature extraction methods and the results of different automated epilepsy stage detection techniques in detail. We also briefly present the various open ended challenges that need to be addressed before a CAD based epilepsy detection system can be set-up in a clinical setting.
Texture analysis has arisen as a tool to explore the amount of data contained in images that cannot be explored by humans visually. Radiomics is a method that extracts a large number of features from ...radiographic medical images using data-characterisation algorithms. These features, termed radiomic features, have the potential to uncover disease characteristics. The goal of both radiomics and texture analysis is to go beyond size or human-eye based semantic descriptors, to enable the non-invasive extraction of quantitative radiological data to correlate them with clinical outcomes or pathological characteristics. In the latest years there has been a flourishing sub-field of radiology where texture analysis and radiomics have been used in many settings. It is difficult for the clinical radiologist to cope with such amount of data in all the different radiological sub-fields and to identify the most significant papers. The aim of this review is to provide a tool to better understand the basic principles underlining texture analysis and radiological data mining and a summary of the most significant papers of the latest years.