Cancer genome and other sequencing initiatives are generating extensive data on non-synonymous single nucleotide polymorphisms (nsSNPs) in human and other genomes. In order to understand the impacts ...of nsSNPs on the structure and function of the proteome, as well as to guide protein engineering, accurate in silicomethodologies are required to study and predict their effects on protein stability. Despite the diversity of available computational methods in the literature, none has proven accurate and dependable on its own under all scenarios where mutation analysis is required. Here we present DUET, a web server for an integrated computational approach to study missense mutations in proteins. DUET consolidates two complementary approaches (mCSM and SDM) in a consensus prediction, obtained by combining the results of the separate methods in an optimized predictor using Support Vector Machines (SVM). We demonstrate that the proposed method improves overall accuracy of the predictions in comparison with either method individually and performs as well as or better than similar methods. The DUET web server is freely and openly available at http://structure.bioc.cam.ac.uk/duet.
There has been an increased interest in speech pattern analysis applications of Parkinsonism for building predictive telediagnosis and telemonitoring models. For this purpose, we have collected a ...wide variety of voice samples, including sustained vowels, words, and sentences compiled from a set of speaking exercises for people with Parkinson's disease. There are two main issues in learning from such a dataset that consists of multiple speech recordings per subject: 1) How predictive these various types, e.g., sustained vowels versus words, of voice samples are in Parkinson's disease (PD) diagnosis? 2) How well the central tendency and dispersion metrics serve as representatives of all sample recordings of a subject? In this paper, investigating our Parkinson dataset using well-known machine learning tools, as reported in the literature, sustained vowels are found to carry more PD-discriminative information. We have also found that rather than using each voice recording of each subject as an independent data sample, representing the samples of a subject with central tendency and dispersion metrics improves generalization of the predictive model.
In clinical field, the diagnosis of many diseases and their development stages depend on the detection of the corresponding bacteria. Raman spectroscopy and laser‐induced breakdown spectroscopy ...(LIBS) are two novel spectral diagnostic technologies for clinical bacteria identification. Both of them have been used in clinical detection combined with optimized support vector machine (SVM). In this paper, two feature‐level fusion methods (before feature selection fusion BFSF and after feature selection fusion AFSF) were proposed to improve the performance of SVM classifier and reduce the analyzing time (including the parameter tuning, model training, and testing time) simultaneously by combining data of LIBS and Raman spectroscopy. Using the most important 10 feature lines as the inputs of the optimized classifier, the analyzing time could be reduced to 1 to 2 min and the correct classification rate (CCR) achieved 95.67%. Without optimizing SVM parameters, the two proposed methods could achieve rapid and accurate classification of pathogenic bacteria further. The AFSF method showed better results with less fusion features while the BFSF method achieved higher CCR at 100% with more features. Both two methods costed around 0.2 s for analysis. These indicate that the proposed feature‐level fusion methods can improve the performance of SVM for bacteria detection. Maintaining the highest diagnostic accuracy, they can achieve the minimum analyzing time.
In this paper, we proposed two feature‐level fusion methods (before feature selection fusion BFSF and after feature selection fusion AFSF) to improve the performance of SVM classifier and reduce the analyzing time simultaneously by combining data of LIBS and Raman spectroscopy. AFSF method showed better results with less fusion features while the BFSF method achieved higher accuracy at 100% with more features. The proposed feature‐level fusion methods can achieve LIBS‐Raman diagnosis around 0.2 s for bacteria.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Wind speed forecasting is a crucial issue in the wind power industry. However, the disadvantage of the existing wind speed forecasting models is that they often ignore similar fluctuation information ...between the adjacent WTGs (wind turbine generators), which leads to poor forecasting accuracy. This paper proposes a hybrid wind speed forecasting model to overcome this disadvantage. Specifically, grey correlation analysis is applied to select useful fluctuation information from the adjacent and observed WTGs, and the chosen fluctuation information is fed into the v-SVM (v-support vector machine), which offers good capability in nonlinear fitting, to perform wind speed forecasting of the observed WTGs. Meanwhile, to reduce the impacts of the model parameters on the final forecasting performance, CS (cuckoo search) is used to tune the parameters in the v-SVM. The results from two case studies show that the proposed model, which considers the fluctuation information of the adjacent WTG, offers greater accuracy than the other compared models. As concluded from the results of three accuracy tests, the performances of v-SVM and ε-SVM (ε-support vector machine) show no significant difference, and the CS algorithm is more efficient than the PSO (particle swarm optimization) for tuning of the parameters in the v-SVM.
•The proposed hybrid model takes advantage of the fluctuation information of the adjacent WTGs.•The relational degree between all inputs and the target is analyzed by grey relation analysis.•Predictive accuracy tests show that CS is more efficient than PSO in optimizing parameters in v-SVM.•The performances of v-SVM and ε-SVM produce no obvious differences in short-term wind speed forecasting in our study.•The hybrid model can improve forecasting accuracy for single WTG.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
•Contribution to industrial data elaboration for predictive maintenance was provided.•Data were analysed to develop a new solution for real-time monitoring.•It was proposed a challenging input ...resulting in data management time savings.•A classifier for tool condition assessment in new and promising form was presented.•An original model was proposed for degradation assessment and prediction.
Effective transition from raw industrial data to knowledge-based executive actions without human action requires developing new analytical tools, what also means new challenges for expert and intelligent systems. Studies must be conducted especially on developing effective analytical solutions for intelligent modules of Computerized Maintenance Management Systems, that take advantage of data analysis and decision support tools to predict and prevent the potential failure of machines or its elements. This is why the idea of a new classifier for condition assessment and Remaining Useful Life (RUL) prediction as an expert system tool for real-time monitoring of the manufacturing process was presented. Based on monitoring and current system check data, a new method enabling both early prediction of the machine tool’s remaining useful life and its current condition classification was devised. Its failure and normal properties were distinguished as well. To this end, it was proposed that the remaining useful life prediction should be made via the combined use of the Support Vector Machine (SVM) as a classification tool and AutoRegressive and Integrated Moving Average (ARIMA) based identification. This would provide process engineers and machine operators with an expert system that is easy to implement and use at the operational level, thus allowing them confidently perform technological processes, according to the acceptable failure probability.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The insufficient information from the minority examples cannot exactly represent the inherent structure of the dataset, which leads to a low prediction accuracy of the minority through the existing ...classification methods. The over- and under-sampling methods help to increase the prediction accuracy of the minority. However, the two methods either lose important information or add trivial information for classification, so as to affect the prediction accuracy of the minority. Therefore, a new different contribution sampling method (DCS) based on the contributions of the support vectors (SVs) and the nonsupport vectors (NSVs) to classification is proposed in this paper. The proposed DCS method applies different sampling methods for the SVs and the NSVs and uses the biased support vector machine (B-SVM) method to identify the SVs and the NSVs of an imbalanced data. Moreover, the synthetic minority over-sampling technique (SMOTE) and the random under-sampling technique (RUS) are used in the proposed method to re-sample the SVs in the minority and the NSVs in the majority, respectively. Examples are labeled by the ensemble of support vector machine (SVMen). Experiments are carried out on the imbalanced dataset which is selected from UCI, AVU06a, Statlog, DP01a, JP98a and CWH03a repositories. Experimental results show that for the imbalanced datasets, the proposed DCS method achieves a better performance in the aspects of Receiver Operating Characteristic (ROC) curve than other methods. The proposed DCS method improves 20.80%, 5.97%, 8.66% and 9.35% in terms of the geometric mean prediction accuracy Gmean as compared with that achieved by using the NS, the US, the SMOTE and the ROS, respectively.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
In this paper, islanding detection in a hybrid distributed generation (DG) system is analyzed by the use of hyperbolic S-transform (HST), timetime transform, and mathematical morphology methods. The ...merits of these methods are thoroughly compared against commonly adopted wavelet transform (WT) and S-transform (ST) techniques, as a new contribution to earlier studies. The hybrid DG system consists of photovoltaic and wind energy systems connected to the grid within the IEEE 30-bus system. Negative sequence component of the voltage signal is extracted at the point of common coupling and passed through the above-mentioned techniques. The efficacy of the proposed methods is also compared by an energy-based technique with proper threshold selection to accurately detect the islanding phenomena. Further, to augment the accuracy of the result, the classification is done using support vector machine (SVM) to distinguish islanding from other power quality (PQ) disturbances. The results demonstrate effective performance and feasibility of the proposed techniques for islanding detection under both noise-free and noisy environments, and also in the presence of harmonics.
Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNA primarily expressed in germ cells that can silence transposons at the post-transcriptional level. Accurate prediction of piRNAs ...remains a significant challenge.
We developed a program for piRNA annotation (Piano) using piRNA-transposon interaction information. We downloaded 13,848 Drosophila piRNAs and 261,500 Drosophila transposons. The piRNAs were aligned to transposons with a maximum of three mismatches. Then, piRNA-transposon interactions were predicted by RNAplex. Triplet elements combining structure and sequence information were extracted from piRNA-transposon matching/pairing duplexes. A support vector machine (SVM) was used on these triplet elements to classify real and pseudo piRNAs, achieving 95.3 ± 0.33% accuracy and 96.0 ± 0.5% sensitivity. The SVM classifier can be used to correctly predict human, mouse and rat piRNAs, with overall accuracy of 90.6%. We used Piano to predict piRNAs for the rice stem borer, Chilo suppressalis, an important rice insect pest that causes huge yield loss. As a result, 82,639 piRNAs were predicted in C. suppressalis.
Piano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions. Piano is freely available to the academic community at http://ento.njau.edu.cn/Piano.html .
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Landslide hazard assessment is critical for preventing and mitigating landslide disasters. The tuning of hyperparameters is of great importance to achieve better accuracy in a landslide hazard ...assessment model. In this study, a novel approach is proposed for landslide hazard assessment with support vector machine (SVM) as the primary model and Bayesian optimization (BO) algorithm as the parameter tuning method. This study describes 1711 historical landslide disaster points in Nanping City, and a total of 12 landslide conditioning factors including elevation, slope, aspect, curvature, lithology, soil type, soil erosion, rainfall, river, land use, highway, and railway were selected. The multicollinearity diagnosis was performed on the factors using the Spearman correlation coefficient. For model validation, 1711 landslides and 1711 non-landslides were collected as the dataset and divided into a training dataset (50 %) and a testing dataset (50 %). The performance of the model was evaluated by the confusion matrix and receiver operating characteristic (ROC) curve. The results of the confusion matrix accuracy and the area under the ROC curve showed that the BO-SVM model (89.53 %, 0.97) performed better than the SVM model (84.91 %, 0.93). In addition, the landslide hazard maps generated by the BO-SVM model had better overall results than that by the SVM model.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
In this research, we develop an affective computing method based on machine learning for emotion recognition using a wireless protocol and a wearable electroencephalography (EEG) custom-designed ...device. The system collects EEG signals using an eight-electrode placement on the scalp; two of these electrodes were placed in the frontal lobe, and the other six electrodes were placed in the temporal lobe. We performed experiments on eight subjects while they watched emotive videos. Six entropy measures were employed for extracting suitable features from the EEG signals. Next, we evaluated our proposed models using three popular classifiers: a support vector machine (SVM), multi-layer perceptron (MLP), and one-dimensional convolutional neural network (1D-CNN) for emotion classification; both subject-dependent and subject-independent strategies were used. Our experiment results showed that the highest average accuracies achieved in the subject-dependent and subject-independent cases were 85.81% and 78.52%, respectively; these accuracies were achieved using a combination of the sample entropy measure and 1D-CNN. Moreover, our study investigates the T8 position (above the right ear) in the temporal lobe as the most critical channel among the proposed measurement positions for emotion classification through electrode selection. Our results prove the feasibility and efficiency of our proposed EEG-based affective computing method for emotion recognition in real-world applications.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK