Support vector machines (SVMs) are a supervised classifier successfully applied in a plethora of real-life applications. However, they suffer from the important shortcomings of their high time and ...memory training complexities, which depend on the training set size. This issue is especially challenging nowadays, since the amount of data generated every second becomes tremendously large in many domains. This review provides an extensive survey on existing methods for selecting SVM training data from large datasets. We divide the state-of-the-art techniques into several categories. They help understand the underlying ideas behind these algorithms, which may be useful in designing new methods to deal with this important problem. The review is complemented with the discussion on the future research pathways which can make SVMs easier to exploit in practice.
Full text
Available for:
CEKLJ, EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NUK, OBVAL, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
In this paper, two multifault diagnosis methods based on improved support vector machine (SVM) are proposed for sensor fault detection and identification respectively. First, online sparse least ...squares support vector machine (OS-LSSVM) is utilized to detect and predict sensor faults. Then, a method which combines the SVM and error-correcting output codes (ECOC) called ECOC-SVM is proposed to solve the sensor fault feature extraction and online identification problem. We regard nonlinear transformation as the input of classifiers to enhance the separability of initial characteristics. ECOC-SVM is utilized to classify the fault states. Some typical faults are investigated and the experimental results indicate that ECOC-SVM has high identification accuracy and can be implemented in real-time to meet the requirements of online fault identification. This method can also be extended to solve other related problems.
In sensorless control of permanent magnet synchronous motor (PMSM) drives, initial rotor position detection is critical to ensure smooth start-up operation. However, extracting initial position ...information for machines with ultra-low saliency ratios is challenging. To address this issue, this paper proposes an estimation method for rotor position using image tracking based on high-frequency (HF) rotating voltage vector injection. Specifically, this paper extracts positive- and negative-sequence currents in the stationary reference frame and reconstructs the induced current vector. By utilizing the support vector machine (SVM) algorithm, the resulting current vector images, which contain amplitude and phase information of the position, are trained to establish the correlation between the current image and rotor position information. Ultimately, the electric position can be obtained from image recognition. Since this method employs image tracking for position estimation, it accounts for cross saturation and multiple saliencies, leading to the relatively high precision of position estimation. Experimental evaluation shows that the proposed method achieves higher precision compared to conventional observers.
A variety of machine learning methods such as naive Bayesian, support vector machines and more recently deep neural networks are demonstrating their utility for drug discovery and development. These ...leverage the generally bigger datasets created from high-throughput screening data and allow prediction of bioactivities for targets and molecular properties with increased levels of accuracy. We have only just begun to exploit the potential of these techniques but they may already be fundamentally changing the research process for identifying new molecules and/or repurposing old drugs. The integrated application of such machine learning models for end-to-end (E2E) application is broadly relevant and has considerable implications for developing future therapies and their targeting.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
The forecasting techniques are affected by the renewable sources randomness. Improvements of the prediction models with more accurate results and lower error are necessary for future development of ...the microgrids projects and of the economic dispatch sector. The LS-SVM (Least Square Support Vector Machine), a relatively unexplored neural network known as GMDH (Group Method of Data Handling) and a novel hybrid algorithm GLSSVM (Group Least Square Support Vector Machine), based on the combination of the first two models, were implemented to forecast the PV (Photovoltaic) output power at several time horizons up to 24 h. In order to improve the forecasting accuracy, each model was combined with three strategies for multi-step ahead forecast (Direct, Recursive and DirRec). A detail analysis of the normalized mean error is carried out to compare the different forecasting methods, using the historical PV output power data of a 960 kWP grid connected PV system in the south of Italy. The outcomes demonstrate the GLSSVM method with the DirRec strategy can give a normalized error of 2.92% under different weather conditions with evident improvements respect to the traditional ANN (Artificial Neural Network).
•Photovoltaic forecast is performed by the historical PV power data.•LS-SVM and the GMDH models are applied to predict the PV output power at 24 h.•Multi-step ahead forecasting strategies (Direct, Recursive and DirRec) were implemented.•A comparative analysis based on the mean error is performed to evaluate the accuracy.•A hybrid method GLSSVM has been investigated.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Network security is becoming increasingly important in our daily lives—not only for organizations but also for individuals. Intrusion detection systems have been widely used to prevent information ...from being compromised, and various machine-learning techniques have been proposed to enhance the performance of intrusion detection systems. However, higher-quality training data is an essential determinant that could improve detection performance. It is well known that the marginal density ratio is the most powerful univariate classifier. In this paper, we propose an effective intrusion detection framework based on a support vector machine (SVM) with augmented features. More specifically, we implement the logarithm marginal density ratios transformation to form the original features with the goal of obtaining new and better-quality transformed features that can greatly improve the detection capability of an SVM-based detection model. The NSL-KDD dataset is used to evaluate the proposed method, and the empirical results show that it achieves a better and more robust performance than existing methods in terms of accuracy, detection rate, false alarm rate and training speed.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
In order to protect industrial safety, improve the operation stability of the industrial control system, conduct the response measures for network environment attacked by the external world, and ...realize simulation in virtual reality environment, in this study, class and sample weighted C-support vector machine (CSWC-SVM) algorithm is first proposed using SVM. Then, the intrusion detection model of industrial control network is built based on the CSWC-SVM algorithm. Finally, KDD CUP 1999 data are introduced to carry out simulation experiments on the algorithm model constructed in this study in the virtual reality simulation environment. The results show when the penalty factor of the polynomial kernel function, radial basis kernel function, and sigmoid kernel function is 104, the average number of support vectors is 45, 46, and 37, respectively; the average training time are about 0.43, 0.45, and 0.47 s, and the average test time is about 9.7, 9.9, and 10.2 s, respectively; the average recognition accuracy is about 85.7%, 86.2%, and 86.7%, and the false positive rate is 3.8%, 2.8%, and 2.3%, respectively; the accuracy of the CSWC-SVM algorithm in different sample sizes (1000-6000) can be kept above 90%. The operation error rate of the CSWC-SVM algorithm is lower than that of C-SVM, C-SVM, and RS-SVM algorithms under different validation data sets. After dimension reduction, the classification accuracy of the CSWC-SVM algorithm is higher than that of C-SVM and WC-SVM algorithms. The weight value increases from 0 to 200, and the number of model errors on 1000, 2000, and 3000 pieces of data decreases significantly. When the weight value is 200, the number of errors drops to 0, and the classification accuracy reaches 100%. In a word, the CSWC-SVM algorithm constructed in this study performs well in response to the attack of the industrial control system in the virtual reality simulation environment, which provides practical significance for the application of virtual reality in industrial monitoring.
Contamination from pesticides and nitrate in groundwater is a significant threat to water quality in general and agriculturally intensive regions in particular. Three widely used machine learning ...models, namely, artificial neural networks (ANN), support vector machines (SVM), and extreme gradient boosting (XGB), were evaluated for their efficacy in predicting contamination levels using sparse data with non-linear relationships. The predictive ability of the models was assessed using a dataset consisting of 303 wells across 12 Midwestern states in the USA. Multiple hydrogeologic, water quality, and land use features were chosen as the independent variables, and classes were based on measured concentration ranges of nitrate and pesticide. This study evaluates the classification performance of the models for two, three, and four class scenarios and compares them with the corresponding regression models. The study also examines the issue of class imbalance and tests the efficacy of three class imbalance mitigation techniques: oversampling, weighting, and oversampling and weighting, for all the scenarios. The models’ performance is reported using multiple metrics, both insensitive to class imbalance (accuracy) and sensitive to class imbalance (F1 score and MCC). Finally, the study assesses the importance of features using game-theoretic Shapley values to rank features consistently and offer model interpretability.
Bearing defects have been accepted as one of the major causes of failure in rotating machinery. It is important to identify and diagnose the failure behavior of bearings for the reliable operation of ...equipment. In this paper, a low-cost non-contact vibration sensor has been developed for detecting the faults in bearings. The supervised learning method, support vector machine (SVM), has been employed as a tool to validate the effectiveness of the developed sensor. Experimental vibration data collected for different bearing defects under various loading and running conditions have been analyzed to develop a system for diagnosing the faults for machine health monitoring. Fault diagnosis has been accomplished using discrete wavelet transform for denoising the signal. Mahalanobis distance criteria has been employed for selecting the strongest feature on the extracted relevant features. Finally, these selected features have been passed to the SVM classifier for identifying and classifying the various bearing defects. The results reveal that the vibration signatures obtained from developed non-contact sensor compare well with the accelerometer data obtained under the same conditions. A developed sensor is a promising tool for detecting the bearing damage and identifying its class. SVM results have established the effectiveness of the developed non-contact sensor as a vibration measuring instrument which makes the developed sensor a cost-effective tool for the condition monitoring of rotating machines.
Support vector ordinal regression (SVOR) is a popular method to tackle ordinal regression problems. However, until now there were no effective algorithms proposed to address incremental SVOR learning ...due to the complicated formulations of SVOR. Recently, an interesting accurate on-line algorithm was proposed for training ν-support vector classification (ν-SVC), which can handle a quadratic formulation with a pair of equality constraints. In this paper, we first present a modified SVOR formulation based on a sum-of-margins strategy. The formulation has multiple constraints, and each constraint includes a mixture of an equality and an inequality. Then, we extend the accurate on-line ν-SVC algorithm to the modified formulation, and propose an effective incremental SVOR algorithm. The algorithm can handle a quadratic formulation with multiple constraints, where each constraint is constituted of an equality and an inequality. More importantly, it tackles the conflicts between the equality and inequality constraints. We also provide the finite convergence analysis for the algorithm. Numerical experiments on the several benchmark and real-world data sets show that the incremental algorithm can converge to the optimal solution in a finite number of steps, and is faster than the existing batch and incremental SVOR algorithms. Meanwhile, the modified formulation has better accuracy than the existing incremental SVOR algorithm, and is as accurate as the sum-of-margins based formulation of Shashua and Levin.