Human action recognition (HAR) has gained much attention in the last few years due to its enormous applications including human activity monitoring, robotics, visual surveillance, to name but a few. ...Most of the previously proposed HAR systems have focused on using hand-crafted images features. However, these features cover limited aspects of the problem and show performance degradation on a large and complex datasets. Therefore, in this work, we propose a novel HAR system which is based on the fusion of conventional hand-crafted features using histogram of oriented gradients (HoG) and deep features. Initially, human silhouette is extracted with the help of saliency-based method - implemented in two phases. In the first phase, motion and geometric features are extracted from the selected channel, whilst, second phase calculates the Chi-square distance between the extracted and threshold-based minimum distance features. Afterwards, extracted deep CNN and hand-crafted features are fused to generate a resultant vector. Moreover, to cope with the curse of dimensionality, an entropy-based feature selection technique is also proposed to identify the most discriminant features for classification using multi-class support vector machine (M-SVM). All the simulations are performed on five publicly available benchmark datasets including Weizmann, UCF11 (YouTube), UCF Sports, IXMAS, and UT-Interaction. A comparative evaluation is also presented to show that our proposed model achieves superior performances in comparison to a few exiting methods.
•Motion and Geometric features are extracted for human flow estimation and silhouette extraction.•Deep CNN and hand crafted features are fused through parallel approach.•Entropy-controlled Chi-square approach is proposed for best features selection.•Experiments are performed on several well-known datasets.
In the agriculture farming business, weeds, pests, and other plant diseases are the major reason for monetary misfortunes around the globe. It is an imperative factor, as it causes a significant ...diminution in both quality and capacity of crop growing. Therefore, detection and taxonomy of various plants diseases are crucial, and it demands utmost attention. However, this loss can be minimized by detecting crops diseases at their earlier stages. In this article, we are primarily focusing on a cucumber leaf diseases detection and classification method, which is comprised of five stages including image enhancement, infected spots segmentation, deep features extraction, feature selection, and finally classification. Image enhancement is performed as a pre-processing step, which efficiently improves the local contrast and makes infected regions more visible, which is later segmented with a novel Sharif saliency-based (SHSB) method. The segmentation results are further improved by fusing active contour segmentation and proposed saliency method. This step is much important for correct and useful feature extraction. In this work, pre-trained models- VGG-19 & VGG-M are utilized for features extraction and later select the most prominent features based on three selected parameters - local entropy, local standard deviation, and local interquartile range method. These refined features are finally fed to multi-class support vector machine for diseases identification. To prove the authenticity of the proposed algorithm, five cucumber leaf diseases are considered and classified to achieve classification accuracy of 98.08% in 10.52 seconds. Additionally, the proposed method is also compared with the recent techniques so as to prove its authenticity.
Brain tumor detection depicts a tough job because of its shape, size and appearance variations. In this manuscript, a deep learning model is deployed to predict input slices as a tumor ...(unhealthy)/non-tumor (healthy). This manuscript employs a high pass filter image to prominent the inhomogeneities field effect of the MR slices and fused with the input slices. Moreover, the median filter is applied to the fused slices. The resultant slices quality is improved with smoothen and highlighted edges of the input slices. After that, based on these slices’ intensity, a 4-connected seed growing algorithm is applied, where optimal threshold clusters the similar pixels from the input slices. The segmented slices are then supplied to the fine-tuned two layers proposed stacked sparse autoencoder (SSAE) model. The hyperparameters of the model are selected after extensive experiments. At the first layer, 200 hidden units and at the second layer 400 hidden units are utilized. The testing is performed on the softmax layer for the prediction of the images having tumors and no tumors. The suggested model is trained and checked on BRATS datasets i.e., 2012(challenge and synthetic), 2013, and 2013 Leaderboard, 2014, and 2015 datasets. The presented model is evaluated with a number of performance metrics which demonstrates the improved performance.
•CNN is used as a building block to represent pedestrian head-pose and body-orientation classes with low-resolution images.•The proposed system is applicable to both still images and image ...sequences.•Two separate big datasets for head pose and body orientation are prepared to employ deep learning.•Only grayscale images from 2D cameras are considered as input to the proposed model.•Promising classification results are achieved, which are compared to current state-of-the-art approaches.
Pedestrian orientation recognition, including head and body directions, is a demanding task in human activity-recognition scenarios. While moving in one direction, a pedestrian may be focusing his visual attention in another direction. The analysis of such orientation estimation via computer-vision applications is sometimes desirable for automated pedestrian intention and behavior analysis. This paper highlights appearance-based pedestrian head-pose and full-body orientation prediction by employing a deep-learning mechanism. A supervised deep convolutional neural-network model is presented as a deep-learning building block for classification. Two separate datasets are prepared for head-pose and full-body orientation estimation. The proposed model is subsequently trained separately on the two prepared datasets with eight orientation bins. Testing of the proposed model is performed with publicly available datasets, as well as self-taken real-time image sequences. The experiments reveal mean accuracies of 0.91 for head-pose estimation and 0.92 for full-body orientation estimation. The performance results illustrate that the proposed approach effectively classifies head-poses and body orientations simultaneously in different setups. The comparison with existing state-of-the-art approaches demonstrates the effectiveness of the presented approach.
The physical appearance of a brain tumor in human beings may be an indication of problems in psychological (cognitive) functions. Such functions include learning, understanding, problem solving, ...decision making, and planning. Early brain tumor detection can be done by using the proper procedure of screening. MRI is used for the detection of disease staging and follow-up without ionization radiation. In this manuscript, an automated system is proposed for the analysis of brain data and detection of cognitive functions abnormalities. The region of interest (ROI) is enhanced using a proposed partial differential diffusion filter (PDDF) which is a modified form of anisotropic diffusion filter. Otsu algorithm is used for better segmentation. Moreover, a new method is also proposed for feature extraction which is a concatenation of local binary pattern (LBP) and Gray level co-occurrence matrix (C2LBPGLCM). The proposed method accurately distinguishes between healthy and unhealthy images with high specificity, sensitivity, and area under the curve.
Computer-aided classification of diseases of the gastrointestinal tract (GIT) has become a crucial area of research. Medical science and artificial intelligence have helped medical experts find GIT ...diseases through endoscopic procedures. Wired endoscopy is a controlled procedure that helps the medical expert in disease diagnosis. Manual screening of the endoscopic frames is a challenging and time taking task for medical experts that also increases the missed rate of the GIT disease. An early diagnosis of GIT disease can save human beings from fatal diseases. An automatic deep feature learning-based system is proposed for GIT disease classification. The adaptive gamma correction and weighting distribution (AGCWD) preprocessing procedure is the first stage of the proposed work that is used for enhancing the intensity of the frames. The deep features are extracted from the frames by deep learning models including InceptionNetV3 and GITNet. Ant Colony Optimization (ACO) procedure is employed for feature optimization. Optimized features are fused serially. The classification operation is performed by variants of support vector machine (SVM) classifiers, including the Cubic SVM (CSVM), Coarse Gaussian SVM (CGSVM), Quadratic SVM (QSVM), and Linear SVM (LSVM) classifiers. The intended model is assessed on two challenging datasets including KVASIR and NERTHUS that consist of eight and four classes respectively. The intended model outperforms as compared with existing methods by achieving an accuracy of 99.32% over the KVASIR dataset and 99.89% accuracy using the NERTHUS dataset.
Traditional methods for behavior detection of distracted drivers are not capable of capturing driver behavior features related to complex temporal features. With the goal to improve transportation ...safety and to reduce fatal accidents on roads, this research article presents a Hybrid Scheme for the Detection of Distracted Driving called HSDDD. This scheme is based on a strategy of aggregating handcrafted and deep CNN features. HSDDD is based on three-tiered architecture. The three tiers are named as Coordination tier, Concatenation tier and Classification tier. We first obtain HOG features by using handcrafted algorithms, and then at the coordination tier, we leverage four deep CNN models including AlexNet, Inception V3, Resnet50 and VGG-16 for extracting DCNN features. DCNN extracted features are fused with HOG extracted features at the Concatenation tier. Then PCA is used as a feature selection technique. PCA takes both the extracted features and removes the redundant and irrelevant information, and it improves the classification performance. After feature fusion and feature selection, the two classifiers, KNN and SVM, at the Classification tier take the selected features and classify the ten classes of distracted driving behaviors. We evaluate our proposed scheme and observe its performance by using the accuracy metrics.
As the number of internet users increases so does the number of malicious attacks using malware. The detection of malicious code is becoming critical, and the existing approaches need to be improved. ...Here, we propose a feature fusion method to combine the features extracted from pre-trained AlexNet and Inception-v3 deep neural networks with features attained using segmentation-based fractal texture analysis (SFTA) of images representing the malware code. In this work, we use distinctive pre-trained models (AlexNet and Inception-V3) for feature extraction. The purpose of deep convolutional neural network (CNN) feature extraction from two models is to improve the malware classifier accuracy, because both models have characteristics and qualities to extract different features. This technique produces a fusion of features to build a multimodal representation of malicious code that can be used to classify the grayscale images, separating the malware into 25 malware classes. The features that are extracted from malware images are then classified using different variants of support vector machine (SVM), k-nearest neighbor (KNN), decision tree (DT), and other classifiers. To improve the classification results, we also adopted data augmentation based on affine image transforms. The presented method is evaluated on a Malimg malware image dataset, achieving an accuracy of 99.3%, which makes it the best among the competing approaches.
White blood cells (WBCs) are an indispensable constituent of the immune system. Efficient and accurate categorization of WBC is a critical task for disease diagnosis by medical experts. This ...categorization helps in the correct identification of medical problems. In this research work, WBC classes are categorized with the help of a transform learning model in combination with our proposed virtual hexagonal trellis (VHT) structure feature extraction method. The VHT feature extractor is a kernel-based filter model designed over a square lattice. In the first step, Graft Net CNN model is used to extract features of augmented data set images. Later, the VHT base feature extractor extracts useful features. The CNN-extracted features are passed to ant colony optimization (ACO) module for optimal features acquisition. Extracted features from the VHT base filter and ACO are serially merged to create a single feature vector. The merged features are passed to the support vector machine (SVM) variants for optimal classification. Our strategy yields 99.9% accuracy, which outperforms other existing methods.
Appearance-based gender classification is one of the key areas in pedestrian analysis, and it has many useful applications such as visual surveillance, predict demographics statistics, population ...prediction, and human–computer interaction. For pedestrian gender classification, traditional and deep convolutional neural network (CNN) approaches are employed individually. However, they are facing issues, for instance, discriminative feature representations, lower classification accuracy, and small sample size for model learning. To address these issues, this article proposes a framework that considers the combination of both traditional and deep CNN approaches for gender classification. To realize it, HOG- and LOMO-assisted low-level features are extracted to handle rotation, viewpoint and illumination variances in the images. Simultaneously, VGG19- and ResNet101-based standard deep CNN architectures are employed to acquire the deep features which are robust against pose variations. To avoid the ambiguous and unnecessary feature representations, the entropy-controlled features are picked from both low-level and deep representations of features that reduce the dimension of computed features. By merging the selected low-level features with deep features, we obtain a robust joint feature representation. The extensive experiments are conducted on PETA and MIT datasets, and computed results suggest that using the integration of both low-level and deep feature representations can improve the performance as compared to using these feature representations, individually. The proposed framework achieves AU-ROC of 96% and accuracy of 89.3% on the PETA dataset, and AU-ROC of 86% and accuracy of 82% on the MIT dataset. The experimental outcomes show that the proposed J-LDFR framework outperformed the existing gender classification methods.