Vision-based human action recognition (HAR) is a hot topic of research from the decade due to a few popular applications such as visual surveillance and robotics. For correct action recognition, ...various local and global points are requires known as features. These features modified during the variation in human movement. But due to a bit change in several human actions, the features of these actions are mixed that degrade the recognition performance. In this article, we design a new 26-layered Convolutional Neural Network (CNN) architecture for accurate complex action recognition. The features are extracted from the global average pooling layer and fully connected (FC) layer, and fused by a proposed high entropy-based approach. Further, we propose a feature selection method name Poisson distribution along with Univariate Measures (PDaUM). Few of fused CNN features are irrelevant, and few of them are redundant that makes the incorrect prediction among complex human actions. Therefore, the proposed PDaUM based approach selects only the strongest features that later passed to the Extreme Learning Machine (ELM) and Softmax for final recognition. Four datasets are using for experimental analysis - HMDB51 (51 classes), UCF Sports (10 classes), KTH (6 classes), and Weizmann (10 classes). On these datasets, the ELM classifier gives an improved performance as compared to a Softmax classifier. The achieved accuracy on each dataset is 81.4%, 99.2%, 98.3%, and 98.7%, respectively. Comparison with existing techniques, it is shown that the proposed architecture gives better performance in terms of accuracy and testing time.
Convolutional neural networks (CNN) have achieved excellent results in the field of image recognition that classifies objects in images. A typical CNN consists of a deep architecture that uses a ...large number of weights and layers to achieve high performance. CNN requires relatively large memory space and computational costs, which not only increase the time to train the model but also limit the real-time application of the trained model. For this reason, various neural network compression methodologies have been studied to efficiently use CNN in small embedded hardware such as mobile and edge devices. In this paper, we propose a kernel density estimation based non-uniform quantization methodology that can perform compression efficiently. The proposed method performs efficient weights quantization using a significantly smaller number of sampled weights than the number of original weights. Four-bit quantization experiments on the classification of the ImageNet dataset with various CNN architectures show that the proposed methodology can perform weights quantization efficiently in terms of computational costs without significant reduction in model performance.
As augmented reality technologies develop, real-time interactions between objects present in the real world and virtual space are required. Generally, recognition and location estimation in augmented ...reality are carried out using tracking techniques, typically markers. However, using markers creates spatial constraints in simultaneous tracking of space and objects. Therefore, we propose a system that enables camera tracking in the real world and visualizes virtual visual information through the recognition and positioning of objects. We scanned the space using an RGB-D camera. A three-dimensional (3D) dense point cloud map is created using point clouds generated through video images. Among the generated point cloud information, objects are detected and retrieved based on the pre-learned data. Finally, using the predicted pose of the detected objects, other information may be augmented. Our system estimates object recognition and 3D pose based on simple camera information, enabling the viewing of virtual visual information based on object location.
Neurofibromatosis type 1 (NF1) is a relatively common inherited disorder characterized by the formation of neurofibromas, pigmentary abnormalities of the skin, Lisch nodules of the iris, and skeletal ...abnormalities. Multiple cutaneous neurofibromas are benign nerve sheath tumors and the main manifestation of NF1. Cardiac neurofibroma associated with NF1 is very rare, and few cases have been reported in the literature. Herein, we present the CT and MRI findings of a surgically confirmed left ventricular neurofibroma in a 32-year-old female with NF1.
Multimodal sentiment analysis is an extended approach to traditional language-based sentiment analysis, which uses other relevant modality data. Multimodal sentiment analysis usually applies visual, ...textual, and acoustic representations for sentiment prediction. Recently, various data fusion methodologies have been proposed for multimodal sentiment analysis. In most cases, textual modality plays a major role, and visual and acoustic modalities are used as auxiliary sources for multimodal sentiment analysis. However, in general multimedia such as video, text transcripts of an individual's speech are not provided. Research on an audio-visual sentiment analysis methodology that does not depend on text modality is essential for multimodal sentiment analysis in real-world industrial applications. Therefore, it is important to improve audio-visual sentiment analysis because it currently exhibits lower performance than multimodal sentiment analysis, including text modality. In this paper, we propose heterogeneous modality transfer learning (HMTL) to utilize the knowledge of aligned text data as a source modality in transfer learning to improve audio-visual sentiment analysis performance. Our approach uses a decoder and adversarial learning techniques to reduce the gap between the source and target modalities in the embedded space for multimodal representation. Our proposed methodology experimentally outperformed recent unimodal and bimodal audio-visual sentiment analysis achievements.
This study proposes a framework for an intelligent agent information service using digital human and deep learning technology. The framework can recognize the identity of individuals using facial ...features and provide personalized services through a digital human. The personalized service is defined by a relevance graph based on personal data collected in advance. The proposed system can continuously evolve to recommend customized services using relevance graphs and dynamic data processing, gradually become more intelligent using additionally collected data. Moreover, it uses animation keyframe interpolation for natural and seamless digital human interaction and provides visual effects that are synchronized based on specific information collected for the intuitive service. The proposed system was tested on a school domain for two months, and a statistical domain feedback system based on a mathematical model that predicts service usage per unit time was developed using the recorded information. Additionally, we evaluate our system through user experience surveys.
•Detecting and categorizing the criticalness of Fusarium wilt of radish based on thresholding a range of color features.•Segmenting the radish regions from other regions of the field such as ground ...and mulching film using K-means clustering.•Proposing two datasets which will be share publicly for further experimentation and simulations.
The significant role of plants can be observed through the dependency of animals and humans on them. Oxygen, materials, food and the beauty of the world are contributed by plants. Climate change, the decrease in pollinators, and plant diseases are causing a significant decline in both quality and coverage ratio of the plants and crops on a global scale. In developed countries, above 80 percent of rural production is produced by sharecropping. However, due to widespread diseases in plants, yields are reported to have declined by more than a half. These diseases are identified and diagnosed by the agricultural and forestry department. Manual inspection on a large area of fields requires a huge amount of time and effort, thereby reduces the effectiveness significantly. To counter this problem, we propose an automatic disease detection and classification method in radish fields by using a camera attached to an unmanned aerial vehicle (UAV) to capture high quality images from the fields and analyze them by extracting both color and texture features, then we used K-means clustering to filter radish regions and feeds them into a fine-tuned GoogleNet to detect Fusarium wilt of radish efficiently at early stage and allow the authorities to take timely action which ensures the food safety for current and future generations.
Across the globe, health cognizant among the people is increasing and everyone wants to maintain a healthy and normal life. But due to the fast moving world, obesity and other related issue becomes ...the major health problem among the human beings. According to medical experts, a person is defined as obese when their BMI is greater than 30 kg/m
2
. Obesity leads to many diseases like high cholesterol, liver failure, breathing issues, heart problems, diabetes and sometimes cancer. By eating healthy foods with high nutrition and low calorie values, we can control the obesity among the people. Human cannot control their appetite and have the nature of eating food which they like the most which leads to obesity. Many people have the difficulty in choosing the food items that have good nutrient and low calorific values. If a system can help the people and give them suggestions about the food and its calorific values, we can find a solution for this obesity problem. In this paper, identifying the food type and its calorific value estimation is done using multilayer perceptron model and the results are discussed. From the mixed food items, region of interest is selected from which the features are extracted. Extracted features are fed as the input to the MLP. Based on the food volume, the calories present in the food are calculated. Implementation of the algorithm is done in MATLAB environment for fruits and food items. The results showed that the level of detection of food item and accuracy of estimation of calorific level was acceptable.
Physical trauma-related mortality places a heavy burden on society. Estimating the mortality risk in physical trauma patients is crucial to enhance treatment efficiency and reduce this burden. The ...most popular and accurate model is the Injury Severity Score (ISS), which is based on the Abbreviated Injury Scale (AIS), an anatomical injury severity scoring system. However, the AIS requires specialists to code the injury scale by reviewing a patient's medical record; therefore, applying the model to every hospital is impossible.
We aimed to develop an artificial intelligence (AI) model to predict in-hospital mortality in physical trauma patients using the International Classification of Disease 10th Revision (ICD-10), triage scale, procedure codes, and other clinical features.
We used the Korean National Emergency Department Information System (NEDIS) data set (N=778,111) compiled from over 400 hospitals between 2016 and 2019. To predict in-hospital mortality, we used the following as input features: ICD-10, patient age, gender, intentionality, injury mechanism, and emergent symptom, Alert/Verbal/Painful/Unresponsive (AVPU) scale, Korean Triage and Acuity Scale (KTAS), and procedure codes. We proposed the ensemble of deep neural networks (EDNN) via 5-fold cross-validation and compared them with other state-of-the-art machine learning models, including traditional prediction models. We further investigated the effect of the features.
Our proposed EDNN with all features provided the highest area under the receiver operating characteristic (AUROC) curve of 0.9507, outperforming other state-of-the-art models, including the following traditional prediction models: Adaptive Boosting (AdaBoost; AUROC of 0.9433), Extreme Gradient Boosting (XGBoost; AUROC of 0.9331), ICD-based ISS (AUROC of 0.8699 for an inclusive model and AUROC of 0.8224 for an exclusive model), and KTAS (AUROC of 0.1841). In addition, using all features yielded a higher AUROC than any other partial features, namely, EDNN with the features of ICD-10 only (AUROC of 0.8964) and EDNN with the features excluding ICD-10 (AUROC of 0.9383).
Our proposed EDNN with all features outperforms other state-of-the-art models, including the traditional diagnostic code-based prediction model and triage scale.