Macular edema (ME) is a retinal condition in which central vision of a patient is affected. ME leads to accumulation of fluid in the surrounding macular region resulting in a swollen macula. Optical ...coherence tomography (OCT) and the fundus photography are the two widely used retinal examination techniques that can effectively detect ME. Many researchers have utilized retinal fundus and OCT imaging for detecting ME. However, to the best of our knowledge, no work is found in the literature that fuses the findings from both retinal imaging modalities for the effective and more reliable diagnosis of ME. In this paper, we proposed an automated framework for the classification of ME and healthy eyes using retinal fundus and OCT scans. The proposed framework is based on deep ensemble learning where the input fundus and OCT scans are recognized through the deep convolutional neural network (CNN) and are processed accordingly. The processed scans are further passed to the second layer of the deep CNN model, which extracts the required feature descriptors from both images. The extracted descriptors are then concatenated together and are passed to the supervised hybrid classifier made through the ensemble of the artificial neural networks, support vector machines and naïve Bayes. The proposed framework has been trained on 73,791 retinal scans and is validated on 5100 scans of publicly available Zhang dataset and Rabbani dataset. The proposed framework achieved the accuracy of 94.33% for diagnosing ME and healthy subjects and achieved the mean dice coefficient of 0.9019 ± 0.04 for accurately extracting the retinal fluids, 0.7069 ± 0.11 for accurately extracting hard exudates and 0.8203 ± 0.03 for accurately extracting retinal blood vessels against the clinical markings.
Tomatoes are a major crop worldwide, and accurately classifying their maturity is important for many agricultural applications, such as harvesting, grading, and quality control. In this paper, the ...authors propose a novel method for tomato maturity classification using a convolutional transformer. The convolutional transformer is a hybrid architecture that combines the strengths of convolutional neural networks (CNNs) and transformers. Additionally, this study introduces a new tomato dataset named KUTomaData, explicitly designed to train deep-learning models for tomato segmentation and classification. KUTomaData is a compilation of images sourced from a greenhouse in the UAE, with approximately 700 images available for training and testing. The dataset is prepared under various lighting conditions and viewing perspectives and employs different mobile camera sensors, distinguishing it from existing datasets. The contributions of this paper are threefold: firstly, the authors propose a novel method for tomato maturity classification using a modular convolutional transformer. Secondly, the authors introduce a new tomato image dataset that contains images of tomatoes at different maturity levels. Lastly, the authors show that the convolutional transformer outperforms state-of-the-art methods for tomato maturity classification. The effectiveness of the proposed framework in handling cluttered and occluded tomato instances was evaluated using two additional public datasets, Laboro Tomato and Rob2Pheno Annotated Tomato, as benchmarks. The evaluation results across these three datasets demonstrate the exceptional performance of our proposed framework, surpassing the state-of-the-art by 58.14%, 65.42%, and 66.39% in terms of mean average precision scores for KUTomaData, Laboro Tomato, and Rob2Pheno Annotated Tomato, respectively. This work can potentially improve the efficiency and accuracy of tomato harvesting, grading, and quality control processes.
This paper presents a deep learning-driven portable, accurate, low-cost, and easy-to-use device to perform Reverse-Transcription Loop-Mediated Isothermal Amplification (RT-LAMP) to facilitate rapid ...detection of COVID-19. The 3D-printed device-powered using only a 5 Volt AC-DC adapter-can perform 16 simultaneous RT-LAMP reactions and can be used multiple times. Moreover, the experimental protocol is devised to obviate the need for separate, expensive equipment for RNA extraction in addition to eliminating sample evaporation. The entire process from sample preparation to the qualitative assessment of the LAMP amplification takes only 45 min (10 min for pre-heating and 35 min for RT-LAMP reactions). The completion of the amplification reaction yields a fuchsia color for the negative samples and either a yellow or orange color for the positive samples, based on a pH indicator dye. The device is coupled with a novel deep learning system that automatically analyzes the amplification results and pays attention to the pH indicator dye to screen the COVID-19 subjects. The proposed device has been rigorously tested on 250 RT-LAMP clinical samples, where it achieved an overall specificity and sensitivity of 0.9666 and 0.9722, respectively with a recall of 0.9892 for C
< 30. Also, the proposed system can be widely used as an accurate, sensitive, rapid, and portable tool to detect COVID-19 in settings where access to a lab is difficult, or the results are urgently required.
Screening baggage against potential threats has become one of the prime aviation security concerns all over the world, where manual detection of prohibited items is a time-consuming and hectic ...process. Many researchers have developed autonomous systems to recognize baggage threats using security X-ray scans. However, all of these frameworks are vulnerable against screening cluttered and concealed contraband items. Furthermore, to the best of our knowledge, no framework possesses the capacity to recognize baggage threats across multiple scanner specifications without an explicit retraining process. To overcome this, we present a novel meta-transfer learning-driven tensor-shot detector that decomposes the candidate scan into dual-energy tensors and employs a meta-one-shot classification backbone to recognize and localize the cluttered baggage threats. In addition, the proposed detection framework can be well-generalized to multiple scanner specifications due to its capacity to generate object proposals from the unified tensor maps rather than diversified raw scans. We have rigorously evaluated the proposed tensor-shot detector on the publicly available SIXray and GDXray datasets (containing a cumulative of 1,067,381 grayscale and colored baggage X-ray scans). On the SIXray dataset, the proposed framework achieved a mean average precision (mAP) of 0.6457, and on the GDXray dataset, it achieved the precision and F1 score of 0.9441 and 0.9598, respectively. Furthermore, it outperforms state-of-the-art frameworks by 8.03% in terms of mAP, 1.49% in terms of precision, and 0.573% in terms of F1 on the SIXray and GDXray dataset, respectively.
Automated systems designed for screening contraband items from the X-ray imagery are still facing difficulties with high clutter, concealment, and extreme occlusion. In this paper, we addressed this ...challenge using a novel multi-scale contour instance segmentation framework that effectively identifies the cluttered contraband data within the baggage X-ray scans. Unlike standard models that employ region-based or keypoint-based techniques to generate multiple boxes around objects, we propose to derive proposals according to the hierarchy of the regions defined by the contours. The proposed framework is rigorously validated on three public datasets, dubbed GDXray, SIXray, and OPIXray, where it outperforms the state-of-the-art methods by achieving the mean average precision score of 0.9779, 0.9614, and 0.8396, respectively. Furthermore, to the best of our knowledge, this is the first contour instance segmentation framework that leverages multi-scale information to recognize cluttered and concealed contraband data from the colored and grayscale security X-ray imagery.
In anti-vascular endothelial growth factor (anti-VEGF) therapy, an accurate estimation of multi-class retinal fluid (MRF) is required for the activity prescription and intravitreal dose. This study ...proposes an end-to-end deep learning-based retinal fluids segmentation network (RFS-Net) to segment and recognize three MRF lesion manifestations, namely, intraretinal fluid (IRF), subretinal fluid (SRF), and pigment epithelial detachment (PED), from multi-vendor optical coherence tomography (OCT) imagery. The proposed image analysis tool will optimize anti-VEGF therapy and contribute to reducing the inter- and intra-observer variability.
The proposed RFS-Net architecture integrates the atrous spatial pyramid pooling (ASPP), residual, and inception modules in the encoder path to learn better features and conserve more global information for precise segmentation and characterization of MRF lesions. The RFS-Net model is trained and validated using OCT scans from multiple vendors (Topcon, Cirrus, Spectralis), collected from three publicly available datasets. The first dataset consisted of OCT volumes obtained from 112 subjects (a total of 11,334 B-scans) is used for both training and evaluation purposes. Moreover, the remaining two datasets are only used for evaluation purposes to check the trained RFS-Net's generalizability on unseen OCT scans. The two evaluation datasets contain a total of 1572 OCT B-scans from 1255 subjects. The performance of the proposed RFS-Net model is assessed through various evaluation metrics.
The proposed RFS-Net model achieved the mean F1 scores of 0.762, 0.796, and 0.805 for segmenting IRF, SRF, and PED. Moreover, with the automated segmentation of the three retinal manifestations, the RFS-Net brings a considerable gain in efficiency compared to the tedious and demanding manual segmentation procedure of the MRF.
Our proposed RFS-Net is a potential diagnostic tool for the automatic segmentation of MRF (IRF, SRF, and PED) lesions. It is expected to strengthen the inter-observer agreement, and standardization of dosimetry is envisaged as a result.
Display omitted
•A deep learning based retinal fluids segmentation network (RFS-Net) is proposed.•It segments multi-class retinal fluid (MRF) lesions (IRF, SRF and PED) in OCT scan.•RFS-Net integrates a comprehensive pre-processing mechanism to boost the performance.•Using recurring ASPP modules, RFS-Net learns inherent multiscale MRF features.•RFS-Net outperforms state-of-the-art schemes in segmentation of MRF lesions in OCT.
Human beings tend to incrementally learn from the rapidly changing environment without comprising or forgetting the already learned representations. Although deep learning also has the potential to ...mimic such human behaviors to some extent, it suffers from catastrophic forgetting due to which its performance on already learned tasks drastically decreases while learning about newer knowledge. Many researchers have proposed promising solutions to eliminate such catastrophic forgetting during the knowledge distillation process. However, to our best knowledge, there is no literature available to date that exploits the complex relationships between these solutions and utilizes them for the effective learning that spans over multiple datasets and even multiple domains. In this paper, we propose a continual learning objective that encompasses mutual distillation loss to understand such complex relationships and allows deep learning models to effectively retain the prior knowledge while adapting to the new classes, new datasets, and even new applications. The proposed objective was rigorously tested on nine publicly available, multi-vendor, and multimodal datasets that span over three applications, and it achieved the top-1 accuracy of 0.9863% and an F1-score of 0.9930.
Currently, most grasping systems are designed to grasp the static objects only, and grasping dynamic objects has received less attention in the literature. For the traditional manipulation scheme, ...achieving dynamic grasping requires either a highly precise dynamic model or sophisticated predefined grasping states and gestures, both of which are hard to obtain and tedious to design. In this paper, we develop a novel reinforcement learning (RL)-based dynamic grasping framework with a trajectory prediction module to address these issues. In particular, we divide dynamic grasping into two parts: RL-based grasping strategies learning and trajectory prediction. In the simulation setting, an RL agent is trained to grasp a static object. When this well-trained agent is transferred to the real world, the observation has been augmented with the predicted one from an LSTM-based trajectory prediction module. We validated the proposed method through an experimental setup involving a Baxter manipulator with two finger grippers and an object placed on a moving car. We also evaluated how well RL performs both with and without our intended trajectory prediction. Experiment results demonstrate that our method can grasp the object on different trajectories at various speeds.
Internet-of-Things (IoT)-based sensor networks have gained unprecedented popularity in recent years and they become crucial for supporting high data rate real-time applications. For efficient data ...transmission within IoT networks, it is necessary that each IoT node learns and adapts itself to recent time/spectral characteristics of channels to maximize the throughput and perform channel swapping wherever required. Many researchers have proposed channel allocation and channel quality measurement protocols within multichannel sensor networks. However, to the best of our knowledge, there is no literature available that proposes an automated and adaptive protocol that can learn and adapt according to changing channel characteristics in IoT network for achieving maximum data transmission and throughput. Therefore, this paper proposes a fully automated self-learning and adaptive protocol which can automatically transmit multiuser data by efficiently utilizing channel time/spectral characteristics. The proposed protocol is unique as it learns and adapts itself to the increasing network density based upon the network metrics. It also allows each node within IoT network to automatically detect the neighboring channel attributes so that they can swap channels to achieve maximum data transfer. This is accomplished by continuously extracting distinct features from the network topology. After extracting these features, the proposed protocol efficiently selects the best channel for an incoming node, provides the best channel utilization based upon its time/spectral attributes, and detects and allocates the unused spectrum of neighboring channels through multistage Gaussian radial basis function and multilayer perceptron-based nonlinear support vector machines classification model. Simulation results demonstrate the supremacy of the proposed protocol in terms of throughput, successful reporting probability, average blocking probability, fairness, and classification accuracy.