We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual and sequential data. Our network uses group point-wise and depth-wise ...dilated separable convolutions to learn representations from a large effective receptive field with fewer FLOPs and parameters. The performance of our network is evaluated on four different tasks: (1) object classification, (2) semantic segmentation, (3) object detection, and (4) language modeling. Experiments on these tasks, including image classification on the ImageNet and language modeling on the PenTree bank dataset, demonstrate the superior performance of our method over the state-of-the-art methods. Our network outperforms ESPNet by 4-5% and has 2-4x fewer FLOPs on the PASCAL VOC and the Cityscapes dataset. Compared to YOLOv2 on the MS-COCO object detection, ESPNetv2 delivers 4.4% higher accuracy with 6x fewer FLOPs. Our experiments show that ESPNetv2 is much more power efficient than existing state-of-the-art efficient methods including ShuffleNets and MobileNets. Our code is open-source and available at https://github.com/sacmehta/ESPNetv2.
Visual Question Answering (VQA) in its ideal form lets us study reasoning in the joint space of vision and language and serves as a proxy for the AI task of scene understanding. However, most VQA ...benchmarks to date are focused on questions such as simple counting, visual attributes, and object detection that do not require reasoning or knowledge beyond what is in the image. In this paper, we address the task of knowledge-based visual question answering and provide a benchmark, called OK-VQA, where the image content is not sufficient to answer the questions, encouraging methods that rely on external knowledge resources. Our new dataset includes more than 14,000 questions that require external knowledge to answer. We show that the performance of the state-of-the-art VQA models degrades drastically in this new setting. Our analysis shows that our knowledge-based VQA task is diverse, difficult, and large compared to previous knowledge-based VQA datasets. We hope that this dataset enables researchers to open up new avenues for research in this domain.
Learning is an inherently continuous phenomenon. When humans learn a new task there is no explicit distinction between training and inference. As we learn a task, we keep learning about it while ...performing the task. What we learn and how we learn it varies during different stages of learning. Learning how to learn and adapt is a key property that enables us to generalize effortlessly to new settings. This is in contrast with conventional settings in machine learning where a trained model is frozen during inference. In this paper we study the problem of learning to learn at both training and test time in the context of visual navigation. A fundamental challenge in navigation is generalization to unseen scenes. In this paper we propose a self-adaptive visual navigation method (SAVN) which learns to adapt to new environments without any explicit supervision. Our solution is a meta-reinforcement learning approach where an agent learns a self-supervised interaction loss that encourages effective navigation. Our experiments, performed in the AI2-THOR framework, show major improvements in both success rate and SPL for visual navigation in novel scenes. Our code and data are available at: https://github.com/allenai/savn.
A polyacrylonitrile (PAN) coating was fabricated by electrospinning technique for using as a new extraction phase in mechanical stir bar sorptive extraction (MSBSE). The setup coupled with gas ...chromatography (GC) was proposed and applied for the extraction and determination of some low molecular weight polycyclic aromatic hydrocarbons (PAHs) in water samples. The effective parameters on the extraction process were optimized by the response surface methodology (RSM) based on the central composite design (CCD). Under the optimum conditions, the limits of detection (LOD) and limits of qualification (LOQ) were 0.008–0.4 ng mL
−1
and 0.02–1.3 ng mL
−1
, respectively. The relative standard deviations (RSDs) for the analytes, including acenaphthene (Ace), anthracene (Ant), naphthalene (Nap) and phenanthrene (Phe), was less than 12.7%. The relative recoveries (RRs) were above 94%. The extraction of the analytes by the MSBSE coated with PAN consumed about 256 kWh m
−3
of electric energy. The coating can be used 200 times without significant changes in its performance because of hydrophobicity and the high mechanical strength of PAN. The proposed approach to improve SBSE for extracting PAHs from water samples is cost-effective and time-efficient because the introduced coating can be fabricated in a short time and by inexpensive raw materials.
Graphic Abstract
LCNN: Lookup-Based Convolutional Neural Network Bagherinezhad, Hessam; Rastegari, Mohammad; Farhadi, Ali
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
07/2017
Conference Proceeding
Odprti dostop
Porting state of the art deep learning algorithms to resource constrained compute platforms (e.g. VR, AR, wearables) is extremely challenging. We propose a fast, compact, and accurate model for ...convolutional neural networks that enables efficient learning and inference. We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs. Training LCNN involves jointly learning a dictionary and a small set of linear combinations. The size of the dictionary naturally traces a spectrum of trade-offs between efficiency and accuracy. Our experimental results on ImageNet challenge show that LCNN can offer 3.2x speedup while achieving 55.1% top-1 accuracy using AlexNet architecture. Our fastest LCNN offers 37.6x speed up over AlexNet while maintaining 44.3% top-1 accuracy. LCNN not only offers dramatic speed ups at inference, but it also enables efficient training. In this paper, we show the benefits of LCNN in few-shot learning and few-iteration learning, two crucial aspects of on-device training of deep learning models.
We present images with binary codes in a way that balances discrimination and learnability of the codes. In our method, each image claims its own code in a way that maintains discrimination while ...being predictable from visual data. Category memberships are usually good proxies for visual similarity but should not be enforced as a hard constraint. Our method learns codes that maximize separability of categories unless there is strong visual evidence against it. Simple linear SVMs can achieve state-of-the-art results with our short codes. In fact, our method produces state-of-the-art results on Caltech256 with only 128-dimensional bit vectors and outperforms state of the art by using longer codes. We also evaluate our method on ImageNet and show that our method outperforms state-of-the-art binary code methods on this large scale dataset. Lastly, our codes can discover a discriminative set of attributes.
Rapid industrialization and urbanization have resulted in environmental pollution and unsustainable development of cities. The concentration of 12 potentially toxic metal(loid)s in windowsill dust ...samples (n = 50) were investigated from different functional areas of Qom city with the highest level of urbanization in Iran. Spatial analyses (ArcGIS 10.3) and multivariate statistics including Principal Component Analysis and Spearman correlation (using STATISTICA-V.12) were adopted to scrutinize the possible sources of pollution. The windowsill dust was very highly enriched with Sb (50 mg/kg) and Pb (1686 mg/kg). Modified degree of contamination (mC
) and the pollution load indices (PLI
) indicate that windowsill dust in all functional areas was polluted in the order of industrial > commercial > residential > green space. Arsenic, Cd, Mo, Pb, Sb, Cu, and Zn were sourced from a mixture of traffic and industrial activities, while Mn in the dust mainly stemmed from mining activities. Non-carcinogenic health risk (HI) showed chronic exposure of Pb for children in the industrial zone (HI = 1.73). The estimations suggest the possible carcinogenic risk of As, Pb, and Cr in the dust. The findings of this study reveal poor environmental management of the city. Emergency plans should be developed to minimize the health risks of dust to residents.
A polyacrylonitrile (PAN)/polydimethylsiloxane (PDMS) coating was electrospun on the shaft of the mechanical stir bar sorptive extraction (MSBSE) setup. MSBSE equipped with the PAN/PDMS coating and ...coupled with gas chromatography/flame ionization detector (MSBSE-PAN/PDMS-GC/FID) was used to determine some polycyclic aromatic hydrocarbons (PAHs) in non-alcoholic beer samples. The effective parameters including extraction time, desorption time, stirring speed and sample volume were optimized by the response surface methodology (RSM) based on the central composite design (CCD). The optimum condition was obtained in 60 min of extraction time, 10 min of desorption time, 1300 rpm of stirring speed and 16 mL of sample volume. The calibration curves of PAHs were linear in the range 5–1000 ng mL
−1
. The limit of detections (LODs) was in the range of 0.009–0.5 ng mL
−1
and the limit of qualifications (LOQs) was varied between 0.03 and 1.5 ng mL
−1
. The relative recoveries and enrichment factors were in the range of 94–98% and 117–126, respectively, and the relative standard deviations (RSDs) were below 11.4%.
Graphic abstract