•To the best of our knowledge, Mask-CNN is the first end-to-end model that selects deep convolutional descriptors for object recognition, especially for fine-grained image recognition.•We present a ...novel and efficient part-based three-stream model for fine-grained recognition. By discarding the fully connected layers, the proposed M-CNN is computationally efficient (cf. Table 1 and Table 4 in experiments). Additionally, comparing with state-of-the-art methods, M-CNN has smaller feature dimensionality. Beyond those, it achieves the highest classification accuracy on CUB200-2011 and Birdsnap among published methods.•The part localization performance of the proposed model outperforms other part-based finegrained approaches which requires additional bounding boxes. In particular, M-CNN is 12.76% higher than state-of-the-art for head localization on CUB200-2011.
Fine-grained image recognition is a challenging computer vision problem, due to the small inter-class variations caused by highly similar subordinate categories, and the large intra-class variations in poses, scales and rotations. In this paper, we prove that selecting useful deep descriptors contributes well to fine-grained image recognition. Specifically, a novel Mask-CNN model without the fully connected layers is proposed. Based on the part annotations, the proposed model consists of a fully convolutional network to both locate the discriminative parts (e.g., head and torso), and more importantly generate weighted object/part masks for selecting useful and meaningful convolutional descriptors. After that, a three-stream Mask-CNN model is built for aggregating the selected object- and part-level descriptors simultaneously. Thanks to discarding the parameter redundant fully connected layers, our Mask-CNN has a small feature dimensionality and efficient inference speed by comparing with other fine-grained approaches. Furthermore, we obtain a new state-of-the-art accuracy on two challenging fine-grained bird species categorization datasets, which validates the effectiveness of both the descriptor selection scheme and the proposed Mask-CNN model.
FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, ...shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. This paper describes a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing. SeqKit provides executable binary files for all major operating systems, including Windows, Linux, and Mac OSX, and can be directly used without any dependencies or pre-configurations. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations. SeqKit is open source and available on Github at https://github.com/shenwei356/seqkit.
Humans are capable of learning a new fine-grained concept with very little supervision, e.g., few exemplary images for a species of bird, yet our best deep learning systems need hundreds or thousands ...of labeled examples. In this paper, we try to reduce this gap by studying the fine-grained image recognition problem in a challenging few-shot learning setting, termed few-shot fine-grained recognition (FSFG). The task of FSFG requires the learning systems to build classifiers for the novel fine-grained categories from few examples (only one or less than five). To solve this problem, we propose an end-to-end trainable deep network, which is inspired by the state-of-the-art fine-grained recognition model and is tailored for the FSFG task. Specifically, our network consists of a bilinear feature learning module and a classifier mapping module: while the former encodes the discriminative information of an exemplar image into a feature vector, the latter maps the intermediate feature into the decision boundary of the novel category. The key novelty of our model is a "piecewise mappings" function in the classifier mapping module, which generates the decision boundary via learning a set of more attainable sub-classifiers in a more parameter-economic way. We learn the exemplar-to-classifier mapping based on an auxiliary dataset in a meta-learning fashion, which is expected to be able to generalize to novel categories. By conducting comprehensive experiments on three fine-grained datasets, we demonstrate that the proposed method achieves superior performance over the competing baselines.
With the rapid development of artificial intelligence, the simulation of the human brain for neuromorphic computing has demonstrated unprecedented progress. Photonic artificial synapses are strongly ...desirable owing to their higher neuron selectivity, lower crosstalk, wavelength multiplexing capabilities, and low operating power compared to their electric counterparts. This study demonstrates a highly transparent and flexible artificial synapse with a two‐terminal architecture that emulates photonic synaptic functionalities. This optically triggered artificial synapse exhibits clear synaptic characteristics such as paired‐pulse facilitation, short/long‐term memory, and synaptic behavior analogous to that of the iris in the human eye. Ultraviolet light illumination‐induced neuromorphic characteristics exhibited by the synapse are attributed to carrier trapping and detrapping in the SnO2 nanoparticles and CsPbCl3 perovskite interface. Moreover, the ability to detect deep red light without changes in synaptic behavior indicates the potential for dual‐mode operation. This study establishes a novel two‐terminal architecture for highly transparent and flexible photonic artificial synapse that can help facilitate higher integration density of transparent 3D stacking memristors, and make it possible to approach optical learning, memory, computing, and visual recognition.
An inorganic CsPbCl3 perovskite artificial photonic synapse is demonstrated for the first time. This work shows the promising potential of multilevel storage capacity devices that can emulate synaptic functionalities via tuning of light intensity and frequency. The two‐terminal architecture synapse device exhibits the potential of dual‐mode operation, high transparency, and flexibility, which enable optical learning, memory, computing, and visual recognition.
Deep Q-Network (DQN), as one type of deep reinforcement learning model, targets to train an intelligent agent that acquires optimal actions while interacting with an environment. The model is well ...known for its ability to surpass professional human players across many Atari 2600 games. Despite the superhuman performance, in-depth understanding of the model and interpreting the sophisticated behaviors of the DQN agent remain to be challenging tasks, due to the long-time model training process and the large number of experiences dynamically generated by the agent. In this work, we propose DQNViz, a visual analytics system to expose details of the blind training process in four levels, and enable users to dive into the large experience space of the agent for comprehensive analysis. As an initial attempt in visualizing DQN models, our work focuses more on Atari games with a simple action space, most notably the Breakout game. From our visual analytics of the agent's experiences, we extract useful action/reward patterns that help to interpret the model and control the training. Through multiple case studies conducted together with deep learning experts, we demonstrate that DQNViz can effectively help domain experts to understand, diagnose, and potentially improve DQN models.
Scalable Algorithms for Multi-Instance Learning Wei, Xiu-Shen; Wu, Jianxin; Zhou, Zhi-Hua
IEEE transaction on neural networks and learning systems,
04/2017, Letnik:
28, Številka:
4
Journal Article
Multi-instance learning (MIL) has been widely applied to diverse applications involving complicated data objects, such as images and genes. However, most existing MIL algorithms can only handle ...small- or moderate-sized data. In order to deal with large-scale MIL problems, we propose MIL based on the vector of locally aggregated descriptors representation (miVLAD) and MIL based on the Fisher vector representation (miFV), two efficient and scalable MIL algorithms. They map the original MIL bags into new vector representations using their corresponding mapping functions. The new feature representations keep essential bag-level information, and at the same time lead to excellent MIL performances even when linear classifiers are used. Thanks to the low computational cost in the mapping step and the scalability of linear classifiers, miVLAD and miFV can handle large-scale MIL data efficiently and effectively. Experiments show that miVLAD and miFV not only achieve comparable accuracy rates with the state-of-the-art MIL algorithms, but also have hundreds of times faster speed. Moreover, we can regard the new miVLAD and miFV representations as multiview data, which improves the accuracy rates in most cases. In addition, our algorithms perform well even when they are used without parameter tuning (i.e., adopting the default parameters), which is convenient for practical MIL applications.
As an indispensable component of recombinant fusion proteins, linkers have shown increasing importance in the construction of stable, bioactive fusion proteins. This review covers the current ...knowledge of fusion protein linkers and summarizes examples for their design and application. The general properties of linkers derived from naturally-occurring multi-domain proteins can be considered as the foundation in linker design. Empirical linkers designed by researchers are generally classified into 3 categories according to their structures: flexible linkers, rigid linkers, and in vivo cleavable linkers. Besides the basic role in linking the functional domains together (as in flexible and rigid linkers) or releasing the free functional domain in vivo (as in in vivo cleavable linkers), linkers may offer many other advantages for the production of fusion proteins, such as improving biological activity, increasing expression yield, and achieving desirable pharmacokinetic profiles.
Display omitted
The large number of potential applications from bridging web data with knowledge bases have led to an increase in the entity linking research. Entity linking is the task to link entity mentions in ...text with their corresponding entities in a knowledge base. Potential applications include information extraction, information retrieval, and knowledge base population. However, this task is challenging due to name variations and entity ambiguity. In this survey, we present a thorough overview and analysis of the main approaches to entity linking, and discuss various applications, the evaluation of entity linking systems, and future directions.
Deep Neural Networks (DNNs) have been extensively used in multiple disciplines due to their superior performance. However, in most cases, DNNs are considered as black-boxes and the interpretation of ...their internal working mechanism is usually challenging. Given that model trust is often built on the understanding of how a model works, the interpretation of DNNs becomes more important, especially in safety-critical applications (e.g., medical diagnosis, autonomous driving). In this paper, we propose DeepVID, a Deep learning approach to Visually Interpret and Diagnose DNN models, especially image classifiers. In detail, we train a small locally-faithful model to mimic the behavior of an original cumbersome DNN around a particular data instance of interest, and the local model is sufficiently simple such that it can be visually interpreted (e.g., a linear model). Knowledge distillation is used to transfer the knowledge from the cumbersome DNN to the small model, and a deep generative model (i.e., variational auto-encoder) is used to generate neighbors around the instance of interest. Those neighbors, which come with small feature variances and semantic meanings, can effectively probe the DNN's behaviors around the interested instance and help the small model to learn those behaviors. Through comprehensive evaluations, as well as case studies conducted together with deep learning experts, we validate the effectiveness of DeepVID.
Generative models bear promising implications to learn data representations in an unsupervised fashion with deep learning. Generative Adversarial Nets (GAN) is one of the most popular frameworks in ...this arena. Despite the promising results from different types of GANs, in-depth understanding on the adversarial training process of the models remains a challenge to domain experts. The complexity and the potential long-time training process of the models make it hard to evaluate, interpret, and optimize them. In this work, guided by practical needs from domain experts, we design and develop a visual analytics system, GANViz, aiming to help experts understand the adversarial process of GANs in-depth. Specifically, GANViz evaluates the model performance of two subnetworks of GANs, provides evidence and interpretations of the models' performance, and empowers comparative analysis with the evidence. Through our case studies with two real-world datasets, we demonstrate that GANViz can provide useful insight into helping domain experts understand, interpret, evaluate, and potentially improve GAN models.