The Flickr30k dataset has become a standard benchmark for sentence-based image description. This paper presents Flickr30k Entities, which augments the 158k captions from Flickr30k with 244k ...coreference chains, linking mentions of the same entities across different captions for the same image, and associating them with 276k manually annotated bounding boxes. Such annotations are essential for continued progress in automatic image description and grounded language understanding. They enable us to define a new benchmark for localization of textual entity mentions in an image. We present a strong baseline for this task that combines an image-text embedding, detectors for common objects, a color classifier, and a bias towards selecting larger objects. While our baseline rivals in accuracy more complex state-of-the-art models, we show that its gains cannot be easily parlayed into improvements on such tasks as image-sentence retrieval, thus underlining the limitations of current methods and the need for further research.
Existing super-resolution fluorescence microscopes compromise acquisition speed to provide subdiffractive sample information. We report an analog implementation of structured illumination microscopy ...that enables three-dimensional (3D) super-resolution imaging with a lateral resolution of 145 nm and an axial resolution of 350 nm at acquisition speeds up to 100 Hz. By using optical instead of digital image-processing operations, we removed the need to capture, store and combine multiple camera exposures, increasing data acquisition rates 10- to 100-fold over other super-resolution microscopes and acquiring and displaying super-resolution images in real time. Low excitation intensities allow imaging over hundreds of 2D sections, and combined physical and computational sectioning allow similar depth penetration to spinning-disk confocal microscopy. We demonstrate the capability of our system by imaging fine, rapidly moving structures including motor-driven organelles in human lung fibroblasts and the cytoskeleton of flowing blood cells within developing zebrafish embryos.
In a disaster situation, local and municipal governments need to distribute relief supplies and provide administrative support to evacuees. Although people are supposed to evacuate to evacuation ...shelters designated by local governments, some people take refuge at non-designated facilities, called
non-designated evacuation shelters
, due to unavoidable circumstances such as damages on the access routes to designated evacuation shelters. Upon occurrence of a disaster, therefore, it is necessary for the local governments to quickly find the locations of non-designated evacuation shelters. In this paper, we propose a method to detect non-designated evacuation shelters based on autoencoder (AE)-based anomaly detection using real-time population dynamics generated from operation data of cellular phone networks. We assume that reconstruction errors of an AE model include both the errors due to characteristic differences between locations and the errors due to anomalies in population dynamics. Thus, we propose to use the ratio of the reconstruction error before and after the earthquake to determine the threshold of anomaly detection. We evaluate the performance of the proposed method on data from three actual earthquakes in Japan. The evaluation results show that our reconstruction-error-based approach can achieve better accuracy for the actual disaster data compared to a baseline method that exploits statistical anomaly detection.
In the era of "big data," science is increasingly information driven, and the potential for computers to store, manage, and integrate massive amounts of data has given rise to such new disciplinary ...fields as biomedical informatics. Applied ontology offers a strategy for the organization of scientific information in computer-tractable form, drawing on concepts not only from computer and information science but also from linguistics, logic, and philosophy. This book provides an introduction to the field of applied ontology that is of particular relevance to biomedicine, covering theoretical components of ontologies, best practices for ontology design, and examples of biomedical ontologies in use.After defining an ontology as a representation of the types of entities in a given domain, the book distinguishes between different kinds of ontologies and taxonomies, and shows how applied ontology draws on more traditional ideas from metaphysics. It presents the core features of the Basic Formal Ontology (BFO), now used by over one hundred ontology projects around the world, and offers examples of domain ontologies that utilize BFO. The book also describes Web Ontology Language (OWL), a common framework for Semantic Web technologies. Throughout, the book provides concrete recommendations for the design and construction of domain ontologies.
Interpersonal relation defines the association, e.g., warm, friendliness, and dominance, between two or more people. We investigate if such fine-grained and high-level relation traits can be ...characterized and quantified from face images in the wild. We address this challenging problem by first studying a deep network architecture for robust recognition of facial expressions. Unlike existing models that typically learn from facial expression labels alone, we devise an effective multitask network that is capable of learning from rich auxiliary attributes such as gender, age, and head pose, beyond just facial expression data. While conventional supervised training requires datasets with complete labels (e.g., all samples must be labeled with gender, age, and expression), we show that this requirement can be relaxed via a novel attribute propagation method. The approach further allows us to leverage the inherent correspondences between heterogeneous attribute sources despite the disparate distributions of different datasets. With the network we demonstrate state-of-the-art results on existing facial expression recognition benchmarks. To predict inter-personal relation, we use the expression recognition network as branches for a Siamese model. Extensive experiments show that our model is capable of mining mutual context of faces for accurate fine-grained interpersonal prediction.
We introduce a novel matching algorithm, called DeepMatching, to compute dense correspondences between images. DeepMatching relies on a hierarchical, multi-layer, correlational architecture designed ...for matching images and was inspired by deep convolutional approaches. The proposed matching algorithm can handle non-rigid deformations and repetitive textures and efficiently determines dense correspondences in the presence of significant changes between images. We evaluate the performance of DeepMatching, in comparison with state-of-the-art matching algorithms, on the Mikolajczyk (Mikolajczyk et al. A comparison of affine region detectors,
2005
), the MPI-Sintel (Butler et al. A naturalistic open source movie for optical flow evaluation,
2012
) and the Kitti (Geiger et al. Vision meets robotics: The KITTI dataset,
2013
) datasets. DeepMatching outperforms the state-of-the-art algorithms and shows excellent results in particular for repetitive textures. We also apply DeepMatching to the computation of optical flow, called DeepFlow, by integrating it in the large displacement optical flow (LDOF) approach of Brox and Malik (Large displacement optical flow: descriptor matching in variational motion estimation,
2011
). Additional robustness to large displacements and complex motion is obtained thanks to our matching approach. DeepFlow obtains competitive performance on public benchmarks for optical flow estimation.
Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, ...however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads to significant improvements. We evaluate the different approaches on the Montalbano gesture recognition dataset, where we achieve state-of-the-art results.
Several reports have shown that radiomic features are affected by acquisition and reconstruction parameters, thus hampering multicenter studies. We propose a method that, by removing the center ...effect while preserving patient-specific effects, standardizes features measured from PET images obtained using different imaging protocols.
Pretreatment
F-FDG PET images of patients with breast cancer were included. In one nuclear medicine department (department A), 63 patients were scanned on a time-of-flight PET/CT scanner, and 16 lesions were triple-negative (TN). In another nuclear medicine department (department B), 74 patients underwent PET/CT on a different brand of scanner and a different reconstruction protocol, and 15 lesions were TN. The images from department A were smoothed using a gaussian filter to mimic data from a third department (department A-S). The primary lesion was segmented to obtain a lesion volume of interest (VOI), and a spheric VOI was set in healthy liver tissue. Three SUVs and 6 textural features were computed in all VOIs. A harmonization method initially described for genomic data was used to estimate the department effect based on the observed feature values. Feature distributions in each department were compared before and after harmonization.
In healthy liver tissue, the distributions significantly differed for 4 of 9 features between departments A and B and for 6 of 9 between departments A and A-S (
< 0.05, Wilcoxon test). After harmonization, none of the 9 feature distributions significantly differed between 2 departments (
> 0.1). The same trend was observed in lesions, with a realignment of feature distributions between the departments after harmonization. Identification of TN lesions was largely enhanced after harmonization when the cutoffs were determined on data from one department and applied to data from the other department.
The proposed harmonization method is efficient at removing the multicenter effect for textural features and SUVs. The method is easy to use, retains biologic variations not related to a center effect, and does not require any feature recalculation. Such harmonization allows for multicenter studies and for external validation of radiomic models or cutoffs and should facilitate the use of radiomic models in clinical practice.
The quality of super-resolution images obtained by single-molecule localization microscopy (SMLM) depends largely on the software used to detect and accurately localize point sources. In this work, ...we focus on the computational aspects of super-resolution microscopy and present a comprehensive evaluation of localization software packages. Our philosophy is to evaluate each package as a whole, thus maintaining the integrity of the software. We prepared synthetic data that represent three-dimensional structures modeled after biological components, taking excitation parameters, noise sources, point-spread functions and pixelation into account. We then asked developers to run their software on our data; most responded favorably, allowing us to present a broad picture of the methods available. We evaluated their results using quantitative and user-interpretable criteria: detection rate, accuracy, quality of image reconstruction, resolution, software usability and computational resources. These metrics reflect the various tradeoffs of SMLM software packages and help users to choose the software that fits their needs.
Deep learning for biological image classification Affonso, Carlos; Rossi, André Luis Debiaso; Vieira, Fábio Henrique Antunes ...
Expert systems with applications,
11/2017, Letnik:
85
Journal Article
Recenzirano
Odprti dostop
•Compare Deep Learning archtecture with machine learning techniques.•Classify the quality of wood board.•Extract texture descriptor from wood images.
A number of industries use human inspection to ...visually classify the quality of their products and the raw materials used in the production process, this process could be done automatically through digital image processing. The industries are not always interested in the most accurate technique for a given problem, but most appropriate for the expected results, there must be a balance between accuracy and computational cost. This paper investigates the classification of the quality of wood boards based on their images. For such, it compares the use of deep learning, particularly Convolutional Neural Networks, with the combination of texture-based feature extraction techniques and traditional techniques: Decision tree induction algorithms, Neural Networks, Nearest neighbors and Support vector machines. Reported studies show that Deep Learning techniques applied to image processing tasks have achieved predictive performance superior to traditional classification techniques, mainly in high complex scenarios. One of the reasons pointed out is their embedded feature extraction mechanism. Deep Learning techniques directly identify and extract features, considered by them to be relevant, in a given image dataset. However, empirical results for the image data set have shown that the texture descriptor method proposed, regardless of the strategy employed is very competitive when compared with Convolutional Neural Network for all the performed experiments. The best performance of the texture descriptor method could be caused by the nature of the image dataset. Finally are pointed out some perspectives of futures developments with the application of Active learning and Semi supervised methods.