Multiple Hypothesis Video Segmentation (MHVS) is a method for the unsupervised photometric segmentation of video sequences. MHVS segments arbitrarily long video streams by considering only a few ...frames at a time, and handles the automatic creation, continuation and termination of labels with no user initialization or supervision. The process begins by generating several pre-segmentations per frame and enumerating multiple possible trajectories of pixel regions within a short time window. After assigning each trajectory a score, we let the trajectories compete with each other to segment the sequence. We determine the solution of this segmentation problem as the MAP labeling of a higher-order random field. This framework allows MHVS to achieve spatial and temporal long-range label consistency while operating in an on-line manner. We test MHVS on several videos of natural scenes with arbitrary camera and object motion.
Recent progress in the measurement of surface reflectance has created a demand for non-parametric appearance representations that are accurate, compact, and easy to use for rendering. Another crucial ...goal, which has so far received little attention, is
editability:
for practical use, we must be able to change both the directional and spatial behavior of surface reflectance (e.g., making one material shinier, another more anisotropic, and changing the spatial "texture maps" indicating where each material appears). We introduce an Inverse
Shade Tree
framework that provides a general approach to estimating the "leaves" of a user-specified shade tree from high-dimensional measured datasets of appearance. These leaves are sampled 1- and 2-dimensional functions that capture both the directional behavior of individual materials and their spatial mixing patterns. In order to compute these shade trees automatically, we map the problem to matrix factorization and introduce a flexible new algorithm that allows for constraints such as non-negativity, sparsity, and energy conservation. Although we cannot infer every type of shade tree, we demonstrate the ability to reduce multi-gigabyte measured datasets of the Spatially-Varying Bidirectional Reflectance Distribution Function (SVBRDF) into a compact representation that may be edited in real time.
The facial performance of an individual is inherently rich in subtle deformation and timing details. Although these subtleties make the performance realistic and compelling, they often elude both ...motion capture and hand animation. We present a technique for adding fine-scale details and expressiveness to low-resolution art-directed facial performances, such as those created manually using a rig, via marker-based capture, by fitting a morphable model to a video, or through Kinect reconstruction using recent
faceshift
technology. We employ a high-resolution facial performance capture system to acquire a representative performance of an individual in which he or she explores the full range of facial expressiveness. From the captured data, our system extracts an expressiveness model that encodes subtle spatial and temporal deformation details specific to that particular individual. Once this model has been built, these details can be transferred to low-resolution art-directed performances. We demonstrate results on various forms of input; after our enhancement, the resulting animations exhibit the same nuances and fine spatial details as the captured performance, with optional temporal enhancement to match the dynamics of the actor. Finally, we show that our technique outperforms the current state-of-the-art in example-based facial animation.
This paper presents a technique to recover geometry from time‐lapse sequences of outdoor scenes. We build upon photometric stereo techniques to recover approximate shadowing, shading and normal ...components allowing us to alter the material and normals of the scene. Previous work in analyzing such images has faced two fundamental difficulties: 1. the illumination in outdoor images consists of time‐varying sunlight and skylight, and 2. the motion of the sun is restricted to a near‐planar arc through the sky, making surface normal recovery unstable. We develop methods to estimate the reflection component due to skylight illumination. We also show that sunlight directions are usually non‐planar, thus making surface normal recovery possible. This allows us to estimate approximate surface normals for outdoor scenes using a single day of data. We demonstrate the use of these surface normals for a number of image editing applications including reflectance, lighting, and normal editing.
We report on a controlled user study comparing three visualization environments for common 3D exploration. Our environments differ in how they exploit natural human perception and interaction ...capabilities. We compare an augmented-reality head-mounted display (Microsoft HoloLens), a handheld tablet, and a desktop setup. The novel head-mounted HoloLens display projects stereoscopic images of virtual content into a user's real world and allows for interaction in-situ at the spatial position of the 3D hologram. The tablet is able to interact with 3D content through touch, spatial positioning, and tangible markers, however, 3D content is still presented on a 2D surface. Our hypothesis is that visualization environments that match human perceptual and interaction capabilities better to the task at hand improve understanding of 3D visualizations. To better understand the space of display and interaction modalities in visualization environments, we first propose a classification based on three dimensions: perception, interaction, and the spatial and cognitive proximity of the two. Each technique in our study is located at a different position along these three dimensions. We asked 15 participants to perform four tasks, each task having different levels of difficulty for both spatial perception and degrees of freedom for interaction. Our results show that each of the tested environments is more effective for certain tasks, but that generally the desktop environment is still fastest and most precise in almost all cases.
Video matting is the process of pulling a high-quality alpha matte and foreground from a video sequence. Current techniques require either a known background (e.g., a blue screen) or extensive user ...interaction (e.g., to specify known foreground and background elements). The matting problem is generally under-constrained, since not enough information has been collected at capture time. We propose a novel, fully autonomous method for pulling a matte using multiple synchronized video streams that share a point of view but differ in their plane of focus. The solution is obtained by directly minimizing the error in filter-based image formation equations, which are over-constrained by our rich data stream. Our system solves the fully dynamic video matting problem without user assistance: both the foreground and background may be high frequency and have dynamic content, the foreground may resemble the background, and the scene is lit by natural (as opposed to polarized or collimated) illumination.
Immunotherapy shows durable response but only in a subset of patients, and test for predictive biomarkers requires procedures in addition to routine workflow. We proposed a confounder-aware ...representation learning-based system, genopathomic biomarker for immunotherapy response (PITER), that uses only diagnosis-acquired hematoxylin-eosin (H&E)-stained pathological slides by leveraging histopathological and genetic characteristics to identify candidates for immunotherapy. PITER was generated and tested with three datasets containing 1944 slides of 1239 patients. PITER was found to be a useful biomarker to identify patients of lung adenocarcinoma with both favorable progression-free and overall survival in the immunotherapy cohort (p < 0.05). PITER was significantly associated with pathways involved in active cell division and a more immune activating microenvironment, which indicated the biological basis in identifying patients with favorable outcome of immunotherapy. Thus, PITER may be a potential biomarker to identify patients of lung adenocarcinoma with a good response to immunotherapy, and potentially provide precise treatment.
Display omitted
•Genetic profile could be extracted by deep learning from digital histologic slides•Extracted pathomic signature could identify patients’ response to immunotherapy•The extracted pathomic signature was involved in an active immune microenvironment
Immunology; Cancer; Artificial intelligence.
Upcoming technologies enable routine collection of highly multiplexed (20-60 channel), subcellular resolution images of mammalian tissues for research and diagnosis. Extracting single cell data from ...such images requires accurate image segmentation, a challenging problem commonly tackled with deep learning. In this paper, we report two findings that substantially improve image segmentation of tissues using a range of machine learning architectures. First, we unexpectedly find that the inclusion of intentionally defocused and saturated images in training data substantially improves subsequent image segmentation. Such real augmentation outperforms computational augmentation (Gaussian blurring). In addition, we find that it is practical to image the nuclear envelope in multiple tissues using an antibody cocktail thereby better identifying nuclear outlines and improving segmentation. The two approaches cumulatively and substantially improve segmentation on a wide range of tissue types. We speculate that the use of real augmentations will have applications in image processing outside of microscopy.
Evaluating 'Graphical Perception' with CNNs Haehn, Daniel; Tompkin, James; Pfister, Hanspeter
IEEE transactions on visualization and computer graphics,
01/2019, Letnik:
25, Številka:
1
Journal Article
Recenzirano
Odprti dostop
Convolutional neural networks can successfully perform many computer vision tasks on images. For visualization, how do CNNs perform when applied to graphical perception tasks? We investigate this ...question by reproducing Cleveland and McGill's seminal 1984 experiments, which measured human perception efficiency of different visual encodings and defined elementary perceptual tasks for visualization. We measure the graphical perceptual capabilities of four network architectures on five different visualization tasks and compare to existing and new human performance baselines. While under limited circumstances CNNs are able to meet or outperform human task performance, we find that CNNs are not currently a good model for human graphical perception. We present the results of these experiments to foster the understanding of how CNNs succeed and fail when applied to data visualizations.
Deblurring Images via Dark Channel Prior Pan, Jinshan; Sun, Deqing; Pfister, Hanspeter ...
IEEE transactions on pattern analysis and machine intelligence,
10/2018, Letnik:
40, Številka:
10
Journal Article
Recenzirano
Odprti dostop
We present an effective blind image deblurring algorithm based on the dark channel prior. The motivation of this work is an interesting observation that the dark channel of blurred images is less ...sparse. While most patches in a clean image contain some dark pixels, this is not the case when they are averaged with neighboring ones by motion blur. This change in sparsity of the dark channel pixels is an inherent property of the motion blur process, which we prove mathematically and validate using image data. Enforcing sparsity of the dark channel thus helps blind deblurring in various scenarios such as natural, face, text, and low-illumination images. However, imposing sparsity of the dark channel introduces a non-convex non-linear optimization problem. In this work, we introduce a linear approximation to address this issue. Extensive experiments demonstrate that the proposed deblurring algorithm achieves the state-of-the-art results on natural images and performs favorably against methods designed for specific scenarios. In addition, we show that the proposed method can be applied to image dehazing.