D-NeRF: Neural Radiance Fields for Dynamic Scenes Pumarola, Albert; Corona, Enric; Pons-Moll, Gerard ...
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
2021-June
Conference Proceeding
Open access
Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of ...images. Among these, stands out the Neural radiance fields (NeRF) 31, which trains a deep network to map 5D input coordinates (representing spatial location and viewing direction) into a volume density and view-dependent emitted radiance. However, despite achieving an unprecedented level of photorealism on the generated images, NeRF is only applicable to static scenes, where the same spatial location can be queried from different images. In this paper we introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a single camera moving around the scene. For this purpose we consider time as an additional input to the system, and split the learning process in two main stages: one that encodes the scene into a canonical space and another that maps this canonical representation into the deformed scene at a particular time. Both mappings are simultaneously learned using fully-connected networks. Once the networks are trained, D-NeRF can render novel images, controlling both the camera view and the time variable, and thus, the object movement. We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions. Code, model weights and the dynamic scenes dataset will be available at 1.
We present an approach to synthesize highly photorealistic images of 3D object models, which we use to train a convolutional neural network for detecting the objects in real images. The proposed ...approach has three key ingredients: (1) 3D object models are rendered in 3D models of complete scenes with realistic materials and lighting, (2) plausible geometric configuration of objects and cameras in a scene is generated using physics simulation, and (3) high photorealism of the synthesized images is achieved by physically based rendering. When trained on images synthesized by the proposed approach, the Faster R-CNN object detector 1 achieves a 24% absolute improvement of mAP@.75IoU on Rutgers APC 2 and 11% on LineMod-Occluded 3 datasets, compared to a baseline where the training images are synthesized by rendering object models on top of random photographs. This work is a step towards being able to effectively train object detectors without capturing or annotating any real images. A dataset of 400K synthetic images with ground truth annotations for various computer vision tasks will be released on the project website: thodan.github.io/objectsynth.
White light-emitting diodes (WLEDs) with high color rendering is strong required in some color-critical lighting applications. However, developing a stable color converter with broad emission remains ...an outstanding challenge for high-power WLEDs. Herein, a stacked color converter of red CaAlSiN3:Eu2+ (CASN) phosphor-in-silicone (PiS) layer coated on a green Y3Al3.08Ga1.92O12:Ce3+ (YAGG) phosphor-in-glass (PiG) was fabricated to realize ultra-high color rendering for high-power WLEDs. The YAGG PiG was prepared by printing and sintering the mixture of YAGG phosphor and borosilicate glass matrix, and then the CASN PiS layer was coated the YAGG PiG. The PiS thickness was controlled to adjust the luminescence of converters. The converter exhibits a broad emission due to the compensation of cyan valley. The converter enables a pure white light with an ultra-high color rendering (Ra = 96.1, R9 = 90.1, R13 = 98.5; Rf = 92, Rg = 100) and chromaticity coordinates (0.367, 0.372). The WLED achieves stable color quality under various driving currents.
•The stacked color converter with broad emission was proposed for high-power WLEDs.•The stacked converters were fabricated by coating red CASN PiS on green YAGG PiG.•The luminescence of converters were optimized by adjusting PiS layer thickness.•The stacked converter based WLED achieves ultra-high color rendering.
We introduce NeuroConstruct, a novel end-to-end application for the segmentation, registration, and visualization of brain volumes imaged using wide-field microscopy. NeuroConstruct offers a ...Segmentation Toolbox with various annotation helper functions that aid experts to effectively and precisely annotate micrometer resolution neurites. It also offers an automatic neurites segmentation using convolutional neuronal networks (CNN) trained by the Toolbox annotations and somas segmentation using thresholding. To visualize neurites in a given volume, NeuroConstruct offers a hybrid rendering by combining iso-surface rendering of high-confidence classified neurites, along with real-time rendering of raw volume using a 2D transfer function for voxel classification score versus voxel intensity value. For a complete reconstruction of the 3D neurites, we introduce a Registration Toolbox that provides automatic coarse-to-fine alignment of serially sectioned samples. The quantitative and qualitative analysis show that NeuroConstruct outperforms the state-of-the-art in all design aspects. NeuroConstruct was developed as a collaboration between computer scientists and neuroscientists, with an application to the study of cholinergic neurons, which are severely affected in Alzheimer's disease.
One of the unresolved problems in the field of computer creativity is developing a robot capable of creating full-color images with artistic paints in a human-like manner. While several advanced ...painting machines have been presented up to date, high-grade color rendition remains one of the main bottlenecks in robotic painting. In this paper, we present a robotic setup for realistic grayscale painting. The key feature of our robot is a special paint mixing device aimed at improving tone rendition in comparison to previously reported approaches. We describe the main algorithmic and hardware solutions implemented in our robot as well as the first experimental results. The robot represents a 3-DoF CNC machine equipped with a brush, the paint mixing device and a syringe pump block for paint supplement. In our study, we focus on monochrome painting with black and white acrylic paints. A mathematical model of primary paint mixing is described. Two realistic artworks have been created during test runs, and their reproductions together with the source images are given. The accuracy of tone rendition was experimentally tested. Further research will be aimed at high-grade color rendition and creating full-color paintings.
•A novel mixing device coupled with the brush is proposed for the artistic painting robot.•A novel algorithm for painterly rendering is adapted for painting robotic set.•Experiments reveal high-quality tone rendition in the monochrome painting by the proposed painting robot.•Two photograph-based artworks have been created and analyzed as examples.
The camera offset space Hladky, Jozef; Seidel, Hans-Peter; Steinberger, Markus
ACM transactions on graphics,
11/2019, Volume:
38, Issue:
6
Journal Article
Peer reviewed
Potential visibility has historically always been of importance when rendering performance was insufficient. With the rise of virtual reality, rendering power may once again be insufficient, e.g., ...for integrated graphics of head-mounted displays. To tackle the issue of efficient potential visibility computations on modern graphics hardware, we introduce the camera offset space (COS). Opposite to how traditional visibility computations work---where one determines which pixels are covered by an object under all potential viewpoints---the COS describes under which camera movement a sample location is covered by a triangle. In this way, the COS opens up a new set of possibilities for visibility computations. By evaluating the pairwise relations of triangles in the COS, we show how to efficiently determine occluded triangles. Constructing the COS for all pixels of a rendered view leads to a complete potentially visible set (PVS) for complex scenes. By fusing triangles to larger occluders, including locations between pixel centers, and considering camera rotations, we describe an exact PVS algorithm that includes all viewing directions inside a view cell. Implementing the COS is a combination of real-time rendering and compute steps. We provide the first GPU PVS implementation that works without preprocessing, on-the-fly, on unconnected triangles. This opens the door to a new approach of rendering for virtual reality head-mounted displays and server-client settings for streaming 3D applications such as video games.
Multi-layer images are currently the most prominent scene representation for viewing natural scenes under full-motion parallax in virtual reality. Layers ordered in diopter space contain color and ...transparency so that a complete image is formed when the layers are composited in a view-dependent manner. Once baked, the same limitations apply to multi-layer images as to conventional single-layer photography, making it challenging to remove obstructive objects or otherwise edit the content. Object removal before baking can benefit from filling disoccluded layers with pixels from background layers. However, if no such background pixels have been observed, an inpainting algorithm must fill the empty spots with fitting synthetic content. We present and study a multi-layer inpainting approach that addresses this problem in two stages: First, a volumetric area of interest specified by the user is classified with respect to whether the background pixels have been observed or not. Second, the unobserved pixels are filled with multi-layer inpainting. We report on experiments using multiple variants of multi-layer inpainting and compare our solution to conventional inpainting methods that consider each layer individually.
Text is a crucial component of 3‐D environments and virtual worlds for user interfaces and wayfinding. Implementing text using standard antialiased texture mapping leads to blurry and illegible ...writing which hinders usability and navigation. While super‐sampling removes some of these artifacts, distracting artifacts can still impede legibility, especially for recent high‐resolution head‐mounted displays. We propose an analytic antialiasing technique that efficiently computes the coverage of text glyphs, over pixel footprints, designed to run at real‐time rates. It decomposes glyphs into piecewise‐biquadratics and trapezoids that can be quickly area‐integrated over a pixel footprint to provide crisp legible antialiased text, even when mapped onto an arbitrary surface in a 3‐D virtual environment.
Abstract
This paper introduces a generative rendering network(GRN) based on a U-shape discriminator for novel view image synthesis. Recently, some generative adversarial networks start to explore 3D ...space and synthesize new images by rendering methods. Compared with 2D-only networks, they have achieved impressive results in synthesizing novel view images. However, they still have a common shortcoming: the discriminator cannot provide more detailed information for training a strong generator. For this problem, we design an U-shape based discriminator to extract both global information and pixel-wise information. This innovative discriminator will feedback more instructive clues to train a better generator. Benefit from this strategy, our synthesis results on ShapeNet v2 “cars” and “chairs” and celebA have achieved better performance than comparative methods.
Despite the potential of neural scene representations to effectively compress 3D scalar fields at high reconstruction quality, the computational complexity of the training and data reconstruction ...step using scene representation networks limits their use in practical applications. In this paper, we analyse whether scene representation networks can be modified to reduce these limitations and whether such architectures can also be used for temporal reconstruction tasks. We propose a novel design of scene representation networks using GPU tensor cores to integrate the reconstruction seamlessly into on‐chip raytracing kernels, and compare the quality and performance of this network to alternative network‐ and non‐network‐based compression schemes. The results indicate competitive quality of our design at high compression rates, and significantly faster decoding times and lower memory consumption during data reconstruction. We investigate how density gradients can be computed using the network and show an extension where density, gradient and curvature are predicted jointly. As an alternative to spatial super‐resolution approaches for time‐varying fields, we propose a solution that builds upon latent‐space interpolation to enable random access reconstruction at arbitrary granularity. We summarize our findings in the form of an assessment of the strengths and limitations of scene representation networks for compression domain volume rendering, and outline future research directions. Source code: https://github.com/shamanDevel/fV‐SRN
To improve the computational efficiency, we propose a novel design of scene representation networks with a volumetric latent grid using GPU tensor cores to integrate the reconstruction seamlessly into on‐chip raytracing kernels.