We can represent the orientation of a plane in 3D by its normal vector. However, every plane has two normal vectors that are negatives of each other. We propose four novel representations of vectors ...in 3D that are negation invariant and can be used by a neural network to predict orientation. Our proposed solution is the first to introduce representations that are negation invariant, continuous and easily parallelisable on the GPU. We evaluate the representations by predicting the orientation of a plane on a toy task, and by applying them to synthetic seismic tomographic data where we predict the presence and orientation of faults for every voxel in the volume. We further make use of the orientation of the faults in a post-processing algorithm on the GPU that separates the faults into segments (i.e. instances) that do not intersect, which allows us to selectively visualise faults in 3D. We demonstrate the utility of the representations by deploying the model on the Laminaria 3D Seismic volume as a case study. We quantitatively compare the model's prediction against human interpretations of slices through the volume as well as existing interpretations in literature. Our analysis shows good agreement (F1 score of 88%) of the model with human interpretation in the shallow levels, where the ambient noise is lower, but this agreement degrades at deeper levels (F1 score of 68%). We explore possible reasons for this degradation.
Image highlights caused by specular reflections always conceal or attenuate the feature information of the samples in vision measurement. This paper investigates countermeasures to highlighted ...specular surface measurements. A measurement system is developed to capture a series of images with different polarization angles, the highlighted regions of which are taken as the trained samples for a back-propagation neural network, while the initial weights of the neural network are set as Gaussian distributions. Experimental results show that the proposed method can efficiently increase the stereo matching accuracy and hence recover the information in highlighted regions in vision measurements of specular surfaces.
Edge computing of reliable multimodal (2D RGB/3D RGB-Depth) data has a wide range of applications. However, many of currently reported visual processors cannot flexibly handle multimodal data, e.g., ...the visual streams of RGB-Depth data. The key challenge exists that these prior visual processors do not come with efficient and unified instruction set architecture (ISA) for both conventional and intelligent cognition on the 2D/3D multimodal sensory data. To fill such a gap, this paper proposes a programmable intelligent visual vector processor compatible with multimodal 2D/3D visual data processing (<inline-formula> <tex-math notation="LaTeX">1920\times 1080</tex-math> </inline-formula>-pixel resolution). The processor consists of a reconfigurable processing element (PE) array, a memory access network flexibly configurable to be fine-or coarse-grained, and a high throughput I/O interface. The vectorial PE array with neighbor PE access increases the data reuse rate and parallel computation efficiency, and can implement both convolutional neural networks (CNNs) and conventional image processing algorithms. The proposed ISA is customized and optimally tailored targeting 2D/3D image processing from RGB/Time-of-Flight(ToF) raw data to intelligent inference results. The chip is fabricated in a 55-nm CMOS process. The experimental results showed that the area efficiency, peak performance, and peak throughput of our chip attained as high as 14.41 GOPS/mm<inline-formula> <tex-math notation="LaTeX">^2</tex-math> </inline-formula>, 409.6 GOPS, and 9.6 Gbps at 200 MHz, respectively. The measured processing speeds of this chip on ToF depth reconstruction is 87 fps (<inline-formula> <tex-math notation="LaTeX">480\times 270</tex-math> </inline-formula>) or 31 fps(<inline-formula> <tex-math notation="LaTeX">1920\times 1080</tex-math> </inline-formula>),on 3D object classification is 219 fps (<inline-formula> <tex-math notation="LaTeX">256\times 256</tex-math> </inline-formula>), and on CNN-based 2D object tracking is 36 fps (<inline-formula> <tex-math notation="LaTeX">256\times 256</tex-math> </inline-formula>).
A stable multimodal system is developed by combining two common-path digital holographic microscopes (DHMs): coherent and incoherent, for simultaneous recording and retrieval of three-dimensional ...(3-D) phase and 3-D fluorescence imaging (FI), respectively, of a biological specimen. The 3-D FI is realized by a single-shot common-path off-axis fluorescent DHM developed recently by our group. In addition, we accomplish, the phase imaging by another single-shot, highly stable common-path off-axis DHM based on a beam splitter. In this DHM configuration, a beam splitter is used to divide the incoming object beam into two beams. One beam serves as the object beam carrying the useful information of the object under study, whereas another beam is spatially filtered at its Fourier plane by using a pinhole and it serves as a reference beam. This DHM setup, owing to a common-path geometry, is less vibration-sensitive and compact, having a similar field of view but with high temporal phase stability in comparison to a two-beam Mach–Zehnder-type DHM. The performance of the proposed common-path DHM and the multimodal system is verified by conducting various experiments on fluorescent microspheres and fluorescent protein-labeled living cells of the moss Physcomitrella patens. Moreover, the potential capability of the proposed multimodal system for 3-D live fluorescence and phase imaging of the fluorescent beads is also demonstrated. The obtained experimental results corroborate the feasibility of the proposed multimodal system and indicate its potential applications for the analysis of functional and structural behaviors of a biological specimen and enhancement of the understanding of physiological mechanisms and various biological diseases.
Edge computing of reliable multimodal (2D RGB/3D RGB-Depth) data has a wide range of applications. However, many of currently reported visual processors cannot flexibly handle multimodal data, e.g., ...the visual streams of RGB-Depth data. The key challenge exists that these prior visual processors do not come with efficient and unified instruction set architecture (ISA) for both conventional and intelligent cognition on the 2D/3D multimodal sensory data. To fill such a gap, this paper proposes a programmable intelligent visual vector processor compatible with multimodal 2D/3D visual data processing (Formula Omitted-pixel resolution). The processor consists of a reconfigurable processing element (PE) array, a memory access network flexibly configurable to be fine- or coarse-grained, and a high throughput I/O interface. The vectorial PE array with neighbor PE access increases the data reuse rate and parallel computation efficiency, and can implement both convolutional neural networks (CNNs) and conventional image processing algorithms. The proposed ISA is customized and optimally tailored targeting 2D/3D image processing from RGB/Time-of-Flight(ToF) raw data to intelligent inference results. The chip is fabricated in a 55-nm CMOS process. The experimental results showed that the area efficiency, peak performance, and peak throughput of our chip attained as high as 14.41GOPS/mm2, 409.6GOPS, and 9.6Gbps at 200MHz, respectively. The measured processing speeds of this chip on ToF depth reconstruction is 87fps (Formula Omitted) or 31 fps(Formula Omitted),on 3D object classification is 219fps (Formula Omitted), and on CNN-based 2D object tracking is 36fps (Formula Omitted).
The behavior of multicamera interference in 3D images (e.g., depth maps), which is based on infrared (IR) light, is not well understood. In 3D images, when multicamera interference is present, there ...is an increase in the amount of zero-value pixels, resulting in a loss of depth information. In this work, we demonstrate a framework for synthetically generating direct and indirect multicamera interference using a combination of a probabilistic model and ray tracing. Our mathematical model predicts the locations and probabilities of zero-value pixels in depth maps that contain multicamera interference. Our model accurately predicts where depth information may be lost in a depth map when multicamera interference is present. We compare the proposed synthetic 3D interference images with controlled 3D interference images captured in our laboratory. The proposed framework achieves an average root mean square error (RMSE) of 0.0625, an average peak signal-to-noise ratio (PSNR) of 24.1277 dB, and an average structural similarity index measure (SSIM) of 0.9007 for predicting direct multicamera interference, and an average RMSE of 0.0312, an average PSNR of 26.2280 dB, and an average SSIM of 0.9064 for predicting indirect multicamera interference. The proposed framework can be used to develop and test interference mitigation techniques that will be crucial for the successful proliferation of these devices.
Traditional seed and fruit phenotyping are mainly accomplished by manual measurement or extraction of morphological properties from two-dimensional images. These methods are not only in ...low-throughput but also unable to collect their three-dimensional (3D) characteristics and internal morphology. X-ray computed tomography (CT) scanning, which provides a convenient means of non-destructively recording the external and internal 3D structures of seeds and fruits, offers a potential to overcome these limitations. However, the current CT equipment cannot be adopted to scan seeds and fruits with high throughput. And there is no specialized software for automatic extraction of phenotypes from CT images. Here, we introduced a high-throughput image acquisition approach by mounting a specially designed seed-fruit container onto the scanning bed. The corresponding 3D image analysis software, 3DPheno-Seed&Fruit, was created for automatic segmentation and rapid quantification of eight morphological phenotypes of internal and external compartments of seeds and fruits. 3DPheno-Seed&Fruit is a graphical user interface design and user-friendly software with an excellent phenotype result visualization function. We described the software in detail and benchmarked it based upon CT image analyses in seeds of soybean, wheat, peanut, pine nut, pistachio nut and dwarf Russian almond fruit.
values between the extracted and manual measurements of seed length, width, thickness, and radius ranged from 0.80 to 0.96 for soybean and wheat. High correlations were found between the 2D (length, width, thickness, and radius) and 3D (volume and surface area) phenotypes for soybean. Overall, our methods provide robust and novel tools for phenotyping the morphological seed and fruit traits of various plant species, which could benefit crop breeding and functional genomics.
Dual-wavelength digital holography has recently become a promising tool for achieving higher axial measurement ranges in microscopy. However, digital filtering, such as Fourier transform, or a ...separate hologram acquisition process is essential for retrieving the quantitative phase information of each respective wavelength. In this paper, a quantitative phase reconstruction method based on an infrared-like single wave conversion in dual-wavelength phase-shift digital holography, which does not require any of the processes mentioned above, has been proposed. Based on the proposed method, the quantitative phase information on the axial measurement specimen was simply obtained by only a phase-shift of a quarter of an infrared-like single wave which has the same magnitude of the synthetic wavelength. This approach simplifies not only the hologram acquisition process but also the numerical reconstruction process. The newly proposed theory is verified and evaluated using simulations and experimental validation.
Manual analysis and optimisation of ergonomic parameters can be tedious when process and worker's body size variance is high. Automating this process would reduce workload and enable developing ...assistance systems for worker support. This paper presents a system which computes the positions of the parts of the body from input depth images and assesses ergonomics scores. The method is based on Particle Swarm Optimisation (PSO). By parallel processing on graphics hardware (GPU), the system is able to provide ergonomic feedback within a few seconds.
This article focuses on an adaptive and fault-tolerant vision-guided robotic system that enables to choose the most appropriate control action if partial or complete failure of the vision system in ...the short term occurs. Moreover, the autonomous robotic system takes physical and operational constraints into account to perform the demands of a specific visual servoing task in a way to minimize a cost function. A hierarchical control architecture is developed based on interwoven integration of a variant of the iterative closest point image registration, a constrained noise-adaptive Kalman filter, a fault detection logic and recovery system, together with a constrained optimal path planner. The dynamic estimator estimates unknown states and uncertain parameters required for motion prediction while imposing a set of inequality constraints for consistency of the estimation process and adjusting adaptively the Kalman filter parameters in the face of unexpected vision errors. It is followed by the implementation of a fault recovery strategy based on a fault detection logic that monitors the health of the visual feedback using the metric fit error of the image registration. Subsequently, the estimated/predicted pose and parameters are passed to an optimal path planner in order to bring the robot end-effector to the grasping point of a moving target as quickly as possible subject to multiple constraints, such as acceleration limit, smooth capture, and line-of-sight angle of the target. Experimental results demonstrated such a visual servoing system succeeded to capture a free-floating object despite the complete failure of the vision system due to occlusion in the last 10 s prior to approach and capture operation.