With the widespread deployment of wavelength division multiplexing (WDM), optical transceivers increasingly use many glass micro-optical components (GMOC). Visual inspection of these GMOCs is a ...critical manufacturing step to ensure quality and reliability. However, manual inspection is often labor-intensive and time-consuming due to the transparent nature of glass components and the small, randomly located defects in three dimensions. Although automated optical inspection (AOI) exists, it has not yet been able to provide the desired level of accuracy and efficiency.
This paper reports the development of an AOI platform for 3D defect detection on GMOCs. The platform incorporates 3D video acquisition and a novel two-stage neural network machine-learning algorithm. It includes a robotic arm for moving parts in 3D, a camera with an illumination module for video acquisition, and a video streaming processing unit with a machine vision algorithm for real-time defect detection on a production line. The robotic arm enables multi-perspective video capture of a test sample without refocusing. The two-stage machine learning network uses a modified YOLOv4 architecture with color channel separation (CCS) convolution, an image quality evaluation (IQE) module, and a frame fusion module to integrate the single frame detection results. This network can process multi-perspective video streams in real-time for defects detection in a coarse-to-fine manner. The AOI platform was trained with only 30 samples and achieved promising performances with a recall rate of 1, a detection accuracy of 97%, and an inspection time of 48 s per part.
•In order to solve the challenging problem of defect detection on transparent optical components with 3D shape and random defect locations, a novel optical inspection platform based on video stream is developed to capture multi-perspective images of the object, and machine learning based algorithms is used to identify the defects.
This study aimed to compare the effectiveness of a dynamic navigation system and a three-dimensional microscope in retrieving broken rotary Nickel-Titanium files when using trepan burs and the ...extractor system.
Thirty maxillary first bicuspids with 60 separate roots were split into 2 comparable groups based on a comprehensive cone beam computed tomography analysis of the root length and curvature. After standardized access opening, glide paths, and patency attainment with the K file (sizes 10 and 15), the teeth were arranged on 3D models (three per quadrant, six per model). Subsequently, controlled-memory heat-treated Nickel-Titanium rotary files (#25/0.04) were notched 4 mm from the tips and fractured at the apical third of the roots. The C-FR1 Endo file removal system was employed under both guidance to retrieve the fragments, and the success rate, canal aberration, treatment time, and volumetric changes were measured. The statistical analysis was performed using IBM SPSS software at a significance level of 0.05.
The microscope-guided group had a higher success rate than the dynamic navigation system guidance, but the difference was insignificant (P > .05). In addition, the microscope-guided drills resulted in a substantially lower proportion of canal aberration, shorter time to retrieve the fragments and less change in the root canal volume (P < .05).
Although dynamically guided trephining with the extractor can retrieve separated instruments, it is inferior to three-dimensional microscope guidance regarding treatment time, procedural errors, and volume change.
Overview of the MVC+D 3D video coding standard Chen, Ying; Hannuksela, Miska M.; Suzuki, Teruhiko ...
Journal of visual communication and image representation,
20/May , Letnik:
25, Številka:
4
Journal Article
Recenzirano
•MVC+D supports depth-image-based rendering for advanced 3D video use cases.•MVC+D decoder can reuse H.264/AVC or MVC hardware decoder implementation modules.•MVC+D supports package level bandwidth ...and device decoding capability adaptation.•MVC+D allows asymmetric depth to have different spatial resolution as texture.•MVC+D requires typically about twice the bitrate of 2D video coded by H.264/AVC.
3D video services are emerging in various application domains including cinema, TV broadcasting, Blu-ray discs, streaming and smartphones. A majority of the 3D video content in market is still based on stereo video, which is typically coded with the multiview video coding (MVC) extension of the Advanced Video Coding (H.264/AVC) standard or as frame-compatible stereoscopic video. However, the 3D video technologies face challenges as well as opportunities to support more demanding application scenarios, such as immersive 3D telepresence with numerous views and 3D perception adaptation for heterogeneous 3D devices and/or user preferences. The Multiview Video plus Depth (MVD) format enables depth-image-based rendering (DIBR) of additional viewpoints in the decoding side and hence helps in such advanced application scenarios. This paper reviews the MVC+D standard, which specifies an MVC-compatible MVD coding format.
•Multiple features are exploited to satisfy copyright protection requirements of DIBR 3D video.•Suitable features are designed for both 2D frames and depth maps to well protect them.•RPSR features ...are designed to counter geometric attacks and enhance discriminative capacity.•An attention-based fusion is designed to offer an optimal copyright protecting solution.•A logistic-logistic chaotic system is used to encrypt these multiple features.
Zero-watermarking is a key technique for achieving lossless and flexible copyright protection of depth image-based rendering (DIBR) videos. Existing approaches extract features of both 2D frames and depth maps via a single mechanism to protect them simultaneously. However, it is difficult for these schemes to fully satisfy the copyright protection requirements of the two components, including the remarkable discriminative capability of 3D videos and robustness against various attacks. Hence, in this paper, we propose a novel multiple-feature-based zero-watermarking scheme to protect the copyright of DIBR 3D videos. To the best of our knowledge, this is the first scheme that integrates multiple features to improve both the discriminative capability and robustness against various attacks. Specifically, dual-tree complex wavelet transform and discrete cosine transform features enhance the robustness against DIBR conversion and noise addition, respectively, while ring-partition statistical residual features ensure robustness against geometric attacks and provide sufficient discriminative capacity. In addition, we use a logistic-logistic chaotic system to encrypt these multiple features for enhanced security and design an attention-based fusion approach to offer an optimal copyright protection solution. Extensive experimental results demonstrate that our proposed scheme has stronger robustness and discriminative capacity compared to state-of-the-art zero-watermarking methods.
This paper investigates the problem of proactive caching for multi-view 3D videos in the fifth generation (5G) networks. We establish a mathematical model for this problem, and point out that it is ...difficult to solve the problem with traditional dynamic programming, then we propose a deep reinforcement learning approach to solve it. First, we model the proactive caching system for multi-view 3D videos as a Markov decision process jointing views selection and local memory allocation. Then, we present an actor-critic, model-free algorithm based on the deep deterministic policy gradient to find effective proactive caching policy. Since the action space is affected by the system state, we embed dynamic k-Nearest Neighbor algorithm into actor-critic algorithm to implement the deep reinforcement learning algorithm working in an action space of variable size. Finally, the numerical results are given to demonstrate that the proposed solution can effectively maintain high-quality user experience for high-mobility 5G users moving among small cells. We also investigate the impact of configuration of critical parameters on the performance of the algorithm.
The VirtualCube system is a 3D video conference system that attempts to overcome some limitations of conventional technologies. The key ingredient is VirtualCube, an abstract representation of a ...real-world cubicle instrumented with RGBD cameras for capturing the user's 3D geometry and texture. We design VirtualCube so that the task of data capturing is standardized and significantly simplified, and everything can be built using off-the-shelf hardware. We use VirtualCubes as the basic building blocks of a virtual conferencing environment, and we provide each VirtualCube user with a surrounding display showing life-size videos of remote participants. To achieve real-time rendering of remote participants, we develop the V-Cube View algorithm, which uses multi-view stereo for more accurate depth estimation and Lumi-Net rendering for better rendering quality. The VirtualCube system correctly preserves the mutual eye gaze between participants, allowing them to establish eye contact and be aware of who is visually paying attention to them. The system also allows a participant to have side discussions with remote participants as if they were in the same room. Finally, the system sheds lights on how to support the shared space of work items (e.g., documents and applications) and track participants' visual attention to work items.
Although 3D video has become popular, successful 3D video streaming over wireless networks involves a number of challenges. Due to the frequent frame damages and losses in wireless networks, temporal ...asynchrony occurs and results in serious visual fatigue for viewers. In order to provide higher quality 3D video, this paper proposes a new scheme termed the Temporal Synchronization Scheme (TSS) for live 3D video streaming. TSS delivers video frames for the left and right views in the same frame sequence with the same transmission priority, and it compensates for frame damage and losses during the decoding phase. In addition, a new metric called the Stereoscopic Temporal Variation Index (STVI) is proposed; it measures the degree of temporal asynchrony in 3D video. Subjective assessments demonstrate that STVI is an objective metric for measuring subjective quality given that it exhibits a strong positive correlation with the subjective MOS rating. This paper demonstrates that TSS significantly improves the visual quality of 3D videos given the use of STVI and the Mean Opinion Score (MOS) even when frame damage and losses occur. The contributions of the paper are as follows: i) The proposed TSS is the first scheme to address the temporal asynchrony issue in live 3D video streaming over wireless networks. ii) TSS only requires slight modifications to decoders. iii) A new metric (i.e., STVI) is proposed to measure the degree of temporal asynchrony in 3D videos. Therefore, 3D video streaming over wireless networks can be performed with temporal synchronization, and it is expected using TSS will reduce visual fatigue.
Epilepsy is a major disorder affecting millions of people. Although modern electrophysiological and imaging approaches provide high-resolution access to the multi-scale brain circuit malfunctions in ...epilepsy, our understanding of how behavior changes with epilepsy has remained rudimentary. As a result, screening for new therapies for children and adults with devastating epilepsies still relies on the inherently subjective, semi-quantitative assessment of a handful of pre-selected behavioral signs of epilepsy in animal models. Here, we use machine learning-assisted 3D video analysis to reveal hidden behavioral phenotypes in mice with acquired and genetic epilepsies and track their alterations during post-insult epileptogenesis and in response to anti-epileptic drugs. These results show the persistent reconfiguration of behavioral fingerprints in epilepsy and indicate that they can be employed for rapid, automated anti-epileptic drug testing at scale.
Display omitted
•Automated behavioral analysis enables high-throughput screening of epileptic mice•Characteristic behavioral phenotypes are found in acquired and genetic epilepsies•Inter-ictal behavior can be used to assess epileptogenesis and for drug screening•A purely data-driven analysis can facilitate seizure assessment
Gschwind et al. show that machine learning-assisted behavioral analysis allows an unbiased assessment of epilepsy in animal models. The automated identification of behavioral phenotypes, including anti-epileptic drug responses, during the readily available inter-ictal periods bears great potential to accelerate rigorous, reproducible preclinical research into epilepsies.
•We propose a playback length changeable 3D video data chunk segmentation algorithm.•A novel hybrid-priority based 3D video P2P data scheduling algorithm is presented.•Bandwidth-adaptive 3D video P2P ...streaming for heterogeneous networks is implemented.•The proposed method obtains superior error-resilience and network utilization performance.
3D video distribution over P2P networks has been thought as a promising way for 3D video entering home. The convergence of scalable 3D video coding and P2P streaming can provide diverse 3D experiences for heterogeneous clients with high distribution efficiencies. However, the conventional chunk segmentation and scheduling algorithms originally aiming at the non-scalable 2D video streaming are not very efficient for scalable 3D video streaming over P2P networks due to the particular data characteristics of scalable 3D video. Based on this motivation, this paper first presents a playback length changeable 3D video chunk segmentation (PLC3DCS) algorithm to provide different error resilience strengths to video and depth as well as layers with different importance levels in the 3D video transmission. Then, a hybrid-priority based chunk scheduling (HPS) algorithm is proposed to be tied in with the proposed chunk segmentation algorithm to further promote the overall 3D video P2P streaming performance. The simulation results show that the proposed PLC3DCS algorithm with the corresponding HPS can increase the success delivery rates of chunks with more important levels, and further improve the user’s quality of 3D experience.
•Fuse zero watermark and retrieval techniques to protect large-scale DIBR videos.•Design robust and discriminative features for precise and reliable protection.•Design attention-based fusion method ...to provide flexible copyright protection.
Digital rights management (DRM) of depth-image-based rendering (DIBR) 3D video is an emerging area of research. Existing schemes for DIBR 3D video cause video distortions, are vulnerable to severe signal and geometric attacks, cannot protect 2D frames and depth maps independently, or have difficulty handling large-scale videos. To address these issues, a novel zero-watermark scheme based on invariant features and similarity-based retrieval to protect DIBR 3D video (RZW-SR) is proposed in this study. In RZW-SR, invariant features are extracted to generate master and ownership shares to provide distortion-free, robust and discriminative copyright identification under various attacks. Different from conventional zero-watermark schemes, our proposed scheme stores features and ownership shares correlatively and designs a similarity-based retrieval phase to provide effective solutions for large-scale videos. In addition, flexible mechanisms based on attention-based fusion are designed to protect 2D frames and depth maps, either independently or simultaneously. The experimental results demonstrate that RZW-SR has superior DRM performance compared to existing schemes. First, RZW-SR can obtain the ownership shares relevant to a particular 3D video precisely and reliably for effective copyright identification of large-scale videos. Second, RZW-SR ensures lossless, precise, reliable and flexible copyright identification for 2D frames and depth maps of 3D videos.