Ray-triangle intersection is an important algorithm, not only in the field of realistic rendering (based on ray tracing) but also in physics simulation, collision detection, modeling, etc. Obviously, ...the speed of this well-defined algorithm's implementations is important because calls to such a routine are numerous in rendering and simulation applications. Contemporary fast intersection algorithms, which use SIMD instructions, focus on the intersection of ray packets against triangles. For intersection between single rays and triangles, operations such as horizontal addition or dot product are required. The SSE4 instruction set adds the dot product instruction which can be used for this purpose. This paper presents a new modification of the fast ray-triangle intersection algorithms commonly used, which-when implemented on SSE4-outperforms the current state-of-the-art algorithms. It also allows both a single ray and ray packet intersection calculation with the same precomputed data. The speed gain measurements are described and discussed in the paper.
This article presents an optimized algorithm for Nonnegative Tensor Factorization (NTF), implemented in the CUDA (Compute Uniform Device Architecture) framework, that runs on contemporary graphics ...processors and exploits their massive parallelism. The NTF implementation is primarily targeted for analysis of high-dimensional spectral images, including dimensionality reduction, feature extraction, and other tasks related to spectral imaging; however, the algorithm and its implementation are not limited to spectral imaging. The speedups measured on real spectral images are around 60 - 100× compared to a traditional C implementation compiled with an optimizing compiler. Since common problems in the field of spectral imaging may take hours on a state-of-the-art CPU, the speedup achieved using a graphics card is attractive. The implementation is publicly available in the form of a dynamically linked library, including an interface to MATLAB, and thus may be of help to researchers and engineers using NTF on large problems.
Detection of lines in raster images is often performed using Hough transform. This paper presents a new parameterization of lines and a modification of the Hough transform-PClines. PClines are based ...on parallel coordinates, a coordinate system used mostly or solely for high-dimensional data visualization. The PClines algorithm is described in the paper; its accuracy is evaluated numerically and compared to the commonly used line detectors based on the Hough transform. The results show that PClines outperform the existing approaches in terms of accuracy. Besides, PClines are computationally extremely efficient, require no floating-point operations, and can be easily accelerated by different hardware architectures.
In various applications, a wider area needs to be covered by fiduciary markers but a large marker cannot be used because only a fraction of the area is to be viewed by the camera. Such an area can be ...covered by a number of small markers with unique identifiers. However, with the camera freely moving in the scene and with occluders present, it is difficult to ensure that at least one of the individual markers is completely visible, unless the markers are small and numerous. In that case, the markers are not recognizable from larger distances. In this paper we introduce the concept of Marker Fields which overcome this limitation. The Marker Field covers a large-scale planar (or non-planar) area and it is composed of mutually overlapping partial markers. We propose a particular arrangement of the Marker Field: a Uniform Checker-Board Marker Field, which is a black- and-white checkerboard whose square modules are defined by aperiodic 4-orientable binary n 2 -window arrays (De Bruijn tori). We propose a genetic algorithm for construction of 4-orientable n 2 window arrays. We used a supercomputer to synthesize large 4-orientable 4 2 window arrays and offer them publicly for downloading. We prototyped an algorithm for detection of the checkerboard marker fields and measured its performance. When processing input video from a cellphone camera, the algorithm visits only about 5 % of image pixels for reliable detection and the processing time is about 1 ms on a mid-range PC processor. The Uniform Marker Field increases freedom of camera movement, especially with occluders present in the scene. The detection algorithm is efficient and real-time marker field detection will be feasible on ultramobile devices.
Augmented reality does not make any sense for fixed cameras. Or does it? In this work, we are dealing with static cameras and their usability for interactive augmented reality applications. Knowing ...that the camera does not move makes camera pose estimation both less and more difficult - one does not have to deal with pose change in time, but on the other hand, obtaining some level of understanding of the scene from a single viewpoint is challenging. We propose several ways how to gain advantage from the camera being static and a pipeline of a system for broadcasting a video stream enriched by information needed for its interactive visual augmenting - Interactive Camera Streams, INCAST. We present a proof-of-concept system showing the usability of INCAST on several use-cases - non-interactive demos and simple AR games.
One limitation of existing fiduciary markers is that the camera motion is tightly limited: the marker (one of the markers) must be visible and it must be observed at a proper scale. This paper ...introduces a fractal structure of markers similar to matrix codes (such as QR-code or Data Matrix): the Fractal Marker Field. The FMF allows for embedding markers of a virtually unlimited number of scales. At the same time, for each of the scales it guarantees a constant density of markers at that scale over the whole marker field's surface. The Fractal Marker Field can provide unprecedented freedom of motion to camera-based augmented reality applications.
The presented system (Unicam) offers a complex state-of-the-art machine vision equipment and technology to provide automated video image vehicle detection devices dedicated for traffic monitoring ...applications. The system provides real time video image capturing, digital signal processing, compression, storage, and transmission over communication interfaces. It uses proprietary artificial intelligence algorithms and special image processing modules to achieve highly accurate vehicles detection. According to the users' needs, the system can be used for detection of red-light violations at road intersections, speed measurement, traffic data collection, video recording, or surveillance. Yet another possible application of the system is surveys based on license plate recognition for transportation engineers, stolen car searching, or toll-tag data collection. The system functionality has been improved by coupling camera sensors with specialized real-time processing units and adding networking capability. Implementation of video detection algorithms, hardware design units, and networking features are also discussed.
Particle rendering engine in DSP and FPGA Zemcik, P.; Herout, A.; Crha, L. ...
Proceedings. 11th IEEE International Conference and Workshop on the Engineering of Computer-Based Systems, 2004,
2004
Conference Proceeding
We present an algorithm for rendering 3D point-clouds, which exploits an FPGA chip coupled with a DSP processor on an experimental board. Point-clouds are sets of graphical data in 3D space, which ...seem to be more suitable for potentially many purposes than the most frequently, used triangle meshes. The actual experimental implementation, which verifies the concept and reports promising results, is also described.
Usage of statistical classifiers, namely AdaBoost and its modifications, in object detection and pattern recognition is a contemporary and popular trend. The computatiponal performance of these ...classifiers largely depends on low level image features they are using: both from the point of view of the amount of information the feature provides and the executional time of its evaluation. Local rank difference is an image feature that is alternative to commonly used Haar features. It is suitable for implementation in programmable (FPGA) or specialized (ASIC) hardware as well as graphics hardware (GPU). Additionally, as shown in this paper, it performs very well on common CPUpsilas. The paper discusses the LRD features and their properties, describes an experimental implementation of LRD using the multimedia instruction set of current general-purpose processors, presents its empirical performance measures compared to alternative approaches, and suggests several notes on practical usage of LRD and proposes directions for future work.