We propose a method for localizing an acoustic source with distributed microphone networks. Time Differences of Arrival (TDOAs) of signals pertaining the same sensor are estimated through Generalized ...Cross-Correlation. After a TDOA filtering stage that discards measurements that are potentially unreliable, source localization is performed by minimizing a fourth-order polynomial that combines hyperbolic constraints from multiple sensors. The algorithm turns to exhibit a significantly lower computational cost compared with state-of-the-art techniques, while retaining an excellent localization accuracy in fairly reverberant conditions.
In the last decade, the increased possibility to produce, edit, and disseminate multimedia contents has not been adequately balanced by similar advances in protecting these contents from unauthorized ...diffusion of forged copies. When the goal is to detect whether or not a digital content has been tampered with in order to alter its semantics, the use of multimedia hashes turns out to be an effective solution to offer proof of legitimacy and to possibly identify the introduced tampering. We propose an image hashing algorithm based on compressive sensing principles, which solves both the authentication and the tampering identification problems. The original content producer generates a hash using a small bit budget by quantizing a limited number of random projections of the authentic image. The content user receives the (possibly altered) image and uses the hash to estimate the mean square error distortion between the original and the received image. In addition, if the introduced tampering is sparse in some orthonormal basis or redundant dictionary, an approximation is given in the pixel domain. We emphasize that the hash is universal, e.g., the same hash signature can be used to detect and identify different types of tampering. At the cost of additional complexity at the decoder, the proposed algorithm is robust to moderate content-preserving transformations including cropping, scaling, and rotation. In addition, in order to keep the size of the hash small, hash encoding/decoding takes advantage of distributed source codes.
In this paper, we propose a robust and low-complexity acoustic source localization technique based on time differences of arrival (TDOA), which addresses the scenario of distributed sensor networks ...in 3D environments. Network nodes are assumed to be unsynchronized, i.e., TDOAs between microphones belonging to different nodes are not available. We begin with showing how to select feasible TDOAs for each sensor node, exploiting both geometrical considerations and a characterization of the overall generalized cross correlation (GCC) shape. We then show how to localize sources in the space-range reference frame, where TDOA measurements have a clear geometrical interpretation that can be fruitfully used in the scenario of unsynchronized sensors. In this framework, in fact, the source corresponds to the apex of a hypercone passing through points described by the sole microphone positions and TDOA measurements. The localization problem is therefore approached as a hypercone fitting problem. Finally, in order to improve the robustness of the estimate, we include an outlier detection procedure based on the evaluation of the hypercone fitting residuals. A refinement of source location estimate is then performed ignoring the contributions coming from outlier measurements. A set of simulations shows the performance of individual blocks of the system, with particular focus on the effect of TDOA selection on source localization and refinement steps. Experiments on real data validate the localization algorithm in an everyday scenario, proving that good accuracy can be obtained while saving computational cost in comparison with state-of-the-art techniques.
When video is transmitted over a packet-switched network, the sequence reconstructed at the receiver side might suffer from impairments introduced by packet losses, which can only be partially healed ...by the action of error concealment techniques. In this context we propose NORM (NO-Reference video quality Monitoring), an algorithm to assess the quality degradation of H.264/AVC video affected by channel errors. NORM works at the receiver side where both the original and the uncorrupted video content is unavailable. We explicitly account for distortion introduced by spatial and temporal error concealment together with the effect of temporal motion-compensation. NORM provides an estimate of the mean square error distortion at the macroblock level, showing good linear correlation (correlation coefficient greater than 0.80) with the distortion computed in full-reference mode. In addition, the estimate at the macroblock level can be successfully exploited by forward quality monitoring systems that compute quality objective metrics to predict mean opinion score (MOS) values. As a proof of concept, we feed the output of NORM to a reduced-reference quality monitoring system that computes an estimate of the structural similarity metric (SSIM) score, which is known to be well correlated with perceptual quality.
In this paper, we propose a novel acoustic source localization method that accommodates the general scenario of multiple independent microphone arrays. The method is based on a 3-D parameter space ...defined by the 2-D spatial location of a source and the range difference extracted from the time difference of arrival (TDOA). In this space, the set of points that correspond to a given range lie on a circle that expand as the range increases, forming a cone whose apex is the actual location of the source. In this parameter space, the lack of synchronization between arrays results in the fact that clusters of data associated to individual arrays are free to shift along the range axis. The cone constraint, in fact, enables the realignment of such clusters while positioning the cone vertex (source location), thus resulting in a joint data re-synchronization and source localization. We also propose a novel and general analysis methodology for swiftly assessing the localization error as a function of the TDOA uncertainties, which is remarkably accurate for small localization bias. With the aid of this method, simulations and experiments on real data, we show that the cone-fitting process offers excellent localization accuracy in the scenario of multiple unsynchronized arrays, as well as in simpler single-array scenarios, also in comparison with state-of-the-art techniques. We also show that the proposed method offers the desired flexibility for adapting to arbitrary geometries of microphone clusters.
Overhead images can be obtained using different acquisition and processing techniques, and they are becoming more and more popular. As with common photographs, they can be forged and manipulated by ...malicious users. However, not all image forensics methods tailored to normal photos can be successfully applied out of the box to overhead images. In this paper we consider the problem of localizing copy-paste forgeries on panchromatic images acquired with different satellites. We leverage a set of Convolutional Neural Networks (CNNs) that extract traces of the acquisition satellite directly from image patches. We then determine whether an image region appears to have been acquired with a different satellite than the rest of the picture. Results show that the proposed technique outperforms more sophisticated image forensics tools tailoring common photographs.
In the last few years, several companies started offering the possibility of buying different kinds of overhead images acquired by satellites orbiting around the planet. This market is interesting ...for several customers, from those who simply fancy a shot of their house from space, to those aiming to acquire strategic information on portions of land. Due to the sensitive nature of this data, which can be maliciously altered by anyone, the forensic community has started investigating methodologies to verify overhead imagery authenticity and integrity. Within this context, in this paper we investigate the possibility of using Convolutional Neural Networks (CNNs) to attribute a panchromatic satellite image to the satellite used to acquire it. In our investigation we tackle both closed-set and, adapting Deep Ensemble (DE) and Monte Carlo Dropout (MCD) techniques, open-set image attribution problems.
In this paper, we address the problem of video transmission over unreliable networks, such as the Internet, where packet losses occur. The most recent literature indicates multiple description (MD) ...as a promising coding approach to handle this issue. Moreover, it has been shown also how important the use of motion compensation prediction is in an MD-coding scheme. This paper proposes two architectures for multiple description video coding, both of them are based on the motion compensation prediction loop. The common characteristic of the two architectures is the use of a polyphase down-sampling technique to create the MDs and to introduce cross-redundancy among the descriptions. The first scheme, that we call drift-compensation multiple description video coder (DC-MDVC) appears very robust when used in an error-prone environment, but it can provide only two descriptions. The second architecture, called independent flow multiple description video coder (IF-MDVC), generates multiple sets of data before the motion compensation loop; in this case, there are no severe limitations in the selection of the number of descriptions used by the coder.
The beam tracing method can be used for the fast tracing of a large number of acoustic paths through a direct lookup of a special tree-like data structure (beam tree) that describes the iterated ...visibility information from one specific position. This structure describes the branching of bundles of rays (beams) as they encounter reflectors in their paths. For this reason, beam tracing is suitable for real-time acoustic rendering even when the receiver is moving. In this paper, we propose a novel technique that enables the fast tracing of a large number of acoustic beams through the iterative lookup of a special data structure that describes the global visibility between reflectors. The method enables the immediate generation of the beam tree corresponding to an arbitrary source location, which can then be used for path tracing through direct lookup. In practice, this technique generalizes the traditional beam-tracing method as it makes it suitable for real-time acoustic rendering not just when the receiver is moving but also when the source is moving. The method enables real-time modeling of acoustic propagation and real-time auralization in complex 2-D and 2-Dtimes1-D environments (e.g., vertical walls limited by horizontal floor and ceiling), which makes it suitable for applications of real-time virtual acoustics, immersive gaming, and advanced acoustic rendering. Some experimental results show the effectiveness of fast beam tracing with respect to the state of the art in acoustic beam tracing.
Recent researches on image forensics have led to the design of algorithms to study the phylogenetic relationship between near-duplicate (ND) images. The proposed solutions aim at reconstructing the ...image phylogeny tree (IPT), and they have immediate applications in security, law and copyright enforcement, and news tracking services. Anyway, the effectiveness of such strategies strictly depends on the accuracy in characterizing image similarities. In this paper, we show that it is possible to take into account additional information to better reconstruct the IPT. More specifically, we propose a set of features that blindly model the processing age of an image, i.e., how much an image has been edited in its lifetime. By exploiting these features, it is possible to improve the performance of IPT reconstruction by increasing the accuracy and reducing the computational complexity.