Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of ...known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.
We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multiscale oriented basis. Neighborhoods of coefficients at adjacent ...positions and scales are modeled as the product of two independent random variables: a Gaussian vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the coefficient amplitudes. Under this model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear estimates over all possible values of the hidden multiplier variable. We demonstrate through simulations with images contaminated by additive white Gaussian noise that the performance of this method substantially surpasses that of previously published methods, both visually and in terms of mean squared error.
It has long been assumed that sensory neurons are adapted, through both evolutionary and developmental processes, to the statistical properties of the signals to which they are exposed. Attneave ...(1954)Barlow (1961) proposed that information theory could provide a link between environmental statistics and neural responses through the concept of coding efficiency. Recent developments in statistical modeling, along with powerful computational tools, have enabled researchers to study more sophisticated statistical models for visual images, to validate these models empirically against large sets of data, and to begin experimentally testing the efficient coding hypothesis for both individual neurons and populations of neurons.
We describe the design of finite-size linear-phase separable kernels for differentiation of discrete multidimensional signals. The problem is formulated as an optimization of the rotation-invariance ...of the gradient operator, which results in a simultaneous constraint on a set of one-dimensional low-pass prefilter and differentiator filters up to the desired order. We also develop extensions of this formulation to both higher dimensions and higher order directional derivatives. We develop a numerical procedure for optimizing the constraint, and demonstrate its use in constructing a set of example filters. The resulting filters are significantly more accurate than those commonly used in the image and multidimensional signal processing literature.
We consider the problem of decomposing a signal into a linear combination of features, each a continuously translated version of one of a small set of elementary features. Although these constituents ...are drawn from a continuous family, most current signal decomposition methods rely on a finite dictionary of discrete examples selected from this family (e.g., shifted copies of a set of basic waveforms), and apply sparse optimization methods to select and solve for the relevant coefficients. Here, we generate a dictionary that includes auxiliary interpolation functions that approximate translates of features via adjustment of their coefficients. We formulate a constrained convex optimization problem, in which the full set of dictionary coefficients represents a linear approximation of the signal, the auxiliary coefficients are constrained so as to only represent translated features, and sparsity is imposed on the primary coefficients using an L1 penalty. The basis pursuit denoising (BP) method may be seen as a special case, in which the auxiliary interpolation functions are omitted, and we thus refer to our methodology as continuous basis pursuit (CBP). We develop two implementations of CBP for a one-dimensional translation-invariant source, one using a first-order Taylor approximation, and another using a form of trigonometric spline. We examine the tradeoff between sparsity and signal reconstruction accuracy in these methods, demonstrating empirically that trigonometric CBP substantially outperforms Taylor CBP, which, in turn, offers substantial gains over ordinary BP. In addition, the CBP bases can generally achieve equally good or better approximations with much coarser sampling than BP, leading to a reduction in dictionary dimensionality.
It is widely believed that visual systems are optimized for the visual properties of the environment inhabited by the organism. A specific instance of this principle is known as the Efficient Coding ...Hypothesis, which holds that the purpose of early visual processing is to produce an efficient representation of the incoming visual signal. The theory provides a quantitative link between the statistical properties of the world and the structure of the visual system. As such, specific instances of this theory have been tested experimentally, and have been used to motivate and constrain models for early visual processing.
We describe a form of nonlinear decomposition that is well-suited for efficient encoding of natural signals. Signals are initially decomposed using a bank of linear filters. Each filter response is ...then rectified and divided by a weighted sum of rectified responses of neighboring filters. We show that this decomposition, with parameters optimized for the statistics of a generic ensemble of natural images or sounds, provides a good characterization of the nonlinear response properties of typical neurons in primary visual cortex or auditory nerve, respectively. These results suggest that nonlinear response properties of sensory neurons are not an accident of biological implementation, but have an important functional role.
Neurons in primary visual cortex (V1) are commonly classified as simple or complex based upon their sensitivity to the sign of stimulus contrast. The responses of both cell types can be described by ...a general model in which the outputs of a set of linear filters are nonlinearly combined. We estimated the model for a population of V1 neurons by analyzing the mean and covariance of the spatiotemporal distribution of random bar stimuli that were associated with spikes. This analysis reveals an unsuspected richness of neuronal computation within V1. Specifically, simple and complex cell responses are best described using more linear filters than the one or two found in standard models. Many filters revealed by the model contribute suppressive signals that appear to have a predominantly divisive influence on neuronal firing. Suppressive signals are especially potent in direction-selective cells, where they reduce responses to stimuli moving in the nonpreferred direction.
We examine a cascade encoding model for neural response in which a linear filtering stage is followed by a noisy, leaky, integrate-and-fire spike generation mechanism. This model provides a ...biophysically more realistic alternative to models based on Poisson (memoryless) spike generation, and can effectively reproduce a variety of spiking behaviors seen in vivo. We describe the maximum likelihood estimator for the model parameters, given only extracellular spike train responses (not intracellular voltage data). Specifically, we prove that the log-likelihood function is concave and thus has an essentially unique global maximum that can be found using gradient ascent techniques. We develop an efficient algorithm for computing the maximum likelihood solution, demonstrate the effectiveness of the resulting estimator with numerical simulations, and discuss a method of testing the model's validity using time-rescaling and density evolution techniques.
Human visual speed perception is qualitatively consistent with a Bayesian observer that optimally combines noisy measurements with a prior preference for lower speeds. Quantitative validation of this ...model, however, is difficult because the precise noise characteristics and prior expectations are unknown. Here, we present an augmented observer model that accounts for the variability of subjective responses in a speed discrimination task. This allowed us to infer the shape of the prior probability as well as the internal noise characteristics directly from psychophysical data. For all subjects, we found that the fitted model provides an accurate description of the data across a wide range of stimulus parameters. The inferred prior distribution shows significantly heavier tails than a Gaussian, and the amplitude of the internal noise is approximately proportional to stimulus speed and depends inversely on stimulus contrast. The framework is general and should prove applicable to other experiments and perceptual modalities.