Building the picture
Complex visual scenes are made up of many component features such as edges and textures. Neurons in the early stages of the visual system are sensitive to individual features; ...it's implicitly believed that the nervous system must put them back together to signal conjunctions of different features, but how this is achieved is unknown. Yan Karklin and Michael Lewicki have developed a computational model of visual processing in which neural activity encodes statistical variations of features in images, establishing which ones are most likely to be associated with each other. Aspects of the model echo the nonlinear properties of some visual neurons, hinting at a possible functional interpretation for these properties.
Complex visual scenes are made up of many component features, such as edges and textures. Neurons in early stages of the visual system are sensitive to individual features, and it is implicitly believed that the nervous system must put them back together to signal conjunctions of different features, but how this is achieved is unknown. This paper proposes a model in which neural activity encodes statistical variations of features in images, thereby allowing the visual system to generalize across variable images.
A fundamental function of the visual system is to encode the building blocks of natural scenes—edges, textures and shapes—that subserve visual tasks such as object recognition and scene understanding. Essential to this process is the formation of abstract representations that generalize from specific instances of visual input. A common view holds that neurons in the early visual system signal conjunctions of image features
1
,
2
, but how these produce invariant representations is poorly understood. Here we propose that to generalize over similar images, higher-level visual neurons encode statistical variations that characterize local image regions. We present a model in which neural activity encodes the probability distribution most consistent with a given image. Trained on natural images, the model generalizes by learning a compact set of dictionary elements for image distributions typically encountered in natural scenes. Model neurons show a diverse range of properties observed in cortical cells. These results provide a new functional explanation for nonlinear effects in complex cells
3
,
4
,
5
,
6
and offer insight into coding strategies in primary visual cortex (V1) and higher visual areas.
A fundamental task of a sensory system is to infer information about the environment. It has long been suggested that an important goal of the first stage of this process is to encode the raw sensory ...signal efficiently by reducing its redundancy in the neural representation. Some redundancy, however, would be expected because it can provide robustness to noise inherent in the system. Encoding the raw sensory signal itself is also problematic, because it contains distortion and noise. The optimal solution would be constrained further by limited biological resources. Here, we analyze a simple theoretical model that incorporates these key aspects of sensory coding, and apply it to conditions in the retina. The model specifies the optimal way to incorporate redundancy in a population of noisy neurons, while also optimally compensating for sensory distortion and noise. Importantly, it allows an arbitrary input-to-output cell ratio between sensory units (photoreceptors) and encoding units (retinal ganglion cells), providing predictions of retinal codes at different eccentricities. Compared to earlier models based on redundancy reduction, the proposed model conveys more information about the original signal. Interestingly, redundancy reduction can be near-optimal when the number of encoding units is limited, such as in the peripheral retina. We show that there exist multiple, equally-optimal solutions whose receptive field structure and organization vary significantly. Among these, the one which maximizes the spatial locality of the computation, but not the sparsity of either synaptic weights or neural responses, is consistent with known basic properties of retinal receptive fields. The model further predicts that receptive field structure changes less with light adaptation at higher input-to-output cell ratios, such as in the periphery.
The auditory system encodes sound by decomposing the amplitude signal arriving at the ear into multiple frequency bands whose center frequencies and bandwidths are approximately exponential functions ...of the distance from the stapes. This organization is thought to result from the adaptation of cochlear mechanisms to the animal's auditory environment. Here we report that several basic auditory nerve fiber tuning properties can be accounted for by adapting a population of filter shapes to encode natural sounds efficiently. The form of the code depends on sound class, resembling a Fourier transformation when optimized for animal vocalizations and a wavelet transformation when optimized for non-biological environmental sounds. Only for the combined set does the optimal code follow scaling characteristics of physiological data. These results suggest that auditory nerve fibers encode a broad set of natural sounds in a manner consistent with information theoretic principles.
Efficient auditory coding Smith, Evan C.; Lewicki, Michael S.
Nature,
02/2006, Letnik:
439, Številka:
7079
Journal Article
Recenzirano
The auditory neural code must serve a wide range of auditory tasks that require great sensitivity in time and frequency and be effective over the diverse array of sounds present in natural acoustic ...environments. It has been suggested
1
,
2
,
3
,
4
,
5
that sensory systems might have evolved highly efficient coding strategies to maximize the information conveyed to the brain while minimizing the required energy and neural resources. Here we show that, for natural sounds, the complete acoustic waveform can be represented efficiently with a nonlinear model based on a population spike code. In this model, idealized spikes encode the precise temporal positions and magnitudes of underlying acoustic features. We find that when the features are optimized for coding either natural sounds or speech, they show striking similarities to time-domain cochlear filter estimates, have a frequency-bandwidth dependence similar to that of auditory nerve fibres, and yield significantly greater coding efficiency than conventional signal representations. These results indicate that the auditory code might approach an information theoretic optimum and that the acoustic structure of speech might be adapted to the coding capacity of the mammalian auditory system.
The detection of neural spike activity is a technical challenge that is a prerequisite for studying many types of brain function. Measuring the activity of individual neurons accurately can be ...difficult due to large amounts of background noise and the difficulty in distinguishing the action potentials of one neuron from those of others in the local area. This article reviews algorithms and methods for detecting and classifying action potentials, a problem commonly referred to as spike sorting. The article first discusses the challenges of measuring neural activity and the basic issues of signal detection and classification. It reviews and illustrates algorithms and techniques that have been applied to many of the problems in spike sorting and discusses the advantages and limitations of each and the applicability of these methods for different types of experimental demands. The article is written both for the physiologist wanting to use simple methods that will improve experimental yield and minimize the selection biases of traditional techniques and for those who want to apply or extend more sophisticated algorithms to meet new experimental challenges.
Learning Overcomplete Representations Lewicki, Michael S.; Sejnowski, Terrence J.
Neural computation,
02/2000, Letnik:
12, Številka:
2
Journal Article
Recenzirano
In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete ...representations have been advocated because they have greater robustness in the presence of noise, can be sparser, and can have greater flexibility in matching structure in the data. Overcomplete codes have also been proposed as a model of some of the response properties of neurons in primary visual cortex. Previous work has focused on finding the best representation of a signal using a fixed overcomplete basis (or dictionary). We present an algorithm for learning an overcomplete basis by viewing it as probabilistic model of the observed data. We show that overcomplete bases can yield a better approximation of the underlying statistical distribution of the data and can thus lead to greater coding efficiency. This can be viewed as a generalization of the technique of independent component analysis and provides a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures.
Capturing statistical regularities in complex, high-dimensional data is an important problem in machine learning and signal processing. Models such as principal component analysis (PCA) and ...independent component analysis (ICA) make few assumptions about the structure in the data and have good scaling properties, but they are limited to representing linear statistical regularities and assume that the distribution of the data is stationary. For many natural, complex signals, the latent variables often exhibit residual dependencies as well as nonstationary statistics. Here we present a hierarchical Bayesian model that is able to capture higher-order nonlinear structure and represent nonstationary data distributions. The model is a generalization of ICA in which the basis function coefficients are no longer assumed to be independent; instead, the dependencies in their magnitudes are captured by a set of density components. Each density component describes a common pattern of deviation from the marginal density of the pattern ensemble; in different combinations, they can describe nonstationary distributions. Adapting the model to image or audio data yields a nonlinear, distributed code for higher-order statistical regularities that reflect more abstract, invariant properties of the signal.
Nonstationary acoustic features provide essential cues for many auditory tasks, including sound localization, auditory stream analysis, and speech recognition. These features can best be ...characterized relative to a precise point in time, such as the onset of a sound or the beginning of a harmonic periodicity. Extracting these types of features is a difficult problem. Part of the difficulty is that with standard block-based signal analysis methods, the representation is sensitive to the arbitrary alignment of the blocks with respect to the signal. Convolutional techniques such as shift-invariant transformations can reduce this sensitivity, but these do not yield a code that is efficient, that is, one that forms a nonredundant representation of the underlying structure. Here, we develop a non-block-based method for signal representation that is both time relative and efficient. Signals are represented using a linear superposition of time-shiftable kernel functions, each with an associated magnitude and temporal position. Signal decomposition in this method is a non-linear process that consists of optimizing the kernel function scaling coefficients and temporal positions to form an efficient, shift-invariant representation. We demonstrate the properties of this representation for the purpose of characterizing structure in various types of nonstationary acoustic signals. The computational problem investigated here has direct relevance to the neural coding at the auditory nerve and the more general issue of how to encode complex, time-varying signals with a population of spiking neurons.
Scene analysis in the natural environment Lewicki, Michael S.; Olshausen, Bruno A.; Surlykke, Annemarie ...
Frontiers in psychology,
04/2014, Letnik:
5
Journal Article
Recenzirano
Odprti dostop
The problem of scene analysis has been studied in a number of different fields over the past decades. These studies have led to important insights into problems of scene analysis, but not all of ...these insights are widely appreciated, and there remain critical shortcomings in current approaches that hinder further progress. Here we take the view that scene analysis is a universal problem solved by all animals, and that we can gain new insight by studying the problems that animals face in complex natural environments. In particular, the jumping spider, songbird, echolocating bat, and electric fish, all exhibit behaviors that require robust solutions to scene analysis problems encountered in the natural environment. By examining the behaviors of these seemingly disparate animals, we emerge with a framework for studying scene analysis comprising four essential properties: (1) the ability to solve ill-posed problems, (2) the ability to integrate and store information across time and modality, (3) efficient recovery and representation of 3D scene structure, and (4) the use of optimal motor actions for acquiring information to progress toward behavioral goals.
An unsupervised classification algorithm is derived by modeling observed data as a mixture of several mutually exclusive classes that are each described by linear combinations of independent, ...non-Gaussian densities. The algorithm estimates the density of each class and is able to model class distributions with non-Gaussian structure. The new algorithm can improve classification accuracy compared with standard Gaussian mixture models. When applied to blind source separation in nonstationary environments, the method can switch automatically between classes, which correspond to contexts with different mixing properties. The algorithm can learn efficient codes for images containing both natural scenes and text. This method shows promise for modeling non-Gaussian structure in high-dimensional data and has many potential applications.