Statistical Mechanics of Deep Learning Bahri, Yasaman; Kadmon, Jonathan; Pennington, Jeffrey ...
Annual review of condensed matter physics,
03/2020, Volume:
11, Issue:
1
Journal Article
Peer reviewed
Open access
The recent striking success of deep neural networks in machine learning raises profound questions about the theoretical principles underlying their success. For example, what can such deep networks ...compute? How can we train them? How does information propagate through them? Why can they generalize? And how can we teach them to imagine? We review recent work in which methods of physical analysis rooted in statistical mechanics have begun to provide conceptual insights into these questions. These insights yield connections between deep learning and diverse physical and mathematical topics, including random landscapes, spin glasses, jamming, dynamical phase transitions, chaos, Riemannian geometry, random matrix theory, free probability, and nonequilibrium statistical mechanics. Indeed, the fields of statistical mechanics and machine learning have long enjoyed a rich history of strongly coupled interactions, and recent advances at the intersection of statistical mechanics and deep learning suggest these interactions will only deepen going forward.
The Panoramic Cameras on NASA's Mars Exploration Rovers have each returned more than 17,000 images of their calibration targets. In order to make optimal use of this data set for reflectance ...calibration, a correction must be made for the presence of air fall dust. Here we present an improved dust correction procedure based on a two‐layer scattering model, and we present a dust reflectance spectrum derived from long‐term trends in the data set. The dust on the calibration targets appears brighter than dusty areas of the Martian surface. We derive detailed histories of dust deposition and removal revealing two distinct environments: At the Spirit landing site, half the year is dominated by dust deposition, the other half by dust removal, usually in brief, sharp events. At the Opportunity landing site the Martian year has a semiannual dust cycle with dust removal happening gradually throughout two removal seasons each year. The highest observed optical depth of settled dust on the calibration target is 1.5 on Spirit and 1.1 on Opportunity (at 601 nm). We derive a general prediction for dust deposition rates of 0.004 ± 0.001 in units of surface optical depth deposited per sol (Martian solar day) per unit atmospheric optical depth. We expect this procedure to lead to improved reflectance‐calibration of the Panoramic Camera data set. In addition, it is easily adapted to similar data sets from other missions in order to deliver improved reflectance calibration as well as data on dust reflectance properties and deposition and removal history.
Key Points
We present an improved method for dust‐correcting calibration target images
The maximum deposited optical depth is 1.5 for Spirit and 1.1 for Opportunity
The two MER landing sites exhibit very different dust histories
The Panoramic Camera (Pancam) on the Mars Exploration Rover mission has acquired in excess of 20,000 images of the Pancam calibration targets on the rovers. Analysis of this data set allows estimates ...of the rate of deposition and removal of aeolian dust on both rovers. During the first 150–170 sols there was gradual dust accumulation on the rovers but no evidence for dust removal. After that time there is ample evidence for both dust removal and dust deposition on both rover decks. We analyze data from early in both rover missions using a diffusive reflectance mixing model. Assuming a dust settling rate proportional to the atmospheric optical depth, we derive spectra of optically thick layers of airfall dust that are consistent with spectra from dusty regions on the Martian surface. Airfall dust reflectance at the Opportunity site appears greater than at the Spirit site, consistent with other observations. We estimate the optical depth of dust deposited on the Spirit calibration target by sol 150 to be 0.44 ± 0.13. For Opportunity the value was 0.39 ± 0.12. Assuming 80% pore space, we estimate that the dust layer grew at a rate of one grain diameter per ∼100 sols on the Spirit calibration target. On Opportunity the rate was one grain diameter per ∼125 sols. These numbers are consistent with dust deposition rates observed by Mars Pathfinder taking into account the lower atmospheric dust optical depth during the Mars Pathfinder mission.
Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to ...adversarial perturbations-subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN.
Fitting probabilistic models to data is often difficult, due to the general intractability of the partition function. We propose a new parameter fitting method, minimum probability flow (MPF), which ...is applicable to any parametric model. We demonstrate parameter estimation using MPF in two cases: a continuous state space model, and an Ising spin glass. In the latter case, MPF outperforms current techniques by at least an order of magnitude in convergence time with lower error in the recovered coupling parameters.
The mammalian neocortex is a highly interconnected network of different types of neurons organized into both layers and columns. Overlaid on this structural organization is a pattern of functional ...connectivity that can be rapidly and flexibly altered during behavior. Parvalbumin-positive (PV+) inhibitory neurons, which are implicated in cortical oscillations and can change neuronal selectivity, may play a pivotal role in these dynamic changes. We found that optogenetic activation of PV+ neurons in the auditory cortex enhanced feedforward functional connectivity in the putative thalamorecipient circuit and in cortical columnar circuits. In contrast, stimulation of PV+ neurons induced no change in connectivity between sites in the same layers. The activity of PV+ neurons may thus serve as a gating mechanism to enhance feedforward, but not lateral or feedback, information flow in cortical circuits. Functionally, it may preferentially enhance the contribution of bottom-up sensory inputs to perception.
•Ising models recover known canonical circuit connectivity in the auditory cortex•PV+ neuron activity increases functional connectivity in cortical columns•PV+ neuron activity does not change horizontal connectivity in cortical layers•PV+ neuron activity may increase bottom-up sensory input for perception
The neocortex contains neurons of many subtypes arranged into layers and columns, but how this structure relates to network activity remains unknown. Hamilton et al. show that optogenetically activating PV+ inhibitory interneurons enhances feedforward functional connectivity in the auditory cortex.
Laboratory visible/near‐infrared multispectral observations of Mars Exploration Rover Pancam calibration target materials coated with different thicknesses of Mars spectral analog dust were acquired ...under variable illumination geometries using the Bloomsburg University Goniometer. The data were fit with a two‐layer radiative transfer model that combines a Hapke formulation for the dust with measured values of the substrate interpolated using a He‐Torrance approach. We first determined the single‐scattering albedo, phase function, opposition effect width, and amplitude for the dust using the entire data set (six coating thicknesses, three substrates, four wavelengths, and phase angles 3°–117°). The dust exhibited single‐scattering albedo values similar to other Mars analog soils and to Mars Pathfinder dust and a dominantly forward scattering behavior whose scattering lobe became narrower at longer wavelengths. Opacity values for each dust thickness corresponded well to those predicted from the particles sizes of the Mars analog dust. We then restricted the number of substrates, dust thicknesses, and incidence angles input to the model. The results suggest that the dust properties are best characterized when using substrates whose reflectances are brighter and darker than those of the deposited dust and data that span a wide range of dust thicknesses. The model also determined the dust photometric properties relatively well despite limitations placed on the range of incidence angles. The model presented here will help determine the photometric properties of dust deposited on the MER rovers and to track the multiple episodes of dust deposition and erosion that have occurred at both landing sites.
We statistically characterize the population spiking activity obtained from simultaneous recordings of neurons across all layers of a cortical microcolumn. Three types of models are compared: an ...Ising model which captures pairwise correlations between units, a Restricted Boltzmann Machine (RBM) which allows for modeling of higher-order correlations, and a semi-Restricted Boltzmann Machine which is a combination of Ising and RBM models. Model parameters were estimated in a fast and efficient manner using minimum probability flow, and log likelihoods were compared using annealed importance sampling. The higher-order models reveal localized activity patterns which reflect the laminar organization of neurons within a cortical column. The higher-order models also outperformed the Ising model in log-likelihood: On populations of 20 cells, the RBM had 10% higher log-likelihood (relative to an independent model) than a pairwise model, increasing to 45% gain in a larger network with 100 spatiotemporal elements, consisting of 10 neurons over 10 time steps. We further removed the need to model stimulus-induced correlations by incorporating a peri-stimulus time histogram term, in which case the higher order models continued to perform best. These results demonstrate the importance of higher-order interactions to describe the structure of correlated activity in cortical networks. Boltzmann Machines with hidden units provide a succinct and effective way to capture these dependencies without increasing the difficulty of model estimation and evaluation.