Abstract
The accurate estimation of photometric redshifts is crucial to many upcoming galaxy surveys, for example, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). Almost all ...Rubin extragalactic and cosmological science requires accurate and precise calculation of photometric redshifts; many diverse approaches to this problem are currently in the process of being developed, validated, and tested. In this work, we use the photometric redshift code GPz to examine two realistically complex training set imperfections scenarios for machine learning based photometric redshift calculation: (i) where the spectroscopic training set has a very different distribution in color–magnitude space to the test set, and (ii) where the effect of emission line confusion causes a fraction of the training spectroscopic sample to not have the true redshift. By evaluating the sensitivity of GPz to a range of increasingly severe imperfections, with a range of metrics (both of photo-
z
point estimates as well as posterior probability distribution functions, PDFs), we quantify the degree to which predictions get worse with higher degrees of degradation. In particular, we find that there is a substantial drop-off in photo-
z
quality when line-confusion goes above ∼1%, and sample incompleteness below a redshift of 1.5, for an experimental setup using data from the Buzzard Flock synthetic sky catalogs.
Estimating redshifts from broadband photometry is often limited by how accurately we can map the colors of galaxies to an underlying spectral template. Current techniques utilize spectrophotometric ...samples of galaxies or spectra derived from spectral synthesis models. Both of these approaches have their limitations: either the sample sizes are small and often not representative of the diversity of galaxy colors, or the model colors can be biased (often as a function of wavelength), which introduces systematics in the derived redshifts. In this paper, we learn the underlying spectral energy distributions from an ensemble of ∼100 K galaxies with measured redshifts and colors. We show that we are able to reconstruct emission and absorption lines at a significantly higher resolution than the broadband filters used to measure the photometry for a sample of 20 spectral templates. We find that our training algorithm reduces the fraction of outliers in the derived photometric redshifts by up to 28%, bias up to 91%, and scatter up to 25%, when compared to estimates using a standard set of spectral templates. We discuss the current limitations of this approach and its applicability for recovering the underlying properties of galaxies. Our derived templates and the code used to produce these results are publicly available in a dedicated Github repository: https://github.com/dirac-institute/photoz_template_learning.
Abstract
The Vera C. Rubin Observatory will, over a period of 10 yr, repeatedly survey the southern sky. To ensure that images generated by Rubin meet the quality requirements for precision science, ...the observatory will use an active-optics system (AOS) to correct for alignment and mirror surface perturbations introduced by gravity and temperature gradients in the optical system. To accomplish this, Rubin will use out-of-focus images from sensors located at the edge of the focal plane to learn and correct for perturbations to the wave front. We have designed and integrated a deep-learning (DL) model for wave-front estimation into the AOS pipeline. In this paper, we compare the performance of this DL approach to Rubin’s baseline algorithm when applied to images from two different simulations of the Rubin optical system. We show the DL approach is faster and more accurate, achieving the atmospheric error floor both for high-quality images and low-quality images with heavy blending and vignetting. Compared to the baseline algorithm, the DL model is 40× faster, the median error 2× better under ideal conditions, 5× better in the presence of vignetting by the Rubin camera, and 14× better in the presence of blending in crowded fields. In addition, the DL model surpasses the required optical quality in simulations of the AOS closed loop. This system promises to increase the survey area useful for precision science by up to 8%. We discuss how this system might be deployed when commissioning and operating Rubin.
Abstract Large imaging surveys will rely on photometric redshifts (photo- z 's), which are typically estimated through machine-learning methods. Currently planned spectroscopic surveys will not be ...deep enough to produce a representative training sample for Legacy Survey of Space and Time (LSST), so we seek methods to improve the photo- z estimates that arise from nonrepresentative training samples. Spectroscopic training samples for photo- z 's are biased toward redder, brighter galaxies, which also tend to be at lower redshift than the typical galaxy observed by LSST, leading to poor photo- z estimates with outlier fractions nearly 4 times larger than for a representative training sample. In this Letter, we apply the concept of training sample augmentation, where we augment simulated nonrepresentative training samples with simulated galaxies possessing otherwise unrepresented features. When we select simulated galaxies with ( g - z ) color, i -band magnitude, and redshift outside the range of the original training sample, we are able to reduce the outlier fraction of the photo- z estimates for simulated LSST data by nearly 50% and the normalized median absolute deviation (NMAD) by 56%. When compared to a fully representative training sample, augmentation can recover nearly 70% of the degradation in the outlier fraction and 80% of the degradation in NMAD. Training sample augmentation is a simple and effective way to improve training samples for photo- z 's without requiring additional spectroscopic samples.
Estimating redshifts from broadband photometry is often limited by how accurately we can map the colors of galaxies to an underlying spectral template. Current techniques utilize spectrophotometric ...samples of galaxies or spectra derived from spectral synthesis models. Both of these approaches have their limitations: either the sample sizes are small and often not representative of the diversity of galaxy colors, or the model colors can be biased (often as a function of wavelength), which introduces systematics in the derived redshifts. In this paper, we learn the underlying spectral energy distributions from an ensemble of ∼100 K galaxies with measured redshifts and colors. We show that we are able to reconstruct emission and absorption lines at a significantly higher resolution than the broadband filters used to measure the photometry for a sample of 20 spectral templates. We find that our training algorithm reduces the fraction of outliers in the derived photometric redshifts by up to 28%, bias up to 91%, and scatter up to 25%, when compared to estimates using a standard set of spectral templates. We discuss the current limitations of this approach and its applicability for recovering the underlying properties of galaxies. Our derived templates and the code used to produce these results are publicly available in a dedicated Github repository: https://github.com/dirac-institute/photoz-template-learning.
The Vera C. Rubin Observatory is a unique facility for survey astronomy that will soon be commissioned and begin operations. Crucial to many of its scientific goals is the achievement of sustained ...high image quality, limited only by the seeing at the site. This will be maintained through an Active Optics System (AOS) that controls optical element misalignments and corrects mirror figure error to minimize aberrations caused by both thermal and gravitational distortions. However, the large number of adjustment degrees of freedom available on the Rubin Observatory introduces a range of degeneracies, including many that are \textit{noise-induced} due to imperfect measurement of the wavefront errors. We present a structured methodology for identifying these degeneracies through an analysis of image noise level. We also present a novel scaling strategy based on Truncated Singular Value Decomposition (TSVD) that mitigates the degeneracy, and optimally distributes the adjustment over the available degrees of freedom. Our approach ensures the attainment of optimal image quality, while avoiding excursions around the noise-induced subspace of degeneracies, marking a significant improvement over the previous techniques adopted for Rubin, which were based on an Optimal Integral Controller (OIC). This new approach is likely to also yield significant benefits for all telescopes that incorporate large numbers of degrees of freedom of adjustment.
Estimating redshifts from broadband photometry is often limited by how accurately we can map the colors of galaxies to an underlying spectral template. Current techniques utilize spectrophotometric ...samples of galaxies or spectra derived from spectral synthesis models. Both of these approaches have their limitations, either the sample sizes are small and often not representative of the diversity of galaxy colors or the model colors can be biased (often as a function of wavelength) which introduces systematics in the derived redshifts. In this paper we learn the underlying spectral energy distributions from an ensemble of \(\sim\)100K galaxies with measured redshifts and colors. We show that we are able to reconstruct emission and absorption lines at a significantly higher resolution than the broadband filters used to measure the photometry for a sample of 20 spectral templates. We find that our training algorithm reduces the fraction of outliers in the derived photometric redshifts by up to 28%, bias up to 91%, and scatter up to 25%, when compared to estimates using a standard set of spectral templates. We discuss the current limitations of this approach and its applicability for recovering the underlying properties of galaxies. Our derived templates and the code used to produce these results are publicly available in a dedicated Github repository: https://github.com/dirac-institute/photoz_template_learning.
The accurate estimation of photometric redshifts is crucial to many upcoming galaxy surveys, for example the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). Almost all Rubin ...extragalactic and cosmological science requires accurate and precise calculation of photometric redshifts; many diverse approaches to this problem are currently in the process of being developed, validated, and tested. In this work, we use the photometric redshift code GPz to examine two realistically complex training set imperfections scenarios for machine learning based photometric redshift calculation: i) where the spectroscopic training set has a very different distribution in colour-magnitude space to the test set, and ii) where the effect of emission line confusion causes a fraction of the training spectroscopic sample to not have the true redshift. By evaluating the sensitivity of GPz to a range of increasingly severe imperfections, with a range of metrics (both of photo-z point estimates as well as posterior probability distribution functions, PDFs), we quantify the degree to which predictions get worse with higher degrees of degradation. In particular we find that there is a substantial drop-off in photo-z quality when line-confusion goes above ~1%, and sample incompleteness below a redshift of 1.5, for an experimental setup using data from the Buzzard Flock synthetic sky catalogues.
Large imaging surveys will rely on photometric redshifts (photo-z's), which are typically estimated through machine learning methods. Currently planned spectroscopic surveys will not be deep enough ...to produce a representative training sample for LSST, so we seek methods to improve the photo-z estimates that arise from non-representative training samples. Spectroscopic training samples for photo-z's are biased towards redder, brighter galaxies, which also tend to be at lower redshift than the typical galaxy observed by LSST, leading to poor photo-z estimates with outlier fractions nearly 4 times larger than for a representative training sample. In this paper, we apply the concept of training sample augmentation, where we augment simulated non-representative training samples with simulated galaxies possessing otherwise unrepresented features. When we select simulated galaxies with (g-z) color, i-band magnitude and redshift outside the range of the original training sample, we are able to reduce the outlier fraction of the photo-z estimates for simulated LSST data by nearly 50% and the normalized median absolute deviation (NMAD) by 56%. When compared to a fully representative training sample, augmentation can recover nearly 70% of the degradation in the outlier fraction and 80% of the degradation in NMAD. Training sample augmentation is a simple and effective way to improve training samples for photo-z's without requiring additional spectroscopic samples.
Evaluating the accuracy and calibration of the redshift posteriors produced by photometric redshift (photo-z) estimators is vital for enabling precision cosmology and extragalactic astrophysics with ...modern wide-field photometric surveys. Evaluating photo-z posteriors on a per-galaxy basis is difficult, however, as real galaxies have a true redshift but not a true redshift posterior. We introduce PZFlow, a Python package for the probabilistic forward modeling of galaxy catalogs with normalizing flows. For catalogs simulated with PZFlow, there is a natural notion of "true" redshift posteriors that can be used for photo-z validation. We use PZFlow to simulate a photometric galaxy catalog where each galaxy has a redshift, noisy photometry, shape information, and a true redshift posterior. We also demonstrate the use of an ensemble of normalizing flows for photo-z estimation. We discuss how PZFlow will be used to validate the photo-z estimation pipeline of the Dark Energy Science Collaboration (DESC), and the wider applicability of PZFlow for statistical modeling of any tabular data.