ABSTRACT
Many scientific investigations of photometric galaxy surveys require redshift estimates, whose uncertainty properties are best encapsulated by photometric redshift (photo-z) posterior ...probability density functions (PDFs). A plethora of photo-z PDF estimation methodologies abound, producing discrepant results with no consensus on a preferred approach. We present the results of a comprehensive experiment comparing 12 photo-z algorithms applied to mock data produced for The Rubin Observatory Legacy Survey of Space and Time Dark Energy Science Collaboration. By supplying perfect prior information, in the form of the complete template library and a representative training set as inputs to each code, we demonstrate the impact of the assumptions underlying each technique on the output photo-z PDFs. In the absence of a notion of true, unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the ensemble properties of the derived photo-z PDFs as well as traditional reductions to photo-z point estimates. We report systematic biases and overall over/underbreadth of the photo-z PDFs of many popular codes, which may indicate avenues for improvement in the algorithms or implementations. Furthermore, we raise attention to the limitations of established metrics for assessing photo-z PDF accuracy; though we identify the conditional density estimate loss as a promising metric of photo-z PDF performance in the case where true redshifts are available but true photo-z PDFs are not, we emphasize the need for science-specific performance metrics.
ABSTRACT
Wide-area imaging surveys are one of the key ways of advancing our understanding of cosmology, galaxy formation physics, and the large-scale structure of the Universe in the coming years. ...These surveys typically require calculating redshifts for huge numbers (hundreds of millions to billions) of galaxies – almost all of which must be derived from photometry rather than spectroscopy. In this paper, we investigate how using statistical models to understand the populations that make up the colour–magnitude distribution of galaxies can be combined with machine learning photometric redshift codes to improve redshift estimates. In particular, we combine the use of Gaussian mixture models with the high-performing machine-learning photo-z algorithm GPz and show that modelling and accounting for the different colour–magnitude distributions of training and test data separately can give improved redshift estimates, reduce the bias on estimates by up to a half, and speed up the run-time of the algorithm. These methods are illustrated using data from deep optical and near-infrared data in two separate deep fields, where training and test data of different colour–magnitude distributions are constructed from the galaxies with known spectroscopic redshifts, derived from several heterogeneous surveys.
Wide-area imaging surveys are one of the key ways of advancing our understanding of cosmology, galaxy formation physics, and the large-scale structure of the Universe in the coming years. These ...surveys typically require calculating redshifts for huge numbers (hundreds of millions to billions) of galaxies - almost all of which must be derived from photometry rather than spectroscopy. In this paper we investigate how using statistical models to understand the populations that make up the colour-magnitude distribution of galaxies can be combined with machine learning photometric redshift codes to improve redshift estimates. In particular we combine the use of Gaussian Mixture Models with the high performing machine learning photo-z algorithm GPz and show that modelling and accounting for the different colour-magnitude distributions of training and test data separately can give improved redshift estimates, reduce the bias on estimates by up to a half, and speed up the run-time of the algorithm. These methods are illustrated using data from deep optical and near infrared data in two separate deep fields, where training and test data of different colour-magnitude distributions are constructed from the galaxies with known spectroscopic redshifts, derived from several heterogeneous surveys.
Many scientific investigations of photometric galaxy surveys require redshift estimates, whose uncertainty properties are best encapsulated by photometric redshift (photo-z) posterior probability ...density functions (PDFs). A plethora of photo-z PDF estimation methodologies abound, producing discrepant results with no consensus on a preferred approach. We present the results of a comprehensive experiment comparing twelve photo-z algorithms applied to mock data produced for The Rubin Observatory Legacy Survey of Space and Time (LSST) Dark Energy Science Collaboration (DESC). By supplying perfect prior information, in the form of the complete template library and a representative training set as inputs to each code, we demonstrate the impact of the assumptions underlying each technique on the output photo-z PDFs. In the absence of a notion of true, unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the ensemble properties of the derived photo-z PDFs as well as traditional reductions to photo-z point estimates. We report systematic biases and overall over/under-breadth of the photo-z PDFs of many popular codes, which may indicate avenues for improvement in the algorithms or implementations. Furthermore, we raise attention to the limitations of established metrics for assessing photo-z PDF accuracy; though we identify the conditional density estimate (CDE) loss as a promising metric of photo-z PDF performance in the case where true redshifts are available but true photo-z PDFs are not, we emphasize the need for science-specific performancemetrics.