A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-z). A wide plethora of methods have been developed, based either on ...template models fitting or on empirical explorations of the photometric parameter space. Machine-learning-based techniques are not explicitly dependent on the physical priors and able to produce accurate photo-z estimations within the photometric ranges derived from the spectroscopic training set. These estimates, however, are not easy to characterize in terms of a photo-z probability density function (PDF), due to the fact that the analytical relation mapping the photometric parameters on to the redshift space is virtually unknown. We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method designed to provide a reliable PDF of the error distribution for empirical techniques. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine-learning model chosen to predict photo-z. We present a summary of results on SDSS-DR9 galaxy data, used also to perform a direct comparison with PDFs obtained by the LE PHARE spectral energy distribution template fitting. We show that METAPHOR is capable to estimate the precision and reliability of photometric redshifts obtained with three different self-adaptive techniques, i.e. MLPQNA, Random Forest and the standard K-Nearest Neighbors models.
ABSTRACT
We study the nuclear (AGN) activity in the local Universe (z < 0.33) and its correlation with the host galaxy properties, derived from a Sloan Digital Sky Survey sample with spectroscopic ...star-formation rate (SFR) and stellar mass determination. To quantify the level of AGN activity we used the XMM-Newton Serendipitous Source Catalogue. Applying multiwavelength selection criteria (optical BPT-diagrams, X-ray/optical ratio etc), we found that 24 per cent of the detected sources are efficiently-accreting AGN with moderate-to-high X-ray luminosity, twice as likely to be hosted by star-forming galaxies than by quiescent ones. The distribution of the specific Black Hole accretion rate (λsBHAR) shows that nuclear activity in local, non-AGN dominated galaxies peaks at very low accretion rates (−4 ≲ log λsBHAR ≲ −3) in all stellar mass ranges. We observe systematically larger values of λsBHAR for galaxies with active star formation than for quiescent ones, and an increase of the mean λsBHAR with SFR for both star-forming and quiescent galaxies. These finding confirm the decrease in AGN activity with cosmic time and are consistent with a scenario where both star-formation and AGN activity are fuelled by a common gas reservoir.
The Multi Layer Perception with Quasi Newton Algorithm (MLPQNA) is a machine learning method that can be used to cope with regression and classification problems on complex and massive data sets. In ...this paper, we give a formal description of the method and present the results of its application to the evaluation of photometric redshifts for quasars. The data set used for the experiment was obtained by merging four different surveys (Sloan Digital Sky Survey, GALEX, UKIDSS, and WISE), thus covering a wide range of wavelengths from the UV to the mid-infrared. The method is able (1) to achieve a very high accuracy, (2) to drastically reduce the number of outliers and catastrophic objects, and (3) to discriminate among parameters (or features) on the basis of their significance, so that the number of features used for training and analysis can be optimized in order to reduce both the computational demands and the effects of degeneracy. The best experiment, which makes use of a selected combination of parameters drawn from the four surveys, leads, in terms of Delta z sub(norm) (i.e., (z sub(spec) - z sub(phot))/(1 + z sub(spec))), to an average of Delta z sub(norm) = 0.004, a standard deviation of sigma - 0.069, and a median absolute deviation, MAD = 0.02, over the whole redshift range (i.e., z sub(spec) < or =, slant 3.6), defined by the four-survey cross-matched spectroscopic sample. The fraction of catastrophic outliers, i.e., of objects with photo-z deviating more than 2sigma from the spectroscopic value, is <3%, leading to sigma = 0.035 after their removal, over the same redshift range. The method is made available to the community through the DAMEWARE Web application.
We discuss whether modern machine learning methods can be used to characterize the physical nature of the large number of objects sampled by the modern multiband digital surveys. In particular, we ...applied the MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) method to the optical data of the Sloan Digital Sky Survey (SDSS) Data Release 10, investigating whether photometric data alone suffice to disentangle different classes of objects as they are defined in the SDSS spectroscopic classification. We discuss three groups of classification problems: (i) the simultaneous classification of galaxies, quasars and stars; (ii) the separation of stars from quasars; (iii) the separation of galaxies with normal spectral energy distribution from those with peculiar spectra, such as starburst or star-forming galaxies and AGN. While confirming the difficulty of disentangling AGN from normal galaxies on a photometric basis only, MLPQNA proved to be quite effective in the three-class separation. In disentangling quasars from stars and galaxies, our method achieved an overall efficiency of 91.31 per cent and a QSO class purity of ∼95 per cent. The resulting catalogue of candidate quasars/AGNs consists of ∼3.6 million objects, of which about half a million are also flagged as robust candidates, and will be made available on CDS VizieR facility.
ABSTRACT
With the launch of eROSITA (extended Roentgen Survey with an Imaging Telescope Array), successfully occurred on 2019 July 13, we are facing the challenge of computing reliable photometric ...redshifts for 3 million of active galactic nuclei (AGNs) over the entire sky, having available only patchy and inhomogeneous ancillary data. While we have a good understanding of the photo-z quality obtainable for AGN using spectral energy distribution (SED)-fitting technique, we tested the capability of machine learning (ML), usually reliable in computing photo-z for QSO in wide and shallow areas with rich spectroscopic samples. Using MLPQNA as example of ML, we computed photo-z for the X-ray-selected sources in Stripe 82X, using the publicly available photometric and spectroscopic catalogues. Stripe 82X is at least as deep as eROSITA will be and wide enough to include also rare and bright AGNs. In addition, the availability of ancillary data mimics what can be available in the whole sky. We found that when optical, and near- and mid-infrared data are available, ML and SED fitting perform comparably well in terms of overall accuracy, realistic redshift probability density functions, and fraction of outliers, although they are not the same for the two methods. The results could further improve if the photometry available is accurate and including morphological information. Assuming that we can gather sufficient spectroscopy to build a representative training sample, with the current photometry coverage we can obtain reliable photo-z for a large fraction of sources in the Southern hemisphere well before the spectroscopic follow-up, thus timely enabling the eROSITA science return. The photo-z catalogue is released here.
Abstract
Star formation rates (SFRs) are crucial to constrain theories of galaxy formation and evolution. SFRs are usually estimated via spectroscopic observations requiring large amounts of ...telescope time. We explore an alternative approach based on the photometric estimation of global SFRs for large samples of galaxies, by using methods such as automatic parameter space optimisation, and supervised machine learning models. We demonstrate that, with such approach, accurate multiband photometry allows to estimate reliable SFRs. We also investigate how the use of photometric rather than spectroscopic redshifts, affects the accuracy of derived global SFRs. Finally, we provide a publicly available catalogue of SFRs for more than 27 million galaxies extracted from the Sloan Digital Sky Survey Data Release 7. The catalogue will be made available through the Vizier facility.
ABSTRACT
Many scientific investigations of photometric galaxy surveys require redshift estimates, whose uncertainty properties are best encapsulated by photometric redshift (photo-z) posterior ...probability density functions (PDFs). A plethora of photo-z PDF estimation methodologies abound, producing discrepant results with no consensus on a preferred approach. We present the results of a comprehensive experiment comparing 12 photo-z algorithms applied to mock data produced for The Rubin Observatory Legacy Survey of Space and Time Dark Energy Science Collaboration. By supplying perfect prior information, in the form of the complete template library and a representative training set as inputs to each code, we demonstrate the impact of the assumptions underlying each technique on the output photo-z PDFs. In the absence of a notion of true, unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the ensemble properties of the derived photo-z PDFs as well as traditional reductions to photo-z point estimates. We report systematic biases and overall over/underbreadth of the photo-z PDFs of many popular codes, which may indicate avenues for improvement in the algorithms or implementations. Furthermore, we raise attention to the limitations of established metrics for assessing photo-z PDF accuracy; though we identify the conditional density estimate loss as a promising metric of photo-z PDF performance in the case where true redshifts are available but true photo-z PDFs are not, we emphasize the need for science-specific performance metrics.
The exploitation of present and future synoptic (multiband and multi-epoch) surveys requires an extensive use of automatic methods for data processing and data interpretation. In this work, using ...data extracted from the Catalina Real Time Transient Survey (CRTS), we investigate the classification performance of some well tested methods: Random Forest, MultiLayer Perceptron with Quasi Newton Algorithm and K-Nearest Neighbours, paying special attention to the feature selection phase. In order to do so, several classification experiments were performed. Namely: identification of cataclysmic variables, separation between galactic and extragalactic objects and identification of supernovae.