A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-z). A wide plethora of methods have been developed, based either on ...template models fitting or on empirical explorations of the photometric parameter space. Machine-learning-based techniques are not explicitly dependent on the physical priors and able to produce accurate photo-z estimations within the photometric ranges derived from the spectroscopic training set. These estimates, however, are not easy to characterize in terms of a photo-z probability density function (PDF), due to the fact that the analytical relation mapping the photometric parameters on to the redshift space is virtually unknown. We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method designed to provide a reliable PDF of the error distribution for empirical techniques. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine-learning model chosen to predict photo-z. We present a summary of results on SDSS-DR9 galaxy data, used also to perform a direct comparison with PDFs obtained by the LE PHARE spectral energy distribution template fitting. We show that METAPHOR is capable to estimate the precision and reliability of photometric redshifts obtained with three different self-adaptive techniques, i.e. MLPQNA, Random Forest and the standard K-Nearest Neighbors models.
ABSTRACT
With the launch of eROSITA (extended Roentgen Survey with an Imaging Telescope Array), successfully occurred on 2019 July 13, we are facing the challenge of computing reliable photometric ...redshifts for 3 million of active galactic nuclei (AGNs) over the entire sky, having available only patchy and inhomogeneous ancillary data. While we have a good understanding of the photo-z quality obtainable for AGN using spectral energy distribution (SED)-fitting technique, we tested the capability of machine learning (ML), usually reliable in computing photo-z for QSO in wide and shallow areas with rich spectroscopic samples. Using MLPQNA as example of ML, we computed photo-z for the X-ray-selected sources in Stripe 82X, using the publicly available photometric and spectroscopic catalogues. Stripe 82X is at least as deep as eROSITA will be and wide enough to include also rare and bright AGNs. In addition, the availability of ancillary data mimics what can be available in the whole sky. We found that when optical, and near- and mid-infrared data are available, ML and SED fitting perform comparably well in terms of overall accuracy, realistic redshift probability density functions, and fraction of outliers, although they are not the same for the two methods. The results could further improve if the photometry available is accurate and including morphological information. Assuming that we can gather sufficient spectroscopy to build a representative training sample, with the current photometry coverage we can obtain reliable photo-z for a large fraction of sources in the Southern hemisphere well before the spectroscopic follow-up, thus timely enabling the eROSITA science return. The photo-z catalogue is released here.
The Multi Layer Perception with Quasi Newton Algorithm (MLPQNA) is a machine learning method that can be used to cope with regression and classification problems on complex and massive data sets. In ...this paper, we give a formal description of the method and present the results of its application to the evaluation of photometric redshifts for quasars. The data set used for the experiment was obtained by merging four different surveys (Sloan Digital Sky Survey, GALEX, UKIDSS, and WISE), thus covering a wide range of wavelengths from the UV to the mid-infrared. The method is able (1) to achieve a very high accuracy, (2) to drastically reduce the number of outliers and catastrophic objects, and (3) to discriminate among parameters (or features) on the basis of their significance, so that the number of features used for training and analysis can be optimized in order to reduce both the computational demands and the effects of degeneracy. The best experiment, which makes use of a selected combination of parameters drawn from the four surveys, leads, in terms of Delta z sub(norm) (i.e., (z sub(spec) - z sub(phot))/(1 + z sub(spec))), to an average of Delta z sub(norm) = 0.004, a standard deviation of sigma - 0.069, and a median absolute deviation, MAD = 0.02, over the whole redshift range (i.e., z sub(spec) < or =, slant 3.6), defined by the four-survey cross-matched spectroscopic sample. The fraction of catastrophic outliers, i.e., of objects with photo-z deviating more than 2sigma from the spectroscopic value, is <3%, leading to sigma = 0.035 after their removal, over the same redshift range. The method is made available to the community through the DAMEWARE Web application.
We discuss whether modern machine learning methods can be used to characterize the physical nature of the large number of objects sampled by the modern multiband digital surveys. In particular, we ...applied the MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) method to the optical data of the Sloan Digital Sky Survey (SDSS) Data Release 10, investigating whether photometric data alone suffice to disentangle different classes of objects as they are defined in the SDSS spectroscopic classification. We discuss three groups of classification problems: (i) the simultaneous classification of galaxies, quasars and stars; (ii) the separation of stars from quasars; (iii) the separation of galaxies with normal spectral energy distribution from those with peculiar spectra, such as starburst or star-forming galaxies and AGN. While confirming the difficulty of disentangling AGN from normal galaxies on a photometric basis only, MLPQNA proved to be quite effective in the three-class separation. In disentangling quasars from stars and galaxies, our method achieved an overall efficiency of 91.31 per cent and a QSO class purity of ∼95 per cent. The resulting catalogue of candidate quasars/AGNs consists of ∼3.6 million objects, of which about half a million are also flagged as robust candidates, and will be made available on CDS VizieR facility.
Abstract
Star formation rates (SFRs) are crucial to constrain theories of galaxy formation and evolution. SFRs are usually estimated via spectroscopic observations requiring large amounts of ...telescope time. We explore an alternative approach based on the photometric estimation of global SFRs for large samples of galaxies, by using methods such as automatic parameter space optimisation, and supervised machine learning models. We demonstrate that, with such approach, accurate multiband photometry allows to estimate reliable SFRs. We also investigate how the use of photometric rather than spectroscopic redshifts, affects the accuracy of derived global SFRs. Finally, we provide a publicly available catalogue of SFRs for more than 27 million galaxies extracted from the Sloan Digital Sky Survey Data Release 7. The catalogue will be made available through the Vizier facility.
ABSTRACT
Many scientific investigations of photometric galaxy surveys require redshift estimates, whose uncertainty properties are best encapsulated by photometric redshift (photo-z) posterior ...probability density functions (PDFs). A plethora of photo-z PDF estimation methodologies abound, producing discrepant results with no consensus on a preferred approach. We present the results of a comprehensive experiment comparing 12 photo-z algorithms applied to mock data produced for The Rubin Observatory Legacy Survey of Space and Time Dark Energy Science Collaboration. By supplying perfect prior information, in the form of the complete template library and a representative training set as inputs to each code, we demonstrate the impact of the assumptions underlying each technique on the output photo-z PDFs. In the absence of a notion of true, unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the ensemble properties of the derived photo-z PDFs as well as traditional reductions to photo-z point estimates. We report systematic biases and overall over/underbreadth of the photo-z PDFs of many popular codes, which may indicate avenues for improvement in the algorithms or implementations. Furthermore, we raise attention to the limitations of established metrics for assessing photo-z PDF accuracy; though we identify the conditional density estimate loss as a promising metric of photo-z PDF performance in the case where true redshifts are available but true photo-z PDFs are not, we emphasize the need for science-specific performance metrics.
Abstract
Hi-GAL (Herschel InfraRed Galactic Plane Survey) is a large-scale survey of the Galactic plane, performed with Herschel
in five infrared continuum bands between 70 and 500 μm. We present a ...band-merged catalogue of spatially matched sources and their properties derived from fits to the spectral energy distributions (SEDs) and heliocentric distances, based on the photometric catalogues presented in Molinari et al., covering the portion of Galactic plane −71
$_{.}^{\circ}$
0 < ℓ < 67
$_{.}^{\circ}$
0. The band-merged catalogue contains 100 922 sources with a regular SED, 24 584 of which show a 70-μm counterpart and are thus considered protostellar, while the remainder are considered starless. Thanks to this huge number of sources, we are able to carry out a preliminary analysis of early stages of star formation, identifying the conditions that characterize different evolutionary phases on a statistically significant basis. We calculate surface densities to investigate the gravitational stability of clumps and their potential to form massive stars. We also explore evolutionary status metrics such as the dust temperature, luminosity and bolometric temperature, finding that these are higher in protostellar sources compared to pre-stellar ones. The surface density of sources follows an increasing trend as they evolve from pre-stellar to protostellar, but then it is found to decrease again in the majority of the most evolved clumps. Finally, we study the physical parameters of sources with respect to Galactic longitude and the association with spiral arms, finding only minor or no differences between the average evolutionary status of sources in the fourth and first Galactic quadrants, or between ‘on-arm’ and ‘interarm’ positions.
We present a catalog of quasars selected from broad-band photometric ugri data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are identified by the random forest (RF) supervised ...machine learning model, trained on Sloan Digital Sky Survey (SDSS) DR14 spectroscopic data. We first cleaned the input KiDS data of entries with excessively noisy, missing or otherwise problematic measurements. Applying a feature importance analysis, we then tune the algorithm and identify in the KiDS multiband catalog the 17 most useful features for the classification, namely magnitudes, colors, magnitude ratios, and the stellarity index. We used the t-SNE algorithm to map the multidimensional photometric data onto 2D planes and compare the coverage of the training and inference sets. We limited the inference set to r < 22 to avoid extrapolation beyond the feature space covered by training, as the SDSS spectroscopic sample is considerably shallower than KiDS. This gives 3.4 million objects in the final inference sample, from which the random forest identified 190 000 quasar candidates. Accuracy of 97% (percentage of correctly classified objects), purity of 91% (percentage of true quasars within the objects classified as such), and completeness of 87% (detection ratio of all true quasars), as derived from a test set extracted from SDSS and not used in the training, are confirmed by comparison with external spectroscopic and photometric QSO catalogs overlapping with the KiDS footprint. The robustness of our results is strengthened by number counts of the quasar candidates in the r band, as well as by their mid-infrared colors available from the Wide-field Infrared Survey Explorer (WISE). An analysis of parallaxes and proper motions of our QSO candidates found also in Gaia DR2 suggests that a probability cut of pQSO > 0.8 is optimal for purity, whereas pQSO > 0.7 is preferable for better completeness. Our study presents the first comprehensive quasar selection from deep high-quality KiDS data and will serve as the basis for versatile studies of the QSO population detected by this survey.
Full text
Available for:
FMFMET, NUK, UL, UM, UPUK
ABSTRACT
The recent data collected by Herschel have confirmed that interstellar structures with a filamentary shape are ubiquitously present in the Milky Way. Filaments are thought to be formed by ...several physical mechanisms acting from large Galactic scales down to subparsec fractions of molecular clouds, and they might represent a possible link between star formation and the large-scale structure of the Galaxy. In order to study this potential link, a statistically significant sample of filaments spread throughout the Galaxy is required. In this work, we present the first catalogue of 32 059 candidate filaments automatically identified in the Herschel Infrared Galactic plane Survey (Hi-GAL) of the entire Galactic plane. For these objects, we determined morphological (length la and geometrical shape) and physical (average column density $N_{\rm H_{2}}$ and average temperature T) properties. We identified filaments with a wide range of properties: 2 ≤ la ≤ 100 arcmin, $10^{20} \le N_{\rm H_{2}} \le 10^{23}$ cm−2 and 10 ≤ T ≤ 35 K. We discuss their association with the Hi-GAL compact sources, finding that the most tenuous (and stable) structures do not host any major condensation. We also assign a distance to ∼18 400 filaments, for which we determine mass, physical size, stability conditions and Galactic distribution. When compared with the spiral arms structure, we find no significant difference between the physical properties of on-arm and inter-arm filaments. We compare our sample with previous studies, finding that our Hi-GAL filament catalogue represents a significant extension in terms of Galactic coverage and sensitivity. This catalogue represents a unique and important tool for future studies devoted to understanding the filament life-cycle.