We present a machine-learning (ML) approach for estimating galaxy cluster masses from Chandra mock images. We utilize a Convolutional Neural Network (CNN), a deep ML tool commonly used in image ...recognition tasks. The CNN is trained and tested on our sample of 7896 Chandra X-ray mock observations, which are based on 329 massive clusters from the simulation. Our CNN learns from a low resolution spatial distribution of photon counts and does not use spectral information. Despite our simplifying assumption to neglect spectral information, the resulting mass values estimated by the CNN exhibit small bias in comparison to the true masses of the simulated clusters (−0.02 dex) and reproduce the cluster masses with low intrinsic scatter, 8% in our best fold and 12% averaging over all. In contrast, a more standard core-excised luminosity method achieves 15%-18% scatter. We interpret the results with an approach inspired by Google DeepDream and find that the CNN ignores the central regions of clusters, which are known to have high scatter with mass.
ABSTRACT We present a modern machine learning (ML) approach for cluster dynamical mass measurements that is a factor-of-two improvement over using a conventional scaling relation. Different methods ...are tested against a mock cluster catalog constructed using halos with mass from Multidark's publicly available N-body MDPL halo catalog. In the conventional method, we use a standard M( v) power-law scaling relation to infer cluster mass, M, from line of sight (LOS) galaxy velocity dispersion, v. The resulting fractional mass error distribution is broad, with width (68% scatter), and has extended high-error tails. The standard scaling relation can be simply enhanced by including higher-order moments of the LOS velocity distribution. Applying the kurtosis as a correction term to reduces the width of the error distribution to (16% improvement). ML can be used to take full advantage of all the information in the velocity distribution. We employ the Support Distribution Machines (SDMs) algorithm that learns from distributions of data to predict single values. SDMs trained and tested on the distribution of LOS velocities yield (47% improvement). Furthermore, the problematic tails of the mass error distribution are effectively eliminated. Decreasing cluster mass errors will improve measurements of the growth of structure and lead to tighter constraints on cosmological parameters.
ABSTRACT We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning algorithm can predict masses by better than a factor of two ...compared to a standard scaling relation approach. We create two mock catalogs from Multidark's publicly available N-body MDPL1 simulation, one with perfect galaxy cluster membership information and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width of . Interlopers introduce additional scatter, significantly widening the error distribution further ( ). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement ( ) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.
ABSTRACT
The origin of the diverse population of galaxy clusters remains an unexplained aspect of large-scale structure formation and cluster evolution. We present a novel method of using X-ray ...images to identify cool core (CC), weak cool core (WCC), and non-cool core (NCC) clusters of galaxies that are defined by their central cooling times. We employ a convolutional neural network, ResNet-18, which is commonly used for image analysis, to classify clusters. We produce mock Chandra X-ray observations for a sample of 318 massive clusters drawn from the IllustrisTNG simulations. The network is trained and tested with low-resolution mock Chandra images covering a central 1 Mpc square for the clusters in our sample. Without any spectral information, the deep learning algorithm is able to identify CC, WCC, and NCC clusters, achieving balanced accuracies (BAcc) of 92 per cent, 81 per cent, and 83 per cent, respectively. The performance is superior to classification by conventional methods using central gas densities, with an average ${\rm BAcc}=81{{\ \rm per\ cent}}$, or surface brightness concentrations, giving ${\rm BAcc}=73{{\ \rm per\ cent}}$. We use class activation mapping to localize discriminative regions for the classification decision. From this analysis, we observe that the network has utilized regions from cluster centres out to r ≈ 300 kpc and r ≈ 500 kpc to identify CC and NCC clusters, respectively. It may have recognized features in the intracluster medium that are associated with AGN feedback and disruptive major mergers.
ABSTRACT We present a new approach for quantifying the abundance of galaxy clusters and constraining cosmological parameters using dynamical measurements. In the standard method, galaxy line-of-sight ...velocities, v, or velocity dispersions are used to infer cluster masses, M, to quantify the halo mass function (HMF), , which is strongly affected by mass measurement errors. In our new method, the probability distributions of velocities for each cluster in the sample are summed to create a new statistic called the velocity distribution function (VDF), . The VDF can be measured more directly and precisely than the HMF and can be robustly predicted with cosmological simulations that capture the dynamics of subhalos or galaxies. We apply these two methods to realistic (ideal) mock cluster catalogs with (without) interlopers and forecast the bias and constraints on the matter density parameter m and the amplitude of matter fluctuations 8 in flat ΛCDM cosmologies. For an example observation of 200 massive clusters, the VDF with (without) interloping galaxies constrains the parameter combination and shows only minor bias. However, the HMF with interlopers is biased to low m and high 8 and the fiducial model lies well outside of the forecast constraints, prior to accounting for Eddington bias. When the VDF is combined with constraints from the cosmic microwave background, the degeneracy between cosmological parameters can be significantly reduced. Upcoming spectroscopic surveys that probe larger volumes and fainter magnitudes will provide clusters for applying the VDF as a cosmological probe.
We present a new approach for quantifying the abundance of galaxy clusters and constraining cosmological parameters using dynamical measurements. In the standard method, galaxy line-of-sight ...velocities, v, or velocity dispersions are used to infer cluster masses, M, to quantify the halo mass function (HMF), dn(M)/dlog(M), which is strongly affected by mass measurement errors. In our new method, the probability distributions of velocities for each cluster in the sample are summed to create a new statistic called the velocity distribution function (VDF), dn(v)/dv. The VDF can be measured more directly and precisely than the HMF and can be robustly predicted with cosmological simulations that capture the dynamics of subhalos or galaxies. We apply these two methods to realistic (ideal) mock cluster catalogs with (without) interlopers and forecast the bias and constraints on the matter density parameter Ω{sub m} and the amplitude of matter fluctuations σ{sub 8} in flat ΛCDM cosmologies. For an example observation of 200 massive clusters, the VDF with (without) interloping galaxies constrains the parameter combination σ{sub 8} Ω{sub m}{sup 0.29(0.29)}=0.589±0.014 (0.584±0.011) and shows only minor bias. However, the HMF with interlopers is biased to low Ω{sub m} and high σ{sub 8} and the fiducial model lies well outside of the forecast constraints, prior to accounting for Eddington bias. When the VDF is combined with constraints from the cosmic microwave background, the degeneracy between cosmological parameters can be significantly reduced. Upcoming spectroscopic surveys that probe larger volumes and fainter magnitudes will provide clusters for applying the VDF as a cosmological probe.
Machine learning has rapidly become a tool of choice for the astronomical community. It is being applied across a wide range of wavelengths and problems, from the classification of transients to ...neural network emulators of cosmological simulations, and is shifting paradigms about how we generate and report scientific results. At the same time, this class of method comes with its own set of best practices, challenges, and drawbacks, which, at present, are often reported on incompletely in the astrophysical literature. With this paper, we aim to provide a primer to the astronomical community, including authors, reviewers, and editors, on how to implement machine learning models and report their results in a way that ensures the accuracy of the results, reproducibility of the findings, and usefulness of the method.
We present a machine-learning approach for estimating galaxy cluster masses from Chandra mock images. We utilize a Convolutional Neural Network (CNN), a deep machine learning tool commonly used in ...image recognition tasks. The CNN is trained and tested on our sample of 7,896 Chandra X-ray mock observations, which are based on 329 massive clusters from the IllustrisTNG simulation. Our CNN learns from a low resolution spatial distribution of photon counts and does not use spectral information. Despite our simplifying assumption to neglect spectral information, the resulting mass values estimated by the CNN exhibit small bias in comparison to the true masses of the simulated clusters (-0.02 dex) and reproduce the cluster masses with low intrinsic scatter, 8% in our best fold and 12% averaging over all. In contrast, a more standard core-excised luminosity method achieves 15-18% scatter. We interpret the results with an approach inspired by Google DeepDream and find that the CNN ignores the central regions of clusters, which are known to have high scatter with mass.
The origin of the diverse population of galaxy clusters remains an unexplained aspect of large-scale structure formation and cluster evolution. We present a novel method of using X-ray images to ...identify cool core (CC), weak cool core (WCC), and non cool core (NCC) clusters of galaxies, that are defined by their central cooling times. We employ a convolutional neural network, ResNet-18, which is commonly used for image analysis, to classify clusters. We produce mock Chandra X-ray observations for a sample of 318 massive clusters drawn from the IllustrisTNG simulations. The network is trained and tested with low resolution mock Chandra images covering a central 1 Mpc square for the clusters in our sample. Without any spectral information, the deep learning algorithm is able to identify CC, WCC, and NCC clusters, achieving balanced accuracies (BAcc) of 92%, 81%, and 83%, respectively. The performance is superior to classification by conventional methods using central gas densities, with an average BAcc = 81%, or surface brightness concentrations, giving BAcc = 73%. We use Class Activation Mapping to localize discriminative regions for the classification decision. From this analysis, we observe that the network has utilized regions from cluster centers out to r~300 kpc and r~500 kpc to identify CC and NCC clusters, respectively. It may have recognized features in the intracluster medium that are associated with AGN feedback and disruptive major mergers.
We study dynamical mass measurements of galaxy clusters contaminated by interlopers and show that a modern machine learning (ML) algorithm can predict masses by better than a factor of two compared ...to a standard scaling relation approach. We create two mock catalogs from Multidark's publicly available \(N\)-body MDPL1 simulation, one with perfect galaxy cluster membership information and the other where a simple cylindrical cut around the cluster center allows interlopers to contaminate the clusters. In the standard approach, we use a power-law scaling relation to infer cluster mass from galaxy line-of-sight (LOS) velocity dispersion. Assuming perfect membership knowledge, this unrealistic case produces a wide fractional mass error distribution, with a width of \(\Delta\epsilon\approx0.87\). Interlopers introduce additional scatter, significantly widening the error distribution further (\(\Delta\epsilon\approx2.13\)). We employ the support distribution machine (SDM) class of algorithms to learn from distributions of data to predict single values. Applied to distributions of galaxy observables such as LOS velocity and projected distance from the cluster center, SDM yields better than a factor-of-two improvement (\(\Delta\epsilon\approx0.67\)) for the contaminated case. Remarkably, SDM applied to contaminated clusters is better able to recover masses than even the scaling relation approach applied to uncontaminated clusters. We show that the SDM method more accurately reproduces the cluster mass function, making it a valuable tool for employing cluster observations to evaluate cosmological models.