Abstract
We introduce the Automatic Learning for the Rapid Classification of Events (ALeRCE) broker, an astronomical alert broker designed to provide a rapid and self-consistent classification of ...large etendue telescope alert streams, such as that provided by the Zwicky Transient Facility (ZTF) and, in the future, the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST). ALeRCE is a Chilean-led broker run by an interdisciplinary team of astronomers and engineers working to become intermediaries between survey and follow-up facilities. ALeRCE uses a pipeline that includes the real-time ingestion, aggregation, cross-matching, machine-learning (ML) classification, and visualization of the ZTF alert stream. We use two classifiers: a stamp-based classifier, designed for rapid classification, and a light curve–based classifier, which uses the multiband flux evolution to achieve a more refined classification. We describe in detail our pipeline, data products, tools, and services, which are made public for the community (see
https://alerce.science
). Since we began operating our real-time ML classification of the ZTF alert stream in early 2019, we have grown a large community of active users around the globe. We describe our results to date, including the real-time processing of 1.5 × 10
8
alerts, the stamp classification of 3.4 × 10
7
objects, the light-curve classification of 1.1 × 10
6
objects, the report of 6162 supernova candidates, and different experiments using LSST-like alert streams. Finally, we discuss the challenges ahead in going from a single stream of alerts such as ZTF to a multistream ecosystem dominated by LSST.
ABSTRACT We present the first results of the High Cadence Transient Survey (HiTS), a survey for which the objective is to detect and follow-up optical transients with characteristic timescales from ...hours to days, especially the earliest hours of supernova (SN) explosions. HiTS uses the Dark Energy Camera and a custom pipeline for image subtraction, candidate filtering and candidate visualization, which runs in real-time to be able to react rapidly to the new transients. We discuss the survey design, the technical challenges associated with the real-time analysis of these large volumes of data and our first results. In our 2013, 2014, and 2015 campaigns, we detected more than 120 young SN candidates, but we did not find a clear signature from the short-lived SN shock breakouts (SBOs) originating after the core collapse of red supergiant stars, which was the initial science aim of this survey. Using the empirical distribution of limiting magnitudes from our observational campaigns, we measured the expected recovery fraction of randomly injected SN light curves, which included SBO optical peaks produced with models from Tominaga et al. (2011) and Nakar & Sari (2010). From this analysis, we cannot rule out the models from Tominaga et al. (2011) under any reasonable distributions of progenitor masses, but we can marginally rule out the brighter and longer-lived SBO models from Nakar & Sari (2010) under our best-guess distribution of progenitor masses. Finally, we highlight the implications of this work for future massive data sets produced by astronomical observatories, such as LSST.
Abstract We present the first version of the Automatic Learning for the Rapid Classification of Events (ALeRCE) broker light curve classifier. ALeRCE is currently processing the Zwicky Transient ...Facility (ZTF) alert stream, in preparation for the Vera C. Rubin Observatory. The ALeRCE light curve classifier uses variability features computed from the ZTF alert stream and colors obtained from AllWISE and ZTF photometry. We apply a balanced random forest algorithm with a two-level scheme where the top level classifies each source as periodic, stochastic, or transient, and the bottom level further resolves each of these hierarchical classes among 15 total classes. This classifier corresponds to the first attempt to classify multiple classes of stochastic variables (including core- and host-dominated active galactic nuclei, blazars, young stellar objects, and cataclysmic variables) in addition to different classes of periodic and transient sources, using real data. We created a labeled set using various public catalogs (such as the Catalina Surveys and Gaia DR2 variable stars catalogs, and the Million Quasars catalog), and we classify all objects with ≥6 g -band or ≥6 r -band detections in ZTF (868,371 sources as of 2020 June 9), providing updated classifications for sources with new alerts every day. For the top level we obtain macro-averaged precision and recall scores of 0.96 and 0.99, respectively, and for the bottom level we obtain macro-averaged precision and recall scores of 0.57 and 0.76, respectively. Updated classifications from the light curve classifier can be found at the ALeRCE Explorer website ( http://alerce.online ).
ABSTRACT
Machine learning has achieved an important role in the automatic classification of variable stars, and several classifiers have been proposed over the last decade. These classifiers have ...achieved impressive performance in several astronomical catalogues. However, some scientific articles have also shown that the training data therein contain multiple sources of bias. Hence, the performance of those classifiers on objects not belonging to the training data is uncertain, potentially resulting in the selection of incorrect models. Besides, it gives rise to the deployment of misleading classifiers. An example of the latter is the creation of open-source labelled catalogues with biased predictions. In this paper, we develop a method based on an informative marginal likelihood to evaluate variable star classifiers. We collect deterministic rules that are based on physical descriptors of RR Lyrae stars, and then, to mitigate the biases, we introduce those rules into the marginal likelihood estimation. We perform experiments with a set of Bayesian logistic regressions, which are trained to classify RR Lyraes, and we found that our method outperforms traditional non-informative cross-validation strategies, even when penalized models are assessed. Our methodology provides a more rigorous alternative to assess machine learning models using astronomical knowledge. From this approach, applications to other classes of variable stars and algorithmic improvements can be developed.
Context. In the last six years, the VISTA Variable in the Vía Láctea (VVV) survey mapped 562 sq. deg. across the bulge and southern disk of the Galaxy. However, a detailed study of these regions, ...which includes ~36 globular clusters (GCs) and thousands of open clusters is by no means an easy challenge. High differential reddening and severe crowding along the line of sight makes highly hamper to reliably distinguish stars belonging to different populations and/or systems. Aims. The aim of this study is to separate stars that likely belong to the Galactic GC NGC 6544 from its surrounding field by means of proper motion (PM) techniques. Methods. This work was based upon a new astrometric reduction method optimized for images of the VVV survey. Results. PSF-fitting photometry over the six years baseline of the survey allowed us to obtain a mean precision of ~0.51 mas yr-1, in each PM coordinate, for stars with Ks< 15 mag. In the area studied here, cluster stars separate very well from field stars, down to the main sequence turnoff and below, allowing us to derive for the first time the absolute PM of NGC 6544. Isochrone fitting on the clean and differential reddening corrected cluster color magnitude diagram yields an age of ~11−13 Gyr, and metallicity Fe/H =−1.5 dex, in agreement with previous studies restricted to the cluster core. We were able to derive the cluster orbit assuming an axisymmetric model of the Galaxy and conclude that NGC 6544 is likely a halo GC. We have not detected tidal tail signatures associated to the cluster, but a remarkable elongation in the galactic center direction has been found. The precision achieved in the PM determination also allows us to separate bulge stars from foreground disk stars, enabling the kinematical selection of bona fide bulge stars across the whole survey area. Conclusions. Kinematical techniques are a fundamental step toward disentangling different stellar populations that overlap in a studied field. Our results show that VVV data is perfectly suitable for this kind of analysis.
Abstract
We report the observations of solar system objects during the 2015 campaign of the High cadence Transient Survey (HiTS). We found 5740 bodies (mostly Main Belt asteroids), 1203 of which were ...detected in different nights and in
g
′ and
r
′. Objects were linked in the barycenter system and their orbital parameters were computed assuming Keplerian motion. We identified 6 near Earth objects, 1738 Main Belt asteroids and 4 Trans-Neptunian objects. We did not find a
g
′−
r
′ color–size correlation for 14 <
H
g
′
< 18 (1 <
D
< 10 km) asteroids. We show asteroids’ colors are disturbed by HiTS’ 1.6 hr cadence and estimate that observations should be separated by at most 14 minutes to avoid confusion in future wide-field surveys like LSST. The size distribution for the Main Belt objects can be characterized as a simple power law with slope ∼0.9, steeper than in any other survey, while data from the 2014 HiTS campaign has a distribution consistent with previous ones (slopes ∼0.68 at the bright end and ∼0.34 at the faint end). This difference is likely due to the ecliptic distribution of the Main Belt since the 2015 campaign surveyed farther from the ecliptic than did 2014's and most previous surveys.
We report on the serendipitous observations of solar system objects imaged during the High cadence Transient Survey 2014 observation campaign. Data from this high-cadence wide-field survey was ...originally analyzed for finding variable static sources using machine learning to select the most-likely candidates. In this work, we search for moving transients consistent with solar system objects and derive their orbital parameters. We use a simple, custom motion detection algorithm to link trajectories and assume Keplerian motion to derive the asteroid's orbital parameters. We use known asteroids from the Minor Planet Center database to assess the detection efficiency of the survey and our search algorithm. Trajectories have an average of nine detections spread over two days, and our fit yields typical errors of , e ∼ 0.07 and i ∼ 0 5 in semimajor axis, eccentricity, and inclination, respectively, for known asteroids in our sample. We extract 7700 orbits from our trajectories, identifying 19 near-Earth objects, 6687 asteroids, 14 Centaurs, and 15 trans-Neptunian objects. This highlights the complementarity of supernova wide-field surveys for solar system research and the significance of machine learning to clean data of false detections. It is a good example of the data-driven science that Large Synoptic Survey Telescope will deliver.
The advent of next-generation survey instruments, such as the Vera C. Rubin Observatory and its Legacy Survey of Space and Time (LSST), is opening a window for new research in time-domain astronomy. ...The Extended LSST Astronomical Time-Series Classification Challenge (ELAsTiCC) was created to test the capacity of brokers to deal with a simulated LSST stream. Our aim is to develop a next-generation model for the classification of variable astronomical objects. We describe ATAT, the Astronomical Transformer for time series And Tabular data, a classification model conceived by the ALeRCE alert broker to classify light curves from next-generation alert streams. ATAT was tested in production during the first round of the ELAsTiCC campaigns. ATAT consists of two transformer models that encode light curves and features using novel time modulation and quantile feature tokenizer mechanisms, respectively. ATAT was trained on different combinations of light curves, metadata, and features calculated over the light curves. We compare ATAT against the current ALeRCE classifier, a balanced hierarchical random forest (BHRF) trained on human-engineered features derived from light curves and metadata. When trained on light curves and metadata, ATAT achieves a macro F1 score of $82.9 0.4$ in 20 classes, outperforming the BHRF model trained on 429 features, which achieves a macro F1 score of $79.4 The use of transformer multimodal architectures, combining light curves and tabular data, opens new possibilities for classifying alerts from a new generation of large etendue telescopes, such as the Vera C. Rubin Observatory, in real-world brokering scenarios.
Aims.
We present a variability-, color-, and morphology-based classifier designed to identify multiple classes of transients and persistently variable and non-variable sources from the Zwicky ...Transient Facility (ZTF) Data Release 11 (DR11) light curves of extended and point sources. The main motivation to develop this model was to identify active galactic nuclei (AGN) at different redshift ranges to be observed by the 4MOST Chilean AGN/Galaxy Evolution Survey (ChANGES). That being said, it also serves as a more general time-domain astronomy study.
Methods.
The model uses nine colors computed from CatWISE and Pan-STARRS1 (PS1), a morphology score from PS1, and 61 single-band variability features computed from the ZTF DR11
g
and
r
light curves. We trained two versions of the model, one for each ZTF band, since ZTF DR11 treats the light curves observed in a particular combination of field, filter, and charge-coupled device (CCD) quadrant independently. We used a hierarchical local classifier per parent node approach-where each node is composed of a balanced random forest model. We adopted a taxonomy with 17 classes: non-variable stars, non-variable galaxies, three transients (SNIa, SN-other, and CV/Nova), five classes of stochastic variables (lowz-AGN, midz-AGN, highz-AGN, Blazar, and YSO), and seven classes of periodic variables (LPV, EA, EB/EW, DSCT, RRL, CEP, and Periodic-other).
Results.
The macro-averaged precision, recall, and F1-score are 0.61, 0.75, and 0.62 for the
g
-band model, and 0.60, 0.74, and 0.61, for the
r
-band model. When grouping the four AGN classes (lowz-AGN, midz-AGN, highz-AGN, and Blazar) into one single class, its precision-recall, and F1-score are 1.00, 0.95, and 0.97, respectively, for both the
g
and
r
bands. This demonstrates the good performance of the model in classifying AGN candidates. We applied the model to all the sources in the ZTF/4MOST overlapping sky (−28 ≤ Dec ≤ 8.5), avoiding ZTF fields that cover the Galactic bulge (|
gal_b
| ≤ 9 and
gal_l
≤ 50). This area includes 86 576 577 light curves in the
g
band and 140 409 824 in the
r
band with 20 or more observations and with an average magnitude in the corresponding band lower than 20.5. Only 0.73% of the
g
-band light curves and 2.62% of the
r
-band light curves were classified as stochastic, periodic, or transient with high probability (
P
init
≥ 0.9). Even though the metrics obtained for the two models are similar, we find that, in general, more reliable results are obtained when using the
g
-band model. With it, we identified 384 242 AGN candidates (including low-, mid-, and high-redshift AGN and Blazars), 287 156 of which have
P
init
≥ 0.9.