Abstract Recently, generative models have been gradually emerging into the extended dataset field, showcasing their advantages. However, when it comes to generating tabular data, these models often ...fail to satisfy the constraints of numerical columns, which cannot generate high-quality datasets that accurately represent real-world data and are suitable for the intended downstream applications. Responding to the challenge, we propose a tabular data generation framework guided by downstream task optimization (TDGGD). It incorporates three indicators into each time step of diffusion generation, using gradient optimization to align the generated fake data. Unlike the traditional strategy of separating the downstream task model from the upstream data synthesis model, TDGGD ensures that the generated data has highly focused columns feasibility in upstream real tabular data. For downstream task, TDGGD strikes the utility of tabular data over solely pursuing statistical fidelity. Through extensive experiments conducted on real-world tables with explicit column constraints and tables without explicit column constraints, we have demonstrated that TDGGD ensures increasing data volume while enhancing prediction accuracy. To the best of our knowledge, this is the first instance of deploying downstream information into a diffusion model framework.
Robert Tibshirani gives an overview of one of David Cox's most widely applied ideas, for which he was awarded the International Prize in Statistics in 2017
Robert Tibshirani gives an overview of one ...of David Cox's most widely applied ideas, for which he was awarded the International Prize in Statistics in 2017.Robert Tibshirani gives an overview of one of David Cox's most widely applied ideas, for which he was awarded the International Prize in Statistics in 2017.
In the context of High Energy Physics (HEP) analyses the advent of large-scale combination fits forms an increasing computational challenge for the underlying software frameworks on which these fits ...rely. RooFit, being the central tool for HEP statistical model creation and fitting, intends to address this challenge through an efficient and versatile parallelisation framework on top of which two parallel implementations were developed in the present research. The first implementation, the parallelisation of the gradient, shows good scaling behaviour and is sufficiently robust to consistently minimize real large-scale fits. The latter, the parallelisation of the line search, is still work in progress for some specific likelihood components but shows promising results in realistic testcases. Enabling just gradient parallelisation speeds up the full fit of a recently published Higgs combination from the ATLAS experiment by a factor of 4.6 with sixteen workers. As the improvements presented in this research are currently publicly available in ROOT 6.28, we invite users to enable at least gradient parallelisation for robust accelerated fitting with RooFit.
The lifetime of nonreactive ultracold bialkali gases was conjectured to be limited by sticky collisions amplifying three-body loss. We show that the sticking times were previously overestimated and ...do not support this hypothesis. We find that electronic excitation of NaK+NaK collision complexes by the trapping laser leads to the experimentally observed two-body loss. We calculate the excitation rate with a quasiclassical, statistical model employing ab initio potentials and transition dipole moments. Using longer laser wavelengths or repulsive box potentials may suppress the losses.
This research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive ...analytics include empirical methods (statistical and other) that generate data predictions as well as methods for assessing predictive power. Predictive analytics not only assist in creating practically useful models, they also play an important role alongside explanatory modeling in theory building and theory testing. We describe six roles for predictive analytics: new theory generation, measurement development, comparison of competing theories, improvement of existing models, relevance assessment, and assessment of the predictability of empirical phenomena. Despite the importance of predictive analytics, we find that they are rare in the empirical IS literature. Extant IS literature relies nearly exclusively on explanatory statistical modeling, where statistical inference is used to test and evaluate the explanatory power of underlying causal models, and predictive power is assumed to follow automatically from the explanatory model. However, explanatory power does not imply predictive power and thus predictive analytics are necessary for assessing predictive power and for building empirical models that predict well. To show that predictive analytics and explanatory statistical modeling are fundamentally disparate, we show that they are different in each step of the modeling process. These differences translate into different final models, so that a pure explanatory statistical model is best tuned for testing causal hypotheses and a pure predictive model is best in terms of predictive power. We convert a well-known explanatory paper on TAM to a predictive context to illustrate these differences and show how predictive analytics can add theoretical and practical value to IS research.
We use the density matrix renormalization group method to calculate several energy eigenvalues of the frustrated S=1/2 square-lattice J1−J2 Heisenberg model on 2L×L cylinders with L≤10. We identify ...excited-level crossings versus the coupling ratio g=J2/J1 and study their drifts with the system size L. The lowest singlet-triplet and singlet-quintuplet crossings converge rapidly (with corrections ∝L−2) to different g values, and we argue that these correspond to ground-state transitions between the Néel antiferromagnet and a gapless spin liquid, at gc1≈0.46, and between the spin liquid and a valence-bond solid at gc2≈0.52. Previous studies of order parameters were not able to positively discriminate between an extended spin liquid phase and a critical point. We expect level-crossing analysis to be a generically powerful tool in density matrix renormalization group studies of quantum phase transitions.