Legislative redistricting is a critical element of representative democracy. A number of political scientists have used simulation methods to sample redistricting plans under various constraints to ...assess their impact on partisanship and other aspects of representation. However, while many optimization algorithms have been proposed, surprisingly few simulation methods exist in the published scholarship. Furthermore, the standard algorithm has no theoretical justification, scales poorly, and is unable to incorporate fundamental constraints required by redistricting processes in the real world. To fill this gap, we formulate redistricting as a graph-cut problem and for the first time in the literature propose a new automated redistricting simulator based on Markov chain Monte Carlo. The proposed algorithm can incorporate contiguity and equal population constraints at the same time. We apply simulated and parallel tempering to improve the mixing of the resulting Markov chain. Through a small-scale validation study, we show that the proposed algorithm can approximate a target distribution more accurately than the standard algorithm. We also apply the proposed methodology to data from Pennsylvania to demonstrate the applicability of our algorithm to real-world redistricting problems. The open-source software package is available so that researchers and practitioners can implement the proposed methodology.
Supplementary materials
for this article are available online.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
2.
Metropolis-Hastings via Classification Kaji, Tetsuya; Ročková, Veronika
Journal of the American Statistical Association,
10/2023, Volume:
118, Issue:
544
Journal Article
Peer reviewed
Open access
This article develops a Bayesian computational platform at the interface between posterior sampling and optimization in models whose marginal likelihoods are difficult to evaluate. Inspired by ...contrastive learning and Generative Adversarial Networks (GAN), we reframe the likelihood function estimation problem as a classification problem. Pitting a Generator, who simulates fake data, against a Classifier, who tries to distinguish them from the real data, one obtains likelihood (ratio) estimators which can be plugged into the Metropolis-Hastings algorithm. The resulting Markov chains generate, at a steady state, samples from an approximate posterior whose asymptotic properties we characterize. Drawing upon connections with empirical Bayes and Bayesian misspecification, we quantify the convergence rate in terms of the contraction speed of the actual posterior and the convergence rate of the Classifier. Asymptotic normality results are also provided which justify the inferential potential of our approach. We illustrate the usefulness of our approach on examples which have challenged for existing Bayesian likelihood-free approaches.
Supplementary materials
for this article are available online.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
•Combination of the TCN surrogate model and the KF–MH algorithm for solving GPSI.•The KF–MH algorithm is effective in improving the accuracy and efficiency of GPSI.•The TCN surrogate model has high ...fitting accuracy to groundwater numerical model.
Increasing the precision of groundwater pollution source identification (GPSI) is crucial for groundwater pollution control and risk management. Bayesian theory based on the Markov Chain Monte Carlo (MCMC) method is a useful strategy of solving the GPSI problem. However, because of the nonlinear and uncertainty characteristics of GPSI, the Metropolis-Hasting (MH) algorithm, one of the most well-known MCMC algorithms, has the disadvantage of relatively low precision and is time-consuming. To address this problem, the Kalman filter (KF) algorithm was combined with the MH algorithm and referred to as the Kalman filter Metropolis-Hasting (KF–MH) algorithm. The algorithm generates a new initial distribution that is close to the true value through a prior distribution, and the new initial distribution is used to perform subsequent iterations of the calculation. The viability and superiority of the proposed KF-MH algorithm were assessed in three hypothetical GPSI cases under different conditions. In the inversion process, a surrogate model was constructed using a temporal convolutional network (TCN) to reduce the computational pressure imposed by the numerical simulation model. The results of the TCN surrogate model in the cases illustrate the high accuracy of the TCN surrogate model in fitting the groundwater numerical model. In the three cases, the normalized errors between the identification results and the true values of the source features obtained with the KF–MH algorithm were significantly lower than those of the MH algorithm. The results indicate that the proposed KF–MH algorithm has higher inversion accuracy than the MH algorithm.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
This paper presents a probabilistic framework to assess the stability of unsaturated slope under rainfall. The effects of soil spatial variability on the probability of rainfall-induced slope failure ...(landslides) are investigated. Soil spatial variability is considered by modeling the saturated hydraulic conductivity of the soil (
k
s
) as a stationary lognormal random field. Subset simulation with a modified Metropolis–Hastings algorithm is used to estimate the probability of slope failure. It is demonstrated numerically that probabilistic analysis accounting for spatial variability of
k
s
can reproduce a shallow failure mechanism widely observed in real rainfall-induced landslides. This shallow failure is attributed to positive pore-water pressures developed in layers near the ground surface. In contrast, analysis assuming a homogeneous profile cannot reproduce a shallow failure except for the extreme case of infiltration flux being almost equal to
k
s
. Therefore, ignoring spatial variability leads to unconservative estimates of failure probability. The correlation length of
k
s
affects the probability of slope failure significantly. The applicability of subset simulation with a modified Metropolis–Hastings algorithm to assess the reliability of problems involving spatial variability is highlighted.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
Repulsive mixture models have recently gained popularity for Bayesian cluster detection. Compared to more traditional mixture models, repulsive mixture models produce a smaller number of ...well-separated clusters. The most commonly used methods for posterior inference either require to fix a priori the number of components or are based on reversible jump MCMC computation. We present a general framework for mixture models, when the prior of the "cluster centers" is a finite repulsive point process depending on a hyperparameter, specified by a density which may depend on an intractable normalizing constant. By investigating the posterior characterization of this class of mixture models, we derive a MCMC algorithm which avoids the well-known difficulties associated to reversible jump MCMC computation. In particular, we use an ancillary variable method, which eliminates the problem of having intractable normalizing constants in the Hastings ratio. The ancillary variable method relies on a perfect simulation algorithm, and we demonstrate this is fast because the number of components is typically small. In several simulation studies and an application on sociological data, we illustrate the advantage of our new methodology over existing methods, and we compare the use of a determinantal or a repulsive Gibbs point process prior model. Supplementary files for this article are available online.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
The present communication develops the tools for estimation and prediction of the Burr-III distribution under unified progressive hybrid censoring scheme. The maximum likelihood estimates of model ...parameters are obtained. It is shown that the maximum likelihood estimates exist uniquely. Expectation maximization and stochastic expectation maximization methods are employed to compute the point estimates of unknown parameters. Based on the asymptotic distribution of the maximum likelihood estimators, approximate confidence intervals are proposed. In addition, the bootstrap confidence intervals are constructed. Furthermore, the Bayes estimates are derived with respect to squared error and LINEX loss functions. To compute the approximate Bayes estimates, Metropolis-Hastings algorithm is adopted. The highest posterior density credible intervals are obtained. Further, maximum a posteriori estimates of the model parameters are computed. The Bayesian predictive point, as well as interval estimates, are proposed. A Monte Carlo simulation study is employed in order to evaluate the performance of the proposed statistical procedures. Finally, two real data sets are considered and analysed to illustrate the methodologies established in this paper.
Full text
Available for:
BFBNIB, GIS, IJS, KISLJ, NUK, PNG, UL, UM, UPUK
Bayesian signal detection methods, including the multiitem gamma Poisson shrinker (MGPS), assume a Poisson distribution for the number of reports. However, the database of the adverse event reporting ...system often has a large number of zero‐count cells. A zero‐inflated Poisson (ZIP) distribution can be more appropriate in this situation than a Poisson distribution. Few studies have considered ZIP‐based models for Bayesian signal detection. In addition, most studies on Bayesian signal detection methods include simulation studies conducted assuming a gamma distribution for the prior. Herein, we extend the MGPS method using the ZIP model and apply various prior distributions. We evaluated the extended methods through an extensive simulation using more varied settings for the model and prior than existing methods. We varied the total number of reports, the number of true signals, the relative reporting rate, and the probability of observing a true zero. The results show that as the probability of observing a zero count increased, methods based on the ZIP model outperformed the Poisson model in most cases. We also found that using the mixture log‐normal prior resulted in more conservative detection than other priors when the relative reporting rate is high. Conversely, more signals were found when using the mixture truncated normal distributions. We applied the Bayesian signal detection methods to data from the Korea Adverse Event Reporting System from 2012 to 2016.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Micromechanical parameters are essential in understanding the behavior of materials with a heterogeneous structure, which helps to predict complex physical processes such as delamination, cracks, and ...plasticity. However, identifying these parameters is challenging due to micro-macro length scale differences, required high resolution, and ambiguity in boundary conditions, among others. The Integrated Digital Image Correlation (IDIC) method, a state-of-the-art full-field deterministic approach to parameter identification, is widely used but suffers from high sensitivity to boundary data errors and is limited to identification of parameters within well-posed problems. This article employs Bayesian approach to estimate micromechanical shear and bulk moduli of fiber-reinforced composite samples under plane strain assumption, and to improve handling of boundary noise. The main purpose of this article is to quantify the effect of uncertainty in the boundary conditions in the stochastic setting. To this end, the Metropolis–Hastings Algorithm (MHA) is employed to estimate probability distributions of bulk and shear moduli and boundary condition parameters using IDIC, considering a fiber-reinforced composite sample under plane strain assumption. The performance and robustness of the MHA are compared to two versions of deterministic IDIC method, under artificially introduced random and systematic errors in kinematic boundary conditions. Although MHA is shown to be computationally more expensive and in certain cases less accurate than the recently introduced Boundary-Enriched IDIC, it offers significant advantages, in particular being able to optimize a large number of parameters while obtaining statistical characterization as well as insights into individual parameter relationships. The paper furthermore highlights the benefits of the non-normalized approach to parameter identification with MHA (leading, within deterministic IDIC, to an ill-posed formulation), which significantly improves the robustness in handling the boundary noise.
•Micromechanical parameter identification using Bayesian approach and DIC is proposed.•Accuracy similar to deterministic methods is obtained when modes are used.•Probabilistic data provide insights into correlation statistics of parameters.•Ill-posed or poorly conditioned problems in parameter identification can be addressed.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP