In linear regression problems with related predictors, it is desirable to do variable selection and estimation by maintaining the hierarchical or structural relationships among predictors. In this ...paper we propose non-negative garrote methods that can naturally incorporate such relationships defined through effect heredity principles or marginality principles. We show that the methods are very easy to compute and enjoy nice theoretical properties. We also show that the methods can be easily extended to deal with more general regression problems such as generalized linear models. Simulations and real examples are used to illustrate the merits of the proposed methods.
In this work, we propose a novel framework for large-scale Gaussian process (GP) modeling. Contrary to the global, and local approximations proposed in the literature to address the computational ...bottleneck with exact GP modeling, we employ a combined global-local approach in building the approximation. Our framework uses a subset-of-data approach where the subset is a union of a set of global points designed to capture the global trend in the data, and a set of local points specific to a given testing location to capture the local trend around the testing location. The correlation function is also modeled as a combination of a global, and a local kernel. The predictive performance of our framework, which we refer to as
TwinGP
, is comparable to the state-of-the-art GP modeling methods, but at a fraction of their computational cost.
In this paper, we present a regression-based methodology that can estimate the Less-than-Truckload (LTL) market rates with high reliability using an extensive database of historical shipments from ...continental United States. Our model successfully combines the quantitative data with qualitative market knowledge to produce better LTL market rate estimates which can be used in benchmarking studies allowing carriers and shippers to identify cost saving opportunities. We identify the main drivers of LTL pricing and reveal the effects of certain industry practices on the final market rates.
Constrained minimum energy designs Huang, Chaofan; Joseph, V. Roshan; Ray, Douglas M.
Statistics and computing,
11/2021, Letnik:
31, Številka:
6
Journal Article
Recenzirano
Odprti dostop
Space-filling designs are important in computer experiments, which are critical for building a cheap surrogate model that adequately approximates an expensive computer code. Many design construction ...techniques in the existing literature are only applicable for rectangular bounded space, but in real-world applications, the input space can often be non-rectangular because of constraints on the input variables. One solution to generate designs in a constrained space is to first generate uniformly distributed samples in the feasible region, and then use them as the candidate set to construct the designs. Sequentially constrained Monte Carlo (SCMC) is the state-of-the-art technique for candidate generation, but it still requires large number of constraint evaluations, which is problematic especially when the constraints are expensive to evaluate. Thus, to reduce constraint evaluations and improve efficiency, we propose the constrained minimum energy design (CoMinED) that utilizes recent advances in deterministic sampling methods. Extensive simulation results on 15 benchmark problems with dimensions ranging from 2 to 13 are provided for demonstrating the improved performance of CoMinED over the existing methods.
Specifying a prior distribution for the large number of parameters in the statistical model is a critical step in a Bayesian approach to the design and analysis of experiments. This article shows ...that the prior distribution can be induced from a functional prior on the underlying transfer function. The functional prior requires specification of only a few hyperparameters and thus can be easily implemented in practice. The usefulness of the approach is demonstrated through the analysis of some experiments. The article also proposes a new class of design criteria and establishes their connections with the minimum aberration criterion.
Three-dimensional printed medical prototypes, which use synthetic metamaterials to mimic biological tissue, are becoming increasingly important in urgent surgical applications. However, the mimicking ...of tissue mechanical properties via three-dimensional printed metamaterial can be difficult and time-consuming, due to the functional nature of both inputs (metamaterial structure) and outputs (mechanical response curve). To deal with this, we propose a novel function-on-function kriging model for efficient emulation and tissue-mimicking optimization. For functional inputs, a key novelty of our model is the spectral-distance (SpeD) correlation function, which captures important spectral differences between two functional inputs. Dependencies for functional outputs are then modeled via a co-kriging framework. We further adopt shrinkage priors on both the input spectra and the output co-kriging covariance matrix, which allows the emulator to learn and incorporate important physics (e.g., dominant input frequencies, output curve properties). Finally, we demonstrate the effectiveness of the proposed SpeD emulator in a real-world study on mimicking human aortic tissue, and show that it can provide quicker and more accurate tissue-mimicking performance compared to existing methods in the medical literature.
In the quest for advanced propulsion and power-generation systems, high-fidelity simulations are too computationally expensive to survey the desired design space, and a new design methodology is ...needed that combines engineering physics, computer simulations, and statistical modeling. In this article, we propose a new surrogate model that provides efficient prediction and uncertainty quantification of turbulent flows in swirl injectors with varying geometries, devices commonly used in many engineering applications. The novelty of the proposed method lies in the incorporation of known physical properties of the fluid flow as simplifying assumptions for the statistical model. In view of the massive simulation data at hand, which is on the order of hundreds of gigabytes, these assumptions allow for accurate flow predictions in around an hour of computation time. To contrast, existing flow emulators which forgo such simplifications may require more computation time for training and prediction than is needed for conducting the simulation itself. Moreover, by accounting for coupling mechanisms between flow variables, the proposed model can jointly reduce prediction uncertainty and extract useful flow physics, which can then be used to guide further investigations. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
Markov chain Monte Carlo (
MCMC) methods require a large number of samples to approximate a posterior distribution, which can be costly when the likelihood or prior is expensive to evaluate. The ...number of samples can be reduced if we can avoid repeated samples and those that are close to each other. This is the idea behind deterministic sampling methods such as quasi-Monte Carlo (QMC). However, the existing QMC methods aim at sampling from a uniform hypercube, which can miss the high probability regions of the posterior distribution and thus the approximation can be poor. Minimum energy design (MinED) is a recently proposed deterministic sampling method, which makes use of the posterior evaluations to obtain a weighted space-filling design in the region of interest. However, the existing implementation of MinED is inefficient because it requires several global optimizations and thus numerous evaluations of the posterior. In this article, we develop an efficient algorithm that can generate MinED samples with few posterior evaluations. We also make several improvements to the MinED criterion to make it perform better in high dimensions. The advantages of MinED over MCMC and QMC are illustrated using an example of calibrating a friction drilling process.
A new kriging predictor is proposed that gives a better performance over the existing predictor when the constant mean assumption in the kriging model is unreasonable. Moreover, it seems to be robust ...to the misspecifications in the correlation parameters. The advantages of the new predictor are demonstrated using some examples from the computer experiment literature.
Engineering model development involves several simplifying assumptions for the purpose of mathematical tractability, which are often not realistic in practice. This leads to discrepancies in the ...model predictions. A commonly used statistical approach to overcome this problem is to build a statistical model for the discrepancies between the engineering model and observed data. In contrast, an engineering approach would be to find the causes of discrepancy and fix the engineering model using first principles. However, the engineering approach is time consuming, whereas the statistical approach is fast. The drawback of the statistical approach is that it treats the engineering model as a black box and therefore, the statistically adjusted models lack physical interpretability. This article proposes a new framework for model calibration and statistical adjustment. It tries to open up the black box using simple main effects analysis and graphical plots and introduces statistical models inside the engineering model. This approach leads to simpler adjustment models that are physically more interpretable. The approach is illustrated using a model for predicting the cutting forces in a laser-assisted mechanical micro-machining process. This article has supplementary material online.