UP - logo
E-resources
Full text
Peer reviewed Open access
  • From Many to One: Consensus...
    Cressie, Noel; Bertolacci, Michael; Zammit‐Mangion, Andrew

    Geophysical research letters, 28 July 2022, Volume: 49, Issue: 14
    Journal Article

    A Model Intercomparison Project (MIP) consists of teams who estimate the same underlying quantity (e.g., temperature projections to the year 2070). A simple average of the ensemble of the teams' outputs gives a consensus estimate, but it does not recognize that some outputs are more variable than others. Statistical analysis of variance (ANOVA) models offer a way to obtain a weighted frequentist consensus estimate of outputs with a variance that is the smallest possible. Modulo dependence between MIP outputs, the ANOVA approach weights a team's output inversely proportional to its variance, from which optimally weighted estimates follow. ANOVA weights can also provide a prior distribution for Bayesian Model Averaging of the MIP outputs when external evaluation data are available. We use a MIP of carbon‐dioxide‐flux inversions to illustrate the ANOVA‐based weighting and subsequent frequentist consensus inferences. Plain Language Summary There can be disagreement between different teams of scientists on the best way to model and hence estimate complex geophysical phenomena. Model Intercomparison Projects (MIPs) address this in a scientific manner, where a common protocol about data and certain basic geophysical features is agreed upon by the teams. The collection of the different teams' outputs is analyzed, often using the ensemble mean and a measure of the ensemble variability. However, the results may indicate that it is inappropriate to treat all teams' outputs equally, which can happen when some teams have superior models or better numerical approximations. It may also happen that some teams share code or their models have common features beyond those specified in the protocol. We adapt a statistical technique called the analysis of variance (ANOVA) to this complex setting, obtain optimal weights on the outputs, and then estimate those weights. This results in a statistically optimal (i.e., most precise) consensus summary of the MIP; other weights give less‐precise inferences. We call this inference framework for MIPs, Statistically Unbiased Prediction and Estimation‐ANOVA, and we apply it to a MIP designed to estimate the sources and sinks of carbon dioxide. Key Points Consensus inference is provided for Multiple Intercomparison Project (MIP) outputs when little or no evaluation data are available The statistical analysis of variance method quantifies the MIP outputs' variabilities to obtain optimally weighted frequentist consensus inference Variance parameters for optimal weighting of outputs and consensus inference are estimated using likelihood‐based methodology