A common approach to improving probabilistic forecasts is to identify and leverage the forecasts from experts in the crowd based on forecasters' performance on prior questions with known outcomes. ...However, such information is often unavailable to decision-makers on many forecasting problems, and thus it can be difficult to identify and leverage expertise. In the current paper, we propose a novel algorithm for aggregating probabilistic forecasts using forecasters' meta-predictions about what other forecasters will predict. We test the performance of an extremised version of our algorithm against current forecasting approaches in the literature and show that our algorithm significantly outperforms all other approaches on a large collection of 500 binary decision problems varying in five levels of difficulty. The success of our algorithm demonstrates the potential of using meta-predictions to leverage latent expertise in environments where forecasters' expertise cannot otherwise be easily identified.
Building on previous research on the use of macroeconomic factors for conflict prediction and using data on political instability provided by the Political Instability Task Force, this article ...proposes two minimal forecasting models of political instability optimised to have the greatest possible predictive power for one-year and two-year event horizons, while still making predictions that are fully explainable. Both models employ logistic regression and use just three predictors: polity code (a measure of government type), infant mortality, and years of stability (i.e., years since the last instability event). These models make predictions for 176 countries on a country-year basis and achieve AUPRC's of 0.108 and 0.115 for the one-year and two-year models respectively. They use public data with ongoing availability so are readily reproducible. They use Monte Carlo simulations to construct confidence intervals for their predictions and are validated by testing their predictions for a set of reference years separate from the set of reference years used to train them. This validation shows that the models are not overfitted but suggests that some of the previous models in the literature may have been. The models developed in this article are able to explain their predictions by showing, for a given prediction, which predictors were the most influential and by using counterfactuals to show how the predictions would have been altered had these predictors taken different values. These models are compared to models created by lasso regression and it is shown that they have at least as good predictive power but that their predictions can be more readily explained. Because policy makers are more likely to be influenced by models whose predictions can explained, the more interpretable a model is the more likely it is to influence policy.
In 2016, the gambling habits of a sample of 3361 adults in the state of Victoria, Australia, were surveyed. It was found that a number of factors that were highly correlated with self-reported ...gambling frequency and gambling problems were not significant predictors of gambling frequency and problem gambling. The major predictors of gambling frequency were the degree to which family members and peers were perceived to gamble, self-reported approval of gambling, the frequency of discussing gambling offline, and the participant's Canadian Problem Gambling Severity Index (PGSI) score. Age was a significant predictor of gambling frequency for certain types of gambling (e.g. buying lottery tickets). Approximately 91% of the explainable variance in the participant's PGSI score could be explained by just five predictors: Positive Urgency; Frequency of playing poker machines at pubs, hotels or sporting clubs; Participation in online discussions of betting on gaming tables at casinos; Frequency of gambling on the internet, and Overestimating the chances of winning. Based on these findings, suggestions are made as to how gambling-related harm can be reduced.
The descriptive norm effect refers to findings that individuals will tend to prefer behaving certain ways when they know that other people behave similarly. An open question is whether individuals ...will still conform to other people's behaviour when they do not identify with these other people, such as a Democrat being biased towards following a popular behaviour amongst Republicans. Self-categorization theory makes the intuitive prediction that people will actively avoid conforming to the norms of groups with which they do not identify. We tested this by informing participants that a particular action was more popular amongst people they identified with and additionally informed some participants that this action was unpopular amongst people they did not identify with. Specifically, we presented descriptive norms of people who supported different political parties or had opposing stances on important social issues. Counter to self-categorization theory's prediction, we found that informing participants that an action was unpopular amongst people they did not identify with led participants' preferences to shift away from that action. These results suggest that a general desire to conform with others may outpower the common ingroup vs outgroup mentality.
Diagnosing certain fractures in conventional radiographs can be a difficult task, usually taking years to master. Typically, students are trained ad-hoc, in a primarily-rule based fashion. Our study ...investigated whether students can more rapidly learn to diagnose proximal neck of femur fractures via perceptual training, without having to learn an explicit set of rules. One hundred and thirty-nine students with no prior medical or radiology training were shown a sequence of plain film X-ray images of the right hip and for each image were asked to indicate whether a fracture was present. Students were told if they were correct and the location of any fracture, if present. No other feedback was given. The more able students achieved the same level of accuracy as board certified radiologists at identifying hip fractures in less than an hour of training. Surprisingly, perceptual learning was reduced when the training set was constructed to over-represent the types of images participants found more difficult to categorise. Conversely, repeating training images did not reduce post-training performance relative to showing an equivalent number of unique images. Perceptual training is an effective way of helping novices learn to identify hip fractures in X-ray images and should supplement the current education programme for students.
People look at what they are interested in, and their emotional expressions tend to indicate how they feel about the objects at which they look. The combination of gaze direction and emotional ...expression can therefore convey important information about people's evaluations of the objects in their environment, and can even influence the subsequent evaluations of those objects by a third party, a phenomenon known as the emotional gaze effect. The present study extended research into the effect of emotional gaze cues by investigating whether they affect evaluations of the most important aspect of our social environment-other people-and whether the presence of multiple gaze cues enhances this effect. Over four experiments, a factorial within-subjects design employing both null hypothesis significance testing and a Bayesian statistical analysis replicated previous work showing an emotional gaze effect for objects, but found strong evidence that emotional gaze cues do not affect evaluations of other people, and that multiple, simultaneously presented gaze cues do not enhance the emotional gaze effect for either the evaluations of objects or of people. Overall, our results suggest that emotional gaze cues have a relatively weak influence on affective evaluations, especially of those aspects of our environment that automatically elicit affectively valenced reactions, including other humans.
Breast screening is an important tool for the early detection of breast cancers. However, tumours are typically present in less than 1% of mammograms. This low prevalence could cause radiologists to ...detect fewer tumours than they otherwise would, an issue known as the prevalence effect. The aim of our study was to investigate a novel breast screening protocol, designed to decrease the number of tumours missed by radiologists, without increasing their workload. We ran two laboratory-based experiments to assess the degree to which the novel protocol, called the catch trial (CT) protocol, resulted in greater sensitivity (d') than the double screener protocol (DS), currently utilised in Australia. In our first experiment we found evidence that the CT protocol resulted in a criterion shift relative to the DS protocol but the evidence that sensitivity was greater in the CT protocol relative to the DS protocol was less clear. A second experiment, using more realistic stimuli that were more representative of actual tumours, also failed to find convincing evidence that sensitivity was greater in the CT protocol than in the DS protocol. This experiment instead found that both the hit rate and the false alarm rate increased in the CT protocol relative to the DS protocol. So while there was again evidence that the CT protocol induced a criterion shift, the sensitivity appeared to be approximately the same in both protocols. Our results suggest the CT protocol is unlikely to result in an improvement in sensitivity over the DS protocol, so we cannot recommend that it be trialled in a clinical setting.
To understand how the visual system represents multiple moving objects and how those representations contribute to tracking, it is essential that we understand how the processes of attention and ...working memory interact. In the work described here we present an investigation of that interaction via a series of tracking and working memory dual-task experiments. Previously, it has been argued that tracking is resistant to disruption by a concurrent working memory task and that any apparent disruption is in fact due to observers making a response to the working memory task, rather than due to competition for shared resources. Contrary to this, in our experiments we find that when task order and response order confounds are avoided, all participants show a similar decrease in both tracking and working memory performance. However, if task and response order confounds are not adequately controlled for we find substantial individual differences, which could explain the previous conflicting reports on this topic. Our results provide clear evidence that tracking and working memory tasks share processing resources.
Introduction
To evaluate the accuracy of deep convolutional neural networks (DCNNs) for detecting neck of femur (NoF) fractures on radiographs, in comparison with perceptual training in ...medically‐naïve individuals.
Methods
This study extends a previous study that conducted perceptual training in medically‐naïve individuals for the detection of NoF fractures on a variety of dataset sizes. The same anteroposterior hip radiograph dataset was used to train two DCNNs (AlexNet and GoogLeNet) to detect NoF fractures. For direct comparison with perceptual training results, deep learning was completed across a variety of dataset sizes (200, 320 and 640 images) with images split into training (80%) and validation (20%). An additional 160 images were used as the final test set. Multiple pre‐processing and augmentation techniques were utilised.
Results
AlexNet and GoogLeNet DCNNs NoF fracture detection accuracy increased with larger training dataset sizes and mildly with augmentation. Accuracy increased from 81.9% and 88.1% to 89.4% and 94.4% for AlexNet and GoogLeNet respectively. Similarly, the test accuracy for the perceptual training in top‐performing medically‐naïve individuals increased from 87.6% to 90.5% when trained on 640 images compared with 200 images.
Conclusions
Single detection tasks in radiology are commonly used in DCNN research with their results often used to make broader claims about machine learning being able to perform as well as subspecialty radiologists. This study suggests that as impressive as recognising fractures is for a DCNN, similar learning can be achieved by top‐performing medically‐naïve humans with less than 1 hour of perceptual training.