Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all machine learning–based methods, they are as good as their training ...data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high-level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a language representation model that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.
Sports competitions are widely researched in computer and social science, with
the goal of understanding how players act under uncertainty. Although there is
an abundance of computational work on ...player metrics prediction based on past
performance, very few attempts to incorporate out-of-game signals have been
made. Specifically, it was previously unclear whether linguistic signals
gathered from players’ interviews can add information that does not
appear in performance metrics. To bridge that gap, we define text classification
tasks of predicting deviations from mean in NBA players’ in-game actions,
which are associated with strategic choices, player behavior, and risk, using
their choice of language prior to the game. We collected a data set of
transcripts from key NBA players’ pre-game interviews and their in-game
performance metrics, totalling 5,226 interview-metric pairs. We design neural
models for players’ action prediction based on increasingly more complex
aspects of the language signals in their open-ended interviews. Our models can
make their predictions based on the textual signal alone, or on a combination of
that signal with signals from past-performance metrics. Our text-based models
outperform strong baselines trained on performance metrics only, demonstrating
the importance of language usage for action prediction. Moreover, the models
that utilize both textual input and past-performance metrics produced the best
results. Finally, as neural networks are notoriously difficult to interpret, we
propose a method for gaining further insight into what our models have learned.
Particularly, we present a latent Dirichlet allocation–based analysis,
where we interpret model predictions in terms of correlated topics. We find that
our best performing textual model is most associated with topics that are
intuitively related to each prediction task and that better models yield higher
correlation with more informative topics.
A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in ...Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the challenges and opportunities in the application of causal inference to the textual domain, with its unique properties. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects with text, encompassing settings where text is used as an outcome, treatment, or to address confounding. In addition, we explore potential uses of causal inference to improve the robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the NLP community.
Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic ...representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. Using stringent zero-shot mapping we demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns. The common geometric patterns allow us to predict the brain embedding in IFG of a given left-out word based solely on its geometrical relationship to other non-overlapping words in the podcast. Furthermore, we show that contextual embeddings capture the geometry of IFG embeddings better than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain.
Departing from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word ...prediction task, these models generate appropriate linguistic responses in a given context. In the current study, nine participants listened to a 30-min podcast while their brain responses were recorded using electrocorticography (ECoG). We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process the same natural narrative: (1) both are engaged in continuous next-word prediction before word onset; (2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise; (3) both rely on contextual embeddings to represent words in natural contexts. Together, our findings suggest that autoregressive DLMs provide a new and biologically feasible computational framework for studying the neural basis of language.
The recent introduction of thousands of cryptocurrencies in an unregulated environment has created many opportunities for unscrupulous traders to profit from price manipulation. We quantify the scope ...of one widespread tactic, the “pump and dump”, in which actors coordinate to bid up the price of coins before selling at a profit. We joined all relevant channels on two popular group-messaging platforms, Telegram and Discord, and identified thousands of different pumps targeting hundreds of coins. We find that pumps are modestly successful in driving short-term price rises, but that this effect has diminished over time. We also find that the most successful pumps are those that are most transparent about their intentions. Combined with evidence of concentration among a small number of channels, we conclude that regulators have an opportunity to effectively crack down on this illicit activity that threatens broader adoption of blockchain technologies.
•We gather the most comprehensive dataset on cryptocurrency pump and dump schemes.•Pumped coins experience a modest price increase, but it is higher for less popular coins.•Brazen pumps are more “successful” than those obscuring their corrupt intent.•High concentration among pump channels creates opportunity for regulators.
Display omitted
•Clinical notes contain potentially identifying residual demographic traits.•Rarity of demographic traits poses a problem for machine learning algorithms.•An active learning process ...mitigates this problem, increasing positive labels.•BERT can detect demographic traits with high accuracy through this process.
The free-form portions of clinical notes are a significant source of information for research, but before they can be used, they must be de-identified to protect patients' privacy. De-identification efforts have focused on known identifier types (names, ages, dates, addresses, ID's, etc.). However, a note can contain residual “Demographic Traits” (DTs), unique enough to re-identify the patient when combined with other such facts. Here we examine whether any residual risks remain after removing these identifiers. After manually annotating over 140,000 words worth of medical notes, we found no remaining directly identifying information, and a low prevalence of demographic traits, such as marital status or housing type. We developed an annotation guide to the discovered Demographic Traits (DTs) and used it to label MIMIC-III and i2b2-2006 clinical notes as test sets. We then designed a “bootstrapped” active learning iterative process for identifying DTs: we tentatively labeled as positive all sentences in the DT-rich note sections, used these to train a binary classifier, manually corrected acute errors, and retrained the classifier. This train-and-correct process may be iterated. Our active learning process significantly improved the classifier's accuracy. Moreover, our BERT-based model outperformed non-neural models when trained on both tentatively labeled data and manually relabeled examples. To facilitate future research and benchmarking, we also produced and made publicly available our human annotated DT-tagged datasets. We conclude that directly identifying information is virtually non-existent in the multiple medical note types we investigated. Demographic traits are present in medical notes, but can be detected with high accuracy using a cost-effective human-in-the-loop active learning process, and redacted if desired.22Annotations are available at: https://www.kaggle.com/google-health/demographic-traits-annotations
Large language models (LLMs) have shown an impressive ability to perform tasks believed to require thought processes. When the model does not document an explicit thought process, it becomes ...difficult to understand the processes occurring within its hidden layers and to determine if these processes can be referred to as reasoning. We introduce a novel and interpretable analysis of internal multi-hop reasoning processes in LLMs. We demonstrate that the prediction process for compositional reasoning questions can be modeled using a simple linear transformation between two semantic category spaces. We show that during inference, the middle layers of the network generate highly interpretable embeddings that represent a set of potential intermediate answers for the multi-hop question. We use statistical analyses to show that a corresponding subset of tokens is activated in the model's output, implying the existence of parallel reasoning paths. These observations hold true even when the model lacks the necessary knowledge to solve the task. Our findings can help uncover the strategies that LLMs use to solve reasoning tasks, offering insights into the types of thought processes that can emerge from artificial intelligence. Finally, we also discuss the implication of cognitive modeling of these results.
The reliance of text classifiers on spurious correlations can lead to poor
generalization at deployment, raising concerns about their use in
safety-critical domains such as healthcare. In this work, ...we propose to use
counterfactual data augmentation, guided by knowledge of the causal structure
of the data, to simulate interventions on spurious features and to learn more
robust text classifiers. We show that this strategy is appropriate in
prediction problems where the label is spuriously correlated with an attribute.
Under the assumptions of such problems, we discuss the favorable sample
complexity of counterfactual data augmentation, compared to importance
re-weighting. Pragmatically, we match examples using auxiliary data, based on
diff-in-diff methodology, and use a large language model (LLM) to represent a
conditional probability of text. Through extensive experimentation on learning
caregiver-invariant predictors of clinical diagnoses from medical narratives
and on semi-synthetic data, we demonstrate that our method for simulating
interventions improves out-of-distribution (OOD) accuracy compared to baseline
invariant learning algorithms.