Current task-oriented dialog (TOD) systems mostly manage structured knowledge (e.g. databases and tables) to guide the goal-oriented conversations. However, they fall short of handling dialogs which ...also involve unstructured knowledge (e.g. reviews and documents). In this paper, we formulate a task of modeling TOD grounded on a fusion of structured and unstructured knowledge. To address this task, we propose a TOD system with semi-structured knowledge management, SeKnow, which extends the belief state to manage knowledge with both structured and unstructured contents. Furthermore, we introduce two implementations of SeKnow based on a non-pretrained sequence-to-sequence model and a pretrained language model, respectively. Both implementations use the end-to-end manner to jointly optimize dialog modeling grounded on structured and unstructured knowledge. We conduct experiments on a modified version of MultiWOZ 2.1 dataset, Mod-MultiWOZ 2.1, where dialogs are processed to involve semi-structured knowledge. Experimental results show that SeKnow has strong performances in both end-to-end dialog and intermediate knowledge management, compared to existing TOD systems and their extensions with pipeline knowledge management schemes.
For machines to understand language, they must intuitively grasp the commonsense knowledge that underlies the situations they encounter in text. A simple statement such as it is raining immediately ...implies a bank of shared context for any human reader: they should bring an umbrella, roads will be slippery, increased traffic may make them late, rain boots are preferable to sandals, and many more. Language understanding systems must be able to robustly use this commonsense knowledge to make decisions or take actions. Observations of the world are always more rich and detailed than the information that is explicitly transmitted through language, and machines must be able to fill in remaining details with commonsense inferences.Recent advances in natural language processing have made considerable progress in identifying the commonsense implications of situations described in text. These methods generally involve training high-parameter language models on large language corpora and have shown marked improvement on a variety of benchmark end tasks in natural language understanding. However, these systems are brittle–often failing when presented with out-of-distribution inputs–and uninterpretable–incapable of providing insights into why these different inputs cause shifted behavior. Meanwhile, traditional approaches to natural language understanding, which focus on linking language to background knowledge from large ontologies, remain limited by their inability to scale to the situational diversity expressed through language.In this dissertation, we argue that for natural language understanding agents to function in less controlled test environments, they must learn to reason more explicitly about the commonsense knowledge underlying textual situations. In furtherance of these goals, we draw from both traditional symbolic and modern neural approaches to natural language understanding. We present four studies on learning commonsense representations from language, and integrating and reasoning about these representations in NLP systems to achieve more robust textual understanding.
Abstract Writing high-quality procedural texts is a challenging task for many learners. While example-based learning has shown promise as a feedback approach, a limitation arises when all learners ...receive the same content without considering their individual input or prior knowledge. Consequently, some learners struggle to grasp or relate to the feedback, finding it redundant and unhelpful. To address this issue, we present , an adaptive learning system designed to enhance procedural writing through personalized example-based learning. The core of our system is a multi-step example retrieval pipeline that selects a higher quality and contextually relevant example for each learner based on their unique input. We instantiate our system in the domain of cooking recipes. Specifically, we leverage a fine-tuned Large Language Model to predict the quality score of the learner’s cooking recipe. Using this score, we retrieve recipes with higher quality from a vast database of over 180,000 recipes. Next, we apply to select the semantically most similar recipe in real-time. Finally, we use domain knowledge and regular expressions to enrich the selected example recipe with personalized instructional explanations. We evaluate in a 2 x 2 controlled study (personalized vs. non-personalized examples, reflective prompts vs. none) with 200 participants. Our results show that providing tailored examples contributes to better writing performance and user experience.
Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs ...(CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about the quality and coverage of these resources due to the massive scale required to comprehensively encompass general commonsense knowledge.
In this work, we posit that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents. Therefore, we propose a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them.
With this new goal, we propose Atomic 2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models. We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that Atomic 2020 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains ~12 absolute points lower than a BART-based knowledge model trained on Atomic 2020 despite using over 430x fewer parameters.
Understanding narratives requires reasoning about implicit world knowledge related to the causes, effects, and states of situations described in text. At the core of this challenge is how to access ...contextually relevant knowledge on demand and reason over it.
In this paper, we present initial studies toward zero-shot commonsense question answering by formulating the task as inference over dynamically generated commonsense knowledge graphs. In contrast to previous studies for knowledge integration that rely on retrieval of existing knowledge from static knowledge graphs, our study requires commonsense knowledge integration where contextually relevant knowledge is often not present in existing knowledge bases. Therefore, we present a novel approach that generates contextually-relevant symbolic knowledge structures on demand using generative neural commonsense knowledge models.
Empirical results on two datasets demonstrate the efficacy of our neuro-symbolic approach for dynamically constructing knowledge graphs for reasoning. Our approach achieves significant performance boosts over pretrained language models and vanilla knowledge models, all while providing interpretable reasoning paths for its predictions.
Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and ConceptNet) poses unique challenges compared to the much studied conventional knowledge bases (e.g., Freebase). Commonsense ...knowledge graphs use free-form text to represent nodes, resulting in orders of magnitude more nodes compared to conventional KBs ( ∼18x more nodes in ATOMIC compared to Freebase (FB15K-237)). Importantly, this implies significantly sparser graph structures — a major challenge for existing KB completion methods that assume densely connected graphs over a relatively smaller set of nodes.In this paper, we present novel KB completion models that can address these challenges by exploiting the structural and semantic context of nodes. Specifically, we investigate two key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densification and (2) transfer learning from pre-trained language models to knowledge graphs for enhanced contextual representation of knowledge. We describe our method to incorporate information from both these sources in a joint model and provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on ConceptNet. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for ConceptNet) when training on subgraphs for computational efficiency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.
In this paper, we present kogito, an open-source tool for generating commonsense inferences about situations described in text. kogito provides an intuitive and extensible interface to interact with ...natural language generation models that can be used for hypothesizing commonsense knowledge inference from a textual input. In particular, kogito offers several features for targeted, multi-granularity knowledge generation. These include a standardized API for training and evaluating knowledge models, and generating and filtering inferences from them. We also include helper functions for converting natural language texts into a format ingestible by knowledge models - intermediate pipeline stages such as knowledge head extraction from text, heuristic and model-based knowledge head-relation matching, and an ability to define and use custom knowledge relations. We make the code for kogito available at https://github.com/epfl-nlp/kogito along with thorough documentation at https://kogito.readthedocs.io.