The knowledge graph connects real-world entities and concepts through their relationships, connects all different types of information to obtain a relationship network, and can analyze “relationship” ...issues. Creating a knowledge graph is a continuous process, and it needs to continuously learn new knowledge and update existing knowledge in the library as time and events change. However, since the accuracy of the updated new knowledge cannot be guaranteed, the new knowledge must be verified. This paper aims to study the knowledge verification method based on artificial intelligence-based knowledge graph construction. Based on the analysis of the knowledge graph construction process, the knowledge graph construction method and the knowledge verification method, knowledge verification is realized by constructing a probabilistic soft logic model. The experimental results show that the recall rate, F1 value, and AUC value of the candidate knowledge set are verified by the knowledge verification model proposed in this paper. Therefore, it can be inferred that the knowledge verification model proposed in this paper is effective.
Abstract
The knowledge graph connects real-world entities and concepts through their relationships, connects all different types of information to obtain a relationship network, and can analyze ...“relationship” issues. Creating a knowledge graph is a continuous process, and it needs to continuously learn new knowledge and update existing knowledge in the library as time and events change. However, since the accuracy of the updated new knowledge cannot be guaranteed, the new knowledge must be verified. This paper aims to study the knowledge verification method based on artificial intelligence-based knowledge graph construction. Based on the analysis of the knowledge graph construction process, the knowledge graph construction method and the knowledge verification method, knowledge verification is realized by constructing a probabilistic soft logic model. The experimental results show that the recall rate, F1 value, and AUC value of the candidate knowledge set are verified by the knowledge verification model proposed in this paper. Therefore, it can be inferred that the knowledge verification model proposed in this paper is effective.
Deep neural networks have emerged as a flexible framework that achieved state-of-the-art performance in many NLP applications such as machine translation, named entity recognition, sentiment ...analysis, and part-of-speech tagging. The main advantage of these neural models is their ability to learn useful representations without hand-engineering features. While this success, these models still suffer from the interpretability issue. More recently, probabilistic soft logic (PSL) is a promising framework based on first-order logic that achieves interesting results in both computer vision and NLP by capturing semantic relationships between entities. Moreover, unifying knowledge-driven modeling approaches and data-driven approaches is a promising framework that will have an exciting impact on structured prediction problems. In this paper, we developed NeuralGLogic a generalization framework of the previous model proposed by Huet al. (2016) that combines deep neural networks with logic rules built either using Soft Logic (SL) or Probabilistic Soft Logic (PSL). Furthermore, we evaluate our framework on different neural network architectures applied to two NLP tasks: sentiment classification and part-of-speech tagging. Experimental results showed that we were able to improve the results over the baselines and outperformed all the previous state-of-the-art systems emphasizing the utility of both SL and PSL rules in reducing the uninterpretability of the neural models thus validating our intuition.
•We present NeuralGLogic a novel and general framework that combines deep neural networks with either Soft Logic (SL) rules.•Our framework is a generalization of the system developed by Hu et al., (2016).•We tackled two NLP tasks: Part-of-Speech tagging and sentiment classification.•In the sentiment classification task, we added more logic rules and were able to improve the performance of the system.•We applied the framework to different neural networks architectures to validate our intuition.
Statistical relational learning (SRL) frameworks are effective at defining probabilistic models over complex relational data. They often use weighted first-order logical rules where the weights of ...the rules govern probabilistic interactions and are usually learned from data. Existing weight learning approaches typically attempt to learn a set of weights that maximizes some function of data likelihood; however, this does not always translate to optimal performance on a desired domain metric, such as accuracy or F1 score. In this paper, we introduce a taxonomy of search-based weight learning approaches for SRL frameworks that directly optimize weights on a chosen domain performance metric. To effectively apply these search-based approaches, we introduce a novel projection, referred to as scaled space (SS), that is an accurate representation of the true weight space. We show that SS removes redundancies in the weight space and captures the semantic distance between the possible weight configurations. In order to improve the efficiency of search, we also introduce an approximation of SS which simplifies the process of sampling weight configurations. We demonstrate these approaches on two state-of-the-art SRL frameworks: Markov logic networks and probabilistic soft logic. We perform empirical evaluation on five real-world datasets and evaluate them each on two different metrics. We also compare them against four other weight learning approaches. Our experimental results show that our proposed search-based approaches outperform likelihood-based approaches and yield up to a 10% improvement across a variety of performance metrics. Further, we perform an extensive evaluation to measure the robustness of our approach to different initializations and hyperparameters. The results indicate that our approach is both accurate and robust.
Credit scoring is an important topic in financial activities and bankruptcy prediction that has been extensively explored using deep neural network (DNN) methods. DNN-based credit scoring models rely ...heavily on a large amount of labeled data. The accuracy of DNN-based credit assessment models relies heavily on large amounts of labeled data. However, purely data-driven learning makes it difficult to encode human intent to guide the model to capture the desired patterns and leads to low transparency of the model. Therefore, the Probabilistic Soft Logic Posterior Regularization (PSLPR) framework is proposed for integrating prior knowledge of logic rule with neural network. First, the PSLPR framework calculates the rule satisfaction distance for each instance using a probabilistic soft logic formula. Second, the logic rules are integrated into the posterior distribution of the DNN output to form a logic output. Finally, a novel discrepancy loss which measures the difference between the real label and the logic output is used to incorporate logic rules into the parameters of the neural network. Extensive experiments were conducted on two datasets, the Australian credit dataset and the credit card customer default dataset. To evaluate the obtained systems, several performance metrics were used, including PCC, Recall, F1 and AUC. The results show that compared to the standard DNN model, the four evaluation metrics are increased by 7.14%, 14.29%, 8.15%, and 5.43% respectively on the Australian credit dataset.
Statistical relational learning (SRL) and graph neural networks (GNNs) are two powerful approaches for learning and inference over graphs. Typically, they are evaluated in terms of simple metrics ...such as accuracy over individual node labels. Complex
aggregate graph queries
(AGQ) involving multiple nodes, edges, and labels are common in the graph mining community and are used to estimate important network properties such as social cohesion and influence. While graph mining algorithms support AGQs, they typically do not take into account uncertainty, or when they do, make simplifying assumptions and do not build full probabilistic models. In this paper, we examine the performance of SRL and GNNs on AGQs over graphs with partially observed node labels. We show that, not surprisingly, inferring the unobserved node labels as a first step and then evaluating the queries on the fully observed graph can lead to sub-optimal estimates, and that a better approach is to compute these queries as an expectation under the joint distribution. We propose a sampling framework to tractably compute the expected values of AGQs. Motivated by the analysis of subgroup cohesion in social networks, we propose a suite of AGQs that estimate the community structure in graphs. In our empirical evaluation, we show that by estimating these queries as an expectation, SRL-based approaches yield up to a 50-fold reduction in average error when compared to existing GNN-based approaches.
Although Knowledge Graphs (KGs) have turned out to become a popular and powerful tool in the industry world, the major focus of most researchers has been only on adding more and more triples to the ...A-Boxes of the KGs. An often overlooked but important part of a KG is its T-Box. If the T-Box contains incorrect statements or if certain correct statements are absent in it, it can lead to inconsistent knowledge in the KG or to information loss respectively. In this paper, we propose a novel system, DOPLEX, based on Probabilistic Soft Logic (PSL) to detect disjointness between pairs of object properties present in the KG. Current approaches mainly rely on checking the absence of common triples and miss out on exploiting the semantics of property names. In the proposed system, in addition to checking common triples, PSL is used to determine if property names imply disjointness. We particularly focus on knowledge graphs that are auto-extracted from large text corpora. Our evaluation demonstrates that the proposed approach discovers disjoint property pairs with better precision when compared to the state-of-the-art system without compromising much on the number of disjoint pairs discovered. Towards the end of the paper, we discuss the disjointness of properties in the context of time and propose a new notion called temporal-non-disjointness and discuss its importance and characteristics. We also present an approach for the discovery of property pairs that are potentially temporally non-disjoint.
Entity resolution in settings with rich relational structure often introduces complex dependencies between co-references. Exploiting these dependencies is challenging—it requires seamlessly combining ...statistical, relational, and logical dependencies. One task of particular interest is entity resolution in familial networks. In this setting, multiple partial representations of a family tree are provided, from the perspective of different family members, and the challenge is to reconstruct a family tree from these multiple, noisy, partial views. This reconstruction is crucial for applications such as understanding genetic inheritance, tracking disease contagion, and performing census surveys. Here, we design a model that incorporates statistical signals (such as name similarity), relational information (such as sibling overlap), logical constraints (such as transitivity and bijective matching), and predictions from other algorithms (such as logistic regression and support vector machines), in a collective model. We show how to integrate these features using probabilistic soft logic, a scalable probabilistic programming framework. In experiments on real-world data, our model significantly outperforms state-of-the-art classifiers that use relational features but are incapable of collective reasoning.
Human activity recognition is widely used in smart cities, public safety and other fields, especially in smart home systems where it has a pivotal role. The study addresses the shortcomings of Markov ...logic networks for human activity recognition and proposes a human activity recognition method in smart home scenarios - an activity recognition framework based on Probabilistic Soft Logic (PSL). The framework is able to deal with logical uncertainty problems and provides expression and inference mechanisms for data uncertainty problems on this basis. The framework utilizes Deng entropy evidence theory to provide an evaluation method for sensor event uncertainty, and combines event calculus for activity modeling. Comparing the PSL method with three other common recognition methods, Ontology, Hidden Markov Model (HMM), and Markov logic network, on a public dataset, it was found that the PSL method has a much better ability to handle data uncertainty than the other three algorithms. The average recognition rates on the ADL and ADL-E sub datasets were 82.87% and 80.33%, respectively. In experiments to verify the ability of PSL to handle temporal complexity, PSL showed the least significant decrease in the average recognition rate and maintained an average recognition rate of 81.02% in the presence of concurrent and alternating activities. The human activity recognition method based on PSL has a better performance in handling both data uncertainty and temporal complexity.
In medical texts, temporal information describes events and changes in status, such as medical visits and discharges. According to the semantic features, it is classified into simple time and complex ...time. The current research on time recognition usually focuses on coarse-grained simple time recognition while ignoring fine-grained complex time. To address this problem, based on the semantic concept of complex time in Clinical Time Ontology, we define seven basic features and eleven extraction rules and propose a complex medical time-extraction method. It combines probabilistic soft logic and textual feature feedback. The framework consists of two parts: (a) text feature recognition based on probabilistic soft logic, which is based on probabilistic soft logic for negative feedback adjustment; (b) complex medical time entity recognition based on text feature feedback, which is based on the text feature recognition model in (a) for positive feedback adjustment. Finally, the effectiveness of our approach is verified in text feature recognition and complex temporal entity recognition experimentally. In the text feature recognition task, our method shows the best F1 improvement of 18.09% on the Irregular Instant Collection type corresponding to utterance l17. In the complex medical temporal entity recognition task, the F1 metric improves the most significantly, by 10.42%, on the Irregular Instant Collection type.