Knowledge Graph Embedding for Link Prediction Rossi, Andrea; Barbosa, Denilson; Firmani, Donatella ...
ACM transactions on knowledge discovery from data,
04/2021, Letnik:
15, Številka:
2
Journal Article
Recenzirano
Odprti dostop
Knowledge Graphs (KGs) have found many applications in industrial and in academic settings, which in turn, have motivated considerable research efforts towards large-scale information extraction from ...a variety of sources. Despite such efforts, it is well known that even the largest KGs suffer from incompleteness; Link Prediction (LP) techniques address this issue by identifying missing facts among entities already in the KG. Among the recent LP techniques, those based on
KG embeddings
have achieved very promising performance in some benchmarks. Despite the fast-growing literature on the subject, insufficient attention has been paid to the effect of the design choices in those methods. Moreover, the standard practice in this area is to report accuracy by aggregating over a large number of test facts in which some entities are vastly more represented than others; this allows LP methods to exhibit good results by just attending to structural properties that include such entities, while ignoring the remaining majority of the KG. This analysis provides a comprehensive comparison of embedding-based LP methods, extending the dimensions of analysis beyond what is commonly available in the literature. We experimentally compare the effectiveness and efficiency of 18 state-of-the-art methods, consider a rule-based baseline, and report detailed analysis over the most popular benchmarks in the literature.
Graph Neural Networks (GNNs) have exploded onto the machine learning scene in recent years owing to their capability to model and learn from graph-structured data. Such an ability has strong ...implications in a wide variety of fields whose data are inherently relational, for which conventional neural networks do not perform well. Indeed, as recent reviews can attest, research in the area of GNNs has grown rapidly and has lead to the development of a variety of GNN algorithm variants as well as to the exploration of ground-breaking applications in chemistry, neurology, electronics, or communication networks, among others. At the current stage research, however, the efficient processing of GNNs is still an open challenge for several reasons. Besides of their novelty, GNNs are hard to compute due to their dependence on the input graph, their combination of dense and very sparse operations, or the need to scale to huge graphs in some applications. In this context, this article aims to make two main contributions. On the one hand, a review of the field of GNNs is presented from the perspective of computing. This includes a brief tutorial on the GNN fundamentals, an overview of the evolution of the field in the last decade, and a summary of operations carried out in the multiple phases of different GNN algorithm variants. On the other hand, an in-depth analysis of current software and hardware acceleration schemes is provided, from which a hardware-software, graph-aware, and communication-centric vision for GNN accelerators is distilled.
Knowledge Graph Completion seeks to find missing elements in a Knowledge Graph, usually edges representing some relation between two concepts. One possible way to do this is to find paths between two ...nodes that indicate the presence of a missing edge. This can be achieved through Reinforcement Learning, by training an agent that learns how to navigate through the graph, starting at a node with a missing edge and identifying what edge among the available ones at each step is more promising in order to reach the target of the missing edge. While some approaches have been proposed to this effect, their reward functions only take into account whether the target node was reached or not, and only apply a single Reinforcement Learning algorithm. In this regard, we present a new family of reward functions based on node embeddings and structural distance, leveraging additional information related to semantic similarity and removing the need to reach the target node to obtain a measure of the benefits of an action. Our experimental results show that these functions, as well as the novel use of more modern Reinforcement Learning techniques, are able to obtain better results than the existing strategies in the literature.
•Reinforcement Learning method to complete knowledge graphs.•Usage of various embedding techniques combined to form a novel reward function.•Application of Proximal Policy optimization and soft Actor-Critic Algorithms.•Results show that the combination of embedding rewards improves agent performance.
Knowledge Graphs (KGs) have made a qualitative leap and effected a real revolution in knowledge representation. This is leveraged by the underlying structure of the KG which underpins a better ...comprehension, reasoning and interpretation of knowledge for both human and machine. Therefore, KGs continue to be used as the main means of tackling a plethora of real-life problems in various domains. However, there is no consensus in regard to a plausible and inclusive definition of a domain-specific KG. Further, in conjunction with several limitations and deficiencies, various domain-specific KG construction approaches are far from perfect. This survey is the first to offer a comprehensive definition of a domain-specific KG. Also, the paper presents a thorough review of the state-of-the-art approaches drawn from academic works relevant to seven domains of knowledge. An examination of current approaches reveals a range of limitations and deficiencies. At the same time, uncharted territories on the research map are highlighted to tackle extant issues in the literature and point to directions for future research.
•This is the first paper to provide an inclusive definition of a domain-specific KG.•We conduct a thorough analysis of more than 140 papers on KG construction approaches, covering seven domains.•The paper highlights research gaps in the area of domain-specific KG construction and suggests venues for future research.
The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. To assess the reproducibility of ...previously published results, we re-implemented and evaluated 21 models in the PyKEEN software package. In this paper, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all, as well as provide insight as to why this might be the case. We then performed a large-scale benchmarking on four datasets with several thousands of experiments and 24,804 GPU hours of computation time. We present insights gained as to best practices, best configurations for each model, and where improvements could be made over previously published best configurations. Our results highlight that the combination of model architecture, training approach, loss function, and the explicit modeling of inverse relations is crucial for a model's performance and is not only determined by its architecture. We provide evidence that several architectures can obtain results competitive to the state of the art when configured carefully. We have made all code, experimental configurations, results, and analyses available at https://github.com/pykeen/pykeen and https://github.com/pykeen/benchmarking .
•entity2rec is a recommender system based on knowledge graph embeddings.•entity2rec generates accurate and non-obvious recommendations•entity2rec is very effective when the dataset is sparse and has ...a low popularity bias.•Learning to rank is no longer needed to generate good recommendations with entity2rec.•entity2rec can be easily interpreted and configured in an interactive interface.
Knowledge graphs have shown to be highly beneficial to recommender systems, providing an ideal data structure to generate hybrid recommendations using both content-based and collaborative filtering. Most knowledge-aware recommender systems are based on manually engineered features, typically relying on path counting and/or on random walks. Recently, knowledge graph embeddings have proven to be extremely effective at learning features for prediction tasks, reducing the complexity and time required to manually design effective features. In this work, we present entity2rec, which learns user-item relatedness for item recommendation through property-specific knowledge graph embeddings. A key element of entity2rec is the construction of property-specific subgraphs. Through an extensive evaluation on three datasets, we show that: (1) hybrid property-specific subgraphs consistently enhance the quality of recommendations with respect to collaborative and content-based subgraphs; (2) entity2rec generates accurate and non-obvious recommendations, compared to a set of state-of-the-art recommender systems, achieving high accuracy, serendipity and novelty. More in detail, entity2rec is particularly effective when the dataset is sparse and has a low popularity bias; (3) entity2rec is easily interpretable and can thus be configured for a particular recommendation problem.
A well-known theorem of Sabidussi shows that a simple G-arc-transitive graph can be represented as a coset graph for the group G. This pivotal result is the standard way to turn problems about simple ...arc-transitive graphs into questions about groups. In this paper, the Sabidussi representation is extended to arc-transitive, not necessarily simple graphs which satisfy a local-finiteness condition: namely graphs with finite valency and finite edge-multiplicity. The construction yields a G-arc-transitive coset graph Cos(G,H,J), where H,J are stabilisers in G of a vertex and incident edge, respectively. A first major application is presented concerning arc-transitive maps on surfaces: given a group G=〈a,z〉 with |z|=2 and |a| finite, the coset graph Cos(G,〈a〉,〈z〉) is shown, under suitable finiteness assumptions, to have exactly two different arc-transitive embeddings as a G-arc-transitive map (V,E,F) (with V,E,F the sets of vertices, edges and faces, respectively), namely, a G-rotary map if |az| is finite, and a G-bi-rotary map if |zza| is finite. The G-rotary map can be represented as a coset geometry for G, extending the notion of a coset graph. However the G-bi-rotary map does not have such a representation, and the face boundary cycles must be specified in addition to incidences between faces and edges. In addition a coset geometry construction is given of a flag-regular map (V,E,F) for non necessarily simple graphs. For all of these constructions it is proved that the face boundary cycles are simple cycles precisely when the given group acts faithfully on V∪F. Illustrative examples are given for graphs related to the n-dimensional hypercubes and the Petersen graph.
Graph representations that preserve relevant topological information allow the use of a rich machine learning toolset for data-driven network analytics. Some notable graph representations in the ...literature are fruitful in their respective applications but they either lack interpretability or are unable to effectively encode a graph’s structure at both local and global scale. In this work, we propose the Higher-Order Structure Descriptor (HOSD): an interpretable graph descriptor that captures information about the patterns in a graph at multiple scales. Scaling is achieved using a novel graph compression technique that reveals successive higher-order structures. The proposed descriptor is invariant to node permutations due to its graph-theoretic nature. We analyze the HOSD algorithm for time complexity and also prove the NP-completeness of three interesting graph compression problems. A faster version, HOSD-Lite, is also presented to approximate HOSD on dense graphs. We showcase the interpretability of our model by discussing structural patterns found within real-world datasets using HOSD. HOSD and HOSD-Lite are evaluated on benchmark datasets for applicability to classification problems; results demonstrate that a simple random forest setup based on our representations competes well with the current state-of-the-art graph embeddings.