L'analyse en ligne OLAP (Online Analytical Processing) est une des technologies les plus importantes dans les entrepôts de données, elle permet l'analyse multidimensionnelle de données. Cela ...correspond à un outil d'analyse puissant, tout en étant flexible en terme d'utilisation pour naviguer dans les données, plus ou moins en profondeur. OLAP a été le sujet de différentes améliorations et extensions, avec sans cesse de nouveaux problèmes en lien avec le domaine et les données, par exemple le multimedia, les données spatiales, les données séquentielles, etc. A l'origine, OLAP a été introduit pour analyser des données structurées que l'on peut qualifier de classiques. Cependant, l'émergence des réseaux d'information induit alors un nouveau domaine intéressant qu'il convient d'explorer. Extraire des connaissances à partir de larges réseaux constitue une tâche complexe et non évidente. Ainsi, l'analyse OLAP peut être une bonne alternative pour observer les données avec certains points de vue. Différents types de réseaux d'information peuvent aider les utilisateurs dans différentes activités, en fonction de différents domaines. Ici, nous focalisons notre attention sur les réseaux d'informations bibliographiques construits à partir des bases de données bibliographiques. Ces données permettent d'analyser non seulement la production scientifique, mais également les collaborations entre auteurs. Il existe différents travaux qui proposent d'avoir recours aux technologies OLAP pour les réseaux d'information, nommé ``graph OLAP". Beaucoup de techniques se basent sur ce qu'on peut appeler cube de graphes. Dans cette thèse, nous proposons une nouvelle approche de “graph OLAP” que nous appelons “Graphes enrichis par des Cubes” (GreC). Notre proposition consiste à enrichir les graphes avec des cubes plutôt que de construire des cubes de graphes. En effet, les noeuds et/ou les arêtes du réseau considéré sont décrits par des cubes de données. Cela permet des analyses intéressantes pour l'utilisateur qui peut naviguer au sein d'un graphe enrichi de cubes selon différents niveaux d'analyse, avec des opérateurs dédiés. En outre, notons quatre principaux aspects dans GreC. Premièrement, GreC considère la structure du réseau afin de permettre des opérations OLAP topologiques, et pas seulement des opérations OLAP classiques et informationnelles. Deuxièmement, GreC propose une vision globale du graphe avec des informations multidimensionnelles. Troisièmement, le problème de dimension à évolution lente est pris en charge dans le cadre de l'exploration du réseau. Quatrièmement, et dernièrement, GreC permet l'analyse de données avec une évolution du réseau parce que notre approche permet d'observer la dynamique à travers la dimension temporelle qui peut être présente dans les cubes pour la description des noeuds et/ou arêtes. Pour évaluer GreC, nous avons implémenté notre approche et mené une étude expérimentale sur des jeux de données réelles pour montrer l'intérêt de notre approche. L'approche GreC comprend différents algorithmes. Nous avons validé de manière expérimentale la pertinence de nos algorithmes et montrons leurs performances.
Online Analytical Processing (OLAP) is one of the most important technologies in data warehouse systems, which enables multidimensional analysis of data. It represents a very powerful and flexible analysis tool to manage within the data deeply by operating computation. OLAP has been the subject of improvements and extensions across the board with every new problem concerning domain and data; for instance, multimedia, spatial data, sequence data and etc. Basically, OLAP was introduced to analyze classical structured data. However, information networks are yet another interesting domain. Extracting knowledge inside large networks is a complex task and too big to be comprehensive. Therefore, OLAP analysis could be a good idea to look at a more compressed view. Many kinds of information networks can help users with various activities according to different domains. In this scenario, we further consider bibliographic networks formed on the bibliographic databases. This data allows analyzing not only the productions but also the collaborations between authors. There are research works and proposals that try to use OLAP technologies for information networks and it is called Graph OLAP. Many Graph OLAP techniques are based on a cube of graphs.In this thesis, we propose a new approach for Graph OLAP that is graphs enriched by cubes (GreC). In a different and complementary way, our proposal consists in enriching graphs with cubes. Indeed, the nodes or/and edges of the considered network are described by a cube. It allows interesting analyzes for the user who can navigate within a graph enriched by cubes according to different granularity levels, with dedicated operators. In addition, there are four main aspects in GreC. First, GreC takes into account the structure of network in order to do topological OLAP operations and not only classical or informational OLAP operations. Second, GreC has a global view of a network considered with multidimensional information. Third, the slowly changing dimension problem is taken into account in order to explore a network. Lastly, GreC allows data analysis for the evolution of a network because our approach allows observing the evolution through the time dimensions in the cubes.To evaluate GreC, we implemented our approach and performed an experimental study on a real bibliographic dataset to show the interest of our proposal. GreC approach includes different algorithms. Therefore, we also validated the relevance and the performances of our algorithms experimentally.
Metric trees (m-trees) are used to organize and execute fast queries on large databases. In classical schemes based on m-trees, routing information kept in an m-tree node includes a representative or ...a prototype to describe the sub-cluster. Several research has been done to apply m-trees to databases of attributed graphs. In these works routing elements are selected graphs of the sub-clusters. In the current paper, we propose to use Graph Metric Trees to improve k-nn queries. We present two types of Graph Metric Trees. The first uses a representative (Set Median Graph) as routing information; the second uses a graph prototype. Experimental validation shows that it is possible to improve k-nn queries using m-trees when noise between graphs of the same class is of reasonable level.
NoSQL database especially key-value store is employed in most search engine. Being one of many NoSQL databases, Graph Database is tailored to store objects and bidirectional relationships with ...attributes as node and graph so that it can better represent the real world scenario. On the other hand, Object-oriented paradigm sparks as a convenient approach to programming as it reflects the reality as a model of classes. However, both graph database and object-oriented paradigm are seldom addressed and utilized in search algorithms. This signifies an opportunity to discover an alternative for search engine using graph database and object-oriented search algorithm that imitate search capability of a human brain. In this paper, we present a graph-based object-oriented search algorithm named GOOSE that is designed for internet search purpose.
Intrusion Detection using a Graphical Fingerprint Model Nie, Chenyang; Quinan, Paulo Gustavo; Traore, Issa ...
2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
Conference Proceeding
The Activity and Event Network (AEN) graph is a new framework that allows modeling and detecting intrusions by capturing ongoing security-relevant activity and events occurring at a given ...organization using a large time-varying graph model. The graph is generated by processing various network security logs, such as network packets, system logs, and intrusion detection alerts. In this paper, we show how known attack methods can be captured generically using attack fingerprints based on the AEN graph. The fingerprints are constructed by identifying attack idiosyncrasies under the form of subgraphs that represent indicators of compromise (IOes), and then encoded using Property Graph Query Language (PGQL) queries. Among the many attack types, three main categories are implemented as a proof of concept in this paper: scanning, denial of service (DoS), and authentication breaches; each category contains its common variations. The experimental evaluation of the fingerprints was carried using a combination of intrusion detection datasets and yielded very encouraging results.
The textual data of insulation status contains the original records of the on-site partial discharge detection. In order to improve the data utilization rate and mine the relevant information that ...affects the insulation status of the equipment, it needs to be processed and stored uniformly. Knowledge graph technology is used to process the insulation status text data, and the insulation status text format is identified, and then the data storage based on the graph database is completed. By comparison with relational databases, it is found that graph databases have more efficient query performance for complex relationships. Graph database is more suitable for storing textual data of insulation status with complex relationships.
Most NoSQL databases are schemaless. Although they offer some flexibility, they do not have any knowledge of the database schema, losing the benefits provided by these schemas. It is generally ...accepted that data modelling can have an impact on performance, consistency, usability, and maintainability. We argue that NoSQL databases need data models that ensure the proper storage and the relevant querying of the data. This paper seeks to present and illustrate an MDA-based approach, allowing us to achieve a reverse engineering of NoSQL property graph databases into an Extended Entity-Relationship schema. The approach is applied to the case of Neo4j graph database. We present an illustrative scenario and evaluate the reverse engineering approach.
With the massive growth of data, the correlation between data becomes more and more complex. Under this background, graph database develops rapidly. By analyzing the concept, model and structure of ...graph database, the characteristics of graph database are summarized. The three key technologies of graph database are expounded and analyzed in detail. The current graph database products are compared and analyzed, and the domestic and foreign research statistics are analyzed, and the current graph database application scenarios are summarized. Finally, the trend of graph database development in the future is proposed.
Thanks to the current massive increase in data related to social activity, such as social networks and real-time recommendations, large amounts of related data are generated every second. Traditional ...relational databases are insufficiently able to handle this situation. A graph database is suitable for representing relationships in social activity because social networks are easily represented by graphs. Graph databases allow users to store data as nodes and relationships so that the data may be queried as a graph. However, no mechanisms exist that can effectively carry out queries regarding relationships, such as queries about properties of a relationship or relationships of relationships. To solve these problems, we introduce the concept of relationship-based query and develop a graph database query language called RelSeeker, which is based on Datalog. In RelSeeker, a user can create query rules using the properties of relationships, nodes, and relationships of relationships. Moreover, Datalog is suitable for describing relationships of relationships in the graph database. These key features of RelSeker enable effective relationship-based query processing on a graph database. The proposed query language RelSeeker is an extension of Datalog, which can deal with data structures and manipulates a graph database through creating, getting, and updating operations.
In this study, a graph-computing based grid splitting detection algorithm is proposed for contingency analysis in a graph-based EMS (Energy Management System). The graph model of a power system is ...established by storing its bus-branch information into the corresponding vertex objects and edge objects of the graph database. Numerical comparison to an up-to-date serial computing algorithm is also investigated. Online tests on a real power system of China State Grid with 2752 buses and 3290 branches show that a 6 times speedup can be achieved, which lays a good foundation for advanced contingency analysis.
The future regulation system puts forward higher requirements for the storage and management of large power data. According to the natural topological structure, dynamic change and intrinsic complex ...data characteristics of power grid model the knowledge map of power grid model is stored by introducing graph database Neo4j, which promotes the evolution of dispatching automatic analysis and decision- making algorithm to knowledge guidance. The key to build knowledge map is to extract the knowledge unit of power grid model accurately. By studying the data storage mode of Neo4j, this paper proposes 0 general method of building knowledge map of power grid model based on Python and storing it in graph database. The feasibility of the method is verified by practical cases, and the internal relationship of model data in knowledge map is explored by using Cypher query language.