The growth and evolution of threats, vulnerabilities and cyber-attacks increase security incidents and generate negative impacts on organizations. We present an online analytical processing (OLAP) ...system for early alerts of upcoming malicious activities. This study aims to systematize the support of cybersecurity granted by a Computer Security Incident Response Team (CSIRT) and shall help to establish a mechanism to analyze and improve the overall level of security of networks and equipment by providing early warning services. In order to accomplish this task, a Business Intelligence solution has been developed adapting the methodology of Ralph Kimball to support the analysis of computer security incidents. This generates a data warehouse of information collected from alerts and events recorded from a continuous transmission of data from various Internet security sources that gather, trace and report malware, botnet, and electronic fraud. Furthermore, we constructed with Pentaho BI load data into the dimensions, measures and facts, OLAP cubes, reports and dashboards. The acquired results demonstrate the functionality of the application where it is possible to visualize with certainty of both, the early warnings, as well as the level of security of the participant Institutions, about the registered threats and vulnerabilities.
SACHER Project Bertacchi, Silvia; Al Jawarneh, Isam Mashhour; Apollonio, Fabrizio Ivan ...
Proceedings of the 4th EAI International Conference on Smart Objects and Technologies for Social Good,
11/2018
Conference Proceeding
Open access
The SACHER project provides a distributed, open source and federated cloud platform able to support the life-cycle management of various kinds of data concerning tangible Cultural Heritage. The paper ...describes the SACHER platform and, in particular, among the various integrated service prototypes, the most important ones to support restoration processes and cultural asset management: (i) 3D Life Cycle Management for Cultural Heritage (SACHER 3D CH), based on 3D digital models of architecture and dedicated to the management of Cultural Heritage and to the storage of the numerous data generated by the team of professionals involved in the restoration process; (ii) Multidimensional Search Engine for Cultural Heritage (SACHER MuSE CH), an advanced multi-level search system designed to manage Heritage data from heterogeneous sources.
An ongoing process of modernization, rationalization and implementation of public administration reform, in order to increase efficiency and economy, would be infeasible without improvement and ...continuous investments in new information technology. The main task of public administration is to be a service to citizens and legal entities from their domain. With the purpose of improving its functions, public authorities must have complete information in order to increase the quality and speed of information exchange with all stakeholders in the process. This paper describes a model of data storage. It describes the process of building and organization of data warehouse and ETL process of extracting, cleansing and transformation of data in a multidimensional data warehouse model, which is based on the metadata model. Furthermore it describes a concept and basic methods used in the development of business intelligence systems for the construction and use of a data warehouse. In addition it states options for the analysis of the data in the warehouse, which are shown based on dimensional storage concept, and it gives guidelines for future development.
An approach to conceptual modelling of ETL processes Dupor, Sasa; Jovanovic, Vladan
2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
Conference Proceeding
In this paper, we propose a new conceptual model based on the visualization of data flow showing transformations of records accompanied with attribute transformations. A data warehouse development ...relies on the development of Extract, Transform, and Load (ETL) processes responsible for extracting data from transactional systems, data quality checks, transformation and conversion and loading data to target tables or cubes in a data warehouse. During the planning and design phases for data warehouse, the ETL conceptual model should be developed not only to show an overview of the whole process, but also to map sources, targets and necessary transformations, verify transformation logic, ensure that requirements are met and the needed structures and content exist. Once created, the conceptual model can be used for development of an ETL process, maintenance/optimization and redesign of a process due to new business requirements and database schema changes, all of which makes it a valuable addendum to data warehouse documentation.
Tiling Strategies for Distributed Point Cloud Databases Szalai-Gindl, János M.; Dobos, László; Csabai, István
Proceedings of the 29th International Conference on Scientific and Statistical Database Management,
06/2017
Conference Proceeding
Many large point clouds -- such as cosmological N-body simulations, intersections of road networks etc. -- are strongly clustered on a hierarchy of scales. In shared nothing distributed environments, ...optimized tiling of data is crucial to minimize cross-server communication and balance IO and processing load. We propose histogram-based tiling algorithms, a hierarchical tiling and a spectral clustering algorithm, that can be incorporated into the data extraction or transformation phase of a typical Extraction--Transformation--Loading (ETL) procedure. We define measures to characterize the performance of these tiling techniques with respect to typical spatial search operations, and evaluate the algorithms based on these measures using hierarchically clustered data sets.
Actually, any company needs to collaborate with others to improve their performance and productivity. Ontologies can provide a way to promote collaboration between companies. They contribute on ...reducing the syntax and semantic conflicts that may occur during the collaboration process. Data warehouse technology is a serious candidate for data-sharing architecture that may be employed within the collaborating companies. The spectacular adoption of domain ontologies by several communities facilitates the explosion of semantic databases sources (SDB) that become candidate for building the semantic data warehouses (SDW). This situation motivates us to deeply formalize the structure of semantic sources in order to propose an automatic construction of a semantic data warehouse SDW. In this paper, we first proposed a generic framework for handling semantic sources. Secondly, the generic ETL steps are incorporated to our framework. Our proposal is validated through a case study, considering Oracle SDB, where each source references a global ontology of the Lehigh University Bench Mark.
The use of Business Intelligence systems has been increased over the last fifteen years. Small and medium enterprises have seen the importance of this type of systems in supporting decision-making ...process. However, there is no consensus about standards that supports activities associated to the development process of this type of systems, namely, ETL design, Datawarehouse design, OLAP cubes design, and others. Hence, this paper focuses on presenting a literature review about the proposed techniques for modeling the ETL process, in order to identify trends, to provide to the practitioners an overview of the field enabling them to make decisions to choose an alternative, and finally, to show to the researchers in the field of business intelligence in particular, and software engineering researchers in general, the need to work on the construction of standards to help reduce the blur found in the design activities.
The process of migrating from one corporative product to another has some difficulties. Especially, these issues are noticeable if the outdated system worked for a long period of time; it stores a ...great amount of data, which need to be transferred to the new system. Sometimes the data formats are fully incompatible. In most cases data migration is possible; however there is no way to move data from old to new system completely. This paper describes the intermediate period, when the new system is being implemented and the old system is being retired. Two approaches of organizing the system replacement process are suggested on condition that the old data will be entirely saved to new system: parallel exploitation of outdated and new systems and preliminary data conversion. The classification of data conversion processes is provided in this work. The conversion organization process is analyzed from two perspectives labour coefficients and possibilities of automation. The following types of conversion are described: automatic and half-automatic. Based on the given classification, the practical experience of data conversion in the information system renewal process is revealed. The basic software tools for automated data conversion are described. One of the developed tools for this task is a special data conversion framework written in C# for the Microsoft .NET platform. This framework implements ETL principles (extract, transform and load) and allows developers to quickly build application for data conversion. The classified data is the address set describing the political division's objects in Russian Federation's subjects. The classification process is consists in determining the address code from the standard classifier based on the textual address description. This approach allows developers to reduce time costs required for conversion of large data mass. After the description of automatic data conversion products developed for address classification, the advantages of their usage in system renewal process are pointed out at the end of paper.
Measures for ETL processes models in data warehouses Muñoz, Lilia; Mazón, Jose-Norberto; Trujillo, Juan
Proceedings of the first international workshop on Model driven service engineering and data quality and security,
11/2009
Conference Proceeding
In data warehousing, ETL (Extract, Transform, and Load)
processes take charge of extracting the data from data sources that would be contained in the data warehouse. Due to their relevance, the ...quality of these processes should be formally assessed since the early stages of development, in order to avoid making bad decisions as a result of incorrect data. In this paper, a set of measures to evaluate the structural complexity of ETL process models at conceptual level is presented. Moreover, this study is accompanied by four experiments whose aim is the empirical validation of the proposed measures. The main advantage of this approach is the early evaluation of ETL process models. This early evaluation support designers in their maintenance tasks. This proposal is based on UML (Unifield Modeling Language) activity diagrams for modeling ETL processes and the adoption of the FMESP (Framework for the Modeling and Evaluation of Software Processes) framework.
Nowadays, with the wide range of data increased incredibly over the years, extracting and integrating useful data from multiple information sources still has to face significant challenges in ...semantically integrating heterogeneous sources to data warehouse. Within the scope of this paper, a semantically coupling of a metamodel with an ontology will be applied, describing - at high abstraction levels - applications domains for both schema and semantic integration. Specifically, the proposed framework can improve the interoperability in ETL processes by means of the schema-based semantics in CWM-compliant metamodels as well as the ontology-based foundation to better representation and management of the underlying domain semantics. Furthermore, potential approach for the mapping of CWM-based model elements and ontology constructs for the definition of metadata required for ETL processes is introduced, facilitating the extraction, transformation and loading of useful data from distributed and heterogeneous sources.