Abstract According to military logistics data warehouse dealing with large-scale data effect is poor. This paper puts forward the construction scheme of dual computing engine military logistics data ...warehouse based on hive in order to improve the military logistics support capability. The scheme adopts Hadoop and Spark dual search architecture to support efficient storage of heterogeneous data. In addition, a data block placement strategy based on correlation analysis was used. This article use the intersection relationship matrix and the concurrency relationship matrix to calculate the correlation between the data block placed on the node and the data block to be placed. The access frequency of the target data block was integrated to select the appropriate node storage. The scheme can effectively improve hive’s query efficiency and provide more effective information aided decision support for commanders.
Nowadays, the data used for decision-making come from a wide variety of sources which are difficult to manage using relational databases. To address this problem, many researchers have turned to Not ...only SQL (NoSQL) databases to provide scalability and flexibility for On-Line Analytical Processing (OLAP) systems. In this paper, we propose a set of formal rules to convert a multidimensional data model into a graph data model (MDM2G). These rules allow conventional star and snowflake schemas to fit into NoSQL graph databases. We apply the proposed rules to implement star-like and snowflake-like graph data warehouses. We compare their performances to similar relational ones focusing on the data model, dimensionality, and size. The experimental results show large differences between relational and graph implementations of a data warehouse. A relational implementation performs better for queries on a couple of tables, but conversely, a graph implementation is better when queries involve many tables. Surprisingly the performances of a star-like and snowflake-like graph data warehouses are very close. Hence a snowflake schema could be used in order to easily consider new sub-dimensions in a graph data warehouse.
Among 1376 patients with Covid-19 admitted to a New York City hospital, 59% were treated with hydroxychloroquine. Patients selected for treatment were more severely ill. After adjustment for ...patients’ baseline characteristics, there was no significant association between hydroxychloroquine use and intubation or death (hazard ratio, 1.04; 95% CI, 0.82 to 1.32).
This paper takes Deppon Express Company as an example to study the optimization of the layout of the store warehouse under the standardized operation mode. First of all, it Analyzes warehouse layout ...present situation of the express company, and gets the problems of the warehouse layout of the company, for example, the layout of the lack of scientific, as well as part of the operational areas of position arrangement is not reasonable and unreasonable action path which leads to the whole business operation efficiency is low. Finally, thus using SLP to solve those problems, and put forward the company store aisle warehouse layout optimization strategy of management according to the comprehensive relations preliminary to the data warehouse layout optimization figure.
Multi-model DBMSs (MMDBMSs) have been recently introduced to store and seamlessly query heterogeneous data (structured, semi-structured, graph-based, etc.) in their native form, aimed at effectively ...preserving their variety. Unfortunately, when it comes to analyzing these data, traditional data warehouses (DWs) and OLAP systems fall short because they rely on relational DBMSs for storage and querying, thus constraining data variety into the rigidity of a structured, fixed schema. In this paper, we investigate the performances of an MMDBMS when used to store multidimensional data for OLAP analyses. A multi-model DW would store each of its elements according to its native model; among the benefits we envision for this solution, that of bridging the architectural gap between data lakes and DWs, that of reducing the cost for ETL, and that of ensuring better flexibility, extensibility, and evolvability thanks to the combined use of structured and schemaless data. To support our investigation we define a multidimensional schema for the UniBench benchmark dataset and an ad-hoc OLAP workload for it. Then we propose and compare three logical solutions implemented on the PostgreSQL multi-model DBMS: one that extends a star schema with JSON, XML, graph-based, and key–value data; one based on a classical (fully relational) star schema; and one where all data are kept in their native form (no relational data are introduced). As expected, the full-relational implementation generally performs better than the multi-model one, but this is balanced by the benefits of MMDBMSs in dealing with variety. Finally, we give our perspective view of the research on this topic.
•We investigate the performances of a multi-model DBMS to store multidimensional data for OLAP analyses.•We define a multidimensional schema for the UniBench benchmark dataset and an ad-hoc OLAP workload for it.•We propose and quantitatively compare three logical solutions implemented on the PostgreSQL multi-model DBMS.•The querying performances of a multi-model solution are slightly worse than those of a full-relational solution.•A multi-model solutions brings advantages in terms of extendibility, flexibility, evolvability, ETL.
Abstract in the process of monitoring the operation state of power equipment, it is easy to be affected by the external environment, resulting in a large amount of data of the operation state of ...power equipment and the monitoring effect. A monitoring method of power equipment operation condition based on data warehouse is proposed. By analyzing the constraints of power equipment operation condition monitoring, a data warehouse is established; The transition probability and stability probability are determined by Markov chain model, and the characteristic values of operation state level variables are obtained by calculating the comprehensive correlation value and sensitivity; The entropy weight coefficient method calculates the weight of each performance index of the monitoring object, constructs the operation state monitoring model of power equipment, and realizes the monitoring of the operation state of power equipment. The experimental results show that the improved method can effectively improve the monitoring efficiency and shorten the monitoring time.
Radio frequency identification (RFID) has been widely used in supporting the logistics management on manufacturing shopfloors where production resources attached with RFID facilities are converted ...into smart manufacturing objects (SMOs) which are able to sense, interact, and reason to create a ubiquitous environment. Within such environment, enormous data could be collected and used for supporting further decision-makings such as logistics planning and scheduling. This paper proposes a holistic Big Data approach to excavate frequent trajectory from massive RFID-enabled shopfloor logistics data with several innovations highlighted. Firstly, RFID-Cuboids are creatively introduced to establish a data warehouse so that the RFID-enabled logistics data could be highly integrated in terms of tuples, logic, and operations. Secondly, a Map Table is used for linking various cuboids so that information granularity could be enhanced and dataset volume could be reduced. Thirdly, spatio-temporal sequential logistics trajectory is defined and excavated so that the logistics operators and machines could be evaluated quantitatively. Finally, key findings from the experimental results and insights from the observations are summarized as managerial implications, which are able to guide end-users to carry out associated decisions.
Temporal Graph Cube Wang, Guoren; Zeng, Yue; Li, Rong-Hua ...
IEEE transactions on knowledge and data engineering,
12/2023, Letnik:
35, Številka:
12
Journal Article
Recenzirano
Data warehouse and OLAP (Online Analytical Processing) are effective tools for decision support on traditional relational data and static multidimensional network data. However, many real-world ...multidimensional networks are often modeled as temporal multidimensional networks, where the edges in the network are associated with temporal information. Such temporal multidimensional networks typically cannot be handled by traditional data warehouse and OLAP techniques. To fill this gap, we propose a novel data warehouse model, named <inline-formula><tex-math notation="LaTeX">\mathsf {Temporal{\kern3.0pt}Graph{\kern3.0pt}Cube}</tex-math></inline-formula>, to support OLAP queries on temporal multidimensional networks. Through supporting OLAP queries in any time range, users can obtain summarized information of the network in the time range of interest, which cannot be derived by using traditional static graph OLAP techniques. We propose a segment-tree based indexing technique to speed up the OLAP queries, and also develop an index-updating technique to maintain the index when the temporal multidimensional network evolves over time. In addition, we also propose a novel concept called <inline-formula><tex-math notation="LaTeX">\mathsf {similarity{\kern3.0pt}of{\kern3.0pt}snapshots}</tex-math></inline-formula> which shows a strong correlation with the efficiency of indexing technique and can provide a good reference on the necessity of building the index. The results of extensive experiments on two large real-world datasets demonstrate the effectiveness and efficiency of the proposed method.
How Big Data Is Different Davenport, Thomas H; Barth, Paul; Bean, Randy
MIT Sloan management review,
09/2012, Letnik:
54, Številka:
1
Journal Article
Recenzirano
Many people in the information technology world believe that big data will give companies new capabilities and value. But companies have been dealing with an exponentially increasing amount of data, ...and much of it in forms that are impossible to manage by traditional analytics. Big data includes information such as call center voice data, social media content and video entertainment, as well as clickstream data from the web. The authors posit that organizations that are learning to take advantage of big data are beginning to understand their business environments at a more granular level, are creating new products and services, and are responding more quickly to change as it occurs. These companies stand apart from those with traditional data analysis environments in three critical ways. First, rather than looking at data to assess what happened in the past, these organizations consider data in terms of flows and processes, and make decisions and take actions quickly. In addition, organizations already involved with big data are taking a lead on hiring and training data scientists and product and process developers as opposed to data analysts. And finally, advanced organizations are moving analytics from IT into their core business and operational functions. As big data evolves, a new information ecosystem is also evolving, a network that is continuously sharing information, optimizing decisions, communicating results and generating new insights for businesses.
Digital data collection during routine clinical practice is now ubiquitous within hospitals. The data contains valuable information on the care of patients and their response to treatments, offering ...exciting opportunities for research. Typically, data are stored within archival systems that are not intended to support research. These systems are often inaccessible to researchers and structured for optimal storage, rather than interpretability and analysis. Here we present MIMIC-IV, a publicly available database sourced from the electronic health record of the Beth Israel Deaconess Medical Center. Information available includes patient measurements, orders, diagnoses, procedures, treatments, and deidentified free-text clinical notes. MIMIC-IV is intended to support a wide array of research studies and educational material, helping to reduce barriers to conducting clinical research.