In the era of Big Data, many NoSQL databases emerged for the storage and later processing of vast volumes of data, using data structures that can follow columnar, key-value, document or graph ...formats. For analytical contexts, requiring a Big Data Warehouse, Hive is used as the driving force, allowing the analysis of vast amounts of data. Data models in Hive are usually defined taking into consideration the queries that need to be answered. In this work, a set of rules is presented for the transformation of multidimensional data models into Hive tables, making available data at different levels of detail. These several levels are suited for answering different queries, depending on the analytical needs. After the identification of the Hive tables, this paper summarizes a demonstration case in which the implementation of a specific Big Data architecture shows how the evolution from a traditional Data Warehouse to a Big Data Warehouse is possible.
Aspergillus fumigatus is a ubiquitously distributed filamentous fungus that has emerged as one of the most serious life-threatening pathogens in immunocompromised patients. The mechanisms for its ...pathogenicity are poorly understood. Here, we analyzed the proteome of dormant A. fumigatus conidia as the fungal entity having the initial contact with the host. Applying two-dimensional polyacrylamide gel electrophoresis (2-D PAGE), we established a 2-D reference map of conidial proteins. By MALDI-TOF mass spectrometry, we identified a total number of 449 different proteins. We show that 57 proteins of our map are over-represented in resting conidia compared to mycelium. Enzymes involved in reactive oxygen intermediates (ROI) detoxification, pigment biosynthesis, and conidial rodlet layer formation were highly abundant in A. fumigatus spores and most probably account for their enormous stress resistance. Interestingly, pyruvate decarboxylase and alcohol dehydrogenase were detectable in dormant conidia, suggesting that alcoholic fermentation plays a role during dormancy or early germination. Moreover, we show that enzymes for rapid reactivation of protein biosynthesis and metabolic processes are preserved in resting conidia, which therefore feature the potential to immediately respond to an environmental stimulus by germination. The generated data lay the foundations for further proteomic analyses and a better understanding of fungal pathogenesis.
Most biological processes including diseases are multifactorial and determined by a complex interplay of various genetic and environmental factors. This chapter aims to provide a user guide to data ...querying, analysis, and visualization with TargetMine and the associated auxiliary toolkit. We have also discussed some of the commonly used data queries for the researchers who are interested in gene set analysis within a data warehouse framework. Overall, TargetMine provides a convenient web browser-based interface that enables the discovery of new hypotheses interactively, by performing analysis of omics data using complicated searches without any scripting and programming efforts on the part of the user and also by providing the results in an easy-to-comprehend output format.
Maintenance of Data Warehouse (DW) systems is a critical task because any downtime or data loss can have significant consequences on business applications. Existing DW maintenance solutions mostly ...rely on concrete technologies and tools that are dependent on: the platform on which the DW system was created; the specific data extraction, transformation, and loading (ETL) tool; and the database language the DW uses. Different languages for different versions of DW systems make organizing DW processes difficult, as minimal changes in the structure require major changes in the application code for managing ETL processes. This article proposes a domain-specific language (DSL) for ETL process management that mitigates these problems by centralizing all program logic, making it independent from a particular platform. This approach would simplify DW system maintenance. The platform-independent language proposed in this article also provides an easier way to create a unified environment to control DW processes, regardless of the language, environment, or ETL tool the DW uses.
This study provides an AI-based detection tool for the surveillance of suspicious activities using data fusion. The system leverages time, location, and specific data pertaining to individuals, ...objects, and vehicles associated with the agency. The study's training data was obtained from Thailand's military institution. The study focuses on comparing the efficiency between MySQL and Apache Hive for big data processing. According to the findings, MySQL is better suited for quick data retrieval and low storage capacity, while Hive demonstrates higher scalabilities for larger datasets. Furthermore, the study explores the practical utilization of web applications interfaces, enabling real-time display, analysis, and identification suspicious activity results. The web application, built with NuxtJS and MySQL, includes statistics charts and maps that show the status of suspicious items, cars, and people, as well as data filtering options. The system utilizes machine-learning algorithms to train the suspicious identification model, with the best-performing algorithms being the decision tree, reaching 98.867% classification accuracy.
R Bakery company is a company that produces bread every day. Products that produced in that company have many different types of bread. Products are made in the form of sweet bread and wheat bread ...which have different tastes for every types of bread. During the making process, there were defects in the products which the defective product turns into reject product. Types of defects that are produced include burnt, sodden bread and shapeless bread. To find out the information about the defects that have been produced then by applying a designed model business intelligence system to create database and data warehouse. By using model business Intelligence system, it will generate useful information such as how many defect that produced by each of the bakery products. To make it easier to obtain such information, it can be done by using data mining method which data that we get is deep explored. The method of data mining is using k-means clustering method. The results of this intelligence business model system are cluster 1 with little amount of defect, cluster 2 with medium amount of defect and cluster 3 with high amount of defect. From OLAP Cube method can be seen that the defect generated during the 7 months period of 96,744 pieces.
Fall armyworm (Spodoptera frugiperda), a native insect species in the Americas, is rapidly becoming a major agricultural pest worldwide and is causing great damage to corn, rice, soybeans, and other ...crops. To control this pest, scientists have accumulated a great deal of high‐throughput data of fall armyworm, and nine versions of its genomes and transcriptomes have been published. However, easily accessing and performing integrated analysis of these omics data sets is challenging. Here, we developed the Fall Armyworm Genome Database (FAWMine, http://159.226.67.243:8080/fawmine/) to maintain genome sequences, structural and functional annotations, transcriptomes, co‐expression, protein interactions, homologs, pathways, and single‐nucleotide variations. FAWMine provides a powerful framework that helps users to perform flexible and customized searching, present integrated data sets using diverse visualization methods, output results tables in a range of file formats, analyze candidate gene lists using multiple widgets, and query data available in other InterMine systems. Additionally, stand‐alone JBrowse and BLAST services are also established, allowing the users to visualize RNA‐Seq data and search genome and annotated gene sequences. Altogether, FAWMine is a useful tool for querying, visualizing, and analyzing compiled data sets rapidly and efficiently. FAWMine will be continually updated to function as a community resource for fall armyworm genomics and pest control research.
Graphical : Fall armyworm (Spodoptera frugiperda) is rapidly becoming a major agricultural pest worldwide. Here, we developed the Fall Armyworm Genome Database (FAWMine, http://159.226.67.243:8080/fawmine/) to maintain heterogeneous omics data, including genomes, transcriptomes, networks, and single‐nucleotide variations. FAWMine provides a powerful framework that helps users to perform flexible searching, and analyze candidate gene lists using multiple widgets.
The development of Extract–Transform–Load (ETL) processes is the most complex, time-consuming and expensive phase of data warehouse development. Yet, the dynamics of modern business systems demand a ...more agile and flexible approach to their development. As a result, current research in this area is focused on ETL process conceptualization and the automation of ETL process development. This paper proposes a novel solution for automating ETL processes using the domain-specific modeling (DSM) approach. The proposed solution is based on the formal specification of ETL processes and the implementation of such formal specifications. Thus, in accordance with the DSM approach, several new domain-specific languages (DSLs) are introduced, each defining concepts relevant for a specific aspect of an ETL process. The focus of this paper is the actual implementation of the formal specification of an ETL process. To this end, a specific ETL platform (ETL-PL) is introduced to technologically support both the modeling of ETL processes (i.e., the creation of models in accordance with the introduced DSLs) and the automated transformation of the created models into the executable code of a specific application framework (representing ETL-PL’s execution environment). It should be emphasized that ETL-PL actually presumes the dynamic execution of ETL models or, more precisely, the executable code is generated at runtime. Thus the execution environment consists of code generator components and the components implementing the application framework. ETL-PL has been implemented as an extension of the .NET platform.
Telemedicine, a term that encompasses several applications and tasks, generally involves the remote management and treatment of patients by physicians. It is known as transversal telemedicine when ...practiced among health care professionals (HCPs).
We describe the experience of implementing our telemedicine Eumeda platform for HCPs over the last 10 years.
A web-based informatics platform was developed that had continuously updated hypertext created using advanced technology and the following features: security, data insertion, dedicated software for image analysis, and the ability to export data for statistical surveys. Customizable files called "modules" were designed and built for different fields of medicine, mainly in the ophthalmology subspecialty. Each module was used by HCPs with different authorization profiles.
Twelve representative modules for different projects are presented in this manuscript. These modules evolved over time, with varying degrees of interconnectivity, including the participation of a number of centers in 19 cities across Italy. The number of HCP operators involved in each single module ranged from 6 to 114 (average 21.8, SD 28.5). Data related to 2574 participants were inserted across all the modules. The average percentage of completed text/image fields in the 12 modules was 65.7%. All modules were evaluated in terms of access, acceptability, and medical efficacy. In their final evaluation, the participants judged the modules to be useful and efficient for clinical use.
Our results demonstrate the usefulness of the telemedicine platform for HCPs in terms of improved knowledge in medicine, patient care, scientific research, teaching, and the choice of therapies. It would be useful to start similar projects across various health care fields, considering that in the near future medicine as we know it will completely change.