Enabling Massive XML-Based Biological Data Management in HBase Liu, Jian; Liu, Qiuru; Zhang, Lei ...
IEEE/ACM transactions on computational biology and bioinformatics,
2020-Nov.-Dec.-1, 2020 Nov-Dec, 2020-11-1, 20201101, Letnik:
17, Številka:
6
Journal Article
Recenzirano
Publishing biological data in XML formats is attractive for organizations who would like to provide their bioinformatics resources in an extensible and machine-readable format. In the era of big ...data, massive XML-based biological data management is emerged as a challengeable issue. With the continuous growth of the XML-based biological data sets, it is usually frustrating to use traditional declarative query languages to provide efficient query capabilities in terms of processing speed and scale. In this study, we report a novel platform to store and query massive XML-based biological data collections. A prototype tool for constructing HBase tables from XML-based biological data collections is first developed, and then a formal approach to transform the XML query model into the MapReduce query model is proposed. Finally, an evaluation of the query performance of the proposed approach on the existing XML-based biological databases is presented, showing that the performance advantages of the proposed solution. The source code of the massive XML-based biological data management platform is freely available at https://github.com/lyotvincent/X2H.
The current Special Issue of Publications is dedicated to PUBMET2022, The 9th Conference on Scholarly Communication in the Context of Open Science. The PUBMET conference aimed to provide a forum for ...the community involved in scholarly communication and the dissemination of knowledge, inviting researchers, information and communication specialists, librarians, editors, publishers, teachers, students, research funders, policy makers, repository managers, and other stakeholders involved in scholarly communication to discuss the current changes, developments, and advancements in scholarly communication from the perspective of open science. The PUBMET conference is open to individuals who are interested in learning more about and sharing their research results and experiences on the practices in open science. The current Special Issue contains submissions of research that reflect both practical and technical innovations, which serve the implementation of open science. The following topics are addressed in the present publication: Assessing the quality of research processes, research outputs, and publication channels; Re-designing open access—rights-retention strategies and alternatives to paid OA; FAIRness of open science; The potential of public engagement with science and environmental activism; Raising efficiency and effectiveness in scholarly communication.
Jena: a semantic Web toolkit McBride, B.
IEEE internet computing,
11/2002, Letnik:
6, Številka:
6
Journal Article
Recenzirano
HP Labs developed the Jena toolkit to make it easier to develop applications that use the semantic Web information model and languages. Jena is a Java application programming interface that is ...available as an open-source download from www.hpl.hp.com/semweb/jena-top.html.
Enabling XEP-0258 Security Labels in XMPP Jarvinen, Juha; Marttinen, Aleksi; Jarvinen, Risto ...
MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM),
2018-Oct.
Conference Proceeding
Labeling makes data easier to process and search. This is vital while handling large amounts of data. By using labeling to indicate security classes, the number of different applications can be ...reduced as there would be no requirement to use separate application for each security class. The Extensible Messaging and Presence Protocol (XMPP) is an Instant Messaging (IM) protocol for Federated Networks. Extension XEP-0258 defines security labeling in XMPP, however not all systems support this draft standard. In this paper we present the processes and limitations of how different XMPP domains with different XMPP labeling capabilities can be joined together in a static or dynamic way. We have constructed an implementation to verify and validate the presented approach. In addition, we propose a way how non XEP-0258 applicable XMPP domains should be joined to mission networks, and hence security labeling would work more smoothly in the future. Finally, we show that the mangling of security labels does not significantly increase the anticipated delay of the communication.
Web services work over dynamic connections among distributed systems. This technology was specifically designed to easily pass SOAP message through firewalls using open ports. These benefits involve ...a number of security challenges, such as Injection Attacks, phishing, Denial-of-Services (DoS) attacks, and so on. The difficulty to detect vulnerabilities,before they are exploited, encourages developers to use security testing like penetration testing to reduce the potential attacks. Given a black-box approach, this research use the penetration testing to emulate a series of attacks, such as Cross-site Scripting (XSS), Fuzzing Scan, Invalid Types, Malformed XML, SQL Injection, XPath Injection and XML Bomb. In this way, was used the soapUI vulnerability scanner in order to emulate these attacks and insert malicious scripts in the requests of the web services tested. Furthermore, was developed a set of rules to analyze the responses in order to reduce false positives and negatives. The results suggest that 97.1% of web services have at least one vulnerability of these attacks. We also determined a ranking of these attacks against web services.
Answering Pattern Queries Using Views Fan, Wenfei; Wang, Xin; Wu, Yinghui
IEEE transactions on knowledge and data engineering,
2016-Feb.-1, 2016-2-1, 20160201, Letnik:
28, Številka:
2
Journal Article
Recenzirano
Odprti dostop
Answering queries using views has proven effective for querying relational and semistructured data. This paper investigates this issue for graph pattern queries based on graph simulation. We propose ...a notion of pattern containment to characterize graph pattern matching using graph pattern views. We show that a pattern query can be answered using a set of views if and only if it is contained in the views. Based on this characterization, we develop efficient algorithms to answer graph pattern queries. We also study problems for determining (minimal, minimum) containment of pattern queries. We establish their complexity (from cubic-time to NP-complete) and provide efficient checking algorithms (approximation when the problem is intractable). In addition, when a pattern query is not contained in the views, we study maximally contained rewriting to find approximate answers; we show that it is in cubic-time to compute such rewriting, and present a rewriting algorithm. We experimentally verify that these methods are able to efficiently answer pattern queries on large real-world graphs.
With recent ICT technologies developed for smart energy systems, sensors are popularly used for metering energy consumption by devices in decentralized smart energy networks. Processing energy ...consumption data at decentralized level will help to facilitate the high-efficiency energy use. Many smart energy projects have published the data on the web for other scholars to perform reanalysis on the data. However, such data tend to vary in storage format (e.g., CSV, JSON, XML) and focus more on the energy aspects. Moreover, adding information from outside data sources into the system such as weather is challenging because data of different sources are usually heterogeneous and exist as fragments on the web. In this paper, we propose using semantic approach to manage the decentralized energy data. The main advantage of this work is that it builds a well-defined interoperable structure for organizing the energy data. Furthermore, a variety of cross-domain data sources can be integrated naturally by human words to enhance the reanalysis on the decentralized energy data.
Software cohesion concerns the degree to which the elements of a module belong together. Cohesive software is easier to understand, test and maintain. In the context of service-oriented development, ...cohesion refers to the degree to which the operations of a service interface belong together. In the state of the art, software cohesion is improved based on refactoring methods that rely on information, extracted from the software implementation. This is a main limitation towards using these methods in the case of web services: web services do not expose their implementation; instead all that they export is the web service interface specification. To deal with this problem, we propose an approach that enables the cohesion-driven decomposition of service interfaces, without information on how the services are implemented. Our approach progressively decomposes a given service interface into more cohesive interfaces; the backbone of the approach is a suite of cohesion metrics that rely on information, extracted solely from the specification of the service interface. We validate the approach in 22 real-world services, provided by Amazon and Yahoo. We assess the effectiveness of the proposed approach, concerning the cohesion improvement, and the number of interfaces that result from the decomposition of the examined interfaces. Moreover, we show the usefulness of the approach in a user study, where developers assessed the quality of the produced interfaces.
eXtensible Markup Language (XML) has been the de facto standard of data representation and exchange over the Web. In addition, imprecise and uncertain data are inherent in the real world. Although ...fuzzy data have been extensively investigated in the context of the relational model, the classical relational database model and its fuzzy extension to date do not satisfy the need of modeling complex objects with imprecision and uncertainty on the Web. On the basis of possibility theory, this paper concentrates on fuzzy information modeling in the fuzzy XML model and the fuzzy IFO model. In particular, the formal approach to mapping a fuzzy IFO model to a fuzzy document‐type definition model is developed.