Reducing energy consumption in the buildings sector requires significant changes, but technology alone may fail to guarantee efficient energy performance. Human behavior plays a pivotal role in ...building design, operation, management and retrofit, and is a crucial positive factor for improving the indoor environment, while reducing energy use at low cost. Over the past 40 years, a substantial body of literature has explored the impacts of human behavior on building technologies and operation. Often, need-action-event cognitive theoretical frameworks were used to represent human-machine interactions. In Part I of this paper, a review of more than 130 published behavioral studies and frameworks was conducted. A large variety of data-driven behavioral models have been developed based on field monitoring of the human-building-system interaction. Studies have emerged scattered geographically around the world that lack in standardization and consistency, thus leading to difficulties when comparing one with another. To address this problem, an ontology to represent energy-related occupant behavior in buildings is presented. Accordingly, the technical DNAs framework is developed based on four key components: i) the Drivers of behavior, ii) the Needs of the occupants, iii) the Actions carried out by the occupants, and iv) the building systems acted upon by the occupants. This DNAs framework is envisioned to support the international research community to standardize a systematic representation of energy-related occupant behavior in buildings. Part II of this paper further develops the DNAs framework as an XML (eXtensible Markup Language) schema, obXML, for exchange of occupant information modeling and integration with building simulation tools.
Display omitted
•A framework represents an ontology of energy-related occupant behavior in buildings.•More than 130 relevant papers are reviewed to develop the ontology.•The DNAs framework has four key elements: drivers, needs, actions, and systems.•The framework standardize representation of occupant behaviour in buildings.•Researchers and practitioners can adopt the framework for their behaviour studies.
The gap between storing data in relational databases and transferring data in form of XML has been closed e.g. by SQL/XML queries that generate XML data out of relational data sources. However, only ...few relational database systems support the evaluation of SQL/XML queries. And even in those systems supporting SQL/XML, the evaluation of such queries is quite slow compared to the evaluation of SQL queries. In this paper, we present S2CX, an approach that allows to efficiently evaluate SQL/XML queries on any relational database system, no matter whether it supports SQL/XML or not. As a result to an SQL/XML query, S2CX supports different output formats ranging from plain XML to different compressed XML representations including a succinct encoding of XML data, schema-aware compressed XML to grammar compressed XML. In many cases, S2CX produces compressed XML as a result to an SQL/XML query even faster than the evaluation of SQL/XML queries into non-compressed XML as provided by Oracle 11g and by DB2. Furthermore, our approach to query evaluation scales better, i.e., the larger the dataset, the faster is our approach compared to SQL/XML query evaluation in Oracle 11g and in DB2.
•SQL/XML query evaluation generating compressed and non-compressed XML formats.•Supports SQL/XML on all relational databases, even those not supporting SQL/XML.•Generates directly compressed XML formats without need to generate XML format first.•In most cases, our approach outperforms SQL/XML query evaluation of DB2 and Oracle.•Our approach scales better than SQL/XML query evaluation of DB2 and of Oracle.
Uncertain information extensively exists in data and knowledge intensive applications, where fuzzy data play an import role in nature. Fuzzy set theory has been extensively applied to extend various ...database models and resulted in numerous contributions. This paper concentrates on a crucial issue in fuzzy data management: fuzzy data modeling in XML. An up-to-date overview of the current state of the art in fuzzy XML data modeling is provided in the paper. The paper serves as identifying possible research opportunities in the area of fuzzy XML data management in addition to providing a generic overview of the approaches proposed to modeling fuzzy XML data.
Extensible Markup Language (XML) technology is widely used for data exchange and data representation in both online and offline mode. This structured format language able to be transformed into other ...formats and share information across platforms. XML is simple; however, it is designed to accommodate changes. For this paper, a study on transformation of XML document into relational database is conducted. Crucial part of this process is how to maintain the hierarchy and relationships between data in the document into database. Approaches that are discussed in this paper each uses own unique way of data storing technique and database design. Therefore, each algorithm is assessed with three datasets constitute of small, medium and large size XML file. The efficiency of the algorithms is being tested on time taken for data storing and query execution process. At the end of the evaluation, we discuss factors that affect algorithm performance and present suggestions to improve mapping scheme for future works
Modern enterprise systems can be composed of many web services (e.g., SOAP and RESTful). Users of such systems might not have direct access to those services, and rather interact with them through a ...single-entry point which provides a GUI (e.g., a web page or a mobile app). Although the interactions with such entry point might be secure, a hacker could trick such systems to send malicious inputs to those internal web services. A typical example is XML injection targeting SOAP communications. Previous work has shown that it is possible to automatically generate such kind of attacks using search-based techniques. In this paper, we improve upon previous results by providing more efficient techniques to generate such attacks. In particular, we investigate four different algorithms and two different fitness functions. A large empirical study, involving also two industrial systems, shows that our technique is effective at automatically generating XML injection attacks.
Extensible Markup Language (XML) is a widely used file format for data storage and transmission. Many XML processors support XPath, a query language that enables the extraction of elements from XML ...documents. These systems can be affected by logic bugs, which are bugs that cause the processor to return incorrect results. In order to tackle such bugs, we propose a new approach, which we realized as a system called XPress. As a test oracle, XPress relies on differential testing, which compares the results of multiple systems on the same test input, and identifies bugs through discrepancies in their outputs. As test inputs, XPress generates both XML documents and XPath queries. Aiming to generate meaningful queries that compute non-empty results, XPress selects a so-called targeted node to guide the XPath expression generation process. Using the targeted node, XPress generates XPath expressions that reference existing context related to the targeted node, such as its tag name and attributes, while also guaranteeing that a predicate evaluates to true before further expanding the query. We tested our approach on six mature XML processors, BaseX, eXist-DB, Saxon, PostgreSQL, libXML2, and a commercial database system. In total, we have found 27 unique bugs in these systems, of which 25 have been verified by the developers, and 20 of which have been fixed. XPress is efficient, as it finds 12 unique bugs in BaseX in 24 hours, which is 2× as fast as naive random generation. We expect that the effectiveness and simplicity of our approach will help to improve the robustness of many XML processors.
Programs that take highly-structured files as inputs normally process inputs in stages: syntax parsing, semantic checking, and application execution. Deep bugs are often hidden in the application ...execution stage, and it is non-trivial to automatically generate test inputs to trigger them. Mutation-based fuzzing generates test inputs by modifying well-formed seed inputs randomly or heuristically. Most inputs are rejected at the early syntax parsing stage. Differently, generation-based fuzzing generates inputs from a specification (e.g., grammar). They can quickly carry the fuzzing beyond the syntax parsing stage. However, most inputs fail to pass the semantic checking (e.g., violating semantic rules), which restricts their capability of discovering deep bugs. In this paper, we propose a novel data-driven seed generation approach, named Skyfire, which leverages the knowledge in the vast amount of existing samples to generate well-distributed seed inputs for fuzzing programs that process highly-structured inputs. Skyfire takes as inputs a corpus and a grammar, and consists of two steps. The first step of Skyfire learns a probabilistic context-sensitive grammar (PCSG) to specify both syntax features and semantic rules, and then the second step leverages the learned PCSG to generate seed inputs. We fed the collected samples and the inputs generated by Skyfire as seeds of AFL to fuzz several open-source XSLT and XML engines (i.e., Sablotron, libxslt, and libxml2). The results have demonstrated that Skyfire can generate well-distributed inputs and thus significantly improve the code coverage (i.e., 20% for line coverage and 15% for function coverage on average) and the bug-finding capability of fuzzers. We also used the inputs generated by Skyfire to fuzz the closed-source JavaScript and rendering engine of Internet Explorer 11. Altogether, we discovered 19 new memory corruption bugs (among which there are 16 new vulnerabilities and received 33.5k USD bug bounty rewards) and 32 denial-of-service bugs.
Clustering XML documents by structure is the task of grouping them by common structural components. Hitherto, this has been accomplished by looking at the occurrence of one preestablished type of ...structural components in the structures of the XML documents. However, the a-priori chosen structural components may not be the most appropriate for effective clustering. Moreover, it is likely that the resulting clusters exhibit a certain extent of inner structural inhomogeneity, because of uncaught differences in the structures of the XML documents, due to further neglected forms of structural components.
To overcome these limitations, a new hierarchical approach is proposed, that allows to consider (if necessary) multiple forms of structural components to isolate structurally-homogeneous clusters of XML documents. At each level of the resulting hierarchy, clusters are divided by considering some type of structural components (unaddressed at the preceding levels), that still differentiate the structures of the XML documents. Each cluster in the hierarchy is summarized through a novel technique, that provides a clear and differentiated understanding of its structural properties.
A comparative evaluation over both real and synthetic XML data proves that the devised approach outperforms established competitors in effectiveness and scalability. Cluster summarization is also shown to be very representative.
XML queries can be modeled by twig pattern queries (TPQs) specifying predicates on XML nodes and XPath relationships satisfied between them. A lot of TPQ types have been proposed; this paper takes ...into account a TPQ model extended by a specification of output and non-output query nodes since it complies with the XQuery semantics and, in many cases, it leads to a more efficient query processing. In general, there are two types of approaches to process a TPQ: holistic joins and binary joins. Whereas the binary join approach builds a query plan as a tree of interconnected binary operators, the holistic join approach evaluates a whole query using one operator (i.e., using one complex algorithm). Surprisingly, a thorough analytical and experimental comparison is still missing despite an enormous research effort in this area. In this paper, we try to fill this gap; we analytically and experimentally show that the binary joins used in a fully-pipelined plan (i.e., the plan where each join operation does not wait for the complete result of the previous operation and no explicit sorting is used) can often outperform the holistic joins, especially for TPQs with a higher ratio of non-output query nodes. The main contributions of this paper can be summarized as follows: (i) we introduce several improvements of existing binary join approaches allowing to build a fully-pipelined plan for a TPQ considering non-output query nodes, (ii) we prove that for a certain class of TPQs such a plan has the linear time complexity with respect to the size of the input and output as well as the linear space complexity with respect to the XML document depth (i.e., the same complexity as the holistic join approaches), (iii) we show that our improved binary join approach outperforms the holistic join approaches in many situations, and (iv) we propose a simple combined approach that utilizes advantages of both types of approaches.
Neste trabalho, propomos uma metodologia para a anotação dos deíticos e da deixis em XML. A utilização da linguagem XML para a anotação de corpora tem conhecido um desenvolvimento nos últimos anos, ...com a publicação de diversas metodologias ou refinamento de outras já existentes. Mas a anotação da deixis tem colocado problemas uma vez que esta opera a vários níveis. De facto, a análise da deixis depende de um conjunto de elementos linguísticos, mais ou menos expressos que não podem ser analisados individualmente, mas sim na relação que estabelecem entre eles e o contexto de produção e de circulação dos textos. Esta contingência conduz a problemas de sobreposição de níveis de análise. A metodologia, aqui apresentada, não só é sensível às relações com o contexto de produção e circulação dos textos, como também permite analisar essas mesmas relações sob diversas perspectivas.