While Ajax programming and the plethora of JavaScript component libraries enable high-quality Uls in web applications, integrating them with page data is laborious and error-prone as a developer has ...to handcode incremental modifications with trigger-based programming and manual coordination of data dependencies. The FORWARD web framework simplifies the development of Ajax applications through declarative, state-based templates. This declarative, data-centric approach is characterized by the principle of logical/physical independence, which the database community has often deployed successfully. It enables FORWARD to leverage database techniques, such as incremental view maintenance, updatable views, capability-based component wrappers and cost-based optimization to automate efficient live visualizations. We demonstrate an end-to-end system implementation, including a web-based IDE (itself built in FORWARD), academic and commercial applications built in FORWARD and a wide variety of JavaScript components supported by the declarative templates.
Utilizing IDs to Accelerate Incremental View Maintenance Katsis, Yannis; Ong, Kian Win; Papakonstantinou, Yannis ...
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data,
05/2015
Conference Proceeding
Prior Incremental View Maintenance (IVM) algorithms specify the view tuples that need to be modified by computing diff sets, which we call tuple-based diffs since a diff set contains one diff tuple ...for each to-be-modified view tuple. idIVM assumes the base tables have keys and performs IVM by computing ID-based diff sets that compactly identify the to-be-modified tuples through their IDs.
This work makes the following contributions: (a) An ID-based IVM system for a large subset of SQL that includes the algebraic operators selection, join, grouping and aggregation, generalized projection involving functions, antisemijoin (and therefore negation/difference) and union. The system is based on a modular approach, allowing one to extend the supported language simply by adding one algebraic operator at-a-time, along with equations describing how ID-based changes are propagated through the operator. (b) An efficient algorithm that creates an IVM plan for a given view in four passes that are polynomial in the size of the view expression. (c) A formal analysis comparing the ID-based IVM algorithm to prior IVM approaches and analytically showing when one outperforms the other. (d) An experimental comparison of the ID-based IVM algorithm to prior IVM algorithms showing the superiority of the former in common use cases.
NoSQL databases support semi-structured data, typically modeled as JSON. They also provide limited (but expanding) query languages. Their idiomatic, non-SQL language constructs, the many variations, ...and the lack of formal semantics inhibit deep understanding of the query languages, and also impede progress towards clean, powerful, declarative query languages. This paper specifies the syntax and semantics of SQL++, which is applicable to both JSON native stores and SQL databases. The SQL++ semi-structured data model is a superset of both JSON and the SQL data model. SQL++ offers powerful computational capabilities for processing semi-structured data akin to prior non-relational query languages, notably OQL and XQuery. Yet, SQL++ is SQL backwards compatible and is generalized towards JSON by introducing only a small number of query language extensions to SQL. Recognizing that a query language standard is probably premature for the fast evolving area of NoSQL databases, SQL++ includes configuration options that formally itemize the semantics variations that language designers may choose from. The options often pertain to the treatment of semi-structuredness (missing attributes, heterogeneous types, etc), where more than one sensible approaches are possible. SQL++ is unifying: By appropriate choices of configuration options, the SQL++ semantics can morph into the semantics of existing semi-structured database query languages. The extensive experimental validation shows how SQL and four semi-structured database query languages (MongoDB, Cassandra CQL, Couchbase N1QL and AsterixDB AQL) are formally described by appropriate settings of the configuration options. Early adoption signs of SQL++ are positive: Version 4 of Couchbase's N1QL is explained as syntactic sugar over SQL++. AsterixDB will soon support the full SQL++ and Apache Drill is in the process of aligning with SQL++.
To facilitate queries over semi-structured data, various structural summaries have been proposed. Structural summaries are derived directly from data and serve as the indexes for evaluating path ...expressions. We introduce
D(
k)-index, an adaptive structural summary, for general graph-structured data. Building on previous 1-index and
A(
k)-index,
D(
k)-index is also based on the concept of bisimilarity. However, as a generalization of 1-index and
A(
k)-index,
D(
k)-index possesses the adaptive ability to adjust its structure to changes in query load. It also enables efficient update algorithms, which are crucial to real applications but have not been adequately addressed in previous literatures. Our experiments show that
D(
k)-index is a more effective structural summary than previous static ones as a result of its query load sensitivity. In addition, the update operations on it can be performed more efficient than on its predecessors.
Implementing even a conceptually simple web application requires an inordinate amount of time. FORWARD addresses three problems that reduce developer productivity: (a) Impedance mismatch across the ...multiple languages used at different tiers of the application architecture. (b) Distributed data access across the multiple data sources of the application (SQL database, user input of the browser page, session data in the application server, etc). (c) Asynchronous, incremental modification of the pages, as performed by Ajax actions. FORWARD belongs to a novel family of web application frameworks that attack impedance mismatch by offering a single unifying language. FORWARD's language is SQL++, a minimally extended SQL. FORWARD's architecture is based on two novel cornerstones: (a) A Unified Application State (UAS), which is a virtual database over the multiple data sources. The UAS is accessed via distributed SQL++ queries, therefore resolving the distributed data access problem. (b) Declarative page specifications, which treat the data displayed by pages as rendered SQL++ page queries. The resulting pages are automatically incrementally modified by FORWARD. User input on the page becomes part of the UAS. We show that SQL++ captures the semi-structured nature of web pages and subsumes the data models of two important data sources of the UAS: SQL databases and JavaScript components. We show that simple markup is sufficient for creating Ajax displays and for modeling user input on the page as UAS data sources. Finally, we discuss the page specification syntax and semantics that are needed in order to avoid race conditions and conflicts between the user input and the automated Ajax page modifications. FORWARD has been used in the development of eight commercial and academic applications. An alpha-release web-based IDE (itself built in FORWARD) enables development in the cloud.
Canopy Kaldor, Jonathan; Mace, Jonathan; Bejda, Michał ...
Proceedings of the 26th Symposium on Operating Systems Principles,
10/2017
Conference Proceeding
Odprti dostop
This paper presents Canopy, Facebook's end-to-end performance tracing infrastructure. Canopy records causally related performance data across the end-to-end execution path of requests, including from ...browsers, mobile applications, and backend services. Canopy processes traces in near real-time, derives user-specified features, and outputs to performance datasets that aggregate across billions of requests. Using Canopy, Facebook engineers can query and analyze performance data in real-time. Canopy addresses three challenges we have encountered in scaling performance analysis: supporting the range of execution and performance models used by different components of the Facebook stack; supporting interactive ad-hoc analysis of performance data; and enabling deep customization by users, from sampling traces to extracting and visualizing features. Canopy currently records and processes over 1 billion traces per day. We discuss how Canopy has evolved to apply to a wide range of scenarios, and present case studies of its use in solving various performance challenges.
Existing encoding schemes and index structures proposed for XML query processing primarily target the containment relationship, specifically the
parent–child and
ancestor–descendant relationship. The ...presence of
preceding-sibling and
following-sibling location steps in the XPath specification, which is the de facto query language for XML, makes the horizontal navigation, besides the vertical navigation, among nodes of XML documents a necessity for efficient evaluation of XML queries. Our work enhances the existing range-based and prefix-based encoding schemes such that all structural relationships between XML nodes can be determined from their codes alone. Furthermore, an external-memory index structure based on the traditional
B+-tree,
XL+-tree(XML Location+-tree), is introduced to index element sets such that all defined location steps in the XPath language,
vertical and
horizontal,
top-down and
bottom-up, can be processed efficiently. The
XL+-trees under the range or prefix encoding scheme actually share the same structure; but various search operations upon them may be slightly different as a result of the richer information provided by the prefix encoding scheme. Finally, experiments are conducted to validate the efficiency of the
XL+-tree approach. We compare the query performance of
XL+-tree with that of
R-tree, which is capable of handling comprehensive XPath location steps and has been empirically shown to outperform other indexing approaches.
SQL++: We Can Finally Relax Carey, Michael; Chamberlin, Don; Goo, Almann ...
2024 IEEE 40th International Conference on Data Engineering (ICDE),
2024-May-13
Conference Proceeding
SQL is five decades old and has outlasted many programming and query languages that have come and gone during its lifetime. It was born shortly after the introduction of the relational model, and was ...designed for querying a flat and typed tabular world. Support for modern, flexible data in the SQL standard and in relational database systems has largely been approached via the addition of new column types (e.g. XML or JSON) together with functions to operate on them. It is time for a cleaner solution that retains the benefits that have allowed SQL to be so successful for so long. We describe SQL++, a SQL extension that relaxes SQL's strictness in terms of both object structure (flat → nested) and schema (mandatory → optional), along with a multi-party effort to agree on a core definition and syntax supportable by multiple vendors. SQL++ sees relational data as a subset of a more flexible object model and it sees collections of document data (e.g., JSON) as a natural and supportable relaxation as opposed to a "bolt on" addition via a SQL column type. We describe the core features of SQL++ and explain how its definition can accommodate flexible data, while staying true to SQL in situations where the target data is tabular and strongly typed. Index Terms-semistructured data, query, JSON, SQL, NoSQL
Structural join has been established as a primitive technique for matching the binary containment pattern, specifically the parent–child and ancestor–descendant relationship, on the tree XML data. ...While current indexing approaches and evaluation algorithms proposed for the structural join operation assume the tree-structured data model, the presence of reference links in XML documents may render the underlying model a graph instead. In the more general category of semi-structured data, of which XML is an example, the data model is also usually supposed to be of graph structure. In this paper, we present an indexing approach and corresponding evaluation algorithms for efficiently performing the structural join operation on graph-structured data. Our approach encodes the structural containment relationship of a graph on multiple nested tree-structured layers, probably with the exception of the last one. With each tree-structured layer indexed with the inverted technique, the structural join operation on a graph can therefore be accomplished through recursively performing structural joins on nested layer trees. Our extensive experiments on both benchmark and synthetic XML data indicate that our proposed approach has good potential to perform significantly better than existing ones in term of both the I/O and CPU cost.
Ajax-based report pages as incrementally rendered views Fu, Yupeng; Kowalczykowski, Keith; Ong, Kian Win ...
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data,
06/2010
Conference Proceeding
While Ajax-based programming enables faster performance and higher interface quality over pure server-side programming, it is demanding and error prone as each action that partially updates the page ...requires custom, ad-hoc code. The problem is exacerbated by distributed programming between the browser and server, where the developer uses JavaScript to access the page state and Java/SQL for the database. The FORWARD framework simplifies the development of Ajax pages by treating them as rendered views, where the developer declares a view using an extension of SQL and page units, which map to the view and render the data in the browser. Such a declarative approach leads to significantly less code, as the framework automatically solves performance optimization problems that the developer would otherwise hand-code. Since pages are fueled by views, FORWARD leverages years of database research on incremental view maintenance by creating optimization techniques appropriately extended for the needs of pages (nesting, variability, ordering), thereby achieving performance comparable to hand-coded JavaScript/Java applications.