Engines have been built that execute queries against XML data. The aim of this paper is to describe a novel technique that can be used to improve the speed of execution of the…
Abstract
Purpose
Engines have been built that execute queries against XML data. The aim of this paper is to describe a novel technique that can be used to improve the speed of execution of the queries based on semantics of the data in the XML document.
Design/methodology/approach
The paper formally introduces algorithms for optimizing XML queries, implement the algorithms, and through experimentation demonstrate the improvement in speed.
Findings
Three possible semantic query optimizations based on the values of elements were introduced and these demonstrate that two of the three optimizations improve query performance but the third does not. It is hypothesized why this is the case.
Research limitations/implications
A limitation is obviously the query engine and how it works. Future work includes, executing the experiments on a different engine and comparing results, building a system to automatically generate the characteristics that are necessary to do the optimization, describing the best way to represent and maintain the characteristics once they are found, compare the results of optimizations based on content with optimizations based on structure.
Practical implications
The optimizations could be incorporated into new query engines.
Originality/value
Novel algorithms for query optimization have been developed and proven to work. They are of value to people who are building database systems for XML data.
Details
Keywords
XML semantic query optimization (XSQO) is an important area in eXtensible Markup Language (XML) query processing. However, the experiments evaluating semantic optimization methods…
Abstract
Purpose
XML semantic query optimization (XSQO) is an important area in eXtensible Markup Language (XML) query processing. However, the experiments evaluating semantic optimization methods often suffer because of the lack of suitable data sets. To evaluate XSQO methods it is necessary to be able to build datasets with specific characteristics. In particular, it is necessary to be able to set: selectivity of embedded elements, selectivity of values of elements, depth, fan‐out and size. The aim of this paper is to describe the requirements of such a generator, and the challenges of building the generator.
Design/methodology/approach
The paper considers that there is currently no generator that gives this flexibility, so the paper discusses the design and building of such a generator.
Findings
The main characteristic of the generator is that it is possible to adapt existing XML documents, including XML benchmarks, for experiments that evaluate XSQO methods. With the generator, users are able to modify not only the structure of XML documents but also content quickly and directly.
Originality/value
The paper provides information of value to information technology professionals.