Chantola Kit, Toshiyuki Amagasa and Hiroyuki Kitagawa
The purpose of this paper is to propose efficient algorithms for structural grouping over Extensible Markup Language (XML) data, called TOPOLOGICAL ROLLUP (T‐ROLLUP), which are to…
Abstract
Purpose
The purpose of this paper is to propose efficient algorithms for structural grouping over Extensible Markup Language (XML) data, called TOPOLOGICAL ROLLUP (T‐ROLLUP), which are to compute aggregation functions based on XML data with multiple hierarchical levels. They play important roles in the online analytical processing of XML data, called XML‐OLAP, with which complex analysis over XML can be performed to discover valuable information from XML.
Design/methodology/approach
Several variations of algorithms are proposed for efficient T‐ROLLUP computation. First, two basic algorithms, top‐down algorithm (TDA) and bottom‐up algorithm (BUA), are presented in which the well‐known structural‐join algorithms are used. The paper then proposes more efficient algorithms, called single‐scan by preorder number and single‐scan by postorder number (SSC‐Pre/Post), which are also based on structural joins, but have been modified from the basic algorithms so that multiple levels of grouping are computed with a single scan over node lists. In addition, the paper attempts to adopt the algorithm for parallel execution in multi‐core environments.
Findings
Several experiments are conducted with XMark and synthetic XML data to show the effectiveness of the proposed algorithms. The experiments show that proposed algorithms perform much better than the naïve implementation. In particular, the proposed SSC‐Pre and SSC‐Post perform better than TDA and BUA for all cases. Beyond that, the experiment using the parallel single scan algorithm also shows better performance than the ordinary basic algorithm.
Research limitations/implications
This paper focuses on the T‐ROLLUP operation for XML data analysis. For this reason, other operations related to XML‐OLAP, such as CUBE, WINDOWING, and RANKING should also be investigated.
Originality/value
The paper presents an extended version of one of the award winning papers at iiWAS2008.