Path-based keyword search over XML streams
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 17 August 2015
Abstract
Purpose
In purpose of this paper is to propose a novel scheme to process XPath-based keyword search over Extensible Markup Language (XML) streams, where one can specify query keywords and XPath-based filtering conditions at the same time. Experimental results prove that our proposed scheme can efficiently and practically process XPath-based keyword search over XML streams.
Design/methodology/approach
To allow XPath-based keyword search over XML streams, it was attempted to integrate YFilter (Diao et al., 2003) with CKStream (Hummel et al., 2011). More precisely, the nondeterministic finite automation (NFA) of YFilter is extended so that keyword matching at text nodes is supported. Next, the stack data structure is modified by integrating set of NFA states in YFilter with bitmaps generated from set of keyword queries in CKStream.
Findings
Extensive experiments were conducted using both synthetic and real data set to show the effectiveness of the proposed method. The experimental results showed that the accuracy of the proposed method was better than the baseline method (CKStream), while it consumed less memory. Moreover, the proposed scheme showed good scalability with respect to the number of queries.
Originality/value
Due to the rapid diffusion of XML streams, the demand for querying such information is also growing. In such a situation, the ability to query by combining XPath and keyword search is important, because it is easy to use, but powerful means to query XML streams. However, none of existing works has addressed this issue. This work is to cope with this problem by combining an existing XPath-based YFilter and a keyword-search-based CKStream for XML streams to enable XPath-based keyword search.
Keywords
Acknowledgements
This research was partly supported by the Grant-in-Aid for Scientific Research (B) (#26280037) and the program Research and Development on Real World Big Data Integration and Analysis of the Ministry of Education, Culture, Sports, Science and Technology, Japan.
Citation
Bou, S., Amagasa, T. and Kitagawa, H. (2015), "Path-based keyword search over XML streams", International Journal of Web Information Systems, Vol. 11 No. 3, pp. 347-369. https://doi.org/10.1108/IJWIS-04-2015-0013
Publisher
:Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited