Search results
1 – 3 of 3Atsushi Keyaki, Kenji Hatano and Jun Miyzaki
Nowadays there are a large number of XML documents on the web. This means that information retrieval techniques for searching XML documents are very important and necessary for…
Abstract
Purpose
Nowadays there are a large number of XML documents on the web. This means that information retrieval techniques for searching XML documents are very important and necessary for internet users. Moreover, it is often said that users of search engines want to browse only relevant content in each document. Therefore, an effective XML element search aims to produce only the relevant elements or portions of an XML document. Based on the demand by users, the purpose of this paper is to propose and evaluate a method for obtaining more accurate search results in XML search.
Design/methodology/approach
The existing approaches generate a ranked list in descending order of each XML element's relevance to a search query; however, these approaches often extract irrelevant XML elements and overlook more relevant elements. To address these problems, the authors' approach extracts the relevant XML elements by considering the size of the elements and the relationships between the elements. Next, the authors score the XML elements to generate a refined ranked list. For scoring, the authors rank high the XML elements that are the most relevant to the user's information needs. In particular, each XML element is scored using the statistics of its descendant and ancestor XML elements.
Findings
The experimental evaluations show that the proposed method outperforms BM25E, a conventional approach, which neither reconstructs XML elements nor uses descendant and ancestor statistics. As a result, the authors found that the accuracy of an XML element search can be improved by reconstructing the XML elements and emphasizing the informative ones by applying the statistics of the descendant XML elements.
Research limitations/implications
This work focused on the effectiveness of XML element search and the authors did not consider the search efficiency in this paper. One of the authors' next challenges is to reduce search time.
Originality/value
The paper proposes a method for improving the effectiveness of XML element search.
Details
Keywords
Atsushi Keyaki, Jun Miyazaki, Kenji Hatano, Goshiro Yamamoto, Takafumi Taketomi and Hirokazu Kato
The purpose of this paper is to propose methods for fast incremental indexing with effective and efficient query processing in XML element retrieval. The effectiveness of a search…
Abstract
Purpose
The purpose of this paper is to propose methods for fast incremental indexing with effective and efficient query processing in XML element retrieval. The effectiveness of a search system becomes lower if document updates are not handled when these occur frequently on the Web. The search accuracy is also reduced if drastic changes in document statistics are not managed. However, existing studies of XML element retrieval do not consider document updates, although these studies have attained both effectiveness and efficiency in query processing. Thus, the authors add a function for handling document updates to the existing techniques for XML element retrieval.
Design/methodology/approach
Though it will be important to enable fast updates of indices, preliminary experiments have shown that a simple incremental update approach has two problems: some kinds of statistics are inaccurate, and it takes a long time to update indices. Therefore, two methods are proposed: one to approximate term weights accurately with a small number of documents, even for dynamically changing statistics; and the other to eliminate unnecessary update targets.
Findings
Experimental results show that this proposed system can update indices up to 32 per cent faster than the simple incremental updates while the search accuracy improved by 4 per cent compared with the simple approach. The proposed methods can also be fast and accurate in query processing, even if document statistics change drastically.
Originality/value
The paper shows that there could be a more practical XML element search engine, which can access the latest XML documents accurately and efficiently.
Details