Search results
1 – 4 of 4This paper aims to propose a new keyword search method on graph data to improve the relevance of search results and reduce duplication of content nodes in the answer trees…
Abstract
Purpose
This paper aims to propose a new keyword search method on graph data to improve the relevance of search results and reduce duplication of content nodes in the answer trees obtained by previous approaches based on distinct root semantics. The previous approaches are restricted to find answer trees having different root nodes and thus often generate a result consisting of answer trees with low relevance to the query or duplicate content nodes. The method allows limited redundancy in the root nodes of top-k answer trees to produce more effective query results.
Design/methodology/approach
A measure for redundancy in a set of answer trees regarding their root nodes is defined, and according to the metric, a set of answer trees with limited root redundancy is proposed for the result of a keyword query on graph data. For efficient query processing, an index on the useful paths in the graph using inverted lists and a hash map is suggested. Then, based on the path index, a top-k query processing algorithm is presented to find most relevant and diverse answer trees given a maximum amount of root redundancy allowed for a set of answer trees.
Findings
The results of experiments using real graph datasets show that the proposed approach can produce effective query answers which are more diverse in the content nodes and more relevant to the query than the previous approach based on distinct root semantics.
Originality/value
This paper first takes redundancy in the root nodes of answer trees into account to improve the relevance and content nodes redundancy of query results over the previous distinct root semantics. It can satisfy the users’ various information need on a large and complex graph data using a keyword-based query.
Details
Keywords
This paper studies a keyword search over graph-structured data used in various fields such as semantic web, linked open data and social networks. This study aims to propose an…
Abstract
Purpose
This paper studies a keyword search over graph-structured data used in various fields such as semantic web, linked open data and social networks. This study aims to propose an efficient keyword search algorithm on graph data to find top-k answers that are most relevant to the query and have diverse content nodes for the input keywords.
Design/methodology/approach
Based on an aggregative measure of diversity of an answer set, this study proposes an approach to searching the top-k diverse answers to a query on graph data, which finds a set of most relevant answer trees whose average dissimilarity should be no lower than a given threshold. This study defines a diversity constraint that must be satisfied for a subset of answer trees to be included in the solution. Then, an enumeration algorithm and a heuristic search algorithm are proposed to find an optimal solution efficiently based on the diversity constraint and an A* heuristic. This study also provides strategies for improving the performance of the heuristic search method.
Findings
The results of experiments using a real data set demonstrate that the proposed search algorithm can find top-k diverse and relevant answers to a query on large-scale graph data efficiently and outperforms the previous methods.
Originality/value
This study proposes a new keyword search method for graph data that finds an optimal solution with diverse and relevant answers to the query. It can provide users with query results that satisfy their various information needs on large graph data.
Details
Keywords
Chang-Sup Park and Sungchae Lim
The paper aims to propose an effective method to process keyword-based queries over graph-structured databases which are widely used in various applications such as XML, semantic…
Abstract
Purpose
The paper aims to propose an effective method to process keyword-based queries over graph-structured databases which are widely used in various applications such as XML, semantic web, and social network services. To satisfy users' information need, it proposes an extended answer structure for keyword queries, inverted list indexes on keywords and nodes, and query processing algorithms exploiting the inverted lists. The study aims to provide more effective and relevant answers to a given query than the previous approaches in an efficient way.
Design/methodology/approach
A new relevance measure for nodes to a given keyword query is defined in the paper and according to the relevance metric, a new answer tree structure is proposed which has no constraint on the number of keyword nodes chosen for each query keyword. For efficient query processing, an inverted list-style index is suggested which pre-computes connectivity and relevance information on the nodes in the graph. Then, a query processing algorithm based on the pre-constructed inverted lists is designed, which aggregates list entries for each graph node relevant to given keywords and identifies top-k root nodes of answer trees most relevant to the given query. The basic search method is also enhanced by using extend inverted lists which store additional relevance information of the related entries in the lists in order to estimate the relevance score of a node more closely and to find top-k answers more efficiently.
Findings
Experiments with real datasets and various test queries were conducted for evaluating effectiveness and performance of the proposed methods in comparison with one of the previous approaches. The experimental results show that the proposed methods with an extended answer structure produce more effective top-k results than the compared previous method for most of the queries, especially for those with OR semantics. An extended inverted list and enhanced search algorithm are shown to achieve much improvement on the execution performance compared to the basic search method.
Originality/value
This paper proposes a new extended answer structure and query processing scheme for keyword queries on graph databases which can satisfy the users' information need represented by a keyword set having various semantics.
Details
Keywords
Collins Udanor, Stephen Aneke and Blessing Ogechi Ogbuokiri
The purpose of this paper is to use the Twitter Search Network of the Apache NodeXL data discovery tool to extract over 5,000 data from Twitter accounts that twitted, re-twitted…
Abstract
Purpose
The purpose of this paper is to use the Twitter Search Network of the Apache NodeXL data discovery tool to extract over 5,000 data from Twitter accounts that twitted, re-twitted or commented on the hashtag, #NigeriaDecides, to gain insight into the impact of the social media on the politics and administration of developing countries.
Design/methodology/approach
Several algorithms like the Fruchterman-Reingold algorithm, Harel-Koren Fast Multiscale algorithm and the Clauset-Newman-Moore algorithms are used to analyse the social media metrics like betweenness, closeness centralities, etc., and visualize the sociograms.
Findings
Results from a typical application of this tool, on the Nigeria general election of 2015, show the social media as the major influencer and the contribution of the social media data analytics in predicting trends that may influence developing economies.
Practical implications
With this type of work, stakeholders can make informed decisions based on predictions that can yield high degree of accuracy as this case. It is also important to stress that this work can be reproduced for any other part of the world, as it is not limited to developing countries or Nigeria in particular or it is limited to the field of politics.
Social implications
Increasingly, during the 2015 general election, citizens have taken over the blogosphere by writing, commenting and reporting about different issues from politics, society, human rights, disasters, contestants, attacks and other community-related issues. One of such instances is the #NigeriaDecides network on Twitter. The effect of these showed in the opinion polls organized by the various interest groups and media houses which were all in favour of GMB.
Originality/value
The case study the authors took on the Nigeria’s general election of 2015 further strengthens the fact that the developing countries have joined the social media race. The major contributions of this work are that policy makers, politicians, business managers, etc. can use the methods shown in this work to harness and gain insights from Big Data, like the social media data.
Details