The purpose of this study is to suggest suitable movies for children among the various multimedia selections available these days. Multimedia have a significant impact on the…
Abstract
Purpose
The purpose of this study is to suggest suitable movies for children among the various multimedia selections available these days. Multimedia have a significant impact on the social and psychological development of children who are often explored to inappropriate materials, including movies that are either accessible online or through other multimedia channels. Even though not all movies are bad, there are negative effects of offensive languages, violence and sexuality as exhibited in movies. Parents and guidance of children need all the help they can get to promote the healthy use of movies these days.
Design/methodology/approach
To offer parents appropriate movies of interest to their youths, the authors have developed MovRec, a personalized movie recommender for children, which is designed to provide educational and suitable entertaining opportunities for children. MovRec determines the appealingness of a movie for a particular user based on its children-appropriate score computed by using the backpropagation model, pre-defined category using latent Dirichlet allocation, its predicted rating using matrix factorization and sentiments based on its users’ reviews, which along with its like/dislike count and genres, yield the features considered by MovRec. MovRec combines these features by using the CombMNZ model to rank and recommend movies.
Findings
The performance evaluation of MovRec clearly demonstrates its effectiveness and its recommended movies are highly regarded by its users.
Originality/value
Unlike Amazon and other online movie recommendation systems, such as Common Sense Media, Internet Movie Database and TasteKid, MovRec is unique, as to the best of the authors’ knowledge, MovRec is the first personalized children movie recommender.
Details
Keywords
The purpose of this paper is to introduce a summarization method to enhance the current web-search approaches by offering a summary of each clustered set of web-search results…
Abstract
Purpose
The purpose of this paper is to introduce a summarization method to enhance the current web-search approaches by offering a summary of each clustered set of web-search results with contents addressing the same topic, which should allow the user to quickly identify the information covered in the clustered search results. Web search engines, such as Google, Bing and Yahoo!, rank the set of documents S retrieved in response to a user query and represent each document D in S using a title and a snippet, which serves as an abstract of D. Snippets, however, are not as useful as they are designed for, i.e. assisting its users to quickly identify results of interest. These snippets are inadequate in providing distinct information and capture the main contents of the corresponding documents. Moreover, when the intended information need specified in a search query is ambiguous, it is very difficult, if not impossible, for a search engine to identify precisely the set of documents that satisfy the user’s intended request without requiring additional information. Furthermore, a document title is not always a good indicator of the content of the corresponding document either.
Design/methodology/approach
The authors propose to develop a query-based summarizer, called QSum, in solving the existing problems of Web search engines which use titles and abstracts in capturing the contents of retrieved documents. QSum generates a concise/comprehensive summary for each cluster of documents retrieved in response to a user query, which saves the user’s time and effort in searching for specific information of interest by skipping the step to browse through the retrieved documents one by one.
Findings
Experimental results show that QSum is effective and efficient in creating a high-quality summary for each cluster to enhance Web search.
Originality/value
The proposed query-based summarizer, QSum, is unique based on its searching approach. QSum is also a significant contribution to the Web search community, as it handles the ambiguous problem of a search query by creating summaries in response to different interpretations of the search which offer a “road map” to assist users to quickly identify information of interest.
Details
Keywords
Maria Soledad Pera and Yiu‐Kai Ng
The web provides its users with abundant information. Unfortunately, when a web search is performed, both users and search engines must deal with an annoying problem: the presence…
Abstract
Purpose
The web provides its users with abundant information. Unfortunately, when a web search is performed, both users and search engines must deal with an annoying problem: the presence of spam documents that are ranked among legitimate ones. The mixed results downgrade the performance of search engines and frustrate users who are required to filter out useless information. To improve the quality of web searches, the number of spam documents on the web must be reduced, if they cannot be eradicated entirely. This paper aims to present a novel approach for identifying spam web documents, which have mismatched titles and bodies and/or low percentage of hidden content in markup data structure.
Design/methodology/approach
The paper shows that by considering the degree of similarity among the words in the title and body of a web docuemnt D, which is computed by using their word‐correlation factors; using the percentage of hidden context in the markup data structure within D; and/or considering the bigram or trigram phase‐similarity values of D, it is possible to determine whether D is spam with high accuracy
Findings
By considering the content and markup of web documents, this paper develops a spam‐detection tool that is: reliable, since we can accurately detect 84.5 percent of spam/legitimate web documents; and computational inexpensive, since the word‐correlation factors used for content analysis are pre‐computed.
Research limitations/implications
Since the bigram‐correlation values employed in the spam‐detection approach are computed by using the unigram‐correlation factors, it imposes additional computational time during the spam‐detection process and could generate higher number of misclassified spam web documents.
Originality/value
The paper verifies that the spam‐detection approach outperforms existing anti‐spam methods by at least 3 percent in terms of F‐measure.
Details
Keywords
Maria Soledad Pera and Yiu‐Kai Ng
Tens of thousands of news articles are posted online each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of…
Abstract
Purpose
Tens of thousands of news articles are posted online each day, covering topics from politics to science to current events. To better cope with this overwhelming volume of information, RSS (news) feeds are used to categorize newly posted articles. Nonetheless, most RSS users must filter through many articles within the same or different RSS feeds to locate articles pertaining to their particular interests. Due to the large number of news articles in individual RSS feeds, there is a need for further organizing articles to aid users in locating non‐redundant, informative, and related articles of interest quickly. This paper aims to address these issues.
Design/methodology/approach
The paper presents a novel approach which uses the word‐correlation factors in a fuzzy set information retrieval model to: filter out redundant news articles from RSS feeds; shed less‐informative articles from the non‐redundant ones; and cluster the remaining informative articles according to the fuzzy equivalence classes on the news articles.
Findings
The clustering approach requires little overhead or computational costs, and experimental results have shown that it outperforms other existing, well‐known clustering approaches.
Research limitations/implications
The clustering approach as proposed in this paper applies only to RSS news articles; however, it can be extended to other application domains.
Originality/value
The developed clustering tool is highly efficient and effective in filtering and classifying RSS news articles and does not employ any labor‐intensive user‐feedback strategy. Therefore, it can be implemented in real‐world RSS feeds to aid users in locating RSS news articles of interest.