Suganeshwari G., Syed Ibrahim S.P. and Gang Li
The purpose of this paper is to address the scalability issue and produce high-quality recommendation that best matches the user’s current preference in the dynamically growing…
Abstract
Purpose
The purpose of this paper is to address the scalability issue and produce high-quality recommendation that best matches the user’s current preference in the dynamically growing datasets in the context of memory-based collaborative filtering methods using temporal information.
Design/methodology/approach
The proposed method is formalized as time-dependent collaborative filtering method. For each item, a set of influential neighbors is identified by using the truncated version of similarity computation based on the timestamp. Then, recent n transactions are used to generate the recommendation that reflect the recent preference of the active user. The proposed method, lazy collaborative filtering with dynamic neighborhoods (LCFDN), is further scaled up by implementing in spark using parallel processing paradigm MapReduce. The experiments conducted on MovieLens dataset reveal that LCFDN implemented on MapReduce is more efficient and achieves good performance than the existing methods.
Findings
The results of the experimental study clearly show that not all ratings provide valuable information. Recommendation system based on LCFDN increases the efficiency of predictions by selecting the most influential neighbors based on the temporal information. The pruning of the recent transactions of the user also addresses the user’s preference drifts and is more scalable when compared to state-of-art methods.
Research limitations/implications
In the proposed method, LCFDN, the neighborhood space is dynamically adjusted based on the temporal information. In addition, the LCFDN also determines the user’s current interest based on the recent preference or purchase details. This method is designed to continuously track the user’s preference with the growing dataset which makes it suitable to be implemented in the e-commerce industry. Compared with the state-of-art methods, this method provides high-quality recommendation with good efficiency.
Originality/value
The LCFDN is an extension of collaborative filtering with temporal information used as context. The dynamic nature of data and user’s preference drifts are addressed in the proposed method by dynamically adapting the neighbors. To improve the scalability, the proposed method is implemented in big data environment using MapReduce. The proposed recommendation system provides greater prediction accuracy than the traditional recommender systems.
Details
Keywords
Seungpeel Lee, Honggeun Ji, Jina Kim and Eunil Park
With the rapid increase in internet use, most people tend to purchase books through online stores. Several such stores also provide book recommendations for buyer convenience, and…
Abstract
Purpose
With the rapid increase in internet use, most people tend to purchase books through online stores. Several such stores also provide book recommendations for buyer convenience, and both collaborative and content-based filtering approaches have been widely used for building these recommendation systems. However, both approaches have significant limitations, including cold start and data sparsity. To overcome these limitations, this study aims to investigate whether user satisfaction can be predicted based on easily accessible book descriptions.
Design/methodology/approach
The authors collected a large-scale Kindle Books data set containing book descriptions and ratings, and calculated whether a specific book will receive a high rating. For this purpose, several feature representation methods (bag-of-words, term frequency–inverse document frequency [TF-IDF] and Word2vec) and machine learning classifiers (logistic regression, random forest, naive Bayes and support vector machine) were used.
Findings
The used classifiers show substantial accuracy in predicting reader satisfaction. Among them, the random forest classifier combined with the TF-IDF feature representation method exhibited the highest accuracy at 96.09%.
Originality/value
This study revealed that user satisfaction can be predicted based on book descriptions and shed light on the limitations of existing recommendation systems. Further, both practical and theoretical implications have been discussed.