Predicting the future increment of review helpfulness: an empirical study based on a two-wave data set
ISSN: 0264-0473
Article publication date: 28 December 2020
Issue publication date: 18 May 2021
Abstract
Purpose
Identifying and predicting the most helpful reviews has been a focal interest in the fields including information management, e-commerce and marketing, etc. Though many factors are found correlated to the helpfulness of reviews, they may suffer endogeneity problems, as normally the data is observed in the same time window. This paper aims to tackle such a problem by examining the predictive power of different factors on the future increment of review helpfulness.
Design/methodology/approach
Adopting a longitudinal data of 443 K empirical business reviews from Yelp.com collected at two different time points, six groups of predictors are extracted from the first snapshot of data to predict the helpfulness increment of old and recent reviews, respectively, between the two snapshots.
Findings
It is found that these factors in general are with moderate accuracy predicting the helpfulness increment. A different group of features shows quite different predictive power. The reviewer disclosure information is the most significant factor, while the review readability does not significantly improve the accuracy of prediction.
Originality/value
Instead of the total number of helpful votes observed in the same time window with the explanatory variables, this paper focuses on the future increment of helpful votes observed in the following time window. With such a two-wave data set, the endogeneity problem can be avoided and the explanatory factors for review helpfulness can, thus, be further tested in the prediction scenario.
Keywords
Acknowledgements
This work is partially supported by the startup foundation of Nanjing University of Information Science and Technology (1441182001001, 1441182001008).
Citation
Pan, X., Hou, L. and Liu, K. (2021), "Predicting the future increment of review helpfulness: an empirical study based on a two-wave data set", The Electronic Library, Vol. 39 No. 1, pp. 59-76. https://doi.org/10.1108/EL-06-2020-0130
Publisher
:Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited