Search results
1 – 2 of 2Clara Martin-Duque, Juan José Fernández-Muñoz, Javier M. Moguerza and Aurora Ruiz-Rua
Recommendation systems are a fundamental tool for hotels to adopt a differentiating competitive strategy. The main purpose of this work is to use machine learning techniques to…
Abstract
Purpose
Recommendation systems are a fundamental tool for hotels to adopt a differentiating competitive strategy. The main purpose of this work is to use machine learning techniques to treat imbalanced data sets, not applied until now in the tourism field. These techniques have allowed the authors to analyse the influence of imbalance data on hotel recommendation models and how this phenomenon affects client dissatisfaction.
Design/methodology/approach
An opinion survey was conducted among hotel customers of different categories in 120 different countries. A total of 135.102 surveys were collected over eleven quarters. A longitudinal design was conducted during this period. A binary logistic model was applied using the function generalized lineal model (GLM).
Findings
Through the analysis of a representative amount of data, the authors empirically demonstrate that the imbalance phenomenon is systematically present in hotel recommendation surveys. In addition, the authors show that the imbalance exists independently of the period in which the survey is done, which means that it is intrinsic to recommendation surveys on this topic. The authors demonstrate the improvement of recommendation systems highlighting the presence of imbalance data and consequences for marketing strategies.
Originality/value
The main contribution of the current work is to apply to the tourism sector the framework for imbalanced data, typically used in the machine learning, improving predictive models.
Details
Keywords
Juan José Fernández-Muñoz, Javier M. Moguerza, Clara Martin Duque and Diana Gomez Bruna
This paper aims to study the effect of imbalanced data in tourism quality models. It is demonstrated that this imbalance strongly affects the accuracy of tourism prediction models…
Abstract
Purpose
This paper aims to study the effect of imbalanced data in tourism quality models. It is demonstrated that this imbalance strongly affects the accuracy of tourism prediction models for hotel recommendation.
Design/methodology/approach
A questionnaire was used to survey 83,740 clients from hotels between five and two or less stars using a binary logistic model. The data correspond to a sample of 87 hotels from all around the world (120 countries from America, Africa, Asia, Europe and Australia).
Findings
The results of the study suggest that the imbalance in the data affects the prediction accuracy of the models used, especially to the prediction provided by unsatisfied clients, tending to consider them as satisfied customers.
Practical implications
In this sense, special attention should be given to unsatisfied clients or, at least, some safeguards to prevent the effect of the imbalance of data should be included in the models.
Social implications
In the tourism industry, the strong imbalance between satisfied and unsatisfied customers produces misleading prediction results. This fact could have effects on the quality policy of hoteliers.
Originality/value
In this work, focusing on tourism data, it is shown that this imbalance strongly affects the prediction accuracy of the models used, especially to the prediction of the recommendation provided by unsatisfied customers, tending to consider them as satisfied customers; a methodological approach based on the balance of the data set used to build the models is proposed to improve the accuracy of the prediction for unsatisfied customers provided by traditional services quality models.
Details