Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case
Journal of Systems and Information Technology
ISSN: 1328-7265
Article publication date: 3 September 2024
Issue publication date: 15 November 2024
Abstract
Purpose
Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This study aims to evaluate different genres of classifiers for product return chance prediction, and further optimizes the best performing model.
Design/methodology/approach
An e-commerce data set having categorical type attributes has been used for this study. Feature selection based on chi-square provides a selective features-set which is used as inputs for model building. Predictive models are attempted using individual classifiers, ensemble models and deep neural networks. For performance evaluation, 75:25 train/test split and 10-fold cross-validation strategies are used. To improve the predictability of the best performing classifier, hyperparameter tuning is performed using different optimization methods such as, random search, grid search, Bayesian approach and evolutionary models (genetic algorithm, differential evolution and particle swarm optimization).
Findings
A comparison of F1-scores revealed that the Bayesian approach outperformed all other optimization approaches in terms of accuracy. The predictability of the Bayesian-optimized model is further compared with that of other classifiers using experimental analysis. The Bayesian-optimized XGBoost model possessed superior performance, with accuracies of 77.80% and 70.35% for holdout and 10-fold cross-validation methods, respectively.
Research limitations/implications
Given the anonymized data, the effects of individual attributes on outcomes could not be investigated in detail. The Bayesian-optimized predictive model may be used in decision support systems, enabling real-time prediction of returns and the implementation of preventive measures.
Originality/value
There are very few reported studies on predicting the chance of order return in e-businesses. To the best of the authors’ knowledge, this study is the first to compare different optimization methods and classifiers, demonstrating the superiority of the Bayesian-optimized XGBoost classification model for returns prediction.
Keywords
Citation
Bhattacharjee, B., Unni, K. and Pratap, M. (2024), "Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case", Journal of Systems and Information Technology, Vol. 26 No. 4, pp. 495-527. https://doi.org/10.1108/JSIT-06-2020-0120
Publisher
:Emerald Publishing Limited
Copyright © 2024, Emerald Publishing Limited