A text classification method combining in-domain pre-training and prompt learning for the steel e-commerce industry
International Journal of Web Information Systems
ISSN: 1744-0084
Article publication date: 9 December 2024
Abstract
Purpose
With the development of Web information systems, steel e-commerce platforms have accumulated a large number of quality objection texts. These texts reflect consumer dissatisfaction with the dimensions, appearance and performance of steel products, providing valuable insights for product improvement and consumer decision-making. Currently, mainstream solutions rely on pre-trained models, but their performance on domain-specific data sets and few-shot data sets is not satisfactory. This paper aims to address these challenges by proposing more effective methods for improving model performance on these specialized data sets.
Design/methodology/approach
This paper presents a method on the basis of in-domain pre-training, bidirectional encoder representation from Transformers (BERT) and prompt learning. Specifically, a domain-specific unsupervised data set is introduced into the BERT model for in-domain pre-training, enabling the model to better understand specific language patterns in the steel e-commerce industry, enhancing the model’s generalization capability; the incorporation of prompt learning into the BERT model enhances attention to sentence context, improving classification performance on few-shot data sets.
Findings
Through experimental evaluation, this method demonstrates superior performance on the quality objection data set, achieving a Macro-F1 score of 93.32%. Additionally, ablation experiments further validate the significant advantages of in-domain pre-training and prompt learning in enhancing model performance.
Originality/value
This study clearly demonstrates the value of the new method in improving the classification of quality objection texts for steel products. The findings of this study offer practical insights for product improvement in the steel industry and provide new directions for future research on few-shot learning and domain-specific models, with potential applications in other fields.
Keywords
Acknowledgements
Funding: This work was supported in part by the Beijing Natural Science Foundation under Grant L211020 and in part by the National Natural Science Foundation of China under Grants U1836106, 62271045 and 62202044.
Statements and declarations: The authors declare that there are no conflicts of interest regarding the publication of this paper. No human or animal subjects are involved in the study described in this paper. All authors have agreed to the submission of this paper.
Data availability: The data underlying this article were provided by Ouyeel Co., Ltd under license. Data will be shared on request to the corresponding author with permission of Ouyeel Co., Ltd.
Citation
Peng, Q., Luo, X., Yuan, Y., Gu, F., Shen, H. and Huang, Z. (2024), "A text classification method combining in-domain pre-training and prompt learning for the steel e-commerce industry", International Journal of Web Information Systems, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/IJWIS-09-2024-0277
Publisher
:Emerald Publishing Limited
Copyright © 2024, Emerald Publishing Limited