To read this content please select one of the options below:

A text classification method combining in-domain pre-training and prompt learning for the steel e-commerce industry

Qiaojuan Peng (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, China and Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, China)
Xiong Luo (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, China and Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, China)
Yuqi Yuan (School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China; Shunde Innovation School, University of Science and Technology Beijing, Foshan, China and Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing, China)
Fengbo Gu (Ouyeel Co., Ltd, Shanghai, China)
Hailun Shen (Ouyeel Co., Ltd, Shanghai, China)
Ziyang Huang (Ouyeel Co., Ltd, Shanghai, China)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 9 December 2024

6

Abstract

Purpose

With the development of Web information systems, steel e-commerce platforms have accumulated a large number of quality objection texts. These texts reflect consumer dissatisfaction with the dimensions, appearance and performance of steel products, providing valuable insights for product improvement and consumer decision-making. Currently, mainstream solutions rely on pre-trained models, but their performance on domain-specific data sets and few-shot data sets is not satisfactory. This paper aims to address these challenges by proposing more effective methods for improving model performance on these specialized data sets.

Design/methodology/approach

This paper presents a method on the basis of in-domain pre-training, bidirectional encoder representation from Transformers (BERT) and prompt learning. Specifically, a domain-specific unsupervised data set is introduced into the BERT model for in-domain pre-training, enabling the model to better understand specific language patterns in the steel e-commerce industry, enhancing the model’s generalization capability; the incorporation of prompt learning into the BERT model enhances attention to sentence context, improving classification performance on few-shot data sets.

Findings

Through experimental evaluation, this method demonstrates superior performance on the quality objection data set, achieving a Macro-F1 score of 93.32%. Additionally, ablation experiments further validate the significant advantages of in-domain pre-training and prompt learning in enhancing model performance.

Originality/value

This study clearly demonstrates the value of the new method in improving the classification of quality objection texts for steel products. The findings of this study offer practical insights for product improvement in the steel industry and provide new directions for future research on few-shot learning and domain-specific models, with potential applications in other fields.

Keywords

Acknowledgements

Funding: This work was supported in part by the Beijing Natural Science Foundation under Grant L211020 and in part by the National Natural Science Foundation of China under Grants U1836106, 62271045 and 62202044.

Statements and declarations: The authors declare that there are no conflicts of interest regarding the publication of this paper. No human or animal subjects are involved in the study described in this paper. All authors have agreed to the submission of this paper.

Data availability: The data underlying this article were provided by Ouyeel Co., Ltd under license. Data will be shared on request to the corresponding author with permission of Ouyeel Co., Ltd.

Citation

Peng, Q., Luo, X., Yuan, Y., Gu, F., Shen, H. and Huang, Z. (2024), "A text classification method combining in-domain pre-training and prompt learning for the steel e-commerce industry", International Journal of Web Information Systems, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/IJWIS-09-2024-0277

Publisher

:

Emerald Publishing Limited

Copyright © 2024, Emerald Publishing Limited

Related articles