Yaotan Xie and Fei Xiang
This study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.
Abstract
Purpose
This study aimed to adapt existing text-mining techniques and propose a novel topic recognition approach for textual patient reviews.
Design/methodology/approach
The authors first transformed multilabel samples for adapting model training forms. Then, an improved method was proposed based on dynamic mixed sampling and transfer learning to improve the learning problem caused by imbalanced samples. Specifically, the training of our model was based on the framework of a convolutional neural network and self-trained Word2Vector on large-scale corpora.
Findings
Compared with the SVM and other CNN-based models, the CNN+ DMS + TL model proposed in this study has made significant improvement in F1 score.
Originality/value
The improved methods based on dynamic mixed sampling and transfer learning can adequately manage the learning problem caused by the skewed distribution of samples and achieve the effective and automatic topic recognition of textual patient reviews.
Peer review
The peer-review history for this article is available at: https://publons.com/publon/10.1108/OIR-01-2021-0059.