Savitar: an intelligent sign language translation approach for deafness and dysphonia in the COVID-19 era
Data Technologies and Applications
ISSN: 2514-9288
Article publication date: 7 July 2023
Issue publication date: 15 April 2024
Abstract
Purpose
In the COVID-19 era, sign language (SL) translation has gained attention in online learning, which evaluates the physical gestures of each student and bridges the communication gap between dysphonia and hearing people. The purpose of this paper is to devote the alignment between SL sequence and nature language sequence with high translation performance.
Design/methodology/approach
SL can be characterized as joint/bone location information in two-dimensional space over time, forming skeleton sequences. To encode joint, bone and their motion information, we propose a multistream hierarchy network (MHN) along with a vocab prediction network (VPN) and a joint network (JN) with the recurrent neural network transducer. The JN is used to concatenate the sequences encoded by the MHN and VPN and learn their sequence alignments.
Findings
We verify the effectiveness of the proposed approach and provide experimental results on three large-scale datasets, which show that translation accuracy is 94.96, 54.52, and 92.88 per cent, and the inference time is 18 and 1.7 times faster than listen-attend-spell network (LAS) and visual hierarchy to lexical sequence network (H2SNet) , respectively.
Originality/value
In this paper, we propose a novel framework that can fuse multimodal input (i.e. joint, bone and their motion stream) and align input streams with nature language. Moreover, the provided framework is improved by the different properties of MHN, VPN and JN. Experimental results on the three datasets demonstrate that our approaches outperform the state-of-the-art methods in terms of translation accuracy and speed.
Keywords
Acknowledgements
Ethical approval statements: All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Consent for publication: Written informed consent for publication was obtained from all participants.
Availability of supporting data: The datasets analyzed during the current study are available from the public Chinese Sign Language (CSL) (Huang et al., 2018).
Competing interests: The authors have no conflicts of interest to declare that are relevant to the content of this article.
Citation
Liang, W. and Xu, X. (2024), "Savitar: an intelligent sign language translation approach for deafness and dysphonia in the COVID-19 era", Data Technologies and Applications, Vol. 58 No. 2, pp. 153-175. https://doi.org/10.1108/DTA-09-2022-0375
Publisher
:Emerald Publishing Limited
Copyright © 2023, Emerald Publishing Limited