An e-healthcare system for disease prediction using hybrid data mining technique
Journal of Modelling in Management
ISSN: 1746-5664
Article publication date: 6 August 2019
Issue publication date: 18 September 2019
Abstract
Purpose
The purpose of this study is to alleviate the specified issues to a great extent. To promote patients’ health via early prediction of diseases, knowledge extraction using data mining approaches shows an integral part of e-health system. However, medical databases are highly imbalanced, voluminous, conflicting and complex in nature, and these can lead to erroneous diagnosis of diseases (i.e. detecting class-values of diseases). In literature, numerous standard disease decision support system (DDSS) have been proposed, but most of them are disease specific. Also, they usually suffer from several drawbacks like lack of understandability, incapability of operating rare cases, inefficiency in making quick and correct decision, etc.
Design/methodology/approach
Addressing the limitations of the existing systems, the present research introduces a two-step framework for designing a DDSS, in which the first step (data-level optimization) deals in identifying an optimal data-partition (Popt) for each disease data set and then the best training set for Popt in parallel manner. On the other hand, the second step explores a generic predictive model (integrating C4.5 and PRISM learners) over the discovered information for effective diagnosis of disease. The designed model is a generic one (i.e. not disease specific).
Findings
The empirical results (in terms of top three measures, namely, accuracy, true positive rate and false positive rate) obtained over 14 benchmark medical data sets (collected from https://archive.ics.uci.edu/ml) demonstrate that the hybrid model outperforms the base learners in almost all cases for initial diagnosis of the diseases. After all, the proposed DDSS may work as an e-doctor to detect diseases.
Originality/value
The model designed in this study is original, and the necessary parallelized methods are implemented in C on Cluster HPC machine (FUJITSU) with total 256 cores (under one Master node).
Keywords
Acknowledgements
Compliance with ethical standards: The study is not funded by any agency. It does not involve other human participants and/or animal. The author declares that there is no conflict of interests regarding the publication of this paper.
Citation
Sarkar, B.K. and Sana, S.S. (2019), "An e-healthcare system for disease prediction using hybrid data mining technique", Journal of Modelling in Management, Vol. 14 No. 3, pp. 628-661. https://doi.org/10.1108/JM2-05-2018-0069
Publisher
:Emerald Publishing Limited
Copyright © 2019, Emerald Publishing Limited