Machine learning approaches for prediction of ovarian cancer driver genes from mutational and network analysis
Data Technologies and Applications
ISSN: 2514-9288
Article publication date: 3 May 2023
Issue publication date: 29 January 2024
Abstract
Purpose
Ovarian cancer (OC) is the most common type of gynecologic cancer in the world with a high rate of mortality. Due to manifestation of generic symptoms and absence of specific biomarkers, OC is usually diagnosed at a late stage. Machine learning models can be employed to predict driver genes implicated in causative mutations.
Design/methodology/approach
In the present study, a comprehensive next generation sequencing (NGS) analysis of whole exome sequences of 47 OC patients was carried out to identify clinically significant mutations. Nine functional features of 708 mutations identified were input into a machine learning classification model by employing the eXtreme Gradient Boosting (XGBoost) classifier method for prediction of OC driver genes.
Findings
The XGBoost classifier model yielded a classification accuracy of 0.946, which was superior to that obtained by other classifiers such as decision tree, Naive Bayes, random forest and support vector machine. Further, an interaction network was generated to identify and establish correlations with cancer-associated pathways and gene ontology data.
Originality/value
The final results revealed 12 putative candidate cancer driver genes, namely LAMA3, LAMC3, COL6A1, COL5A1, COL2A1, UGT1A1, BDNF, ANK1, WNT10A, FZD4, PLEKHG5 and CYP2C9, that may have implications in clinical diagnosis.
Keywords
Acknowledgements
All the authors would like to thank the Director, MIT School of Bioengineering Sciences & Research for infrastructure and support. R.W. thanks MIT-Art Design and Technology University, Pune, for awarding a PhD research fellowship.
Conflict of Interest: The authors have no conflicts of interest to declare.
The supplementary material for this article can be found at:
Citation
Wadapurkar, R., Bapat, S., Mahajan, R. and Vyas, R. (2024), "Machine learning approaches for prediction of ovarian cancer driver genes from mutational and network analysis", Data Technologies and Applications, Vol. 58 No. 1, pp. 62-80. https://doi.org/10.1108/DTA-03-2022-0096
Publisher
:Emerald Publishing Limited
Copyright © 2023, Emerald Publishing Limited