To read this content please select one of the options below:

Machine learning approaches for prediction of ovarian cancer driver genes from mutational and network analysis

Rucha Wadapurkar (School of Bioengineering Sciences & Research , MIT Art, Design and Technology University, Pune, India)
Sanket Bapat (School of Bioengineering Sciences & Research , MIT Art, Design and Technology University, Pune, India)
Rupali Mahajan (School of Bioengineering Sciences & Research , MIT Art, Design and Technology University, Pune, India)
Renu Vyas (School of Bioengineering Sciences & Research , MIT Art, Design and Technology University, Pune, India)

Data Technologies and Applications

ISSN: 2514-9288

Article publication date: 3 May 2023

Issue publication date: 29 January 2024

142

Abstract

Purpose

Ovarian cancer (OC) is the most common type of gynecologic cancer in the world with a high rate of mortality. Due to manifestation of generic symptoms and absence of specific biomarkers, OC is usually diagnosed at a late stage. Machine learning models can be employed to predict driver genes implicated in causative mutations.

Design/methodology/approach

In the present study, a comprehensive next generation sequencing (NGS) analysis of whole exome sequences of 47 OC patients was carried out to identify clinically significant mutations. Nine functional features of 708 mutations identified were input into a machine learning classification model by employing the eXtreme Gradient Boosting (XGBoost) classifier method for prediction of OC driver genes.

Findings

The XGBoost classifier model yielded a classification accuracy of 0.946, which was superior to that obtained by other classifiers such as decision tree, Naive Bayes, random forest and support vector machine. Further, an interaction network was generated to identify and establish correlations with cancer-associated pathways and gene ontology data.

Originality/value

The final results revealed 12 putative candidate cancer driver genes, namely LAMA3, LAMC3, COL6A1, COL5A1, COL2A1, UGT1A1, BDNF, ANK1, WNT10A, FZD4, PLEKHG5 and CYP2C9, that may have implications in clinical diagnosis.

Keywords

Acknowledgements

All the authors would like to thank the Director, MIT School of Bioengineering Sciences & Research for infrastructure and support. R.W. thanks MIT-Art Design and Technology University, Pune, for awarding a PhD research fellowship.

Conflict of Interest: The authors have no conflicts of interest to declare.

The supplementary material for this article can be found at:

Citation

Wadapurkar, R., Bapat, S., Mahajan, R. and Vyas, R. (2024), "Machine learning approaches for prediction of ovarian cancer driver genes from mutational and network analysis", Data Technologies and Applications, Vol. 58 No. 1, pp. 62-80. https://doi.org/10.1108/DTA-03-2022-0096

Publisher

:

Emerald Publishing Limited

Copyright © 2023, Emerald Publishing Limited

Related articles