A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data

This work focuses on data sampling in cancer-gene association prediction. Currently, researchers are using machine learning methods to predict genes that are more likely to produce cancer-causing mutations. To improve the performance of machine learning models, methods have been proposed, one of whi...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xu, Mingzhe, Abdullah, Nor Aniza, Md Sabri, Aznul Qalid
Format:	Article
Published:	Elsevier Ltd 2024
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://eprints.um.edu.my/44823/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Malaya

id	my.um.eprints.44823
record_format	eprints
spelling	my.um.eprints.448232024-07-02T05:05:51Z http://eprints.um.edu.my/44823/ A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data Xu, Mingzhe Abdullah, Nor Aniza Md Sabri, Aznul Qalid QA75 Electronic computers. Computer science This work focuses on data sampling in cancer-gene association prediction. Currently, researchers are using machine learning methods to predict genes that are more likely to produce cancer-causing mutations. To improve the performance of machine learning models, methods have been proposed, one of which is to improve the quality of the training data. Existing methods focus mainly on positive data, i.e. cancer driver genes, for screening selection. This paper proposes a low-cancer-related gene screening method based on gene network and graph theory algorithms to improve the negative samples selection. Genetic data with low cancer correlation is used as negative training samples. After experimental verification, using the negative samples screened by this method to train the cancer gene classification model can improve prediction performance. The biggest advantage of this method is that it can be easily combined with other methods that focus on enhancing the quality of positive training samples. It has been demonstrated that significant improvement is achieved by combining this method with three state-of-the-arts cancer gene prediction methods. © 2023 Elsevier Ltd Elsevier Ltd 2024 Article PeerReviewed Xu, Mingzhe and Abdullah, Nor Aniza and Md Sabri, Aznul Qalid (2024) A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data. Computational Biology and Chemistry, 108. ISSN 1476-9271, DOI https://doi.org/10.1016/j.compbiolchem.2023.107997 <https://doi.org/10.1016/j.compbiolchem.2023.107997>. 10.1016/j.compbiolchem.2023.107997
institution	Universiti Malaya
building	UM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaya
content_source	UM Research Repository
url_provider	http://eprints.um.edu.my/
topic	QA75 Electronic computers. Computer science
spellingShingle	QA75 Electronic computers. Computer science Xu, Mingzhe Abdullah, Nor Aniza Md Sabri, Aznul Qalid A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
description	This work focuses on data sampling in cancer-gene association prediction. Currently, researchers are using machine learning methods to predict genes that are more likely to produce cancer-causing mutations. To improve the performance of machine learning models, methods have been proposed, one of which is to improve the quality of the training data. Existing methods focus mainly on positive data, i.e. cancer driver genes, for screening selection. This paper proposes a low-cancer-related gene screening method based on gene network and graph theory algorithms to improve the negative samples selection. Genetic data with low cancer correlation is used as negative training samples. After experimental verification, using the negative samples screened by this method to train the cancer gene classification model can improve prediction performance. The biggest advantage of this method is that it can be easily combined with other methods that focus on enhancing the quality of positive training samples. It has been demonstrated that significant improvement is achieved by combining this method with three state-of-the-arts cancer gene prediction methods. © 2023 Elsevier Ltd
format	Article
author	Xu, Mingzhe Abdullah, Nor Aniza Md Sabri, Aznul Qalid
author_facet	Xu, Mingzhe Abdullah, Nor Aniza Md Sabri, Aznul Qalid
author_sort	Xu, Mingzhe
title	A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
title_short	A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
title_full	A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
title_fullStr	A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
title_full_unstemmed	A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
title_sort	method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data
publisher	Elsevier Ltd
publishDate	2024
url	http://eprints.um.edu.my/44823/
_version_	1805881172035633152

A method to improve the prediction performance of cancer-gene association by screening negative training samples through gene network data

Similar Items