On the significance of topological-indices based non-binary molecular similarity measures

This paper describes experiments to study on how well the whole range of topological indices-based non-binary similarity values represents the physicochemical similarities between compounds. Measured log P values have been compared with the log P values predicted from compounds at different range of...

Full description

Saved in:
Bibliographic Details
Main Authors: Naomie Salim, Holliday, John, Willett, Peter
Format: Article
Published: Universiti Kebangsaan Malaysia 2004
Online Access:http://journalarticle.ukm.my/3896/
http://www.ukm.my/jsm/english_journals/vol33num2_2004/vol33num2_04page157-172.html
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Kebangsaan Malaysia
id my-ukm.journal.3896
record_format eprints
spelling my-ukm.journal.38962012-05-07T03:29:53Z http://journalarticle.ukm.my/3896/ On the significance of topological-indices based non-binary molecular similarity measures Naomie Salim, Holliday, John Willett, Peter This paper describes experiments to study on how well the whole range of topological indices-based non-binary similarity values represents the physicochemical similarities between compounds. Measured log P values have been compared with the log P values predicted from compounds at different range of similarities calculated based on various topological indices of the compounds. Analysis shows that the non-binary Cosine, Simpson and Pearson coefficients might give misleading results when certain compounds are compared. Similarity values involving 1% most similar compounds based on the non-binary Tanimoto or Euclidean coefficients has been found to be able to represent physicochemical similarities between the molecules compared. Therefore, for searches requiring around 1% most similar compounds, rational selection methods based on the non-binary Tanimoto or Euclidean coefficients are likely to produce better results than random selection. Similarity values involving 5% most dissimilar compounds based on the non-binary Tanimoto coefficients has also been found to be able to represent physicochemical dissimilarities between the molecules compared. Therefore, for diverse selection requiring less than 5% most dissimilar compounds, rational selection methods based on the non-binary Tanimoto coefficient is likely to produce better results than random selection. However, in both focused and diverse selection using the coefficients mentioned, as more and more compounds are selected, the selection becomes more and more like random selection in terms of physicochemical properties similarity and dissimilarity. Universiti Kebangsaan Malaysia 2004 Article PeerReviewed Naomie Salim, and Holliday, John and Willett, Peter (2004) On the significance of topological-indices based non-binary molecular similarity measures. Sains Malaysiana, 33 (2). pp. 157-172. ISSN 0126-6039 http://www.ukm.my/jsm/english_journals/vol33num2_2004/vol33num2_04page157-172.html
institution Universiti Kebangsaan Malaysia
building Perpustakaan Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
description This paper describes experiments to study on how well the whole range of topological indices-based non-binary similarity values represents the physicochemical similarities between compounds. Measured log P values have been compared with the log P values predicted from compounds at different range of similarities calculated based on various topological indices of the compounds. Analysis shows that the non-binary Cosine, Simpson and Pearson coefficients might give misleading results when certain compounds are compared. Similarity values involving 1% most similar compounds based on the non-binary Tanimoto or Euclidean coefficients has been found to be able to represent physicochemical similarities between the molecules compared. Therefore, for searches requiring around 1% most similar compounds, rational selection methods based on the non-binary Tanimoto or Euclidean coefficients are likely to produce better results than random selection. Similarity values involving 5% most dissimilar compounds based on the non-binary Tanimoto coefficients has also been found to be able to represent physicochemical dissimilarities between the molecules compared. Therefore, for diverse selection requiring less than 5% most dissimilar compounds, rational selection methods based on the non-binary Tanimoto coefficient is likely to produce better results than random selection. However, in both focused and diverse selection using the coefficients mentioned, as more and more compounds are selected, the selection becomes more and more like random selection in terms of physicochemical properties similarity and dissimilarity.
format Article
author Naomie Salim,
Holliday, John
Willett, Peter
spellingShingle Naomie Salim,
Holliday, John
Willett, Peter
On the significance of topological-indices based non-binary molecular similarity measures
author_facet Naomie Salim,
Holliday, John
Willett, Peter
author_sort Naomie Salim,
title On the significance of topological-indices based non-binary molecular similarity measures
title_short On the significance of topological-indices based non-binary molecular similarity measures
title_full On the significance of topological-indices based non-binary molecular similarity measures
title_fullStr On the significance of topological-indices based non-binary molecular similarity measures
title_full_unstemmed On the significance of topological-indices based non-binary molecular similarity measures
title_sort on the significance of topological-indices based non-binary molecular similarity measures
publisher Universiti Kebangsaan Malaysia
publishDate 2004
url http://journalarticle.ukm.my/3896/
http://www.ukm.my/jsm/english_journals/vol33num2_2004/vol33num2_04page157-172.html
_version_ 1643735888436396032