Web-based retrieval system for chemical structural formulas
The drug discovery process relies heavily on chemical substructure and similarity search results for lead identification. Researchers often pool substructure and similarity search results to obtain a larger set of lead molecules for drug suitability evaluation in subsequent stages of the drug discov...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/55016 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-55016 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-550162023-03-03T20:49:26Z Web-based retrieval system for chemical structural formulas Neo, Lok Tuan Hui Siu Cheung School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval The drug discovery process relies heavily on chemical substructure and similarity search results for lead identification. Researchers often pool substructure and similarity search results to obtain a larger set of lead molecules for drug suitability evaluation in subsequent stages of the drug discovery process. However, existing chemical search engines require users to issue similarity and substructure chemical search queries separately and only display search results to the users when the search is complete. In this project, an efficient web-based chemical search engine is proposed and implemented to efficiently deliver both types of search results to users once a match is found. Two approaches are proposed to support efficient chemical search: • Effective Substructure Screening - By combining substructure information with chemical functional groups and chemical bonds, the accuracy of the substructure screening process during a substructure search can be improved. Evaluation results showed that the combined chemical features improve precision, recall and F1 scores for almost all test queries. • Publisher-Subscriber Infrastructure - Using the Publisher-Subscriber pattern in conjunction with an effective molecule filtering process, various types of chemical search can be carried out simultaneously and results can be efficiently delivered to users. Evaluation results of the proposed search engine infrastructure indicate that it is linearly scalable when used on larger chemical databases with significant speed-ups in search time when cached results are used to filter molecules for substructure search. Both proposed approaches jointly work to enhance the efficiency and effectiveness of chemical structural formula search. In this report, the proposed substructure screening process and the proposed publisher-subscriber infrastructure will be discussed. The performance of the proposed approaches is also evaluated. Bachelor of Engineering (Computer Science) 2013-11-29T07:06:15Z 2013-11-29T07:06:15Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/55016 en Nanyang Technological University 67 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Neo, Lok Tuan Web-based retrieval system for chemical structural formulas |
description |
The drug discovery process relies heavily on chemical substructure and similarity search results for lead identification. Researchers often pool substructure and similarity search results to obtain a larger set of lead molecules for drug suitability evaluation in subsequent stages of the drug discovery process. However, existing chemical search engines require users to issue similarity and substructure chemical search queries separately and only display search results to the users when the search is complete. In this project, an efficient web-based chemical search engine is proposed and implemented to efficiently deliver both types of search results to users once a match is found. Two approaches are proposed to support efficient chemical search:
• Effective Substructure Screening - By combining substructure information with chemical functional groups and chemical bonds, the accuracy of the substructure screening process during a substructure search can be improved. Evaluation results showed that the combined chemical features improve precision, recall and F1 scores for almost all test queries.
• Publisher-Subscriber Infrastructure - Using the Publisher-Subscriber pattern in conjunction with an effective molecule filtering process, various types of chemical search can be carried out simultaneously and results can be efficiently delivered to users. Evaluation results of the proposed search engine infrastructure indicate that it is linearly scalable when used on larger chemical databases with significant speed-ups in search time when cached results are used to filter molecules for substructure search.
Both proposed approaches jointly work to enhance the efficiency and effectiveness of chemical structural formula search. In this report, the proposed substructure screening process and the proposed publisher-subscriber infrastructure will be discussed. The performance of the proposed approaches is also evaluated. |
author2 |
Hui Siu Cheung |
author_facet |
Hui Siu Cheung Neo, Lok Tuan |
format |
Final Year Project |
author |
Neo, Lok Tuan |
author_sort |
Neo, Lok Tuan |
title |
Web-based retrieval system for chemical structural formulas |
title_short |
Web-based retrieval system for chemical structural formulas |
title_full |
Web-based retrieval system for chemical structural formulas |
title_fullStr |
Web-based retrieval system for chemical structural formulas |
title_full_unstemmed |
Web-based retrieval system for chemical structural formulas |
title_sort |
web-based retrieval system for chemical structural formulas |
publishDate |
2013 |
url |
http://hdl.handle.net/10356/55016 |
_version_ |
1759854210734620672 |