Handwriting recognition and retrieval for chemical structural formulas
Chemicals with similar structures often have similar chemical properties, chemical re- action and even physical properties. Therefore, in many drug discovery projects, it is required to search for similar chemical structures of drug-like compounds that are worthy for further synthetic investigation....
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/63297 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Chemicals with similar structures often have similar chemical properties, chemical re- action and even physical properties. Therefore, in many drug discovery projects, it is required to search for similar chemical structures of drug-like compounds that are worthy for further synthetic investigation. However, most of the current search engines only work well for text-based information. They are unable to provide good support for chemical structural search. Moreover, to perform chemical structural search, it is necessary to input a chemical structural query. Compared to handwriting-based input, the traditional template-based input is much more complicated and non-intuitive. With the growing popularity of touch-based devices, handwriting-based input has become much more important. Due to the spatial complexity of chemical structural formulas, it is challenging to recognize handwritten chemical structural formulas with both precision and efficiency. In this research, we focus on investigating various techniques to support handwritten chemical recognition and retrieval for chemical structural formulas. In this research, we have made the following contributions: • Handwritten Chemical Symbol Recognition. We proposed a CF44 chemical feature set consisting of 44 chemical symbol features which model the writing process, visual appearance and contextual environment of handwritten chemical symbols. In addition, we also proposed a handwritten chemical symbol recognition approach which is based on Support Vector Machine and our proposed CF44 chemical symbol feature set. • Progressive Chemical Structural Analysis. We proposed a chemical structural analysis approach to support progressive recognition of handwritten chemical structural formulas. In the proposed approach, Chemical Structural Graph was proposed to model chemical structural formulas. In addition, we also proposed a novel connected bond analysis method and ring closure detection method to support the recognition of complex chemical structures such as connected bonds and cyclic ring structures. • Chemical Structural Similarity Retrieval. We proposed two approaches for chemical structural similarity retrieval which retrieve functionally similar chemical structural formulas to the query. The two proposed chemical structural retrieval approaches are based on Vector Space Model and Formal Concept Analysis respectively. In addition, we also proposed a web-based chemical retrieval system for efficient chemical structural similarity retrieval using the publish-subscribe model. |
---|