Automatic information extraction and text mining in medical abstracts.

This study is in the area of information extraction (IE), which seeks to extract pieces of related information from unstructured text to populate a database or an ontology. Most IE systems employ a pattern-matching technique to identify the information to be extracted. Patterns are learnt from a lar...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Wei.
Other Authors: Khoo Soo Guan, Christopher
Format: Theses and Dissertations
Language:English
Published: 2009
Subjects:
Online Access:http://hdl.handle.net/10356/19088
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-19088
record_format dspace
spelling sg-ntu-dr.10356-190882019-12-10T12:07:58Z Automatic information extraction and text mining in medical abstracts. Wang, Wei. Khoo Soo Guan, Christopher Wee Kim Wee School of Communication and Information DRNTU::Library and information science This study is in the area of information extraction (IE), which seeks to extract pieces of related information from unstructured text to populate a database or an ontology. Most IE systems employ a pattern-matching technique to identify the information to be extracted. Patterns are learnt from a large annotated training set, which requires substantial human effort. This study investigates a semi-supervised learning approach to learn IE patterns. The approach uses a small number of seed patterns to automatically generate a training set and learns IE patterns from the training set by an Apriori algorithm. The study is carried out in the context of extracting information related to potential treatments of colon cancer from medical abstracts. It focuses on extracting 3 kinds of semantic relations: • Treatment relation: the disease and its potential medical treatment • Dosage relation: the treatment and its dose • Effect type relation: the treatment and its effect type. The objectives of this study are to develop a method for automatic construction of IE patterns using semi-supervised learning, to develop an IE system for extracting disease-treatment information from medical abstracts, and to develop an ontology for representing disease-treatment information found in medical abstracts. Master of Applied Science 2009-10-06T06:23:45Z 2009-10-06T06:23:45Z 2009 2009 Thesis http://hdl.handle.net/10356/19088 en Nanyang Technological University 154 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Library and information science
spellingShingle DRNTU::Library and information science
Wang, Wei.
Automatic information extraction and text mining in medical abstracts.
description This study is in the area of information extraction (IE), which seeks to extract pieces of related information from unstructured text to populate a database or an ontology. Most IE systems employ a pattern-matching technique to identify the information to be extracted. Patterns are learnt from a large annotated training set, which requires substantial human effort. This study investigates a semi-supervised learning approach to learn IE patterns. The approach uses a small number of seed patterns to automatically generate a training set and learns IE patterns from the training set by an Apriori algorithm. The study is carried out in the context of extracting information related to potential treatments of colon cancer from medical abstracts. It focuses on extracting 3 kinds of semantic relations: • Treatment relation: the disease and its potential medical treatment • Dosage relation: the treatment and its dose • Effect type relation: the treatment and its effect type. The objectives of this study are to develop a method for automatic construction of IE patterns using semi-supervised learning, to develop an IE system for extracting disease-treatment information from medical abstracts, and to develop an ontology for representing disease-treatment information found in medical abstracts.
author2 Khoo Soo Guan, Christopher
author_facet Khoo Soo Guan, Christopher
Wang, Wei.
format Theses and Dissertations
author Wang, Wei.
author_sort Wang, Wei.
title Automatic information extraction and text mining in medical abstracts.
title_short Automatic information extraction and text mining in medical abstracts.
title_full Automatic information extraction and text mining in medical abstracts.
title_fullStr Automatic information extraction and text mining in medical abstracts.
title_full_unstemmed Automatic information extraction and text mining in medical abstracts.
title_sort automatic information extraction and text mining in medical abstracts.
publishDate 2009
url http://hdl.handle.net/10356/19088
_version_ 1681046275614572544