Discourse parsing of sociology dissertation abstracts using decision tree induction

In this study, we investigated the use of decision tree induction to parse the macro-level discourse structure of sociology dissertation abstracts. We treated discourse parsing as a sentence categorization task. The attributes used in constructing the dec...

Full description

Saved in:
Bibliographic Details
Main Authors: Ou, Shiyan, Heng, Hui Ying, Goh, Dion Hoe-Lian, Khoo, Christopher S. G.
Other Authors: Wee Kim Wee School of Communication and Information
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/93813
http://hdl.handle.net/10220/7296
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-93813
record_format dspace
spelling sg-ntu-dr.10356-938132019-12-06T18:45:58Z Discourse parsing of sociology dissertation abstracts using decision tree induction Ou, Shiyan Heng, Hui Ying Goh, Dion Hoe-Lian Khoo, Christopher S. G. Wee Kim Wee School of Communication and Information 14th ASIS SIG/CR Classification Research Workshop DRNTU::Library and information science In this study, we investigated the use of decision tree induction to parse the macro-level discourse structure of sociology dissertation abstracts. We treated discourse parsing as a sentence categorization task. The attributes used in constructing the decision tree models were stemmed words that occurred in at least 35 sentences (out of 3694 sentences in 300 sample abstracts). Sentence location information was also used. The model obtained an accuracy rate of 71.3% when applied to a test sample of 100 abstracts. Another model that made use of information regarding the presence of 31 indicator words in neighboring sentences was also developed. Although this model did not obtain better results, a comparison of the two models suggests that an improvement in the classification of sentences in problem statement and research method section is possible by combining the models. Published version 2011-10-18T01:23:50Z 2019-12-06T18:45:58Z 2011-10-18T01:23:50Z 2019-12-06T18:45:58Z 2003 2003 Conference Paper Ou, S., Khoo, C. S. G., Heng, H. Y., & Goh, D. H. L. (2003). Discourse Parsing of Sociology Dissertation Abstracts Using Decision Tree Induction. In Proceedings of the 14th Annual ASIST SIG CR Workshop, Long Beach, California, USA. https://hdl.handle.net/10356/93813 http://hdl.handle.net/10220/7296 en © 2009 The Author(s) (ASIS SIG/CR Classification Research Workshop). This paper was published in Proceedings of the 14th ASIS SIG/CR Classification Research Workshop and is made available as an electronic reprint (preprint) with permission of The Author(s) (ASIS SIG/CR Classification Research Workshop). The published version is available at: [http://journals.lib.washington.edu/index.php/acro/article/view/14114]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. 9 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Library and information science
spellingShingle DRNTU::Library and information science
Ou, Shiyan
Heng, Hui Ying
Goh, Dion Hoe-Lian
Khoo, Christopher S. G.
Discourse parsing of sociology dissertation abstracts using decision tree induction
description In this study, we investigated the use of decision tree induction to parse the macro-level discourse structure of sociology dissertation abstracts. We treated discourse parsing as a sentence categorization task. The attributes used in constructing the decision tree models were stemmed words that occurred in at least 35 sentences (out of 3694 sentences in 300 sample abstracts). Sentence location information was also used. The model obtained an accuracy rate of 71.3% when applied to a test sample of 100 abstracts. Another model that made use of information regarding the presence of 31 indicator words in neighboring sentences was also developed. Although this model did not obtain better results, a comparison of the two models suggests that an improvement in the classification of sentences in problem statement and research method section is possible by combining the models.
author2 Wee Kim Wee School of Communication and Information
author_facet Wee Kim Wee School of Communication and Information
Ou, Shiyan
Heng, Hui Ying
Goh, Dion Hoe-Lian
Khoo, Christopher S. G.
format Conference or Workshop Item
author Ou, Shiyan
Heng, Hui Ying
Goh, Dion Hoe-Lian
Khoo, Christopher S. G.
author_sort Ou, Shiyan
title Discourse parsing of sociology dissertation abstracts using decision tree induction
title_short Discourse parsing of sociology dissertation abstracts using decision tree induction
title_full Discourse parsing of sociology dissertation abstracts using decision tree induction
title_fullStr Discourse parsing of sociology dissertation abstracts using decision tree induction
title_full_unstemmed Discourse parsing of sociology dissertation abstracts using decision tree induction
title_sort discourse parsing of sociology dissertation abstracts using decision tree induction
publishDate 2011
url https://hdl.handle.net/10356/93813
http://hdl.handle.net/10220/7296
_version_ 1681034070514991104