Discourse parsing of sociology dissertation abstracts using decision tree induction

In this study, we investigated the use of decision tree induction to parse the macro-level discourse structure of sociology dissertation abstracts. We treated discourse parsing as a sentence categorization task. The attributes used in constructing the dec...

Full description

Saved in:
Bibliographic Details
Main Authors: Ou, Shiyan, Heng, Hui Ying, Goh, Dion Hoe-Lian, Khoo, Christopher S. G.
Other Authors: Wee Kim Wee School of Communication and Information
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/93813
http://hdl.handle.net/10220/7296
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this study, we investigated the use of decision tree induction to parse the macro-level discourse structure of sociology dissertation abstracts. We treated discourse parsing as a sentence categorization task. The attributes used in constructing the decision tree models were stemmed words that occurred in at least 35 sentences (out of 3694 sentences in 300 sample abstracts). Sentence location information was also used. The model obtained an accuracy rate of 71.3% when applied to a test sample of 100 abstracts. Another model that made use of information regarding the presence of 31 indicator words in neighboring sentences was also developed. Although this model did not obtain better results, a comparison of the two models suggests that an improvement in the classification of sentences in problem statement and research method section is possible by combining the models.