Automatic extraction of conceptual relations from children's stories

People use storytelling as a natural and familiar means of conveying information and experience to each other. During this interchange, people understand each other because we rely on a large body of shared common sense knowledge. But computers do not share this knowledge, causing a barrier in human...

Full description

Saved in:
Bibliographic Details
Main Author: Samson, Briane Paul V.
Format: text
Language:English
Published: Animo Repository 2013
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_masteral/4382
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
Description
Summary:People use storytelling as a natural and familiar means of conveying information and experience to each other. During this interchange, people understand each other because we rely on a large body of shared common sense knowledge. But computers do not share this knowledge, causing a barrier in human-computer interaction and in applications requiring computers to generate coherent text. To support this task, computers must be provided with a usable knowledge about the basic relationships between concepts that we need everyday in our world. This research made use of GATE, an existing tool, and custom extraction rules to automatically extract concepts and their relations from existing children's stories, and store these in a knowledge base that story generation systems like Picture Books and other NLP applications can utilize to do their tasks. Sixteen (16) relation types were extracted specifying descriptions of story elements, character actions, temporal succession and causal chain of events, spatial and functional information of story objects, and world state information in a story. Based on the results of the evaluations, the extractor has been found to be inaccurate in identifying relations in a story. It has an overall accuracy of 36% based on precision, recall and F-measure. The incomplete and generalized templates, insufficient indicators, accuracy of existing tools, and inability to infer and detect implied relations were the main causes of inaccuracy. Furthermore, the quality and accuracy of extracted relations decrease as the complexity and length of a story increases.