A corpus based analysis of -kan and -i in Indonesian

The importance of capturing patterns of shared verb behavior through verb classes was called to attention by Fillmore (1970) in his seminal paper “The Grammar of Hitting and Breaking”. In his work, he recognized that verbs in English could be grouped into classes based on their semantic similarity a...

Full description

Saved in:
Bibliographic Details
Main Author: Choi, Hannah Yun Jung
Other Authors: Francis Bond
Format: Thesis-Master by Research
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/136955
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The importance of capturing patterns of shared verb behavior through verb classes was called to attention by Fillmore (1970) in his seminal paper “The Grammar of Hitting and Breaking”. In his work, he recognized that verbs in English could be grouped into classes based on their semantic similarity as well as shared grammatical behavior and argument realization. Specifically, he showed how hit and break verbs are each members of larger classes of verbs whose members share comparable patterns of behavior such as participation in the causative alternation and interpretations available to their passive particles (1970:125). Other studies have since been done in English which confirmed and expanded on Fillmore’s findings (Dixon, 1992; Jackendoff, 1992; Levin & Hovav, 1991), most notably by Levin (1993) in her seminal book “English Verb Classes and Alternations”. Moving beyond English, the idea of semantically related verb classes having shared syntactic behaviors has also been identified and explored in other languages such as Lhasa Tibetan (DeLancey, 1995), Kimaragang Dusun (Kroeger, 2010) and Indonesian (Voskuil, 1996). Most recently, this idea has been implemented computationally in Hebrew by Sheinfux et al. (2017). Their study proposed an analysis that explained argument structure phenomena in Hebrew by distinguishing between semantic and syntactic selection and stating the constraints in each level separately. Indonesian (ISO 639-3: ind), is the national language of the multilingual Indonesian archipelago. This Austronesian language is spoken by more than 22 million speakers as a first language (Lewis, 2009). As an agglutinating language, Indonesian employs several suffixes and prefixes with verbal roots (Sneddon et al., 2010). This thesis explores the use of these affixes together with argument role information as a basis of identifying verb sub-classes in Indonesian. The data used in this thesis comes from the Indonesian sub-corpora found in the Leipzig Corpora Collection (LCC) (Quasthoff et al., 2006) which contains over 15 million sentences from news and web articles. I searched the corpus for verbs containing the prefix meN- and suffixes -kan and -i. I then grouped these verbs into one of seven distinct groups according to its morphological behavior. I selected 50 verb roots from each group and extracted a total of 4800 sentences for further analysis. I annotated these sentences with semantic roles and arguments based on a list adapted from Sheinfux et al. (2017). I found that it is possible to use the morphological information of affixes to arrive at a coarse-grained sub-classification of verbs in Indonesian that confirms the findings from existing research. I also show that more fine-grained classification can be achieved using semantic information from argument roles.