Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text

Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This p...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmad Nazri, Mohd. Zakree, Abu Bakar, Azuraliza, Shamsudin, Siti Mariyam, Abd. Ghani, Tarmizi
Format: Book Section
Published: Institute of Electrical and Electronics Engineers 2008
Subjects:
Online Access:http://eprints.utm.my/id/eprint/12794/
http://dx.doi.org/10.1109/ITSIM.2008.4631709
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
id my.utm.12794
record_format eprints
spelling my.utm.127942017-10-04T04:35:50Z http://eprints.utm.my/id/eprint/12794/ Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text Ahmad Nazri, Mohd. Zakree Abu Bakar, Azuraliza Shamsudin, Siti Mariyam Abd. Ghani, Tarmizi QA75 Electronic computers. Computer science Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst's pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique. Institute of Electrical and Electronics Engineers 2008 Book Section PeerReviewed Ahmad Nazri, Mohd. Zakree and Abu Bakar, Azuraliza and Shamsudin, Siti Mariyam and Abd. Ghani, Tarmizi (2008) Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text. In: Proceedings - International Symposium on Information Technology 2008, ITSim. Institute of Electrical and Electronics Engineers, New York, 1173 -1179. ISBN 978-142442328-6 http://dx.doi.org/10.1109/ITSIM.2008.4631709 doi:10.1109/ITSIM.2008.4631709
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ahmad Nazri, Mohd. Zakree
Abu Bakar, Azuraliza
Shamsudin, Siti Mariyam
Abd. Ghani, Tarmizi
Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text
description Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst's pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique.
format Book Section
author Ahmad Nazri, Mohd. Zakree
Abu Bakar, Azuraliza
Shamsudin, Siti Mariyam
Abd. Ghani, Tarmizi
author_facet Ahmad Nazri, Mohd. Zakree
Abu Bakar, Azuraliza
Shamsudin, Siti Mariyam
Abd. Ghani, Tarmizi
author_sort Ahmad Nazri, Mohd. Zakree
title Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text
title_short Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text
title_full Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text
title_fullStr Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text
title_full_unstemmed Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text
title_sort using linguistic patterns in fca-based approach for automatic acquisition of taxonomies from malay text
publisher Institute of Electrical and Electronics Engineers
publishDate 2008
url http://eprints.utm.my/id/eprint/12794/
http://dx.doi.org/10.1109/ITSIM.2008.4631709
_version_ 1643646043913453568