Building online corpora of Philippine languages

This paper aims at describing the building of the online corpora on Philippine languages as part of the online repository system called Palito. There are five components of the corpora: the top four major Philippine languages which are Tagalog, Cebuano, Ilocano and Hiligaynon and the Filipino Sign L...

Full description

Saved in:
Bibliographic Details
Main Authors: Dita, Shirley N., Roxas, Rachel Edita O., Inventado, Paul Salvador B.
Format: text
Published: Animo Repository 2009
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/2952
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-3951
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-39512022-07-22T01:38:59Z Building online corpora of Philippine languages Dita, Shirley N. Roxas, Rachel Edita O. Inventado, Paul Salvador B. This paper aims at describing the building of the online corpora on Philippine languages as part of the online repository system called Palito. There are five components of the corpora: the top four major Philippine languages which are Tagalog, Cebuano, Ilocano and Hiligaynon and the Filipino Sign Language (FSL). The four languages are composed of 250,000-word written texts each, whereas the FSL is composed of seven thousand signs in video format. Categories of the written texts include creative writing (such as novels and stories) and religious texts (such as the Bible). Automated tools are provided for language analysis such as word count, collocates, and others. This is part of a bigger corpora building project for Philippine languages that would consider text, speech and video forms, and the corresponding development of automated tools for language analysis of these various forms. © 2009 by Shirley N. Dita, Rachel Edita O. Roxas, and Paul Inventado. 2009-12-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/2952 Faculty Research Work Animo Repository Corpora (Linguistics) Philippine languages Computer Sciences Language and Literacy Education South and Southeast Asian Languages and Societies
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
topic Corpora (Linguistics)
Philippine languages
Computer Sciences
Language and Literacy Education
South and Southeast Asian Languages and Societies
spellingShingle Corpora (Linguistics)
Philippine languages
Computer Sciences
Language and Literacy Education
South and Southeast Asian Languages and Societies
Dita, Shirley N.
Roxas, Rachel Edita O.
Inventado, Paul Salvador B.
Building online corpora of Philippine languages
description This paper aims at describing the building of the online corpora on Philippine languages as part of the online repository system called Palito. There are five components of the corpora: the top four major Philippine languages which are Tagalog, Cebuano, Ilocano and Hiligaynon and the Filipino Sign Language (FSL). The four languages are composed of 250,000-word written texts each, whereas the FSL is composed of seven thousand signs in video format. Categories of the written texts include creative writing (such as novels and stories) and religious texts (such as the Bible). Automated tools are provided for language analysis such as word count, collocates, and others. This is part of a bigger corpora building project for Philippine languages that would consider text, speech and video forms, and the corresponding development of automated tools for language analysis of these various forms. © 2009 by Shirley N. Dita, Rachel Edita O. Roxas, and Paul Inventado.
format text
author Dita, Shirley N.
Roxas, Rachel Edita O.
Inventado, Paul Salvador B.
author_facet Dita, Shirley N.
Roxas, Rachel Edita O.
Inventado, Paul Salvador B.
author_sort Dita, Shirley N.
title Building online corpora of Philippine languages
title_short Building online corpora of Philippine languages
title_full Building online corpora of Philippine languages
title_fullStr Building online corpora of Philippine languages
title_full_unstemmed Building online corpora of Philippine languages
title_sort building online corpora of philippine languages
publisher Animo Repository
publishDate 2009
url https://animorepository.dlsu.edu.ph/faculty_research/2952
_version_ 1740844632445550592