Building online corpora of Philippine languages
This paper aims at describing the building of the online corpora on Philippine languages as part of the online repository system called Palito. There are five components of the corpora: the top four major Philippine languages which are Tagalog, Cebuano, Ilocano and Hiligaynon and the Filipino Sign L...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Published: |
Animo Repository
2009
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/faculty_research/2952 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
id |
oai:animorepository.dlsu.edu.ph:faculty_research-3951 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:faculty_research-39512022-07-22T01:38:59Z Building online corpora of Philippine languages Dita, Shirley N. Roxas, Rachel Edita O. Inventado, Paul Salvador B. This paper aims at describing the building of the online corpora on Philippine languages as part of the online repository system called Palito. There are five components of the corpora: the top four major Philippine languages which are Tagalog, Cebuano, Ilocano and Hiligaynon and the Filipino Sign Language (FSL). The four languages are composed of 250,000-word written texts each, whereas the FSL is composed of seven thousand signs in video format. Categories of the written texts include creative writing (such as novels and stories) and religious texts (such as the Bible). Automated tools are provided for language analysis such as word count, collocates, and others. This is part of a bigger corpora building project for Philippine languages that would consider text, speech and video forms, and the corresponding development of automated tools for language analysis of these various forms. © 2009 by Shirley N. Dita, Rachel Edita O. Roxas, and Paul Inventado. 2009-12-01T08:00:00Z text https://animorepository.dlsu.edu.ph/faculty_research/2952 Faculty Research Work Animo Repository Corpora (Linguistics) Philippine languages Computer Sciences Language and Literacy Education South and Southeast Asian Languages and Societies |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
topic |
Corpora (Linguistics) Philippine languages Computer Sciences Language and Literacy Education South and Southeast Asian Languages and Societies |
spellingShingle |
Corpora (Linguistics) Philippine languages Computer Sciences Language and Literacy Education South and Southeast Asian Languages and Societies Dita, Shirley N. Roxas, Rachel Edita O. Inventado, Paul Salvador B. Building online corpora of Philippine languages |
description |
This paper aims at describing the building of the online corpora on Philippine languages as part of the online repository system called Palito. There are five components of the corpora: the top four major Philippine languages which are Tagalog, Cebuano, Ilocano and Hiligaynon and the Filipino Sign Language (FSL). The four languages are composed of 250,000-word written texts each, whereas the FSL is composed of seven thousand signs in video format. Categories of the written texts include creative writing (such as novels and stories) and religious texts (such as the Bible). Automated tools are provided for language analysis such as word count, collocates, and others. This is part of a bigger corpora building project for Philippine languages that would consider text, speech and video forms, and the corresponding development of automated tools for language analysis of these various forms. © 2009 by Shirley N. Dita, Rachel Edita O. Roxas, and Paul Inventado. |
format |
text |
author |
Dita, Shirley N. Roxas, Rachel Edita O. Inventado, Paul Salvador B. |
author_facet |
Dita, Shirley N. Roxas, Rachel Edita O. Inventado, Paul Salvador B. |
author_sort |
Dita, Shirley N. |
title |
Building online corpora of Philippine languages |
title_short |
Building online corpora of Philippine languages |
title_full |
Building online corpora of Philippine languages |
title_fullStr |
Building online corpora of Philippine languages |
title_full_unstemmed |
Building online corpora of Philippine languages |
title_sort |
building online corpora of philippine languages |
publisher |
Animo Repository |
publishDate |
2009 |
url |
https://animorepository.dlsu.edu.ph/faculty_research/2952 |
_version_ |
1740844632445550592 |