Word by word labelling of Romanized Sindhi text by using online python tool
Sindhi is one of the most ancient languages in the world and it has its own written and spoken scripts. After the rigorous study it was found that a lot of research work has been done in different languages, but word by word labelling of Sindhi language had not been done yet. In this research study...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English English |
Published: |
The Science and Information (SAI) Organization
2022
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/100654/1/100654_Word%20by%20word%20labelling.pdf http://irep.iium.edu.my/100654/2/100654_Word%20by%20word%20labelling_SCOPUS.pdf http://irep.iium.edu.my/100654/ https://thesai.org/Publications/ViewPaper?Volume=13&Issue=8&Code=IJACSA&SerialNo=31 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Islam Antarabangsa Malaysia |
Language: | English English |
Summary: | Sindhi is one of the most ancient languages in the world and it has its own written and spoken scripts. After the rigorous study it was found that a lot of research work has been done in different languages, but word by word labelling of Sindhi
language had not been done yet. In this research study, word
labelling was done on 100 sentences of Romanized Sindhi texts using Python online tool. The dataset was collected from different sources which include Sindhi newspaper, blogs and social media webpages. From this dataset, a rule-based model has been applied for the Parts-of-Speech (POS) tagging of the Romanized Sindhi sentences. A total of 624 words of Romanized Sindhi texts were tested and successfully tagged by the SindhiNLP tool in which 482 words were tagged as nouns and pronouns, 92 words tagged as verbs and 50 words tagged as determinants. |
---|