What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
This paper proposes a novel use of grammatical error detection/correction (GED/GEC) tools to document non-standard English varieties. This is motivated by the fact that both GED/GEC technology and sociolinguistics aim to identify linguistic deviation from the so-called standard, yet there has been l...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181982 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181982 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1819822025-01-11T17:00:29Z What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features Nguyen, Li Taslimipoor, Shiva Yuan, Zheng School of Humanities Arts and Humanities Grammatical error analysis Dialectal features This paper proposes a novel use of grammatical error detection/correction (GED/GEC) tools to document non-standard English varieties. This is motivated by the fact that both GED/GEC technology and sociolinguistics aim to identify linguistic deviation from the so-called standard, yet there has been little communication between the two fields. We thus investigate whether state-of-the-art GED/GEC models can be effectively repurposed to automatically detect dialectal differences and assist linguistic missions in this regard. We explore this in the context of written Singaporean English (Cambridge Write & Improve), and spoken Vietnamese English (CanVEC), representing an established non-standard variety (Case 1) and an emerging variety (Case 2), respectively. We find that our GED/GEC systems are able to successfully detect a number of both established and new features. We further highlight some of the remaining areas that the systems overlook, as well as opportunities for future developments. This research bridges the gap between GED/GEC and dialectology, emphasizing their shared theme of linguistic deviation from a socially defined standard. Published version This work was funded by Cambridge University Press & Assessment. 2025-01-05T03:17:23Z 2025-01-05T03:17:23Z 2024 Journal Article Nguyen, L., Taslimipoor, S. & Yuan, Z. (2024). What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features. Linguistics Vanguard, 10(1), 465-477. https://dx.doi.org/10.1515/lingvan-2024-0001 2199-174X https://hdl.handle.net/10356/181982 10.1515/lingvan-2024-0001 2-s2.0-85205565042 1 10 465 477 en Linguistics Vanguard © 2024 The author(s), published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 International License. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Arts and Humanities Grammatical error analysis Dialectal features |
spellingShingle |
Arts and Humanities Grammatical error analysis Dialectal features Nguyen, Li Taslimipoor, Shiva Yuan, Zheng What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features |
description |
This paper proposes a novel use of grammatical error detection/correction (GED/GEC) tools to document non-standard English varieties. This is motivated by the fact that both GED/GEC technology and sociolinguistics aim to identify linguistic deviation from the so-called standard, yet there has been little communication between the two fields. We thus investigate whether state-of-the-art GED/GEC models can be effectively repurposed to automatically detect dialectal differences and assist linguistic missions in this regard. We explore this in the context of written Singaporean English (Cambridge Write & Improve), and spoken Vietnamese English (CanVEC), representing an established non-standard variety (Case 1) and an emerging variety (Case 2), respectively. We find that our GED/GEC systems are able to successfully detect a number of both established and new features. We further highlight some of the remaining areas that the systems overlook, as well as opportunities for future developments. This research bridges the gap between GED/GEC and dialectology, emphasizing their shared theme of linguistic deviation from a socially defined standard. |
author2 |
School of Humanities |
author_facet |
School of Humanities Nguyen, Li Taslimipoor, Shiva Yuan, Zheng |
format |
Article |
author |
Nguyen, Li Taslimipoor, Shiva Yuan, Zheng |
author_sort |
Nguyen, Li |
title |
What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features |
title_short |
What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features |
title_full |
What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features |
title_fullStr |
What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features |
title_full_unstemmed |
What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features |
title_sort |
what can nlp do for linguistics? towards using grammatical error analysis to document non-standard english features |
publishDate |
2025 |
url |
https://hdl.handle.net/10356/181982 |
_version_ |
1821237178523451392 |