What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features

This paper proposes a novel use of grammatical error detection/correction (GED/GEC) tools to document non-standard English varieties. This is motivated by the fact that both GED/GEC technology and sociolinguistics aim to identify linguistic deviation from the so-called standard, yet there has been l...

Full description

Saved in:
Bibliographic Details
Main Authors: Nguyen, Li, Taslimipoor, Shiva, Yuan, Zheng
Other Authors: School of Humanities
Format: Article
Language:English
Published: 2025
Subjects:
Online Access:https://hdl.handle.net/10356/181982
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181982
record_format dspace
spelling sg-ntu-dr.10356-1819822025-01-11T17:00:29Z What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features Nguyen, Li Taslimipoor, Shiva Yuan, Zheng School of Humanities Arts and Humanities Grammatical error analysis Dialectal features This paper proposes a novel use of grammatical error detection/correction (GED/GEC) tools to document non-standard English varieties. This is motivated by the fact that both GED/GEC technology and sociolinguistics aim to identify linguistic deviation from the so-called standard, yet there has been little communication between the two fields. We thus investigate whether state-of-the-art GED/GEC models can be effectively repurposed to automatically detect dialectal differences and assist linguistic missions in this regard. We explore this in the context of written Singaporean English (Cambridge Write & Improve), and spoken Vietnamese English (CanVEC), representing an established non-standard variety (Case 1) and an emerging variety (Case 2), respectively. We find that our GED/GEC systems are able to successfully detect a number of both established and new features. We further highlight some of the remaining areas that the systems overlook, as well as opportunities for future developments. This research bridges the gap between GED/GEC and dialectology, emphasizing their shared theme of linguistic deviation from a socially defined standard. Published version This work was funded by Cambridge University Press & Assessment. 2025-01-05T03:17:23Z 2025-01-05T03:17:23Z 2024 Journal Article Nguyen, L., Taslimipoor, S. & Yuan, Z. (2024). What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features. Linguistics Vanguard, 10(1), 465-477. https://dx.doi.org/10.1515/lingvan-2024-0001 2199-174X https://hdl.handle.net/10356/181982 10.1515/lingvan-2024-0001 2-s2.0-85205565042 1 10 465 477 en Linguistics Vanguard © 2024 The author(s), published by De Gruyter. This work is licensed under the Creative Commons Attribution 4.0 International License. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Arts and Humanities
Grammatical error analysis
Dialectal features
spellingShingle Arts and Humanities
Grammatical error analysis
Dialectal features
Nguyen, Li
Taslimipoor, Shiva
Yuan, Zheng
What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
description This paper proposes a novel use of grammatical error detection/correction (GED/GEC) tools to document non-standard English varieties. This is motivated by the fact that both GED/GEC technology and sociolinguistics aim to identify linguistic deviation from the so-called standard, yet there has been little communication between the two fields. We thus investigate whether state-of-the-art GED/GEC models can be effectively repurposed to automatically detect dialectal differences and assist linguistic missions in this regard. We explore this in the context of written Singaporean English (Cambridge Write & Improve), and spoken Vietnamese English (CanVEC), representing an established non-standard variety (Case 1) and an emerging variety (Case 2), respectively. We find that our GED/GEC systems are able to successfully detect a number of both established and new features. We further highlight some of the remaining areas that the systems overlook, as well as opportunities for future developments. This research bridges the gap between GED/GEC and dialectology, emphasizing their shared theme of linguistic deviation from a socially defined standard.
author2 School of Humanities
author_facet School of Humanities
Nguyen, Li
Taslimipoor, Shiva
Yuan, Zheng
format Article
author Nguyen, Li
Taslimipoor, Shiva
Yuan, Zheng
author_sort Nguyen, Li
title What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
title_short What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
title_full What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
title_fullStr What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
title_full_unstemmed What can NLP do for linguistics? Towards using grammatical error analysis to document non-standard English features
title_sort what can nlp do for linguistics? towards using grammatical error analysis to document non-standard english features
publishDate 2025
url https://hdl.handle.net/10356/181982
_version_ 1821237178523451392