Neural machine translation in grammar error correction
Grammar Error Correction (GEC) is the task of detecting and correcting grammatical errors in text written by non-native English writers. While traditional approaches with separate classifiers for different error types can achieve high precision, they cannot give the correction to errors based on the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74074 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Grammar Error Correction (GEC) is the task of detecting and correcting grammatical errors in text written by non-native English writers. While traditional approaches with separate classifiers for different error types can achieve high precision, they cannot give the correction to errors based on the sentence context, or handle errors such as non-idiomatic phrasing or word redundancy. This project studies the use of neural machine translation (NMT) for the GEC problem. This project reproduces two existing models using NMT: word-based machine translation and character-based machine translation. The core component of NMT is an encoder-decoder recurrent neural network with an attention mechanism. Though word-based machine translation is more popular and applied in many problems solvable by NMT such as translation or summarization, word-based approach may encounter the problem of out-of-vocabulary (OOV) words. On the other hand, by investigating at character level, character-based NMT is able to handle OOV words because of small vocabulary size. Evaluation of this study is performed on Lang-8 development set, JFLEG corpus and common English grammar errors. A web prototype system is also developed to demonstrate the working of the model. |
---|