English essay scoring

Essays are structured and constructed responses that provide a comprehensive insight into a person’s grasp of a particular topic. It is for this reason that essays are a highly preferred mode of assessment by educational institutions and corporate organizations alike. The underlying issue, due to it...

Full description

Saved in:
Bibliographic Details
Main Author: Kumar Keerthana
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2014
Subjects:
Online Access:http://hdl.handle.net/10356/59133
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-59133
record_format dspace
spelling sg-ntu-dr.10356-591332023-03-03T20:29:42Z English essay scoring Kumar Keerthana School of Computer Engineering Kim Jung-Jae DRNTU::Engineering::Computer science and engineering Essays are structured and constructed responses that provide a comprehensive insight into a person’s grasp of a particular topic. It is for this reason that essays are a highly preferred mode of assessment by educational institutions and corporate organizations alike. The underlying issue, due to its time and cost ineffectiveness, is the need to hire human graders. Although several automated scoring techniques exist, most rely on human judgement for verification. Computers do not always give a high accuracy rate while providing scores, as some traits and qualities of essays require human understanding to decipher. This paper explores the process of automatically scoring essays. It discusses the features that differentiate high and low quality essays by analyzing scored writing from the ICAME, KAGGLE, and WECCL20 corpora. Good and bad essays were found to differ in word choices, style, grammar, word count, sentence structure, punctuations, and spelling. Considering feasibility and resource constraints, a subset was chosen and used as features for machine learning. Three different approaches – Naïve Bayes Multinomial, Naïve Bayes Bernoulli, and Extreme Learning Machines – were used to train the classifier for binary categorization; essays were initially separated into good or bad. ELM was found to be the most accurate, and was thus used to combine all of the individual components. This resulted in an overall accuracy of 75.82%. The scores obtained from the classifier’s decision function were normalized to the desired scoring range. The difference between the actual scores and the expected scores were analyzed. The results showed that the approach worked better for narrow scoring margins (bands) such as 0-4, as compared to wider ones such as 0-30. The report concludes with a workable scoring model that takes a range of factors into account. However, additional features are required. Limitations are clearly stated, and further suggestions for improvement are also provided. Bachelor of Engineering (Computer Science) 2014-04-23T12:58:06Z 2014-04-23T12:58:06Z 2014 2014 Final Year Project (FYP) http://hdl.handle.net/10356/59133 en Nanyang Technological University 91 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Kumar Keerthana
English essay scoring
description Essays are structured and constructed responses that provide a comprehensive insight into a person’s grasp of a particular topic. It is for this reason that essays are a highly preferred mode of assessment by educational institutions and corporate organizations alike. The underlying issue, due to its time and cost ineffectiveness, is the need to hire human graders. Although several automated scoring techniques exist, most rely on human judgement for verification. Computers do not always give a high accuracy rate while providing scores, as some traits and qualities of essays require human understanding to decipher. This paper explores the process of automatically scoring essays. It discusses the features that differentiate high and low quality essays by analyzing scored writing from the ICAME, KAGGLE, and WECCL20 corpora. Good and bad essays were found to differ in word choices, style, grammar, word count, sentence structure, punctuations, and spelling. Considering feasibility and resource constraints, a subset was chosen and used as features for machine learning. Three different approaches – Naïve Bayes Multinomial, Naïve Bayes Bernoulli, and Extreme Learning Machines – were used to train the classifier for binary categorization; essays were initially separated into good or bad. ELM was found to be the most accurate, and was thus used to combine all of the individual components. This resulted in an overall accuracy of 75.82%. The scores obtained from the classifier’s decision function were normalized to the desired scoring range. The difference between the actual scores and the expected scores were analyzed. The results showed that the approach worked better for narrow scoring margins (bands) such as 0-4, as compared to wider ones such as 0-30. The report concludes with a workable scoring model that takes a range of factors into account. However, additional features are required. Limitations are clearly stated, and further suggestions for improvement are also provided.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Kumar Keerthana
format Final Year Project
author Kumar Keerthana
author_sort Kumar Keerthana
title English essay scoring
title_short English essay scoring
title_full English essay scoring
title_fullStr English essay scoring
title_full_unstemmed English essay scoring
title_sort english essay scoring
publishDate 2014
url http://hdl.handle.net/10356/59133
_version_ 1759854263999135744