A human-centric automated essay scoring and feedback system for the development of ethical reasoning
Although artificial Intelligence (AI) is prevalent and impacts facets of daily life, there is limited research on responsible and humanistic design, implementation, and evaluation of AI, especially in the field of education. Afterall, learning is inherently a social endeavor involving human interact...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/169447 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Although artificial Intelligence (AI) is prevalent and impacts facets of daily life, there is limited research on responsible and humanistic design, implementation, and evaluation of AI, especially in the field of education. Afterall, learning is inherently a social endeavor involving human interactions, rendering the need for AI designs to be approached from a humanistic perspective, or human-centered AI (HAI). This study focuses on the use of essays as a principal means for assessing learning outcomes, through students’ writing in subjects that require arguments and justifications, such as ethics and moral reasoning. We considered AI with a human and student-centric design for formative assessment, using an automated essay scoring (AES) and feedback system to address issues of running an online course with large enrolment and to provide efficient feedback to students with substantial time savings for the instructor. The development of the AES system occurred over four phases as part of an iterative design cycle. A mixed-method approach was used, allowing instructors to qualitatively code subsets of data for training a machine learning model based on the Random Forest algorithm. This model was subsequently used to automatically score more essays at scale. Findings show substantial agreement on inter-rater reliability before model training was conducted with acceptable training accuracy. The AES system’s performance was slightly less accurate than human raters but is improvable over multiple iterations of the iterative design cycle. This system has allowed instructors to provide formative feedback, which was not possible in previous runs of the course. |
---|