Design and development of a NLP demo system

Natural Language Processing is a subfield combining linguistics, computer science and Artificial Intelligence, codifying the interactions between computers and human language. With growing use cases in the 21st century in applications such as speech recognition, big data processing, unstructur...

Full description

Saved in:
Bibliographic Details
Main Author: Lynn Htet Aung
Other Authors: Sun Aixin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/157371
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Natural Language Processing is a subfield combining linguistics, computer science and Artificial Intelligence, codifying the interactions between computers and human language. With growing use cases in the 21st century in applications such as speech recognition, big data processing, unstructured data mining and natural language generation, NLP has cemented itself to be an important field of research and development for programmers. NLP techniques can be broken down into multiple umbrella tasks such as text/speech processing, morphological analysis, syntactic analysis, lexical semantics, relational semantics, etc. Multiple open-source and regularly maintained packages are available for developers to build their own NLP use cases, exploring a multiple of such techniques. However, the challenge in unifying these tasks and packages lies in the lack of a consolidated demo environment. Specifically in the context of education, demoing in real-time the output of various NLP packages from an input is highly conducive to learning to connect theoretical concepts to real applications. Many students are visual learners, and such demo systems are highly valuable in academia and teaching. With many NLP package providers with nuances in their functionalities, showcasing the different outputs in a side-by-side comparison is hard to accomplish. Coupled with the scattered demo sites, some built by the official provider while others built by third-party developers, it is challenging for lecturers to rely on a gold standard demo workflow. Many of these demo sites also have wildly different visualizations which are key in many NLP outputs, further confusing the audience with vastly different visual identities between package outputs. Additionally, some of the NLP packages are not maintained well and hence, the public API endpoints provided may not be robust enough for demo purposes.