Open-set pattern recognition and its application in information extraction from text

In traditional supervised learning, the training set contains the same classes that appear in the testing set. However, the classifier may encounter previously unseen classes in the actual world, which is likely to create errors if a close-set classifier divides these data into the original category...

Full description

Saved in:
Bibliographic Details
Main Author: Ke, Yizhen
Other Authors: Mao Kezhi
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/164148
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In traditional supervised learning, the training set contains the same classes that appear in the testing set. However, the classifier may encounter previously unseen classes in the actual world, which is likely to create errors if a close-set classifier divides these data into the original category. The open-set classifier is designed to classify known samples accurately and reject unrelated samples. However, there are fewer applications in text classification. The goal of this paper is to achieve the application of open-set recognition on text classification tasks. This paper first reviews the work related to text classification and open-set classification identification. Subsequently, this paper determines the use of GloVe technique to map files to vector space. Considering that CNN and LSTM are superior in text classification, this article conducted a preliminary experiment and selected CNN with better performance as the base model. On this basis, SVDD and OpenMax methods are used in the 10 domains and 20 domains of the data set, respectively, and are compared with existing text classifiers. SVDD has similar training results to the currently open-set classifier based on SVM. The performance of OpenMax in the text classifier does not greatly vibrate by Openness and has good accuracy.