Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning

A growing amount of research use pre-trained language models to address few/zero-shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In t...

Full description

Saved in:
Bibliographic Details
Main Authors: Basabain, Seham, Cambria, Erik, Alomar, Khalid, Hussain, Amir
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/172293
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-172293
record_format dspace
spelling sg-ntu-dr.10356-1722932023-12-08T15:35:56Z Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning Basabain, Seham Cambria, Erik Alomar, Khalid Hussain, Amir School of Computer Science and Engineering Engineering::Computer science and engineering Arabic Text Classification Contextual Embeddings A growing amount of research use pre-trained language models to address few/zero-shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In this work, we demonstrate how label information can be utilized to extract enhanced feature representation of the input text from a Transformer-based pre-trained language model such as AraBERT. In addition, how this approach can improve performance when the data resources are scarce like in the Arabic language and the input text is short with little semantic information as is the case using tweets. The work also applies zero-shot text classification to predict new classes with no training examples across different domains including sarcasm detection and sentiment analysis using the information in the last layer of a trained classifier in a transfer learning setting. Experiments show that our approach has a better performance for the few-shot sentiment classification compared to baseline models and models trained without augmenting label information. Moreover, the zero-shot implementation achieved an accuracy up to 0.874 in Arabic sarcasm detection from a model trained on a sentiment analysis task. Published version Amir Hussain would like to acknowledge the support by the U.K. Engineering, and Physical Sciences Research Council (EPSRC), his work is under Grant EP/M026981/1, Grant EP/T021063/1 and Grant EP/T024917/1. 2023-12-05T04:42:14Z 2023-12-05T04:42:14Z 2023 Journal Article Basabain, S., Cambria, E., Alomar, K. & Hussain, A. (2023). Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning. Expert Systems, 40(8), e13329-. https://dx.doi.org/10.1111/exsy.13329 0266-4720 https://hdl.handle.net/10356/172293 10.1111/exsy.13329 2-s2.0-85158087267 8 40 e13329 en Expert Systems © 2023 The Authors. Expert Systems published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Arabic Text Classification
Contextual Embeddings
spellingShingle Engineering::Computer science and engineering
Arabic Text Classification
Contextual Embeddings
Basabain, Seham
Cambria, Erik
Alomar, Khalid
Hussain, Amir
Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
description A growing amount of research use pre-trained language models to address few/zero-shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In this work, we demonstrate how label information can be utilized to extract enhanced feature representation of the input text from a Transformer-based pre-trained language model such as AraBERT. In addition, how this approach can improve performance when the data resources are scarce like in the Arabic language and the input text is short with little semantic information as is the case using tweets. The work also applies zero-shot text classification to predict new classes with no training examples across different domains including sarcasm detection and sentiment analysis using the information in the last layer of a trained classifier in a transfer learning setting. Experiments show that our approach has a better performance for the few-shot sentiment classification compared to baseline models and models trained without augmenting label information. Moreover, the zero-shot implementation achieved an accuracy up to 0.874 in Arabic sarcasm detection from a model trained on a sentiment analysis task.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Basabain, Seham
Cambria, Erik
Alomar, Khalid
Hussain, Amir
format Article
author Basabain, Seham
Cambria, Erik
Alomar, Khalid
Hussain, Amir
author_sort Basabain, Seham
title Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
title_short Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
title_full Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
title_fullStr Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
title_full_unstemmed Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
title_sort enhancing arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
publishDate 2023
url https://hdl.handle.net/10356/172293
_version_ 1784855593751674880