Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning
A growing amount of research use pre-trained language models to address few/zero-shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In t...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172293 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-172293 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1722932023-12-08T15:35:56Z Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning Basabain, Seham Cambria, Erik Alomar, Khalid Hussain, Amir School of Computer Science and Engineering Engineering::Computer science and engineering Arabic Text Classification Contextual Embeddings A growing amount of research use pre-trained language models to address few/zero-shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In this work, we demonstrate how label information can be utilized to extract enhanced feature representation of the input text from a Transformer-based pre-trained language model such as AraBERT. In addition, how this approach can improve performance when the data resources are scarce like in the Arabic language and the input text is short with little semantic information as is the case using tweets. The work also applies zero-shot text classification to predict new classes with no training examples across different domains including sarcasm detection and sentiment analysis using the information in the last layer of a trained classifier in a transfer learning setting. Experiments show that our approach has a better performance for the few-shot sentiment classification compared to baseline models and models trained without augmenting label information. Moreover, the zero-shot implementation achieved an accuracy up to 0.874 in Arabic sarcasm detection from a model trained on a sentiment analysis task. Published version Amir Hussain would like to acknowledge the support by the U.K. Engineering, and Physical Sciences Research Council (EPSRC), his work is under Grant EP/M026981/1, Grant EP/T021063/1 and Grant EP/T024917/1. 2023-12-05T04:42:14Z 2023-12-05T04:42:14Z 2023 Journal Article Basabain, S., Cambria, E., Alomar, K. & Hussain, A. (2023). Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning. Expert Systems, 40(8), e13329-. https://dx.doi.org/10.1111/exsy.13329 0266-4720 https://hdl.handle.net/10356/172293 10.1111/exsy.13329 2-s2.0-85158087267 8 40 e13329 en Expert Systems © 2023 The Authors. Expert Systems published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Arabic Text Classification Contextual Embeddings |
spellingShingle |
Engineering::Computer science and engineering Arabic Text Classification Contextual Embeddings Basabain, Seham Cambria, Erik Alomar, Khalid Hussain, Amir Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
description |
A growing amount of research use pre-trained language models to address few/zero-shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In this work, we demonstrate how label information can be utilized to extract enhanced feature representation of the input text from a Transformer-based pre-trained language model such as AraBERT. In addition, how this approach can improve performance when the data resources are scarce like in the Arabic language and the input text is short with little semantic information as is the case using tweets. The work also applies zero-shot text classification to predict new classes with no training examples across different domains including sarcasm detection and sentiment analysis using the information in the last layer of a trained classifier in a transfer learning setting. Experiments show that our approach has a better performance for the few-shot sentiment classification compared to baseline models and models trained without augmenting label information. Moreover, the zero-shot implementation achieved an accuracy up to 0.874 in Arabic sarcasm detection from a model trained on a sentiment analysis task. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Basabain, Seham Cambria, Erik Alomar, Khalid Hussain, Amir |
format |
Article |
author |
Basabain, Seham Cambria, Erik Alomar, Khalid Hussain, Amir |
author_sort |
Basabain, Seham |
title |
Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
title_short |
Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
title_full |
Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
title_fullStr |
Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
title_full_unstemmed |
Enhancing Arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
title_sort |
enhancing arabic-text feature extraction utilizing label-semantic augmentation in few/zero-shot learning |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/172293 |
_version_ |
1784855593751674880 |