Question classification of CoQa (QCOC) dataset

This paper proposes a new dataset for question classification process. Named QCoC (Question Classification of CoQA), this dataset is created based on Stanford’s CoQA (A Conversational Question Answering Challenge) dataset. The total of QCoC datapoint is 116630 (total of combined questionanswer pairs...

Full description

Saved in:

Bibliographic Details
Main Authors:	Abbas Saliimi, Lokman, Mohamed Ariff, Ameedeen, Ngahzaifa, Ab. Ghani
Format:	Conference or Workshop Item
Language:	English English
Published:	IEEE 2021
Subjects:	QA76 Computer software
Online Access:	http://umpir.ump.edu.my/id/eprint/33107/1/Question%20classification%20of%20coqa_FULL.pdf http://umpir.ump.edu.my/id/eprint/33107/2/Question%20classification%20of%20coqa.pdf http://umpir.ump.edu.my/id/eprint/33107/ http://10.1109/ICSECS52883.2021.00123
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Malaysia Pahang
Language:	English English

id	my.ump.umpir.33107
record_format	eprints
spelling	my.ump.umpir.331072022-06-22T01:57:52Z http://umpir.ump.edu.my/id/eprint/33107/ Question classification of CoQa (QCOC) dataset Abbas Saliimi, Lokman Mohamed Ariff, Ameedeen Ngahzaifa, Ab. Ghani QA76 Computer software This paper proposes a new dataset for question classification process. Named QCoC (Question Classification of CoQA), this dataset is created based on Stanford’s CoQA (A Conversational Question Answering Challenge) dataset. The total of QCoC datapoint is 116630 (total of combined questionanswer pairs in CoQA training and evaluation dataset). Common question classification datasets are classifying question based on its paired answer’s knowledge (the semantic of answer’s context). For QCoC, classification is done differently that is per answer’s feature (semantic and syntactic of answer’s type). This paper discusses the question classification datasets, QA datasets, and justification of CoQA as selected base for QCoC. Then QCoC specification is discussed with class definition, classification method and result subsections. To the author’s knowledge, such dataset is still nonexistent to date. This paper suggests that this type of dataset is useful in solving abstractive answers issue in Question-Answering (QA) system. While factual answers can be directly produced by regular QA system, abstractive answers need some additional components. Although it is a recognizable issue, lack of suitable dataset perhaps is the reason why such direction is not being pursued. With QCoC dataset made publicly available1, hopefully such direction is open for further exploration. IEEE 2021 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/33107/1/Question%20classification%20of%20coqa_FULL.pdf pdf en http://umpir.ump.edu.my/id/eprint/33107/2/Question%20classification%20of%20coqa.pdf Abbas Saliimi, Lokman and Mohamed Ariff, Ameedeen and Ngahzaifa, Ab. Ghani (2021) Question classification of CoQa (QCOC) dataset. In: 2021 International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM), 24-26 August 2021 , Pekan. pp. 1-2.. ISBN 978-166541407-4 http://10.1109/ICSECS52883.2021.00123
institution	Universiti Malaysia Pahang
building	UMP Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang
content_source	UMP Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English English
topic	QA76 Computer software
spellingShingle	QA76 Computer software Abbas Saliimi, Lokman Mohamed Ariff, Ameedeen Ngahzaifa, Ab. Ghani Question classification of CoQa (QCOC) dataset
description	This paper proposes a new dataset for question classification process. Named QCoC (Question Classification of CoQA), this dataset is created based on Stanford’s CoQA (A Conversational Question Answering Challenge) dataset. The total of QCoC datapoint is 116630 (total of combined questionanswer pairs in CoQA training and evaluation dataset). Common question classification datasets are classifying question based on its paired answer’s knowledge (the semantic of answer’s context). For QCoC, classification is done differently that is per answer’s feature (semantic and syntactic of answer’s type). This paper discusses the question classification datasets, QA datasets, and justification of CoQA as selected base for QCoC. Then QCoC specification is discussed with class definition, classification method and result subsections. To the author’s knowledge, such dataset is still nonexistent to date. This paper suggests that this type of dataset is useful in solving abstractive answers issue in Question-Answering (QA) system. While factual answers can be directly produced by regular QA system, abstractive answers need some additional components. Although it is a recognizable issue, lack of suitable dataset perhaps is the reason why such direction is not being pursued. With QCoC dataset made publicly available1, hopefully such direction is open for further exploration.
format	Conference or Workshop Item
author	Abbas Saliimi, Lokman Mohamed Ariff, Ameedeen Ngahzaifa, Ab. Ghani
author_facet	Abbas Saliimi, Lokman Mohamed Ariff, Ameedeen Ngahzaifa, Ab. Ghani
author_sort	Abbas Saliimi, Lokman
title	Question classification of CoQa (QCOC) dataset
title_short	Question classification of CoQa (QCOC) dataset
title_full	Question classification of CoQa (QCOC) dataset
title_fullStr	Question classification of CoQa (QCOC) dataset
title_full_unstemmed	Question classification of CoQa (QCOC) dataset
title_sort	question classification of coqa (qcoc) dataset
publisher	IEEE
publishDate	2021
url	http://umpir.ump.edu.my/id/eprint/33107/1/Question%20classification%20of%20coqa_FULL.pdf http://umpir.ump.edu.my/id/eprint/33107/2/Question%20classification%20of%20coqa.pdf http://umpir.ump.edu.my/id/eprint/33107/ http://10.1109/ICSECS52883.2021.00123
_version_	1736833891411427328

Question classification of CoQa (QCOC) dataset

Similar Items