Term importance for transformer-based QA retrieval : A case study of StackExchange
Question-answering (QA) retrieval is the task of retrieving the most relevant answer to a given question from a collection of answers. Various approaches to QA retrieval have been developed recently. One successful and popular model is Contextualized Late Interaction over BERT (ColBERT), a transform...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9855 https://ink.library.smu.edu.sg/context/sis_research/article/10855/viewcontent/webconf24shp__1_.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-10855 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-108552024-12-24T03:19:16Z Term importance for transformer-based QA retrieval : A case study of StackExchange TAN, Bryan Zhi Yang LAUW, Hady W. Question-answering (QA) retrieval is the task of retrieving the most relevant answer to a given question from a collection of answers. Various approaches to QA retrieval have been developed recently. One successful and popular model is Contextualized Late Interaction over BERT (ColBERT), a transformer-based approach that adopts a query-document scoring mechanism that retains the granularity of transformer matching, whilst improving on efficiency. However, one key limitation is that it requires further fine-tuning for new query or collection types. In this work, we explore and propose several non-parametric retrieval augmentation methods based on explicit signals of term importance that improve over ColBERT's baseline performance. In particular, we consider the QA retrieval task in the context of StackExchange question-answering forum, verifying the effectiveness of our methods in this setting. 2024-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9855 info:doi/10.1145/3589335.3651568 https://ink.library.smu.edu.sg/context/sis_research/article/10855/viewcontent/webconf24shp__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Information retrieval Retrieval models Retrieval ranking Question-answering Neural information retrieval Term importance Weighted late interaction Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Information retrieval Retrieval models Retrieval ranking Question-answering Neural information retrieval Term importance Weighted late interaction Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing |
spellingShingle |
Information retrieval Retrieval models Retrieval ranking Question-answering Neural information retrieval Term importance Weighted late interaction Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing TAN, Bryan Zhi Yang LAUW, Hady W. Term importance for transformer-based QA retrieval : A case study of StackExchange |
description |
Question-answering (QA) retrieval is the task of retrieving the most relevant answer to a given question from a collection of answers. Various approaches to QA retrieval have been developed recently. One successful and popular model is Contextualized Late Interaction over BERT (ColBERT), a transformer-based approach that adopts a query-document scoring mechanism that retains the granularity of transformer matching, whilst improving on efficiency. However, one key limitation is that it requires further fine-tuning for new query or collection types. In this work, we explore and propose several non-parametric retrieval augmentation methods based on explicit signals of term importance that improve over ColBERT's baseline performance. In particular, we consider the QA retrieval task in the context of StackExchange question-answering forum, verifying the effectiveness of our methods in this setting. |
format |
text |
author |
TAN, Bryan Zhi Yang LAUW, Hady W. |
author_facet |
TAN, Bryan Zhi Yang LAUW, Hady W. |
author_sort |
TAN, Bryan Zhi Yang |
title |
Term importance for transformer-based QA retrieval : A case study of StackExchange |
title_short |
Term importance for transformer-based QA retrieval : A case study of StackExchange |
title_full |
Term importance for transformer-based QA retrieval : A case study of StackExchange |
title_fullStr |
Term importance for transformer-based QA retrieval : A case study of StackExchange |
title_full_unstemmed |
Term importance for transformer-based QA retrieval : A case study of StackExchange |
title_sort |
term importance for transformer-based qa retrieval : a case study of stackexchange |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2024 |
url |
https://ink.library.smu.edu.sg/sis_research/9855 https://ink.library.smu.edu.sg/context/sis_research/article/10855/viewcontent/webconf24shp__1_.pdf |
_version_ |
1821237252543479808 |