Spatial aware information retrieval for community-based Q&A (SIRCQA)
As social media are getting popular on the web, there are abundant of user-generated content that could be utilized using Information Retrieval (IR) to satisfy information needs. One particular form of social media, Community-based Question and Answering (CQA) is a possible source to be explored. It...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/50820 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-50820 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-508202023-03-03T20:26:01Z Spatial aware information retrieval for community-based Q&A (SIRCQA) Ang, Eugene Soon Leong School of Computer Engineering Cong Gao DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval As social media are getting popular on the web, there are abundant of user-generated content that could be utilized using Information Retrieval (IR) to satisfy information needs. One particular form of social media, Community-based Question and Answering (CQA) is a possible source to be explored. It provides an online platform for user to ask questions and allows other user's to post their answers. With similar questions that could be ask by multiple user, the CQA archive can be used to search for past answered questions for users that expressed similar query later. A technique of improving IR is also getting popular is called Spatial Aware Retrieval. This method of retrieval in addition to providing keyword matching for query terms, it also compares the similarity of spatial context found in the document indented by query. This further increases user satisfaction by promoting more relevant search result from queries that would require spatial context-dependent information. Although many researches has been performed for spatial aware retrieval on web documents, very little investigation has been done for CQA entries. Thus in this report, the techniques of Spatial Aware Retrieval were being analyzed to propose a suitable retrieval operation in Spatial Aware Information Retrieval for CQA, SIRCQ. Three considerations were looked into. First, traditional IR were known to treat every terms equally (bag of words) when performing keyword matching. In this study, it analyze on the effects that will be resulted in placing different emphasis on the identified location terms found in the query. Next, a proposed retrieval operation were introduced to further build on the IR similarity function by combining the carefully calculated weighted scores of the documents (CQA entries) from both textual and spatial relevance. Lastly, the concern of incorporating query expansion to add other similar or implied location terms deduced by the spatial relationship cue (term) found in queries was examined. From the findings, effective measures were thoughtfully constructed and utilized to improve the search results. The proposed retrieval operations were evaluated using the Yahoo! Answer dataset with Precision@K and Mean Average Precision (MAP) performance metric. The results shown from the experiment proved that the proposed retrieval operation performed significantly better (t-test: p<0.05) in precision and its improved flexibility to handle multiple locations in comparison to conventional "bag of words" retrieval. Bachelor of Engineering (Computer Science) 2012-11-15T07:54:05Z 2012-11-15T07:54:05Z 2012 2012 Final Year Project (FYP) http://hdl.handle.net/10356/50820 en Nanyang Technological University 73 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Ang, Eugene Soon Leong Spatial aware information retrieval for community-based Q&A (SIRCQA) |
description |
As social media are getting popular on the web, there are abundant of user-generated content that could be utilized using Information Retrieval (IR) to satisfy information needs. One particular form of social media, Community-based Question and Answering (CQA) is a possible source to be explored. It provides an online platform for user to ask questions and allows other user's to post their answers. With similar questions that could be ask by multiple user, the CQA archive can be used to search for past answered questions for users that expressed similar query later. A technique of improving IR is also getting popular is called Spatial Aware Retrieval. This method of retrieval in addition to providing keyword matching for query terms, it also compares the similarity of spatial context found in the document indented by query. This further increases user satisfaction by promoting more relevant search result from queries that would require spatial context-dependent information. Although many researches has been performed for spatial aware retrieval on web documents, very little investigation has been done for CQA entries. Thus in this report, the techniques of Spatial Aware Retrieval were being analyzed to propose a suitable retrieval operation in Spatial Aware Information Retrieval for CQA, SIRCQ. Three considerations were looked into. First, traditional IR were known to treat every terms equally (bag of words) when performing keyword matching. In this study, it analyze on the effects that will be resulted in placing different emphasis on the identified location terms found in the query. Next, a proposed retrieval operation were introduced to further build on the IR similarity function by combining the carefully calculated weighted scores of the documents (CQA entries) from both textual and spatial relevance. Lastly, the concern of incorporating query expansion to add other similar or implied location terms deduced by the spatial relationship cue (term) found in queries was examined. From the findings, effective measures were thoughtfully constructed and utilized to improve the search results. The proposed retrieval operations were evaluated using the Yahoo! Answer dataset with Precision@K and Mean Average Precision (MAP) performance metric. The results shown from the experiment proved that the proposed retrieval operation performed significantly better (t-test: p<0.05) in precision and its improved flexibility to handle multiple locations in comparison to conventional "bag of words" retrieval. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Ang, Eugene Soon Leong |
format |
Final Year Project |
author |
Ang, Eugene Soon Leong |
author_sort |
Ang, Eugene Soon Leong |
title |
Spatial aware information retrieval for community-based Q&A (SIRCQA) |
title_short |
Spatial aware information retrieval for community-based Q&A (SIRCQA) |
title_full |
Spatial aware information retrieval for community-based Q&A (SIRCQA) |
title_fullStr |
Spatial aware information retrieval for community-based Q&A (SIRCQA) |
title_full_unstemmed |
Spatial aware information retrieval for community-based Q&A (SIRCQA) |
title_sort |
spatial aware information retrieval for community-based q&a (sircqa) |
publishDate |
2012 |
url |
http://hdl.handle.net/10356/50820 |
_version_ |
1759858020130488320 |