Spatial aware information retrieval for community-based Q&A (SIRCQA)

As social media are getting popular on the web, there are abundant of user-generated content that could be utilized using Information Retrieval (IR) to satisfy information needs. One particular form of social media, Community-based Question and Answering (CQA) is a possible source to be explored. It...

Full description

Saved in:
Bibliographic Details
Main Author: Ang, Eugene Soon Leong
Other Authors: School of Computer Engineering
Format: Final Year Project
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10356/50820
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-50820
record_format dspace
spelling sg-ntu-dr.10356-508202023-03-03T20:26:01Z Spatial aware information retrieval for community-based Q&A (SIRCQA) Ang, Eugene Soon Leong School of Computer Engineering Cong Gao DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval As social media are getting popular on the web, there are abundant of user-generated content that could be utilized using Information Retrieval (IR) to satisfy information needs. One particular form of social media, Community-based Question and Answering (CQA) is a possible source to be explored. It provides an online platform for user to ask questions and allows other user's to post their answers. With similar questions that could be ask by multiple user, the CQA archive can be used to search for past answered questions for users that expressed similar query later. A technique of improving IR is also getting popular is called Spatial Aware Retrieval. This method of retrieval in addition to providing keyword matching for query terms, it also compares the similarity of spatial context found in the document indented by query. This further increases user satisfaction by promoting more relevant search result from queries that would require spatial context-dependent information. Although many researches has been performed for spatial aware retrieval on web documents, very little investigation has been done for CQA entries. Thus in this report, the techniques of Spatial Aware Retrieval were being analyzed to propose a suitable retrieval operation in Spatial Aware Information Retrieval for CQA, SIRCQ. Three considerations were looked into. First, traditional IR were known to treat every terms equally (bag of words) when performing keyword matching. In this study, it analyze on the effects that will be resulted in placing different emphasis on the identified location terms found in the query. Next, a proposed retrieval operation were introduced to further build on the IR similarity function by combining the carefully calculated weighted scores of the documents (CQA entries) from both textual and spatial relevance. Lastly, the concern of incorporating query expansion to add other similar or implied location terms deduced by the spatial relationship cue (term) found in queries was examined. From the findings, effective measures were thoughtfully constructed and utilized to improve the search results. The proposed retrieval operations were evaluated using the Yahoo! Answer dataset with Precision@K and Mean Average Precision (MAP) performance metric. The results shown from the experiment proved that the proposed retrieval operation performed significantly better (t-test: p<0.05) in precision and its improved flexibility to handle multiple locations in comparison to conventional "bag of words" retrieval. Bachelor of Engineering (Computer Science) 2012-11-15T07:54:05Z 2012-11-15T07:54:05Z 2012 2012 Final Year Project (FYP) http://hdl.handle.net/10356/50820 en Nanyang Technological University 73 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Ang, Eugene Soon Leong
Spatial aware information retrieval for community-based Q&A (SIRCQA)
description As social media are getting popular on the web, there are abundant of user-generated content that could be utilized using Information Retrieval (IR) to satisfy information needs. One particular form of social media, Community-based Question and Answering (CQA) is a possible source to be explored. It provides an online platform for user to ask questions and allows other user's to post their answers. With similar questions that could be ask by multiple user, the CQA archive can be used to search for past answered questions for users that expressed similar query later. A technique of improving IR is also getting popular is called Spatial Aware Retrieval. This method of retrieval in addition to providing keyword matching for query terms, it also compares the similarity of spatial context found in the document indented by query. This further increases user satisfaction by promoting more relevant search result from queries that would require spatial context-dependent information. Although many researches has been performed for spatial aware retrieval on web documents, very little investigation has been done for CQA entries. Thus in this report, the techniques of Spatial Aware Retrieval were being analyzed to propose a suitable retrieval operation in Spatial Aware Information Retrieval for CQA, SIRCQ. Three considerations were looked into. First, traditional IR were known to treat every terms equally (bag of words) when performing keyword matching. In this study, it analyze on the effects that will be resulted in placing different emphasis on the identified location terms found in the query. Next, a proposed retrieval operation were introduced to further build on the IR similarity function by combining the carefully calculated weighted scores of the documents (CQA entries) from both textual and spatial relevance. Lastly, the concern of incorporating query expansion to add other similar or implied location terms deduced by the spatial relationship cue (term) found in queries was examined. From the findings, effective measures were thoughtfully constructed and utilized to improve the search results. The proposed retrieval operations were evaluated using the Yahoo! Answer dataset with Precision@K and Mean Average Precision (MAP) performance metric. The results shown from the experiment proved that the proposed retrieval operation performed significantly better (t-test: p<0.05) in precision and its improved flexibility to handle multiple locations in comparison to conventional "bag of words" retrieval.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Ang, Eugene Soon Leong
format Final Year Project
author Ang, Eugene Soon Leong
author_sort Ang, Eugene Soon Leong
title Spatial aware information retrieval for community-based Q&A (SIRCQA)
title_short Spatial aware information retrieval for community-based Q&A (SIRCQA)
title_full Spatial aware information retrieval for community-based Q&A (SIRCQA)
title_fullStr Spatial aware information retrieval for community-based Q&A (SIRCQA)
title_full_unstemmed Spatial aware information retrieval for community-based Q&A (SIRCQA)
title_sort spatial aware information retrieval for community-based q&a (sircqa)
publishDate 2012
url http://hdl.handle.net/10356/50820
_version_ 1759858020130488320