Extracting integrate and search healthcare knowledge from the web (III)

Currently, there is a trend where users post questions and edit questions via the use of online websites. These sites are also known as Community Question Answering (CQA) sites. CQA sites are beneficial to the web users because of the valuable knowledge accumulated from everybody around the world. H...

Full description

Saved in:

Bibliographic Details
Main Author:	Lim, Lionel Guan Chuan.
Other Authors:	School of Computer Engineering
Format:	Final Year Project
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Online Access:	http://hdl.handle.net/10356/51991
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-51991
record_format	dspace
spelling	sg-ntu-dr.10356-519912023-03-03T20:34:52Z Extracting integrate and search healthcare knowledge from the web (III) Lim, Lionel Guan Chuan. School of Computer Engineering Gao Cong DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Currently, there is a trend where users post questions and edit questions via the use of online websites. These sites are also known as Community Question Answering (CQA) sites. CQA sites are beneficial to the web users because of the valuable knowledge accumulated from everybody around the world. However, as beneficial as CQA sites may be, there comes a complexity of how to extract only relevant information which is beneficial to the web user. The goal of this project aims to consolidate healthcare information and allow web users to extract information which is beneficial to them. To do so, java-programmed web crawlers are programmed to retrieve the URL, category, question answer from the CQA health category. The question answer pairs crawled are then saved into an XML format. Lucene, a java IR java library, is used for speed indexing of the various XML documents.Another goal is to design a centralised search engine that can retrieve relevant healthcare information from CQA data. As this project will be a continuation from Senior Lee Qian Hui’s progress, i am tasked to utilise Information Retrieval Models to data crawl from more CQA sites that resemble WikiAnswers, which was previously implemented by Senior Lee. Bachelor of Engineering (Computer Science) 2013-04-19T02:42:20Z 2013-04-19T02:42:20Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/51991 en Nanyang Technological University 43 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle	DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Lim, Lionel Guan Chuan. Extracting integrate and search healthcare knowledge from the web (III)
description	Currently, there is a trend where users post questions and edit questions via the use of online websites. These sites are also known as Community Question Answering (CQA) sites. CQA sites are beneficial to the web users because of the valuable knowledge accumulated from everybody around the world. However, as beneficial as CQA sites may be, there comes a complexity of how to extract only relevant information which is beneficial to the web user. The goal of this project aims to consolidate healthcare information and allow web users to extract information which is beneficial to them. To do so, java-programmed web crawlers are programmed to retrieve the URL, category, question answer from the CQA health category. The question answer pairs crawled are then saved into an XML format. Lucene, a java IR java library, is used for speed indexing of the various XML documents.Another goal is to design a centralised search engine that can retrieve relevant healthcare information from CQA data. As this project will be a continuation from Senior Lee Qian Hui’s progress, i am tasked to utilise Information Retrieval Models to data crawl from more CQA sites that resemble WikiAnswers, which was previously implemented by Senior Lee.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Lim, Lionel Guan Chuan.
format	Final Year Project
author	Lim, Lionel Guan Chuan.
author_sort	Lim, Lionel Guan Chuan.
title	Extracting integrate and search healthcare knowledge from the web (III)
title_short	Extracting integrate and search healthcare knowledge from the web (III)
title_full	Extracting integrate and search healthcare knowledge from the web (III)
title_fullStr	Extracting integrate and search healthcare knowledge from the web (III)
title_full_unstemmed	Extracting integrate and search healthcare knowledge from the web (III)
title_sort	extracting integrate and search healthcare knowledge from the web (iii)
publishDate	2013
url	http://hdl.handle.net/10356/51991
_version_	1759853975998300160

Extracting integrate and search healthcare knowledge from the web (III)

Similar Items