Processing and mining large-scale bibliography data

Extracting information from large collections of semi-structured data can be a considerable challenge when much of the information are hidden and require understanding. There are 3 primary objective of this project. First is to parse a large XML file from Digital Bibliography & Library Project...

Full description

Saved in:
Bibliographic Details
Main Author: Kerk, Wei Yang
Other Authors: Ke Yiping, Kelly
Format: Final Year Project
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/66745
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-66745
record_format dspace
spelling sg-ntu-dr.10356-667452023-03-03T20:28:00Z Processing and mining large-scale bibliography data Kerk, Wei Yang Ke Yiping, Kelly School of Computer Engineering DRNTU::Engineering Extracting information from large collections of semi-structured data can be a considerable challenge when much of the information are hidden and require understanding. There are 3 primary objective of this project. First is to parse a large XML file from Digital Bibliography & Library Project (DBLP) into a designed database system. Next, to classify each author ethnicity using an external name based classification system. Analysis on DBLP authors will be done to identify trends and pattern. Lastly, to conduct a Link Prediction experiment on collaboration between authors using well known predictors. By using ethnicity as a feature for collaboration, it is interesting to see any relationship exists between ethnicity and collaboration. This information can be added in the community for further research or future experiment. This report consists of a detailed explanation of various implementations to achieve the objectives, solutions to overcome them, results, analysis and a recommendation for future work. Finally, this report highlights that an author’s ethnicity is a relevant factor in collaboration for some ethnicity. Bachelor of Engineering (Computer Science) 2016-04-25T03:01:01Z 2016-04-25T03:01:01Z 2016 Final Year Project (FYP) http://hdl.handle.net/10356/66745 en Nanyang Technological University 58 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Kerk, Wei Yang
Processing and mining large-scale bibliography data
description Extracting information from large collections of semi-structured data can be a considerable challenge when much of the information are hidden and require understanding. There are 3 primary objective of this project. First is to parse a large XML file from Digital Bibliography & Library Project (DBLP) into a designed database system. Next, to classify each author ethnicity using an external name based classification system. Analysis on DBLP authors will be done to identify trends and pattern. Lastly, to conduct a Link Prediction experiment on collaboration between authors using well known predictors. By using ethnicity as a feature for collaboration, it is interesting to see any relationship exists between ethnicity and collaboration. This information can be added in the community for further research or future experiment. This report consists of a detailed explanation of various implementations to achieve the objectives, solutions to overcome them, results, analysis and a recommendation for future work. Finally, this report highlights that an author’s ethnicity is a relevant factor in collaboration for some ethnicity.
author2 Ke Yiping, Kelly
author_facet Ke Yiping, Kelly
Kerk, Wei Yang
format Final Year Project
author Kerk, Wei Yang
author_sort Kerk, Wei Yang
title Processing and mining large-scale bibliography data
title_short Processing and mining large-scale bibliography data
title_full Processing and mining large-scale bibliography data
title_fullStr Processing and mining large-scale bibliography data
title_full_unstemmed Processing and mining large-scale bibliography data
title_sort processing and mining large-scale bibliography data
publishDate 2016
url http://hdl.handle.net/10356/66745
_version_ 1759855037906944000