Wiki saga: an approach for the digitisation, processing and visualisation of historical documents

A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its fir...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Daniel Yong Wen
Format: Thesis
Language:English
Published: Universiti Malaysia Sarawak, (UNIMAS) 2015
Subjects:
Online Access:http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf
http://ir.unimas.my/id/eprint/10769/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
id my.unimas.ir.10769
record_format eprints
spelling my.unimas.ir.107692023-08-24T02:12:42Z http://ir.unimas.my/id/eprint/10769/ Wiki saga: an approach for the digitisation, processing and visualisation of historical documents Tan, Daniel Yong Wen T Technology (General) A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of the most important historical document for information on the history of Sarawak. The task of gleaning for information by laboriously going through pages of printed pages is an arduous task in terms of time and effort. This research focuses on enabling a semantic search on the Sarawak Gazette, as a case study, for visualising a summary of what actually happened in Sarawak during a certain period. This research proposes a pipeline process that involves digitising the Sarawak Gazette, a natural language process that extracts named entities and a timeline generator to display events as reported. Due to the difficulties of the task, the current state-of-the-art approach makes use of human power as part of a mass digitisation projects by Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can search for specific information on an event that happened in Sarawak during a certain timeframe by using the timeline display. By extracting named entities and displaying them within events in a timeline, researchers can have a summary of the event. By visualising events in a timeline, semantic patterns are recognised and related events can be identified. Through this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process a semi-automated approach for digitising all the documents is also now available to researchers. Universiti Malaysia Sarawak, (UNIMAS) 2015 Thesis NonPeerReviewed text en http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf Tan, Daniel Yong Wen (2015) Wiki saga: an approach for the digitisation, processing and visualisation of historical documents. Masters thesis, Universiti Malaysia Sarawak, (UNIMAS).
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic T Technology (General)
spellingShingle T Technology (General)
Tan, Daniel Yong Wen
Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
description A historical document contains information about past events which can be a source of reference. In this research, the selected historical document is the Sarawak Gazette, a monthly newspaper that reported on what happened in Sarawak. With one hundred and forty four years of reports since its first publication on Friday, August 26, 1870, the Sarawak Gazette is one of the most important historical document for information on the history of Sarawak. The task of gleaning for information by laboriously going through pages of printed pages is an arduous task in terms of time and effort. This research focuses on enabling a semantic search on the Sarawak Gazette, as a case study, for visualising a summary of what actually happened in Sarawak during a certain period. This research proposes a pipeline process that involves digitising the Sarawak Gazette, a natural language process that extracts named entities and a timeline generator to display events as reported. Due to the difficulties of the task, the current state-of-the-art approach makes use of human power as part of a mass digitisation projects by Google. A prototype system, Wiki SaGa, visualises the digitised documents in conjunction with the generated timeline. Through Wiki Saga, researchers who use the Sarawak Gazette can search for specific information on an event that happened in Sarawak during a certain timeframe by using the timeline display. By extracting named entities and displaying them within events in a timeline, researchers can have a summary of the event. By visualising events in a timeline, semantic patterns are recognised and related events can be identified. Through this research, Wiki Saga, a new archival and retrieval system, has been produced. In the process a semi-automated approach for digitising all the documents is also now available to researchers.
format Thesis
author Tan, Daniel Yong Wen
author_facet Tan, Daniel Yong Wen
author_sort Tan, Daniel Yong Wen
title Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_short Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_full Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_fullStr Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_full_unstemmed Wiki saga: an approach for the digitisation, processing and visualisation of historical documents
title_sort wiki saga: an approach for the digitisation, processing and visualisation of historical documents
publisher Universiti Malaysia Sarawak, (UNIMAS)
publishDate 2015
url http://ir.unimas.my/id/eprint/10769/1/Daniel.pdf
http://ir.unimas.my/id/eprint/10769/
_version_ 1775627176891121664