Information extraction from semi and unstructured data sources: a systematic literature review

Millions of structured, semi structured and unstructured documents have been produced around the globe on a daily basis. Sources of such documents are individuals as well as several research societies like IEEE, Elsevier, Springer and Wiley that we use to publish the scientific documents enormously....

Full description

Saved in:
Bibliographic Details
Main Authors: Zaman, Gohar, Mahdin, Hairulnizam, Hussain, Khalid, Rahman, Atta-ur-
Format: Article
Language:English
Published: ICIC-EL Office 2020
Subjects:
Online Access:http://eprints.uthm.edu.my/6551/1/AJ%202020%20%28348%29.pdf
http://eprints.uthm.edu.my/6551/
https://dx.doi.org/ 10.24507/icicel.14.06.593
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Tun Hussein Onn Malaysia
Language: English
id my.uthm.eprints.6551
record_format eprints
spelling my.uthm.eprints.65512022-03-01T01:14:12Z http://eprints.uthm.edu.my/6551/ Information extraction from semi and unstructured data sources: a systematic literature review Zaman, Gohar Mahdin, Hairulnizam Hussain, Khalid Rahman, Atta-ur- T Technology (General) Millions of structured, semi structured and unstructured documents have been produced around the globe on a daily basis. Sources of such documents are individuals as well as several research societies like IEEE, Elsevier, Springer and Wiley that we use to publish the scientific documents enormously. These documents are a huge resource of scientific knowledge for research communities and interested users around the world. However, due to their massive volume and varying document formats, search engines are facing problems in indexing such documents, thus making retrieval of information inefficient, tedious and time consuming. Information extraction from such documents is among the hottest areas of research in data/text mining. As the number of such documents is increasing tremendously, more sophisticated information extraction techniques are necessary. This research focuses on reviewing and summarizing existing state-of-theart techniques in information extraction to highlight their limitations. Consequently, the research gap is formulated for the researchers in information extraction domain. ICIC-EL Office 2020 Article PeerReviewed text en http://eprints.uthm.edu.my/6551/1/AJ%202020%20%28348%29.pdf Zaman, Gohar and Mahdin, Hairulnizam and Hussain, Khalid and Rahman, Atta-ur- (2020) Information extraction from semi and unstructured data sources: a systematic literature review. ICIC Express Letters, 14 (6). pp. 593-603. ISSN 1881-803X https://dx.doi.org/ 10.24507/icicel.14.06.593
institution Universiti Tun Hussein Onn Malaysia
building UTHM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Tun Hussein Onn Malaysia
content_source UTHM Institutional Repository
url_provider http://eprints.uthm.edu.my/
language English
topic T Technology (General)
spellingShingle T Technology (General)
Zaman, Gohar
Mahdin, Hairulnizam
Hussain, Khalid
Rahman, Atta-ur-
Information extraction from semi and unstructured data sources: a systematic literature review
description Millions of structured, semi structured and unstructured documents have been produced around the globe on a daily basis. Sources of such documents are individuals as well as several research societies like IEEE, Elsevier, Springer and Wiley that we use to publish the scientific documents enormously. These documents are a huge resource of scientific knowledge for research communities and interested users around the world. However, due to their massive volume and varying document formats, search engines are facing problems in indexing such documents, thus making retrieval of information inefficient, tedious and time consuming. Information extraction from such documents is among the hottest areas of research in data/text mining. As the number of such documents is increasing tremendously, more sophisticated information extraction techniques are necessary. This research focuses on reviewing and summarizing existing state-of-theart techniques in information extraction to highlight their limitations. Consequently, the research gap is formulated for the researchers in information extraction domain.
format Article
author Zaman, Gohar
Mahdin, Hairulnizam
Hussain, Khalid
Rahman, Atta-ur-
author_facet Zaman, Gohar
Mahdin, Hairulnizam
Hussain, Khalid
Rahman, Atta-ur-
author_sort Zaman, Gohar
title Information extraction from semi and unstructured data sources: a systematic literature review
title_short Information extraction from semi and unstructured data sources: a systematic literature review
title_full Information extraction from semi and unstructured data sources: a systematic literature review
title_fullStr Information extraction from semi and unstructured data sources: a systematic literature review
title_full_unstemmed Information extraction from semi and unstructured data sources: a systematic literature review
title_sort information extraction from semi and unstructured data sources: a systematic literature review
publisher ICIC-EL Office
publishDate 2020
url http://eprints.uthm.edu.my/6551/1/AJ%202020%20%28348%29.pdf
http://eprints.uthm.edu.my/6551/
https://dx.doi.org/ 10.24507/icicel.14.06.593
_version_ 1738581505147404288