Time expression extraction from free text
Today, extracting Time Expressions from free text remains a hot topic in Information Retrieval and Natural Processing Language Applications such as question answering systems. Even messaging applications such as WhatsApp are currently extracting time expressions to allow its users to add the dates i...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74079 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-74079 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-740792023-03-03T20:27:34Z Time expression extraction from free text Chua, Chin Aik Sun Aixin School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Today, extracting Time Expressions from free text remains a hot topic in Information Retrieval and Natural Processing Language Applications such as question answering systems. Even messaging applications such as WhatsApp are currently extracting time expressions to allow its users to add the dates into the calendar. Extracting time expressions requires the ability to recognize and amend the format into a normalized form. Such systems are called temporal taggers. There are a few temporal taggers and different approaches available on the Internet. In this project, the existing temporal taggers will be evaluated against benchmarks. These benchmarks are datasets which are annotated. This experiment will provide a gauge of the best approach and tagger available on the Web. It will also show the limitations that the taggers have and provide a better solution to solve the limitations. Based on the experiment, we observe that rule-based taggers had better performance compared to the statistical systems. Also, we found out that most of the temporal taggers are unable to recognize colloquial words such as “nw” and “tmrw”. Such words are increasing in everyday use and it will be important for taggers to be able to recognize these words. Also, there were a few ambiguous terms such as a day’s march that taggers incorrectly normalize A few methods to solve the limitations will be shown in the implementation part of the report where the best temporal tagger will be extended. This will provide a description of how future algorithms could do to improve the limitations and performance. Bachelor of Engineering (Computer Science) 2018-04-24T05:33:27Z 2018-04-24T05:33:27Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74079 en Nanyang Technological University 29 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval Chua, Chin Aik Time expression extraction from free text |
description |
Today, extracting Time Expressions from free text remains a hot topic in Information Retrieval and Natural Processing Language Applications such as question answering systems. Even messaging applications such as WhatsApp are currently extracting time expressions to allow its users to add the dates into the calendar.
Extracting time expressions requires the ability to recognize and amend the format into a normalized form. Such systems are called temporal taggers. There are a few temporal taggers and different approaches available on the Internet.
In this project, the existing temporal taggers will be evaluated against benchmarks. These benchmarks are datasets which are annotated. This experiment will provide a gauge of the best approach and tagger available on the Web. It will also show the limitations that the taggers have and provide a better solution to solve the limitations.
Based on the experiment, we observe that rule-based taggers had better performance compared to the statistical systems. Also, we found out that most of the temporal taggers are unable to recognize colloquial words such as “nw” and “tmrw”. Such words are increasing in everyday use and it will be important for taggers to be able to recognize these words. Also, there were a few ambiguous terms such as a day’s march that taggers incorrectly normalize
A few methods to solve the limitations will be shown in the implementation part of the report where the best temporal tagger will be extended. This will provide a description of how future algorithms could do to improve the limitations and performance. |
author2 |
Sun Aixin |
author_facet |
Sun Aixin Chua, Chin Aik |
format |
Final Year Project |
author |
Chua, Chin Aik |
author_sort |
Chua, Chin Aik |
title |
Time expression extraction from free text |
title_short |
Time expression extraction from free text |
title_full |
Time expression extraction from free text |
title_fullStr |
Time expression extraction from free text |
title_full_unstemmed |
Time expression extraction from free text |
title_sort |
time expression extraction from free text |
publishDate |
2018 |
url |
http://hdl.handle.net/10356/74079 |
_version_ |
1759857385448407040 |