Time expression extraction from free text
Today, extracting Time Expressions from free text remains a hot topic in Information Retrieval and Natural Processing Language Applications such as question answering systems. Even messaging applications such as WhatsApp are currently extracting time expressions to allow its users to add the dates i...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74079 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Today, extracting Time Expressions from free text remains a hot topic in Information Retrieval and Natural Processing Language Applications such as question answering systems. Even messaging applications such as WhatsApp are currently extracting time expressions to allow its users to add the dates into the calendar.
Extracting time expressions requires the ability to recognize and amend the format into a normalized form. Such systems are called temporal taggers. There are a few temporal taggers and different approaches available on the Internet.
In this project, the existing temporal taggers will be evaluated against benchmarks. These benchmarks are datasets which are annotated. This experiment will provide a gauge of the best approach and tagger available on the Web. It will also show the limitations that the taggers have and provide a better solution to solve the limitations.
Based on the experiment, we observe that rule-based taggers had better performance compared to the statistical systems. Also, we found out that most of the temporal taggers are unable to recognize colloquial words such as “nw” and “tmrw”. Such words are increasing in everyday use and it will be important for taggers to be able to recognize these words. Also, there were a few ambiguous terms such as a day’s march that taggers incorrectly normalize
A few methods to solve the limitations will be shown in the implementation part of the report where the best temporal tagger will be extended. This will provide a description of how future algorithms could do to improve the limitations and performance. |
---|