Information extraction for maritime security
With the rise of the Internet and the high speed of information dissemination, new media's have become an unstoppable trend. Electronic news has gradually replaced traditional paper media and has become an important channel for people to obtain information and knowledge. Therefore, extracting k...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/158303 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-158303 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1583032023-07-07T18:57:23Z Information extraction for maritime security Li, Wenhan Mao Kezhi School of Electrical and Electronic Engineering ekzmao@ntu.edu.sg Engineering::Electrical and electronic engineering With the rise of the Internet and the high speed of information dissemination, new media's have become an unstoppable trend. Electronic news has gradually replaced traditional paper media and has become an important channel for people to obtain information and knowledge. Therefore, extracting key information from a large amount of news and classifying them effectively become a prerequisite for making further decisions. This dissertation aims to categorize a large number of news related to rampant piracy, such as unemployment, rising oil prices and illegal fishing, for decision-makers to make predictions about the need to enhance maritime security. Based on the above background, this dissertation proposed a text classification system to accomplish automatic classification of short and long texts on a variety of topics. Text classification is an important module in text processing and has a wide range of applications, such as spam filtering, news classification and lexical annotation. The general process of text classification is text pre-processing, feature learning and classifier construction. This dissertation accomplishes text classification through machine learning and deep learning respectively. First, web crawler is fully utilized to complete the data collection and build a corpus. Second, text preprocessing, such as removing stopwords and lemmatization, is necessary for model establishment and test accuracy. Third, feature learning such as BOW, TF-IDF and semantic analysis are applied in the machine learning model to extract text features and build a reduced dimensional latent semantic space. Forth, classification model such as Support Vector Machine, Random Forest and neural network such as Convolutional Neural Network, Long-Short Term Memory and pretrained-model BERT are used to complete the final classification. The method proposed in this dissertation achieves good results in multi-topic text classification and provides a reference idea for original web page text processing. Bachelor of Engineering (Electrical and Electronic Engineering) 2022-05-31T06:59:14Z 2022-05-31T06:59:14Z 2022 Final Year Project (FYP) Li, W. (2022). Information extraction for maritime security. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158303 https://hdl.handle.net/10356/158303 en A1094-211 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering |
spellingShingle |
Engineering::Electrical and electronic engineering Li, Wenhan Information extraction for maritime security |
description |
With the rise of the Internet and the high speed of information dissemination, new media's have become an unstoppable trend. Electronic news has gradually replaced traditional paper media and has become an important channel for people to obtain information and knowledge. Therefore, extracting key information from a large amount of news and classifying them effectively become a prerequisite for making further decisions. This dissertation aims to categorize a large number of news related to rampant piracy, such as unemployment, rising oil prices and illegal fishing, for decision-makers to make predictions about the need to enhance maritime security. Based on the above background, this dissertation proposed a text classification system to accomplish automatic classification of short and long texts on a variety of topics. Text classification is an important module in text processing and has a wide range of applications, such as spam filtering, news classification and lexical annotation. The general process of text classification is text pre-processing, feature learning and classifier construction. This dissertation accomplishes text classification through machine learning and deep learning respectively. First, web crawler is fully utilized to complete the data collection and build a corpus. Second, text preprocessing, such as removing stopwords and lemmatization, is necessary for model establishment and test accuracy. Third, feature learning such as BOW, TF-IDF and semantic analysis are applied in the machine learning model to extract text features and build a reduced dimensional latent semantic space. Forth, classification model such as Support Vector Machine, Random Forest and neural network such as Convolutional Neural Network, Long-Short Term Memory and pretrained-model BERT are used to complete the final classification. The method proposed in this dissertation achieves good results in multi-topic text classification and provides a reference idea for original web page text processing. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Li, Wenhan |
format |
Final Year Project |
author |
Li, Wenhan |
author_sort |
Li, Wenhan |
title |
Information extraction for maritime security |
title_short |
Information extraction for maritime security |
title_full |
Information extraction for maritime security |
title_fullStr |
Information extraction for maritime security |
title_full_unstemmed |
Information extraction for maritime security |
title_sort |
information extraction for maritime security |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/158303 |
_version_ |
1772825587449069568 |