Korean jamo-level byte-pair encoding for neural machine translation

Korean jamo-level byte-pair encoding for neural machine translation

Tokenization is the very first step in most Natural Language Processing tasks, and is essential in addressing the fundamental out-of-vocabulary problem, as well as in changing the linguistic understanding. To exploit the characteristics of the Korean language for a more parameter-efficient tokenizat...

Full description

Saved in:

Bibliographic Details
Main Author:	Lee, Junyoung
Other Authors:	Wang Lipo
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Online Access:	https://hdl.handle.net/10356/172737
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Similar Items

Machine learning for new friends recommendation in NTU
by: Niu, Jianan
Published: (2021)

Building generalizable models for discourse phenomena evaluation and machine translation
by: Jwalapuram, Prathyusha
Published: (2023)

Deep metric based feature engineering to Improve document-level representation for document clustering
by: Xu, Liwen
Published: (2022)

Natural language translation with graph convolutional neural network
by: Zhu, Yimin
Published: (2018)

Comparison of character recognition performance - bayes classifier and neural network methods
by: Low, Siew Eng.
Published: (2008)

Mining social media data
by: Yak, Kenneth Yong Seng
Published: (2020)

Personality detection from text, based on the MBTI model
by: Christienne Grace Regodon, Visco
Published: (2020)

TFIDF meets deep document representation : a re-visit of co-training for text classification
by: Chen, Zhiwei
Published: (2020)

Named entity recognition and linking with knowledge base
by: Phan, Cong Minh
Published: (2020)

Automated abuse detection of privacy policy
by: Tan, Soo Yong
Published: (2021)

Building SenticNet 7
by: Perh, Zhi Hao
Published: (2021)

Keyword and named entity recognition on emergency call hotline data
by: Mohamed Fahadh Jahir Hussain
Published: (2021)

Sentiment analysis on yelp reviews
by: Wong, Hong Yong
Published: (2021)

Named entity recognition for information extraction
by: Wei, Mark Zi Yun
Published: (2021)

Knowledge graph construction from text
by: Yong, Shan Jie
Published: (2021)

Automated source code summarization via transformer
by: Viswen Kumar Mariammalle
Published: (2021)

Topic extraction and sentiment analysis of a subreddit (r/coronavirus)
by: Chong, You Min
Published: (2021)

Event detection from social media on COVID-19
by: Ho, Yin Wee
Published: (2022)

Event detection for biomedical text
by: Pham, Nguyen Minh Thu
Published: (2022)

BERT named entity recognition on emergency response system
by: Chua, Clarita Wyn Kay
Published: (2022)

Time expression and named entity recognition for sentiment analysis
by: Tan, Jordan Rei Yao
Published: (2022)

Correlation analysis between Reddit sentiments and Ether (ETH) price action
by: Leah, Castillo
Published: (2022)

AI for Finance
by: Phoe, Chuan Bin
Published: (2022)

Generalized AutoNLP model for name entity recognition task
by: Wong, Yung Shen
Published: (2022)

Solving aspect-based sentiment analysis task with GNN models and tree reconstruction methods
by: Peng, Cheng
Published: (2022)

Reddit post summarization based subreddit analysis
by: Ng, Jesline
Published: (2023)

Social media chat data visualization
by: Tan, Royce Chun Wei
Published: (2023)

Event detection for cyber security news articles
by: Huang, Jovan Tian Chun
Published: (2023)

Natural language processing as autoregressive generation
by: Lin, Xiang
Published: (2023)

Context based patent classification and search : part A
by: Yoong, Jia Hui
Published: (2020)

Adaptation of language models via text augmentation
by: Prachaseree, Chaiyasait
Published: (2023)

Deep learning techniques for hate speech detection
by: Sam, Jared Mun Kit
Published: (2023)

Question classification via machine learning techniques
by: Ho, Mun Kit
Published: (2020)

Application of machine learning in the forecast of stock index
by: Sugianto, Jason Jonathan
Published: (2022)

Prototype user interface development for automatic categorizing search results
by: Didik Ariyawan Hartono
Published: (2008)

Automatic sentiment classification of movie reviews.
by: Chan, Kok Hong.
Published: (2009)

Feature vector generation tool for sentiment classification of product reviews using SVM.
by: Chan, Saw Nyein Aung.
Published: (2008)

Selecting training samples from large and noisy corpora for efficient text classification
by: Wong, Daji
Published: (2011)

Automatic sentiment classification of product reviews.
by: Zhou, Yun Yun.
Published: (2008)

Automatic sentiment classification of political news articles.
by: Nourbakhsh, Armineh.
Published: (2009)