The Company They Keep: Extracting Japanese Neologisms Using Language Patterns

We describe an investigation into the identification and extraction of unrecorded potential lexical items in Japanese text by detecting text passages containing selected language patterns typically associated with such items. We identified a set of suitable patterns, then tested them with two large...

Full description

Saved in:
Bibliographic Details
Main Authors: Breen, James, Baldwin, Timothy, Bond, Francis
Other Authors: School of Humanities and Social Sciences
Format: Conference or Workshop Item
Language:English
Published: 2018
Subjects:
Online Access:https://hdl.handle.net/10356/88494
http://hdl.handle.net/10220/44912
http://compling.hss.ntu.edu.sg/events/2018-gwc/pdfs/GWC2018_paper_20.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-88494
record_format dspace
spelling sg-ntu-dr.10356-884942019-12-06T17:04:28Z The Company They Keep: Extracting Japanese Neologisms Using Language Patterns Breen, James Baldwin, Timothy Bond, Francis School of Humanities and Social Sciences The 9th Global WordNet Conference (GWC 2018) Japanese Text Lexicons We describe an investigation into the identification and extraction of unrecorded potential lexical items in Japanese text by detecting text passages containing selected language patterns typically associated with such items. We identified a set of suitable patterns, then tested them with two large collections of text drawn from the WWW and Twitter. Samples of the extracted items were evaluated, and it was demonstrated that the approach has considerable potential for identifying terms for later lexicographic analysis. Accepted version 2018-05-30T08:02:59Z 2019-12-06T17:04:28Z 2018-05-30T08:02:59Z 2019-12-06T17:04:28Z 2018-01-01 2018 Conference Paper Breen, J., Baldwin, T., & Bond, F. (2018). The Company They Keep: Extracting Japanese Neologisms Using Language Patterns. The 9th Global WordNet Conference (GWC 2018). https://hdl.handle.net/10356/88494 http://hdl.handle.net/10220/44912 http://compling.hss.ntu.edu.sg/events/2018-gwc/pdfs/GWC2018_paper_20.pdf 204438 en © 2018 The author(s). This is the author created version of a work that has been peer reviewed and accepted for publication by The 9th Global WordNet Conference (GWC 2018). It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The full-text is available at: [http://compling.hss.ntu.edu.sg/events/2018-gwc/pdfs/GWC2018_paper_20.pdf]. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Japanese Text
Lexicons
spellingShingle Japanese Text
Lexicons
Breen, James
Baldwin, Timothy
Bond, Francis
The Company They Keep: Extracting Japanese Neologisms Using Language Patterns
description We describe an investigation into the identification and extraction of unrecorded potential lexical items in Japanese text by detecting text passages containing selected language patterns typically associated with such items. We identified a set of suitable patterns, then tested them with two large collections of text drawn from the WWW and Twitter. Samples of the extracted items were evaluated, and it was demonstrated that the approach has considerable potential for identifying terms for later lexicographic analysis.
author2 School of Humanities and Social Sciences
author_facet School of Humanities and Social Sciences
Breen, James
Baldwin, Timothy
Bond, Francis
format Conference or Workshop Item
author Breen, James
Baldwin, Timothy
Bond, Francis
author_sort Breen, James
title The Company They Keep: Extracting Japanese Neologisms Using Language Patterns
title_short The Company They Keep: Extracting Japanese Neologisms Using Language Patterns
title_full The Company They Keep: Extracting Japanese Neologisms Using Language Patterns
title_fullStr The Company They Keep: Extracting Japanese Neologisms Using Language Patterns
title_full_unstemmed The Company They Keep: Extracting Japanese Neologisms Using Language Patterns
title_sort company they keep: extracting japanese neologisms using language patterns
publishDate 2018
url https://hdl.handle.net/10356/88494
http://hdl.handle.net/10220/44912
http://compling.hss.ntu.edu.sg/events/2018-gwc/pdfs/GWC2018_paper_20.pdf
_version_ 1681037384185020416