Mobile phone name extraction from internet forums: a semi-supervised approach
Collecting users’ feedback on products from Internet forums is challenging because users often mention a product with informal abbreviations or nicknames. In this paper, we propose a method named Gren to recognize and normalize mobile phone names from domain-specific Internet forums. Instead of dire...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/83415 http://hdl.handle.net/10220/41432 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-83415 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-834152020-05-28T07:18:12Z Mobile phone name extraction from internet forums: a semi-supervised approach Yao, Yangjie Sun, Aixin School of Computer Engineering Mobile phone Name recognition and normalization Collecting users’ feedback on products from Internet forums is challenging because users often mention a product with informal abbreviations or nicknames. In this paper, we propose a method named Gren to recognize and normalize mobile phone names from domain-specific Internet forums. Instead of directly recognizing phone names from sentences as in most named entity recognition tasks, we propose an approach to generating candidate names as the first step. The candidate names capture short forms, spelling variations, and nicknames of products, but are not noise free. To predict whether a candidate name mention in a sentence indeed refers to a specific phone model, a Conditional Random Field (CRF)-based name recognizer is developed. The CRF model is trained by using a large set of sentences obtained in a semi-automatic manner with minimal manual labeling effort. Lastly, a rule-based name normalization component maps a recognized name to its formal form. Evaluated on more than 4000 manually labeled sentences with about 1000 phone name mentions, Gren outperforms all baseline methods. Specifically, it achieves precision and recall of 0.918 and 0.875 respectively, with the best feature setting. We also provide detailed analysis of the intermediate results obtained by each of the three components in Gren. MOE (Min. of Education, S’pore) Accepted version 2016-09-07T08:32:39Z 2019-12-06T15:22:00Z 2016-09-07T08:32:39Z 2019-12-06T15:22:00Z 2015 Journal Article Yao, Y., & Sun, A. (2016). Mobile phone name extraction from internet forums: a semi-supervised approach. World Wide Web, 19(5), 783-805. 1386-145X https://hdl.handle.net/10356/83415 http://hdl.handle.net/10220/41432 10.1007/s11280-015-0361-1 en World Wide Web © 2015 Springer Science+Business Media New York. This is the author created version of a work that has been peer reviewed and accepted for publication by World Wide Web, Springer Science+Business Media New York. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [http://dx.doi.org/10.1007/s11280-015-0361-1]. 14 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Mobile phone Name recognition and normalization |
spellingShingle |
Mobile phone Name recognition and normalization Yao, Yangjie Sun, Aixin Mobile phone name extraction from internet forums: a semi-supervised approach |
description |
Collecting users’ feedback on products from Internet forums is challenging because users often mention a product with informal abbreviations or nicknames. In this paper, we propose a method named Gren to recognize and normalize mobile phone names from domain-specific Internet forums. Instead of directly recognizing phone names from sentences as in most named entity recognition tasks, we propose an approach to generating candidate names as the first step. The candidate names capture short forms, spelling variations, and nicknames of products, but are not noise free. To predict whether a candidate name mention in a sentence indeed refers to a specific phone model, a Conditional Random Field (CRF)-based name recognizer is developed. The CRF model is trained by using a large set of sentences obtained in a semi-automatic manner with minimal manual labeling effort. Lastly, a rule-based name normalization component maps a recognized name to its formal form. Evaluated on more than 4000 manually labeled sentences with about 1000 phone name mentions, Gren outperforms all baseline methods. Specifically, it achieves precision and recall of 0.918 and 0.875 respectively, with the best feature setting. We also provide detailed analysis of the intermediate results obtained by each of the three components in Gren. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Yao, Yangjie Sun, Aixin |
format |
Article |
author |
Yao, Yangjie Sun, Aixin |
author_sort |
Yao, Yangjie |
title |
Mobile phone name extraction from internet forums: a semi-supervised approach |
title_short |
Mobile phone name extraction from internet forums: a semi-supervised approach |
title_full |
Mobile phone name extraction from internet forums: a semi-supervised approach |
title_fullStr |
Mobile phone name extraction from internet forums: a semi-supervised approach |
title_full_unstemmed |
Mobile phone name extraction from internet forums: a semi-supervised approach |
title_sort |
mobile phone name extraction from internet forums: a semi-supervised approach |
publishDate |
2016 |
url |
https://hdl.handle.net/10356/83415 http://hdl.handle.net/10220/41432 |
_version_ |
1681059237733597184 |