Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus

In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods ar...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Xiaoyin, LO, David, JIANG, Jing, ZHANG, LU, Mei, Hong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2009
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/472
https://ink.library.smu.edu.sg/context/sis_research/article/1471/viewcontent/acl09.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1471
record_format dspace
spelling sg-smu-ink.sis_research-14712011-11-02T09:34:49Z Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus WANG, Xiaoyin LO, David JIANG, Jing ZHANG, LU Mei, Hong In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods are not entirely suitable here due to the noisy nature of bug reports. We propose a number of techniques to address the noisy data problem. The empirical evaluation shows that our method significantly improves an existing method by upto 58% 2009-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/472 https://ink.library.smu.edu.sg/context/sis_research/article/1471/viewcontent/acl09.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
WANG, Xiaoyin
LO, David
JIANG, Jing
ZHANG, LU
Mei, Hong
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
description In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods are not entirely suitable here due to the noisy nature of bug reports. We propose a number of techniques to address the noisy data problem. The empirical evaluation shows that our method significantly improves an existing method by upto 58%
format text
author WANG, Xiaoyin
LO, David
JIANG, Jing
ZHANG, LU
Mei, Hong
author_facet WANG, Xiaoyin
LO, David
JIANG, Jing
ZHANG, LU
Mei, Hong
author_sort WANG, Xiaoyin
title Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
title_short Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
title_full Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
title_fullStr Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
title_full_unstemmed Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
title_sort extracting paraphrases of technical terms from noisy parallel software corpus
publisher Institutional Knowledge at Singapore Management University
publishDate 2009
url https://ink.library.smu.edu.sg/sis_research/472
https://ink.library.smu.edu.sg/context/sis_research/article/1471/viewcontent/acl09.pdf
_version_ 1770570448319283200