Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus
In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods ar...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2009
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/472 https://ink.library.smu.edu.sg/context/sis_research/article/1471/viewcontent/acl09.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-1471 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-14712011-11-02T09:34:49Z Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus WANG, Xiaoyin LO, David JIANG, Jing ZHANG, LU Mei, Hong In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods are not entirely suitable here due to the noisy nature of bug reports. We propose a number of techniques to address the noisy data problem. The empirical evaluation shows that our method significantly improves an existing method by upto 58% 2009-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/472 https://ink.library.smu.edu.sg/context/sis_research/article/1471/viewcontent/acl09.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Software Engineering |
spellingShingle |
Software Engineering WANG, Xiaoyin LO, David JIANG, Jing ZHANG, LU Mei, Hong Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus |
description |
In this paper, we study the problem of extracting technical paraphrases from a parallel software corpus, namely, a collection of duplicate bug reports. Paraphrase acquisition is a fundamental task in the emerging area of text mining for software engineering. Existing paraphrase extraction methods are not entirely suitable here due to the noisy nature of bug reports. We propose a number of techniques to address the noisy data problem. The empirical evaluation shows that our method significantly improves an existing method by upto 58% |
format |
text |
author |
WANG, Xiaoyin LO, David JIANG, Jing ZHANG, LU Mei, Hong |
author_facet |
WANG, Xiaoyin LO, David JIANG, Jing ZHANG, LU Mei, Hong |
author_sort |
WANG, Xiaoyin |
title |
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus |
title_short |
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus |
title_full |
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus |
title_fullStr |
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus |
title_full_unstemmed |
Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpus |
title_sort |
extracting paraphrases of technical terms from noisy parallel software corpus |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2009 |
url |
https://ink.library.smu.edu.sg/sis_research/472 https://ink.library.smu.edu.sg/context/sis_research/article/1471/viewcontent/acl09.pdf |
_version_ |
1770570448319283200 |