On domain knowledge organization and extraction in software engineering

Developers' social information seeking on the Web is unable to benefit from the recent significant advances of semantics-oriented applications, such as knowledge graph and direct answers. This is largely because existing approaches to analyzing software engineering social content, such as the...

Full description

Saved in:
Bibliographic Details
Main Author: Ye, Deheng
Other Authors: Lin Shang-Wei
Format: Theses and Dissertations
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/69477
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-69477
record_format dspace
spelling sg-ntu-dr.10356-694772023-03-04T00:52:44Z On domain knowledge organization and extraction in software engineering Ye, Deheng Lin Shang-Wei School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering Developers' social information seeking on the Web is unable to benefit from the recent significant advances of semantics-oriented applications, such as knowledge graph and direct answers. This is largely because existing approaches to analyzing software engineering social content, such as the discussions on Stack Overflow, 1) treat software-specific entities in the same way as other textual content, and 2) fall short to consider the semantic linkages between software knowledge. In this thesis, we perform a pioneering study towards the long-term goal of enabling domain-specific knowledge graph and semantic search in software engineering. Using the developer-generated content on Stack Overflow, we formulate a series of research problems that are the key steps for achieving this goal. These include: 1) we investigate the online knowledge connection in software engineering by analyzing the knowledge network formed by Stack Overflow users' URL sharing activities. Through this study, we obtain an overall understanding of the domain knowledge organization, correlation and evolution, which inspires further research on extracting and linking software engineering knowledge. 2) we propose semi-supervised methods for extracting software-specific named entities, such as API mentions, from informal natural language text. 3) we develop automated techniques to link semantically linkable knowledge at document-level, and to link a recognized API mention to its fully qualified form as appeared in the API documentation at entity-level. We investigate the development and enhancement of NLP and IR techniques for the design challenges of these research problems brought by the socio-technical nature of software engineering social content. Extensive experiments show the effectiveness of our proposed approaches for analyzing and solving these problems. Doctor of Philosophy (SCE) 2017-01-25T09:20:31Z 2017-01-25T09:20:31Z 2017 Thesis Ye, D. (2017). On domain knowledge organization and extraction in software engineering. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/69477 10.32657/10356/69477 en 188 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Ye, Deheng
On domain knowledge organization and extraction in software engineering
description Developers' social information seeking on the Web is unable to benefit from the recent significant advances of semantics-oriented applications, such as knowledge graph and direct answers. This is largely because existing approaches to analyzing software engineering social content, such as the discussions on Stack Overflow, 1) treat software-specific entities in the same way as other textual content, and 2) fall short to consider the semantic linkages between software knowledge. In this thesis, we perform a pioneering study towards the long-term goal of enabling domain-specific knowledge graph and semantic search in software engineering. Using the developer-generated content on Stack Overflow, we formulate a series of research problems that are the key steps for achieving this goal. These include: 1) we investigate the online knowledge connection in software engineering by analyzing the knowledge network formed by Stack Overflow users' URL sharing activities. Through this study, we obtain an overall understanding of the domain knowledge organization, correlation and evolution, which inspires further research on extracting and linking software engineering knowledge. 2) we propose semi-supervised methods for extracting software-specific named entities, such as API mentions, from informal natural language text. 3) we develop automated techniques to link semantically linkable knowledge at document-level, and to link a recognized API mention to its fully qualified form as appeared in the API documentation at entity-level. We investigate the development and enhancement of NLP and IR techniques for the design challenges of these research problems brought by the socio-technical nature of software engineering social content. Extensive experiments show the effectiveness of our proposed approaches for analyzing and solving these problems.
author2 Lin Shang-Wei
author_facet Lin Shang-Wei
Ye, Deheng
format Theses and Dissertations
author Ye, Deheng
author_sort Ye, Deheng
title On domain knowledge organization and extraction in software engineering
title_short On domain knowledge organization and extraction in software engineering
title_full On domain knowledge organization and extraction in software engineering
title_fullStr On domain knowledge organization and extraction in software engineering
title_full_unstemmed On domain knowledge organization and extraction in software engineering
title_sort on domain knowledge organization and extraction in software engineering
publishDate 2017
url http://hdl.handle.net/10356/69477
_version_ 1759858164959805440