Automatically categorizing software technologies

Informal language and the absence of a standard taxonomy for software technologies make it difficult to reliably analyze technology trends on discussion forums and other on-line venues. We propose an automated approach called Witt for the categorization of software technologies (an expanded version...

Full description

Saved in:
Bibliographic Details
Main Authors: NASSIF, Mathieu, TREUDE, Christoph, ROBILLARD, Martin P.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8784
https://ink.library.smu.edu.sg/context/sis_research/article/9787/viewcontent/tse18.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9787
record_format dspace
spelling sg-smu-ink.sis_research-97872024-05-30T08:57:07Z Automatically categorizing software technologies NASSIF, Mathieu TREUDE, Christoph ROBILLARD, Martin P. Informal language and the absence of a standard taxonomy for software technologies make it difficult to reliably analyze technology trends on discussion forums and other on-line venues. We propose an automated approach called Witt for the categorization of software technologies (an expanded version of the hypernym discovery problem). Witt takes as input a phrase describing a software technology or concept and returns a general category that describes it (e.g., integrated development environment), along with attributes that further qualify it (commercial, php, etc.). By extension, the approach enables the dynamic creation of lists of all technologies of a given type (e.g., web application frameworks). Our approach relies on Stack Overflow and Wikipedia, and involves numerous original domain adaptations and a new solution to the problem of normalizing automatically-detected hypernyms. We compared Witt with six independent taxonomy tools and found that, when applied to software terms, Witt demonstrated better coverage than all evaluated alternative solutions, without a corresponding degradation in false positive rate. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8784 info:doi/10.1109/TSE.2018.2836450 https://ink.library.smu.edu.sg/context/sis_research/article/9787/viewcontent/tse18.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Encyclopedias Electronic publishing Internet Taxonomy Tools Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software
Encyclopedias
Electronic publishing
Internet
Taxonomy
Tools
Software Engineering
spellingShingle Software
Encyclopedias
Electronic publishing
Internet
Taxonomy
Tools
Software Engineering
NASSIF, Mathieu
TREUDE, Christoph
ROBILLARD, Martin P.
Automatically categorizing software technologies
description Informal language and the absence of a standard taxonomy for software technologies make it difficult to reliably analyze technology trends on discussion forums and other on-line venues. We propose an automated approach called Witt for the categorization of software technologies (an expanded version of the hypernym discovery problem). Witt takes as input a phrase describing a software technology or concept and returns a general category that describes it (e.g., integrated development environment), along with attributes that further qualify it (commercial, php, etc.). By extension, the approach enables the dynamic creation of lists of all technologies of a given type (e.g., web application frameworks). Our approach relies on Stack Overflow and Wikipedia, and involves numerous original domain adaptations and a new solution to the problem of normalizing automatically-detected hypernyms. We compared Witt with six independent taxonomy tools and found that, when applied to software terms, Witt demonstrated better coverage than all evaluated alternative solutions, without a corresponding degradation in false positive rate.
format text
author NASSIF, Mathieu
TREUDE, Christoph
ROBILLARD, Martin P.
author_facet NASSIF, Mathieu
TREUDE, Christoph
ROBILLARD, Martin P.
author_sort NASSIF, Mathieu
title Automatically categorizing software technologies
title_short Automatically categorizing software technologies
title_full Automatically categorizing software technologies
title_fullStr Automatically categorizing software technologies
title_full_unstemmed Automatically categorizing software technologies
title_sort automatically categorizing software technologies
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/8784
https://ink.library.smu.edu.sg/context/sis_research/article/9787/viewcontent/tse18.pdf
_version_ 1814047529278373888