Software visualization using topic models

Document Type

Conference Proceeding

Publication Date

1-1-2018

Abstract

Latent Direchlet Allocation (LDA) is a statistical topic modeling approach that has been used to support several software engineering activities. The main assumption is that LDA ofiers a unique insight into the semantic content of software systems, thus revealing otherwise unseen relations between software artifacts. However, a main problem when dealing with LDA is the complexity of its output. In particular, the numerical probabilistic distributions produced by LDA to represent topics and documents are not intuitive to understand and rationalize. To address this problem, in this paper we present a topic modeling based approach to visualize software systems based on LDA. We also present several visualizations to represent the basic elements of LDA including words, topics, and documents. These difierent basic views are combined through a set of integration links to enable users to effectively explore software systems by supporting knowledge discovery at different levels of abstraction. We also demonstrate how the topic modeling based visualization approach can provide support to several software engineering activities such as program comprehension, software clustering, and code evolution analysis.

Publication Title

Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE

Volume

2018-July

First Page

409

Last Page

414

Digital Object Identifier (DOI)

10.18293/SEKE2018-194

ISSN

23259000

E-ISSN

23259086

ISBN

1891706446

Share

COinS