Proceedings ArticleDOI
Using latent semantic analysis to identify similarities in source code to support program understanding
Jonathan I. Maletic,Andrian Marcus +1 more
- pp 46-53
TLDR
The paper describes the results of applying Latent Semantic Analysis (LSA), an advanced information retrieval method, to program source code and associated documentation to assist in the understanding of a nontrivial software system, namely a version of Mosaic.Abstract:
The paper describes the results of applying Latent Semantic Analysis (LSA), an advanced information retrieval method, to program source code and associated documentation. Latent semantic analysis is a corpus based statistical method for inducing and representing aspects of the meanings of words and passages (of natural language) reflective in their usage. This methodology is assessed for application to the domain of software components (i.e., source code and its accompanying documentation). Here LSA is used as the basis to cluster software components. This clustering is used to assist in the understanding of a nontrivial software system, namely a version of Mosaic. Applying latent semantic analysis to the domain of source code and internal documentation for the support of program understanding is a new application of this method and a departure from the normal application domain of natural language.read more
Citations
More filters
Proceedings ArticleDOI
Recovering documentation-to-source-code traceability links using latent semantic indexing
TL;DR: The method presented proves to give good results by comparison and additionally it is a low cost, highly flexible method to apply with regards to preprocessing and/or parsing of the source code and documentation.
Book
Operating Systems: Design and Implementation
TL;DR: The author discusses the history and present situation of operating systems, as well as some of the techniques used to design and implement these systems.
Journal ArticleDOI
Semantic clustering: Identifying topics in source code
TL;DR: Semantic Clustering is introduced, a technique based on Latent Semantic Indexing and clustering to group source artifacts that use similar vocabulary that interpret them as linguistic topics that reveal the intention of the code.
Proceedings ArticleDOI
Identification of high-level concept clones in source code
TL;DR: The intention of the approach is to enhance and augment existing clone detection methods that are based on structural analysis and improve the quality of clone detection.
Proceedings ArticleDOI
Supporting program comprehension using semantic and structural information
TL;DR: Focuses on investigating the combined use of semantic and structural information of programs to support the comprehension tasks involved in the maintenance and reengineering of software systems.
References
More filters
Book
Principal Component Analysis
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
Journal ArticleDOI
Pattern Classification and Scene Analysis.
Book
Pattern classification and scene analysis
Richard O. Duda,Peter E. Hart +1 more
TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Book
Numerical Recipes in C: The Art of Scientific Computing
TL;DR: Numerical Recipes: The Art of Scientific Computing as discussed by the authors is a complete text and reference book on scientific computing with over 100 new routines (now well over 300 in all), plus upgraded versions of many of the original routines, with many new topics presented at the same accessible level.
Journal ArticleDOI
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.