Showing papers in "International Journal of Software Engineering and Knowledge Engineering in 2015"

PDF

Open Access

Journal Article•DOI•

Human Resource Allocation in Software Project with Practical Considerations

[...]

Jihun Park¹, Dongwon Seo¹, Gwangui Hong¹, Donghwan Shin¹, Jimin Hwa¹, Doo-Hwan Bae¹ - Show less +2 more•Institutions (1)

KAIST¹

18 May 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: Software planning is very important for the success of a software project, even if the same developers work on the same project, the time span of the project and the quality of software may change.

...read moreread less

Abstract: Software planning is very important for the success of a software project. Even if the same developers work on the same project, the time span of the project and the quality of software may change ...

...read moreread less

21 citations

Journal Article•DOI•

A software reliability model for cloud-based software rejuvenation using dynamic fault trees

[...]

Jean Rahme¹, Haiping Xu¹•Institutions (1)

University of Massachusetts Dartmouth¹

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The dynamic fault tree (DFT) formalism is adopted and it is shown how cost-effective software rejuvenation schedules can be created to keep the system reliability consistently staying above a predefined critical level.

...read moreread less

Abstract: Correctly measuring the reliability and availability of a cloud-based system is critical for evaluating its system performance. Due to the promised high reliability of physical facilities provided for cloud services, software faults have become one of the major factors for the failures of cloud-based systems. In this paper, we focus on the software aging phenomenon where system performance may be progressively degraded due to exhaustion of system resources, fragmentation and accumulation of errors. We use a proactive technique, called software rejuvenation, to counteract the software aging problem. The dynamic fault tree (DFT) formalism is adopted to model the system reliability before and during a software rejuvenation process in an aging cloud-based system. A novel analytical approach is presented to derive the reliability function of a cloud-based Hot SPare (HSP) gate, which is further verified using Continuous Time Markov Chains (CTMC) for its correctness. We use a case study of a cloud-based system to illustrate the validity of our approach. Based on the reliability analytical results, we show how cost-effective software rejuvenation schedules can be created to keep the system reliability consistently staying above a predefined critical level.

...read moreread less

17 citations

Journal Article•DOI•

EFSM-Based Test Case Generation: Sequence, Data, and Oracle

[...]

Rui Yang¹, Zhenyu Chen¹, Zhiyi Zhang¹, Baowen Xu¹•Institutions (1)

Nanjing University¹

17 Sep 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: A survey on EFSM-based test case generation techniques in the last two decades is provided and several possible research areas in the future are presented.

...read moreread less

Abstract: Model-based testing has been intensively and extensively studied in the past decades. Extended Finite State Machine (EFSM) is a widely used model of software testing in both academy and industry. This paper provides a survey on EFSM-based test case generation techniques in the last two decades. All techniques in EFSM-based test case generation are mainly classified into three parts: test sequence generation, test data generation, and test oracle construction. The key challenges, such as coverage criterion and feasibility analysis in EFSM-based test case generation are discussed. Finally, we summarize the research work and present several possible research areas in the future.

...read moreread less

16 citations

Journal Article•DOI•

A Tool Supporting End-User Development of Access Control in Web Applications

[...]

Loredana Caruccio¹, Vincenzo Deufemia¹, Christopher D'Souza², Christopher D'Souza³, Athula Ginige², Giuseppe Polese¹ - Show less +2 more•Institutions (3)

University of Salerno¹, University of Western Sydney², Australian Catholic University³

21 Jun 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: End-user development (EUD) is drawing an increasing attention due to the necessity of users to frequently extend and personalize their applications.

...read moreread less

Abstract: End-user development (EUD) is drawing an increasing attention due to the necessity of users to frequently extend and personalize their applications. In particular, EUD in the context of Web (EUDWeb...

...read moreread less

15 citations

Journal Article•DOI•

Using Learning Styles of Software Professionals to Improve Their Inspection Team Performance

[...]

Anurag Goswami¹, Gursimran S. Walia¹, Abhinav Singh²•Institutions (2)

North Dakota State University¹, Indiana University²

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: Results showed inspection ability does not depend on educational background and technical knowledge, and LS’s can aid software managers to create high performance inspection team(s) and manage software quality.

...read moreread less

Abstract: Inspections of software artifacts during early software development aids managers to detect early faults that may be hard to find and fix later. Results showed inspection ability does not depend on educational background and technical knowledge. This paper presents the results from an industrial empirical study, wherein the Learning Styles (i.e. ability to perceive and process information) of individual inspectors were manipulated to measure its impact on the fault detection effectiveness of inspection teams. Using inspection data from professional developers, we developed virtual teams with varying LS’s of individual inspectors and analyzed the team performance. The results from the current study show that teams of inspectors with diverse LS’s are significantly more effective at detecting faults as compared to teams of inspectors with similar LS’s. Therefore, LS’s can aid software managers to create high performance inspection team(s) and manage software quality.

...read moreread less

13 citations

Journal Article•DOI•

Linear Software Models: Decoupled Modules from Modularity Matrix Eigenvectors

[...]

Iaakov Exman¹•Institutions (1)

Jerusalem College of Engineering, Chennai¹

01 Oct 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The idea that outliers not only indicate the need for system redesign, but explicitly point out to problematic design spots is illustrated, which extends the applicability of linear algebra spectral methods to Modularity Matrices, at higher software abstraction levels than previously shown.

...read moreread less

Abstract: Modularity Matrices for software systems can be put in block-diagonal form, where blocks are higher-level software modules, in a hierarchy of modules. But the exact module boundaries are often blurred by the uncertainty whether given matrix elements are module members or outliers. This paper provides an algorithm to determine module sizes. As a consequence the algorithm also decides which matrix elements are outliers. Matrix elements are weighted by their Affinity — an exponential function of the off-diagonality. The module size is given by the positive consecutive elements of the eigenvectors corresponding to the largest eigenvalues of this weighted symmetrized Modularity Matrix. By means of case studies, we illustrate the idea that outliers not only indicate the need for system redesign, but explicitly point out to problematic design spots. This work extends the applicability of linear algebra spectral methods to Modularity Matrices, at higher software abstraction levels than previously shown.

...read moreread less

12 citations

Journal Article•DOI•

A Study of Cross-National Differences in Happiness Factors Using Machine Learning Approach

[...]

Theresia Ratih Dewi Saputri¹, Seok-Won Lee¹•Institutions (1)

Ajou University¹

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: National happiness has been actively studied throughout the past years and the factors used in this work include both physical needs an...

...read moreread less

Abstract: National happiness has been actively studied throughout the past years. The happiness factor varies due to different human perspectives. The factors used in this work include both physical needs an...

...read moreread less

9 citations

Journal Article•DOI•

Working and Playing with Scrum

[...]

Danilo B. Medeiros¹, Pedro de Alcântara dos Santos Neto¹, Erick Baptista Passos², Wandresson De Souza Araújo•Institutions (2)

Federal University of Piauí¹, International Federation of the Phonographic Industry²

30 Oct 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: This work shows a suggestion of Scrum gamification together with an evaluation of the proposed approach in a case study of a software house, in order to change its use to a more amusing task, by taking advantage of the gamification trend.

...read moreread less

Abstract: Software development is sometimes considered a boring task. To avoid this fact we propose an approach based on the incorporation of game mechanics into Scrum framework, in order to change its use to a more amusing task, by taking advantage of the gamification trend. Gamification is applied to non-game applications and processes, trying to encourage people to adopt them. This work shows a suggestion of Scrum gamification together with an evaluation of the proposed approach in a case study of a software house. The use of this concept can help the software industry to increase the team productivity in a natural way.

...read moreread less

9 citations

Journal Article•DOI•

Comparison of Design Models: A Systematic Mapping Study

[...]

Lucian José Gonçales¹, Kleinner Farias¹, Murilo Scholl¹, Mauricio Roberto Veronez¹, Toacy C. Oliveira² - Show less +1 more•Institutions (2)

Universidade do Vale do Rio dos Sinos¹, Federal University of Rio de Janeiro²

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: Fine-grained techniques are still required to support ever-present and complex model comparison tasks during the evolution of design models.

...read moreread less

Abstract: Context: Model comparison plays a central role in many software engineering activities. However, a comprehensive understanding about the state-of-the-art is still required. Goal: This paper aims at classifying and performing a thematic analysis of the current literature. Method: For this, we have followed well-established empirical guidelines to define and perform a systematic mapping study. Results: Some studies (14 out of 40) provide generic model comparison techniques, rather than specific ones for UML diagrams. Conclusion: Fine-grained techniques are still required to support ever-present and complex model comparison tasks during the evolution of design models.

...read moreread less

9 citations

Journal Article•DOI•

Efficient Mining of Data Streams Using Associative Classification Approach

[...]

K. Prasanna Lakshmi¹, Ramesh Kumar Cherku²•Institutions (2)

Gokaraju Rangaraju Institute of Engineering and Technology¹, Mahatma Gandhi Institute of Technology²

05 Aug 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: This paper proposes PSTMiner, which considers the nature of data streams and provides an efficient classifier for predicting the class label of real data streams, and proposes a compact novel tree structure called PSTree (Prefix Streaming Tree) for storing data.

...read moreread less

Abstract: Data stream associative classification poses many challenges to the data mining community. In this paper, we address four major challenges posed, namely, infinite length, extraction of knowledge with single scan, processing time, and accuracy. Since data streams are infinite in length, it is impractical to store and use all the historical data for training. Mining such streaming data for knowledge acquisition is a unique opportunity and even a tough task. A streaming algorithm must scan data once and extract knowledge. While mining data streams, processing time, and accuracy have become two important aspects. In this paper, we propose PSTMiner which considers the nature of data streams and provides an efficient classifier for predicting the class label of real data streams. It has greater potential when compared with many existing classification techniques. Additionally, we propose a compact novel tree structure called PSTree (Prefix Streaming Tree) for storing data. Extensive experiments conducted on 24 real datasets from UCI repository and synthetic datasets from MOA (Massive Online Analysis) show that PSTMiner is consistent. Empirical results show that performance of PSTMiner is highly competitive in terms of accuracy and performance time when compared with other approaches under windowed streaming model.

...read moreread less

8 citations

Journal Article•DOI•

On the Stability of Feature Selection Methods in Software Quality Prediction: An Empirical Investigation

[...]

Huanjing Wang¹, Taghi M. Khoshgoftaar², Naeem Seliya•Institutions (2)

Western Kentucky University¹, Florida Atlantic University²

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: Results demonstrate that ReliefF (RF) is the most stable feature selection method and wrapper based feature subset selection shows least stability, and as the overlap of partitions increased, the stability of the feature selection strategies increased.

...read moreread less

Abstract: Software quality modeling is the process of using software metrics from previous iterations of development to locate potentially faulty modules in current under-development code. This has become an important part of the software development process, allowing practitioners to focus development efforts where they are most needed. One difficulty encountered in software quality modeling is the problem of high dimensionality, where the number of available software metrics is too large for a classifier to work well. In this case, many of the metrics may be redundant or irrelevant to defect prediction results, thereby selecting a subset of software metrics that are the best predictors becomes important. This process is called feature (metric) selection. There are three major forms of feature selection: filter-based feature rankers, which uses statistical measures to assign a score to each feature and present the user with a ranked list; filter-based feature subset evaluation, which uses statistical measures on feature subsets to find the best feature subset; and wrapper-based subset selection, which builds classification models using different subsets to find the one which maximizes performance. Software practitioners are interested in which feature selection methods are best at providing the most stable feature subset in the face of changes to the data (here, the addition or removal of instances). In this study we select feature subsets using fifteen feature selection methods and then use our newly proposed Average Pairwise Tanimoto Index (APTI) to evaluate the stability of the feature selection methods. We evaluate the stability of feature selection methods on a pair of subsamples generated by our fixed-overlap partitions algorithm. Four different levels of overlap are considered in this study. 13 software metric datasets from two real-world software projects are used in this study. Results demonstrate that ReliefF (RF) is the most stable feature selection method and wrapper based feature subset selection shows least stability. In addition, as the overlap of partitions increased, the stability of the feature selection strategies increased.

...read moreread less

Journal Article•DOI•

A Proposal for the Improvement of Project's Cost Predictability Using Earned Value Management and Historical Data of Cost — An Empirical Study

[...]

Adler Diniz de Souza¹, Ana Regina Rocha¹, Djenane Cristina Silveira dos Santos²•Institutions (2)

Federal University of Rio de Janeiro¹, Universidade Federal de Itajubá²

18 May 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The proposed EVM technique through the integration of historical cost performance data of processes as a means to improve the project's cost predictability was more accurate and more precise than the traditional technique for calculating the Cost Performance Index (CPI) and Estimates at Completion (EAC).

...read moreread less

Abstract: Although the Earned Value Management (EVM) technique has been used by several companies in various industrial sectors (software development, construction, aerospace, aeronautics, among others) for over 35 years to predict time and cost outcomes, many studies have found vulnerabilities, including: (i) cost performance data do not always have normal distribution, which makes reliable projections difficult; (ii) instability of cost performance indexes during the execution of projects, (iii) there is a worsening tendency in cost performance indexes when project approaches termination. This paper proposes an extension of the EVM technique through the integration of historical cost performance data of processes as a means to improve the project's cost predictability. The proposed technique was evaluated through an empirical study, which evaluated the implementation of the proposed technique in 22 software development projects. The proposed technique has been applied in real projects with the aim of evaluating the accuracy and variation compared to the traditional technique. Hypotheses tests with 95% significance level were performed, and the proposed technique was more accurate and more precise than the traditional technique for calculating the Cost Performance Index (CPI) and Estimates at Completion (EAC).

...read moreread less

Journal Article•DOI•

Reconstructing Software High-Level Architecture by Clustering Weighted Directed Class Graph

[...]

Dehong Qiu¹, Qifeng Zhang¹, Shaohong Fang¹•Institutions (1)

Huazhong University of Science and Technology¹

17 Sep 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: WDCG-CA makes full use of the structural and quantitative information of WDCG, and avoids wrong compositions and arbitrary partitions successfully in the process of reconstructing software architecture in most cases in terms of the four metrics.

...read moreread less

Abstract: Software architecture reconstruction plays an important role in software reuse, evolution and maintenance. Clustering is a promising technique for software architecture reconstruction. However, the representation of software, which serves as clustering input, and the clustering algorithm need to be improved in real applications. The representation should contain appropriate and adequate information of software. Furthermore, the clustering algorithm should be adapted to the particular demands of software architecture reconstruction well. In this paper, we first extract Weighted Directed Class Graph (WDCG) to represent object-oriented software. WDCG is a structural and quantitative representation of software, which contains not only the static information of software source code but also the dynamic information of software execution. Then we propose a WDCG-based Clustering Algorithm (WDCG-CA) to reconstruct high-level software architecture. WDCG-CA makes full use of the structural and quantitative information of WDCG, and avoids wrong compositions and arbitrary partitions successfully in the process of reconstructing software architecture. We introduce four metrics to evaluate the performance of WDCG-CA. The results of the comparative experiments show that WDCG-CA outperforms the comparative approaches in most cases in terms of the four metrics.

...read moreread less

Journal Article•DOI•

Evaluating the Feasibility of MAX: A Method Using Cards and a Board for Assessing the Post-Use UX

[...]

Emanuelle Cavalcante¹, Luis Rivero¹, Tayana Conte¹•Institutions (1)

Federal University of Amazonas¹

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The method for the Assessment of eXperience (MAX), which through cards and a board assists software engineers in gathering UX data while motivating users to report their experience, showed that the method is useful for evaluating the UX of finished/prototyped applications from the point of view of users and software engineers.

...read moreread less

Abstract: User Experience (UX) is an important attribute for the success and quality of a software application. UX explores how an application is used and the emotional and behavioral consequences of such use. Although several UX evaluation methods allow understanding the reasons for a poor UX, some of them are tedious or too intrusive, making the evaluation unpleasant. This paper presents the Method for the Assessment of eXperience (MAX), which through cards and a board assists software engineers in gathering UX data while motivating users to report their experience. We conducted two pilot studies to verify the feasibility of MAX, which showed that the method is useful for evaluating the UX of finished/prototyped applications from the point of view of users and software engineers.

...read moreread less

Journal Article•DOI•

Aggregating Data Sampling with Feature Subset Selection to Address Skewed Software Defect Data

[...]

Kehan Gao¹, Taghi M. Khoshgoftaar², Amri Napolitano²•Institutions (2)

Eastern Connecticut State University¹, Florida Atlantic University²

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: Three data preprocessing approaches, in which feature selection is combined with data sampling, to overcome high dimensionality and class imbalance in the context of software quality estimation are investigated.

...read moreread less

Abstract: Defect prediction is an important process activity frequently used for improving the quality and reliability of software products. Defect prediction results provide a list of fault-prone modules which are necessary in helping project managers better utilize valuable project resources. In the software quality modeling process, high dimensionality and class imbalance are the two potential problems that may exist in data repositories. In this study, we investigate three data preprocessing approaches, in which feature selection is combined with data sampling, to overcome these problems in the context of software quality estimation. These three approaches are: Approach 1 — sampling performed prior to feature selection, but retaining the unsampled data instances; Approach 2 — sampling performed prior to feature selection, retaining the sampled data instances; and Approach 3 — sampling performed after feature selection. A comparative investigation is presented for evaluating the three approaches. In the experiments, we employed three sampling methods (random undersampling, random oversampling, and synthetic minority oversampling), each combined with a filter-based feature subset selection technique called correlation-based feature selection. We built the defect prediction models using five common classification algorithms. The case study was based on software metrics and defect data collected from multiple releases of a real-world software system. The results demonstrated that the type of sampling methods used in data preprocessing significantly affected the performance of the combination approaches. It was found that when the random undersampling technique was used, Approach 1 performed better than the other two approaches. However, when the feature selection technique was used in conjunction with an oversampling method (random oversampling or synthetic minority oversampling), we strongly recommended Approach 3.

...read moreread less

Journal Article•DOI•

An Empirical Investigation on Wrapper-Based Feature Selection for Predicting Software Quality

[...]

Huanjing Wang¹, Taghi M. Khoshgoftaar², Amri Napolitano²•Institutions (2)

Western Kentucky University¹, Florida Atlantic University²

18 May 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: This paper investigated thirty wrapper-based feature selection methods to remove irrelevant and redundant software metrics used for building defect predictors and demonstrated that Best Arithmetic Mean is the best performance metric used within the wrapper.

...read moreread less

Abstract: The basic measurements for software quality control and management are the various project and software metrics collected at various states of a software development life cycle. The software metrics may not all be relevant for predicting the fault proneness of software components, modules, or releases. Thus creating the need for the use of feature (software metric) selection. The goal of feature selection is to find a minimum subset of attributes that can characterize the underlying data with results as well as, or even better than the original data when all available features are considered. As an example of inter-disciplinary research (between data science and software engineering), this study is unique in presenting a large comparative study of wrapper-based feature (or attribute) selection techniques for building defect predictors. In this paper, we investigated thirty wrapper-based feature selection methods to remove irrelevant and redundant software metrics used for building defect predictors. In this study, these thirty wrappers vary based on the choice of search method (Best First or Greedy Stepwise), leaner (Naive Bayes, Support Vector Machine, and Logistic Regression), and performance metric (Overall Accuracy, Area Under ROC (Receiver Operating Characteristic) Curve, Area Under the Precision-Recall Curve, Best Geometric Mean, and Best Arithmetic Mean) used in the defect prediction model evaluation process. The models are trained using the three learners and evaluated using the five performance metrics. The case study is based on software metrics and defect data collected from a real world software project. The results demonstrate that Best Arithmetic Mean is the best performance metric used within the wrapper. Naive Bayes performed significantly better than Logistic Regression and Support Vector Machine as a wrapper learner on slightly and less imbalanced datasets. We also recommend Greedy Stepwise as a search method for wrappers. Moreover, comparing to models built with full datasets, the performances of defect prediction models can be improved when metric subsets are selected through a wrapper subset selector.

...read moreread less

Journal Article•DOI•

Investigating Two Approaches for Adding Feature Ranking to Sampled Ensemble Learning for Software Quality Estimation

[...]

Kehan Gao¹, Taghi M. Khoshgoftaar², Amri Napolitano²•Institutions (2)

Eastern Connecticut State University¹, Florida Atlantic University²

18 May 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The experimental results demonstrate that feature selection is important and needed prior to the learning process, and the ensemble feature ranking method generally has better or similar performance than the average of the base ranking techniques, and more importantly, the ensemble method exhibits better robustness than most baseranking techniques.

...read moreread less

Abstract: Defect prediction is very challenging in software development practice. Classification models are useful tools that can help for such prediction. Classification models can classify program modules into quality-based classes, e.g. fault-prone (fp) or not-fault-prone (nfp). This facilitates the allocation of limited project resources. For example, more resources are assigned to program modules that are of poor quality or likely to have a high number of faults based on the classification. However, two main problems, high dimensionality and class imbalance, affect the quality of training datasets and therefore classification models. Feature selection and data sampling are often used to overcome these problems. Feature selection is a process of choosing the most important attributes from the original dataset. Data sampling alters the dataset to change its balance level. Another technique, called boosting (building multiple models, with each model tuned to work better on instances misclassified by previous models), is found to also be effective for resolving the class imbalance problem. In this study, we investigate an approach for combining feature selection with this ensemble learning (boosting) process. We focus on two different scenarios: feature selection performed prior to the boosting process and feature selection performed inside the boosting process. Ten individual base feature ranking techniques, as well as an ensemble ranker based on the ten, are examined and compared over the two scenarios. We also employ the boosting algorithm to construct classification models without performing feature selection and use the results as the baseline for further comparison. The experimental results demonstrate that feature selection is important and needed prior to the learning process. In addition, the ensemble feature ranking method generally has better or similar performance than the average of the base ranking techniques, and more importantly, the ensemble method exhibits better robustness than most base ranking techniques. As for the two scenarios, the results show that applying feature selection inside boosting performs better than using feature selection prior to boosting.

...read moreread less

Journal Article•DOI•

Using neural networks to forecast available system resources: an approach and empirical investigation

[...]

Yun-Fei Jia¹, Zhi Quan Zhou², Ke-Xian Xue, Lei Zhao, Kai-Yuan Cai³ - Show less +1 more•Institutions (3)

Civil Aviation University of China¹, Information Technology University², Beihang University³

17 Sep 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: A neural network approach is proposed to first investigate the relationship between available system resources and system workload and then to forecast future available system Resources under the real-world situation where workload changes dynamically over time.

...read moreread less

Abstract: Software aging refers to the phenomenon that software systems show progressive performance degradation or a sudden crash after longtime execution. It has been reported that this phenomenon is closely related to the exhaustion of system resources. This paper quantitatively studies available system resources under the real-world situation where workload changes dynamically over time. We propose a neural network approach to first investigate the relationship between available system resources and system workload and then to forecast future available system resources. Experimental results on data sets collected from real-world computer systems demonstrate that the proposed approach is effective.

...read moreread less

Journal Article•DOI•

Feature-Level Change Impact Analysis Using Formal Concept Analysis

[...]

Hamzeh Eyal Salman¹, Abdelhak-Djamel Seriai, Christophe Dony•Institutions (1)

University of Montpellier¹

18 May 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: A feature-level CIA approach using Formal Concept Analysis (FCA) applied to SPL evolution is proposed and the effectiveness of the technique is shown in terms of the most commonly used metrics on the subject.

...read moreread less

Abstract: Software Product Line Engineering (SPLE) is a systematic reuse approach to develop a short time-to-market and quality products, called Software Product Line (SPL). Usually, a SPL is not developed from scratch but it is developed by reusing features (resp. their implementing source code elements) of existing similar systems previously developed by ad-hoc reuse techniques. The features implementations that are reused may be changed for developing new products (SPL) using SPLE. Any code element can be a part of (shared by) different features implementations; modifying one feature's implementation can thus impact others. Therefore, feature-level Change Impact Analysis (CIA) is important to predict affected features for change management purpose. In this paper, we propose a feature-level CIA approach using Formal Concept Analysis (FCA) applied to SPL evolution. In our experimental evaluation using three case studies of different domains and sizes, we show the effectiveness of our technique in terms of the most commonly used metrics on the subject.

...read moreread less

Journal Article•DOI•

The CoCoME Platform: A Research Note on Empirical Studies in Information System Evolution

[...]

Robert Heinrich¹, Stefan Gärtner², Tom-Michael Hesse³, Thomas Ruhroth⁴, Ralf Reussner¹, Kurt Schneider², Barbara Paech³, Jan Jürjens⁴ - Show less +4 more•Institutions (4)

Karlsruhe Institute of Technology¹, Leibniz University of Hanover², Heidelberg University³, Technical University of Dortmund⁴

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: CoMEP is designed, a platform for supporting collaboration in empirical research on software evolution by shared knowledge, and lessons learned from the application of the platform in a large research programme are reported.

...read moreread less

Abstract: Methods for supporting evolution of software-intensive systems are a competitive edge in software engineering as software is often operated over decades. Empirical research is useful to validate the effectiveness of these methods. However, empirical studies on software evolution are rarely comprehensive and hardly replicable. Collaboration may prevent these shortcomings. We designed CoCoMEP — a platform for supporting collaboration in empirical research on software evolution by shared knowledge. We report lessons learned from the application of the platform in a large research programme.

...read moreread less

Journal Article•DOI•

Reusable Solutions for Implementing Usability Functionalities

[...]

Francy D. Rodríguez¹, Silvia T. Acuña², Natalia Juristo¹•Institutions (2)

Technical University of Madrid¹, Autonomous University of Madrid²

17 Sep 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: This paper presents three reusable solutions at detailed design and programming level in order to effectively implement the Abort Operation, Progress Feedback and Preferences usability functionalities in web applications.

...read moreread less

Abstract: Usability is a software system quality attribute. Although software engineers originally considered usability to be related exclusively to the user interface, it was later found to affect the core functionality of software applications. As of then, proposals for addressing usability at different stages of the software development cycle were researched. The objective of this paper is to present three reusable solutions at detailed design and programming level in order to effectively implement the Abort Operation, Progress Feedback and Preferences usability functionalities in web applications. To do this, an inductive research method was applied. We developed three web applications including the above usability functionalities as case studies. We looked for commonalities across the implementations in order to induce a general solution. The elements common to all three developed applications include: application scenarios, functionalities, responsibilities, classes, methods, attributes and code snippets. The findings were specified as an implementation-oriented design pattern and as programming patterns in three languages. Additional case studies were conducted in order to validate the proposed solution. The independent developers used the patterns to implement different applications for each case study. As a result, we found that solutions specified as patterns can be reused to develop web applications.

...read moreread less

Journal Article•DOI•

Web Service Clustering Using Relational Database Approach

[...]

Jianxiao Liu¹, Jianxiao Liu², Feng Liu², Xiaoxia Li², Keqing He³, Yutao Ma³, Jian Wang³ - Show less +3 more•Institutions (3)

Nanjing University of Information Science and Technology¹, Huazhong Agricultural University², Wuhan University³

01 Oct 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The self-join operation in relational database (RDB) to realize Web service clustering uses the semantic reasoning relationship between concepts and the concept status path to do the calculation, which can improve the calculation accuracy.

...read moreread less

Abstract: In the era of service-oriented software engineering (SOSE), service clustering is used to organize Web services, and it can help to enhance the efficiency and accuracy of service discovery. In order to improve the efficiency and accuracy of service clustering, this paper uses the self-join operation in relational database (RDB) to realize Web service clustering. Based on storing service information, it does the self-join operation towards the Input, Output, Precondition, Effect (IOPE) tables of Web services, which can enhance the efficiency of computing services similarity. The semantic reasoning relationship between concepts and the concept status path are used to do the calculation, which can improve the calculation accuracy. Finally, we use experiments to validate the effectiveness of the proposed methods.

...read moreread less

Journal Article•DOI•

Cloud Computing Research Analysis Using Bibliometric Method

[...]

Yuanyuan Cai¹, Wei Lu¹, Liqiang Wang², Weiwei Xing¹•Institutions (2)

Beijing Jiaotong University¹, University of Wyoming²

05 Aug 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: A bibliometric-based approach is presented and implemented to quantitatively review the progress in global cloud computing research with the related literature during 2007–2013 from the databases of Science Citation Index Expanded (SCI-E), Conference Proceedings Citation Index–Science (CPCI-S), and IEEEXplore.

...read moreread less

Abstract: Cloud computing has been a mainstream solution for the processing and storage of mass data, as well as an exciting area for research. As a novel business model, cloud computing has dramatically changed the provision of services and IT capacity by means of the advanced techniques. In recent years, with the increasing research interests and rapid growth of publications, some review papers provide detailed analysis on cloud computing area. In this paper, a bibliometric-based approach is presented and implemented to quantitatively review the progress in global cloud computing research with the related literature during 2007–2013 from the databases of Science Citation Index Expanded (SCI-E), Conference Proceedings Citation Index–Science (CPCI-S), and IEEEXplore. Our work is motivated by the purpose of tracing global advancement in terms of research content, geographic distribution and issue time of the related publications, rather than a specific technological area in cloud computing research. By investigating the characteristics of publications such as keywords, output, geographic distribution and affiliation, we draw some valuable conclusions to guide the further research. The experimental results show that the top 5 active research points of cloud computing concentrate on virtualization, security, mobile cloud, distributed computing, and scheduling. From the location-time aspect, China, USA, and India have published most of the papers, dominate cloud computing research and keep a high level on the international research cooperation. And there is a great increase in publication outputs especially in China and USA. Meanwhile, the analysis results demonstrate the top 3 high-cited research institutes of the University of Melbourne, University of California. Berkeley and University of Vienna in cloud computing research. The mobile cloud will be a future research hotspot and promising application field.

...read moreread less

Journal Article•DOI•

Reliable and Secure Distributed Cloud Data Storage Using Reed-Solomon Codes

[...]

Haiping Xu¹, Deepti Bhalerao¹•Institutions (1)

University of Massachusetts Dartmouth¹

01 Nov 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: This paper proposes a reliable and secure distributed cloud data storage schema using Reed-Solomon codes that relies on multiple cloud service providers (CSP), and protects users’ cloud data from the client side, and demonstrates the feasibility of the approach.

...read moreread less

Abstract: Despite the popularity and many advantages of using cloud data storage, there are still major concerns about the data stored in the cloud, such as security, reliability and confidentiality. In this paper, we propose a reliable and secure distributed cloud data storage schema using Reed-Solomon codes. Different from existing approaches to achieving data reliability with redundancy at the server side, our proposed mechanism relies on multiple cloud service providers (CSP), and protects users’ cloud data from the client side. In our approach, we view multiple cloud-based storage services as virtual independent disks for storing redundant data encoded with erasure codes. Since each CSP has no access to a user’s complete data, the data stored in the cloud would not be easily compromised. Furthermore, the failure or disconnection of a CSP will not result in the loss of a user’s data as the missing data pieces can be readily recovered. To demonstrate the feasibility of our approach, we developed a prototype distributed cloud data storage application using three major CSPs. The experimental results show that, besides the reliability and security related benefits of our approach, the application outperforms each individual CSP for uploading and downloading files.

...read moreread less

Journal Article•DOI•

[...]

Dehong Qiu¹, Jialin Sun¹, Hao Li¹•Institutions (1)

Huazhong University of Science and Technology¹

23 Dec 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: Experiments demonstrate that the CFG-Match approach outperforms the comparative approaches in the detection of Java program plagiarism and is more accurate and robust against semantics-preserving transformations.

...read moreread less

Abstract: Measuring program similarity plays an important role in solving many problems in software engineering. However, because programs are instruction sequences with complex structures and semantic functions and furthermore, programs may be obfuscated deliberately through semantics-preserving transformations, measuring program similarity is a difficult task that has not been adequately addressed. In this paper, we propose a new approach to measuring Java program similarity. The approach first measures the low-level similarity between basic blocks according to the bytecode instruction sequences and the structural property of the basic blocks. Then, an error-tolerant graph matching algorithm that can combat structure transformations is used to match the Control Flow Graphs (CFG) based on the basic block similarity. The high-level similarity between Java programs is subsequently calculated on the matched pairs of the independent paths extracted from the optimal CFG matching. The proposed CFG-Match approach is compared with a string-based approach, a tree-based approach and a graph-based approach. Experimental results show that the CFG-Match approach is more accurate and robust against semantics-preserving transformations. The CFG-Match approach is used to detect Java program plagiarism. Experiments on the collection of benchmark program pairs collected from the students’ submission of project assignments demonstrate that the CFG-Match approach outperforms the comparative approaches in the detection of Java program plagiarism.

...read moreread less

Journal Article•DOI•

Reliability and Performance Analysis of Architecture-Based Software Implementing Restarts and Retries Subject to Correlated Component Failures

[...]

Xiao-Dan Li¹, Yongfeng Yin¹, Lance Fiondella²•Institutions (2)

Beihang University¹, University of Massachusetts Amherst²

01 Oct 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: This paper presents an architecture-based model of software reliability and performance that explicitly considers a two-stage fault recovery mechanism implementing component restarts and application-level retries and suggests that the model can be used to quantify the impact of software fault recovery and correlated component failures on application reliability andperformance.

...read moreread less

Abstract: High reliability and performance are essential attributes of software systems designed for critical real-time applications. To improve the reliability and performance of software, many systems incorporate some form of fault recovery mechanism. However, contemporary models of software reliability and performance rarely consider these fault recovery mechanisms. Another notable shortcoming of many software models is that they make the simplifying assumption that component failures are statistically independent, which disagrees with several experimental studies that have shown that the failures of software components can exhibit correlation. This paper presents an architecture-based model of software reliability and performance that explicitly considers a two-stage fault recovery mechanism implementing component restarts and application-level retries. The application architecture is characterized by a Discrete Time Markov Chain (DTMC) to represent the dynamic branching behavior of control between the components of the application. Correlations between the component failures are computed with an efficient numerical algorithm for a multivariate Bernoulli (MVB) distribution. We illustrate the utility of the model through a case study of an embedded software application. The results suggest that the model can be used to quantify the impact of software fault recovery and correlated component failures on application reliability and performance.

...read moreread less

Journal Article•DOI•

Specification Patterns: Formal and Easy

[...]

Fernando Asteasuain, Victor Braberman

17 Sep 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The Featherweight Visual Scenarios (FVS) as mentioned in this paper is a declarative and graphical language based on scenarios, which is a possible alternative to specify behavioral properties and is better suited for validation tasks.

...read moreread less

Abstract: Property specification is still one of the most challenging tasks for transference of software verification technology. The use of patterns has been proposed in order to hide the complicated handling of formal languages from the developer. However, this goal is not entirely satisfied. When validating the desired property the developer may have to deal with the pattern representation in some particular formalism. For this reason, we identify four desirable quality attributes for the underlying specification language: succinctness, comparability, complementariness, and modifiability. We show that typical formalisms such as temporal logics or automata fail at some extent to support these features. Given this context we introduce Featherweight Visual Scenarios (FVS), a declarative and graphical language based on scenarios, as a possible alternative to specify behavioral properties. We illustrate FVS applicability by modeling all the specification patterns and we thoroughly compare FVS to other known approaches, showing that FVS specifications are better suited for validation tasks. In addition, we augment pattern specification by introducing the concept of violating behavior. Finally we characterize the type of properties that can be written in FVS and we formally introduce its syntax and semantics.

...read moreread less

Journal Article•DOI•

A Methodology to Analyze Multi-Agent Systems Modeled in High Level Petri Nets

[...]

Lily Chang¹, Xudong He²•Institutions (2)

University of Wisconsin–Platteville¹, Florida International University²

23 Dec 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The translation rules that translate the multi-agent model to an executable PROMELA model are formally defined, and the translation with an example is demonstrated.

...read moreread less

Abstract: This paper presents a methodology for analyzing multi-agent systems modeled in nested predicate transition nets. The objective is to automate the model analysis for complex systems, and provide a foundation for tool development. We formally define the translation rules that translate the multi-agent model to an executable PROMELA model, and demonstrate the translation with an example.

...read moreread less

Journal Article•DOI•

IPR Centered Institutional Service and Tools for Content and Metadata Management

[...]

Pierfrancesco Bellini¹, Ivan Bruno¹, Paolo Nesi¹, Michela Paolucci¹•Institutions (1)

University of Florence¹

01 Oct 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: The main contributions of this paper consist of the formalization of IPR Model that enable the shortening of the activities for the IPR resolution, and avoid the assignment of conflicting rights/permissions during IPR model formalization, and thus of licensing.

...read moreread less

Abstract: Multimedia services of cultural institutions need to be supported by content, metadata and workflow management systems to efficiently manage huge amount of content items and metadata production. Online digital libraries and cultural heritage institutions, as well as portals of publishers need an integrated multimedia back office in order to aggregate content collection and provide them to national and international aggregators, with respect to Intellectual Property Rights, IPR. The aim of this paper is to formalize and discuss requirements, modeling, design and validation of an institutional aggregator for metadata and content, coping with IPR Models for conditional access and providing content towards Europeana, the European international aggregator. This paper presents the identification of the Content Aggregator requirements for content management and IPR, and thus the definition and realization of a corresponding distributed architecture and workflow solution satisfying them. The main contributions of this paper consist of the formalization of IPR Model that enable the shortening of the activities for the IPR resolution, and avoid the assignment of conflicting rights/permissions during IPR model formalization, and thus of licensing. The proposed solution, models and tools have been validated in the case of the ECLAP service and results are reported in the paper. ECLAP Content Aggregator has been established by the European Commission to serve Europeana for the thematic area of Performing Arts institutions.

...read moreread less

Journal Article•DOI•

Mining Class Temporal Specification Dynamically Based on Extended Markov Model

[...]

Deng Chen¹, Deng Chen², Rubing Huang³, Binbin Qu¹, Sheng Jiang¹, Jianping Ju⁴ - Show less +2 more•Institutions (4)

Huazhong University of Science and Technology¹, Wuhan Institute of Technology², Jiangsu University³, Hubei University of Technology⁴

05 Aug 2015-International Journal of Software Engineering and Knowledge Engineering

TL;DR: To the best of the knowledge, this is the first work of learning specifications from object-oriented programs dynamically based on probabilistic models and it learns specifications in an online mode, which can refine existing models continuously.

...read moreread less

Abstract: Class temporal specification is a kind of important program specifications especially for object-oriented programs, which specifies that interface methods of a class should be called in a particular sequence. Currently, most existing approaches mine this kind of specifications based on finite state automaton. Observed that finite state automaton is a kind of deterministic models with inability to tolerate noise. In this paper, we propose to mine class temporal specifications relying on a probabilistic model extending from Markov chain. To the best of our knowledge, this is the first work of learning specifications from object-oriented programs dynamically based on probabilistic models. Different from similar works, our technique does not require annotating programs. Additionally, it learns specifications in an online mode, which can refine existing models continuously. Above all, we talk about problems regarding noise and connectivity of mined models and a strategy of computing thresholds is proposed to resolve them. To investigate our technique's feasibility and effectiveness, we implemented our technique in a prototype tool ISpecMiner and used it to conduct several experiments. Results of the experiments show that our technique can deal with noise effectively and useful specifications can be learned. Furthermore, our method of computing thresholds provides a strong assurance for mined models to be connected.

...read moreread less