scispace - formally typeset
Search or ask a question
Author

SchmidAngelika

Bio: SchmidAngelika is an academic researcher from IBM. The author has contributed to research in topics: Software & Open-source software development. The author has co-authored 1 publications.

Papers
More filters
Journal ArticleDOI
TL;DR: Many open-source software projects depend on a few core developers, who take over both the bulk of coordination and programming tasks as mentioned in this paper, and they are supported by peripheral developers who contribute e
Abstract: Many open-source software projects depend on a few core developers, who take over both the bulk of coordination and programming tasks. They are supported by peripheral developers, who contribute ei...

5 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: It is shown how a set of heterogeneous networks composed of social and technical entities can be jointly embedded in a single vector space enabling mathematically sound comparisons between distinct software projects.
Abstract: Software development is at the intersection of the social realm, involving people who develop the software, and the technical realm, involving artifacts (code, docs, etc.) that are being produced. It has been shown that a socio-technical perspective provides rich information about the state of a software project. In particular, we are interested in socio-technical factors that are associated with project success. For this purpose, we frame the task as a network classification problem. We show how a set of heterogeneous networks composed of social and technical entities can be jointly embedded in a single vector space enabling mathematically sound comparisons between distinct software projects. Our approach is specifically designed using intuitive metrics stemming from network analysis and statistics to ease the interpretation of results in the context of software engineering wisdom. Based on a selection of 32 open source projects, we perform an empirical study to validate our approach considering three prediction scenarios to test the classification model’s ability generalizing to (1) randomly held-out project snapshots, (2) future project states, and (3) entirely new projects. Our results provide evidence that a socio-technical perspective is superior to a pure social or technical perspective when it comes to early indicators of future project success. To our surprise, the methodology proposed here even shows evidence of being able to generalize to entirely novel (project hold-out set) software projects reaching predication accuracies of 80%, which is a further testament to the efficacy of our approach and beyond what has been possible so far. In addition, we identify key features that are strongly associated with project success. Our results indicate that even relatively simple socio-technical networks capture highly relevant and interpretable information about the early indicators of future project success.

5 citations

Proceedings ArticleDOI
07 Nov 2022
TL;DR: Zhang et al. as discussed by the authors proposed four entropy-based indices to measure the degree of community split, shrink, merge, expand, and emerge, respectively, which can provide a comprehensive and quantitative measure of community evolution in developer social networks.
Abstract: Understanding the evolution of communities in developer social networks (DSNs) around open source software (OSS) projects can provide valuable insights about the socio-technical process of OSS development. Existing studies show the evolutionary behaviors of social communities can effectively be described using patterns including split, shrink, merge, expand, emerge, and extinct. However, existing pattern-based approaches are limited in supporting quantitative analysis, and are potentially problematic for using the patterns in a mutually exclusive manner when describing community evolution. In this work, we propose that different patterns can occur simultaneously between every pair of communities during the evolution, just in different degrees. Four entropy-based indices are devised to measure the degree of community split, shrink, merge, and expand, respectively, which can provide a comprehensive and quantitative measure of community evolution in DSNs. The indices have properties desirable to quantify community evolution including monotonicity, and bounded maximum and minimum values that correspond to meaningful cases. They can also be combined to describe more patterns such as community emerge and extinct. We conduct studies with real-world OSS projects to evaluate the validity of the proposed indices. The results suggest the proposed indices can effectively capture community evolution, and are consistent with existing approaches in detecting evolution patterns in DSNs with an accuracy of 94.1%. The results also show that the indices are useful in predicting OSS team productivity with an accuracy of 0.718. In summary, the proposed approach is among the first to quantify the degree of community evolution with respect to different patterns, which is promising in supporting future research and applications about DSNs and OSS development.

1 citations

Journal ArticleDOI
TL;DR: In this article , the authors conducted a longitudinal study of 20 popular OSS projects, modeling their organizational structures as networks of developers connected by communication ties and characterizing developers' positions in terms of hierarchical (sub)structures in these networks.
Abstract: Despite the absence of a formal process and a central command-and-control structure, developer organization in open-source software (OSS) projects are far from being a purely random process. Prior work indicates that, over time, highly successful OSS projects develop a hybrid organizational structure that comprises a hierarchical part and a non-hierarchical part. This suggests that hierarchical organization is not necessarily a global organizing principle and that a fundamentally different principle is at play below the lowest positions in the hierarchy. Given the vast proportion of developers are in the non-hierarchical part, we seek to understand the interplay between these two fundamentally differently organized groups, how this hybrid structure evolves, and the trajectory individual developers take through these structures over the course of their participation. We conducted a longitudinal study of the full histories of 20 popular OSS projects, modeling their organizational structures as networks of developers connected by communication ties and characterizing developers’ positions in terms of hierarchical (sub)structures in these networks. We observed a number of notable trends and patterns in the subject projects: (1) hierarchy is a pervasive structural feature of developer networks of OSS projects; (2) OSS projects tend to form hybrid organizational structures, consisting of a hierarchical and a non-hierarchical part; and (3) the positional trajectory of a developer starts loosely connected in the non-hierarchical part and then tightly integrate into the hierarchical part, which is associated with the acquisition of experience (tenure), in addition to coordination and coding activities. Our study (a) provides a methodological basis for further investigations of hierarchy formation, (b) suggests a number of hypotheses on prevalent organizational patterns and trends in OSS projects to be addressed in further work, and (c) may ultimately guide the governance of organizational structures.

1 citations

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a method to automatically identify core developers based on role permissions of privileged events triggered in GitHub issues and pull requests, and validated the set of automatically identified core developers with a sample of project-reported developer lists.
Abstract: Many open-source software (OSS) projects are self-organized and do not maintain official lists with information on developer roles. So, knowing which developers take core and maintainer roles is, despite being relevant, often tacit knowledge. We propose a method to automatically identify core developers based on role permissions of privileged events triggered in GitHub issues and pull requests. In an empirical study on 25 GitHub projects, (1) we validate the set of automatically identified core developers with a sample of project-reported developer lists, and (2) we use our set of identified core developers to assess the accuracy of state-of-the-art unsupervised developer classification methods. Our results indicate that the set of core developers, which we extracted from privileged issue events, is sound and the accuracy of state-of-the-art unsupervised classification methods depends mainly on the data source (commit data vs. issue data) rather than the network-construction method (directed vs. undirected, etc.). In perspective, our results shall guide research and practice to choose appropriate unsupervised classification methods, and our method can help create reliable ground-truth data for training supervised classification methods.
Journal ArticleDOI
TL;DR: The proposed approach is among the first to quantify the degree of community evolution with respect to different patterns, and is consistent with existing approaches in detecting evolution patterns in DSNs with an accuracy of 94.1%.
Abstract: Understanding the evolution of communities in developer social networks (DSNs) around open source software (OSS) projects can provide valuable insights about the socio-technical process of OSS development. Existing studies show the evolutionary behaviors of social communities can effectively be described using patterns including split, shrink, merge, expand, emerge, and extinct. However, existing pattern-based approaches are limited in supporting quantitative analysis, and are potentially problematic for using the patterns in a mutually exclusive manner when describing community evolution. In this work, we propose that different patterns can occur simultaneously between every pair of communities during the evolution, just in different degrees. Four entropy-based indices are devised to measure the degree of community split, shrink, merge, and expand, respectively, which can provide a comprehensive and quantitative measure of community evolution in DSNs. The indices have properties desirable to quantify community evolution including monotonicity, and bounded maximum and minimum values that correspond to meaningful cases. They can also be combined to describe more patterns such as community emerge and extinct. We conduct experiments with real-world OSS projects to evaluate the validity of the proposed indices. The results suggest the proposed indices can effectively capture community evolution, and are consistent with existing approaches in detecting evolution patterns in DSNs with an accuracy of 94.1\%. The results also show that the indices are useful in predicting OSS team productivity with an accuracy of 0.718. In summary, the proposed approach is among the first to quantify the degree of community evolution with respect to different patterns, which is promising in supporting future research and applications about DSNs and OSS development.