To perform their daily tasks, developers intensively make use of existing resources by consulting open source software (OSS) repositories. Such platforms contain rich data sources, e.g., code ...snippets, documentations, and user discussions, that can be useful for supporting development activities. Over the last decades, several techniques and tools have been promoted to provide developers with innovative features, aiming to bring in improvements in terms of development effort, cost savings, and productivity. In the context of the EU H2020 CROSSMINER project, a set of recommendation systems has been conceived to assist software programmers in different phases of the development process. The systems provide developers with various artifacts, such as third-party libraries, documentation about how to use the APIs being adopted, or relevant API function calls. To develop such recommendations, various technical choices have been made to overcome issues related to several aspects including the lack of baselines, limited data availability, decisions about the performance measures, and evaluation approaches. This paper is an experience report to present the knowledge pertinent to the set of recommendation systems developed through the CROSSMINER project. We explain in detail the challenges we had to deal with, together with the related lessons learned when developing and evaluating these systems. Our aim is to provide the research community with concrete takeaway messages that are expected to be useful for those who want to develop or customize their own recommendation systems. The reported experiences can facilitate interesting discussions and research work, which in the end contribute to the advancement of recommendation systems applied to solve different issues in Software Engineering.
For software engineering research to increase its impact and steer our community toward a more successful future, it must foster context-driven research. Such research focuses on problems defined in ...collaboration with industrial partners and is driven by concrete needs in specific domains and development projects.
Building on concepts drawn from control theory, self-adaptive software handles environmental and internal uncertainties by dynamically adjusting its architecture and parameters in response to events ...such as workload changes and component failures. Self-adaptive software is increasingly expected to meet strict functional and non-functional requirements in applications from areas as diverse as manufacturing, healthcare and finance. To address this need, we introduce a methodology for the systematic ENgineering of TRUstworthy Self-adaptive sofTware (ENTRUST). ENTRUST uses a combination of (1) design-time and runtime modelling and verification, and (2) industry-adopted assurance processes to develop trustworthy self-adaptive software and assurance cases arguing the suitability of the software for its intended application. To evaluate the effectiveness of our methodology, we present a tool-supported instance of ENTRUST and its use to develop proof-of-concept self-adaptive software for embedded and service-based systems from the oceanic monitoring and e-finance domains, respectively. The experimental results show that ENTRUST can be used to engineer self-adaptive software systems in different application domains and to generate dynamic assurance cases for these systems.
Representative sampling appears rare in empirical software engineering research. Not all studies need representative samples, but a general lack of representative sampling undermines a scientific ...field. This article therefore reports a critical review of the state of sampling in recent, high-quality software engineering research. The key findings are: (1) random sampling is rare; (2) sophisticated sampling strategies are very rare; (3) sampling, representativeness and randomness often appear misunderstood. These findings suggest that
software engineering research has a generalizability crisis
. To address these problems, this paper synthesizes existing knowledge of sampling into a succinct primer and proposes extensive guidelines for improving the conduct, presentation and evaluation of sampling in software engineering research. It is further recommended that while researchers should strive for more representative samples, disparaging non-probability sampling is generally capricious and particularly misguided for predominately qualitative research.
Estimating and understanding software development productivity represent crucial tasks for researchers and practitioners. Although different works focused on evaluating the impact of human factors on ...productivity, a few explored the influence of cultural/geographical diversity in software development communities. More particularly, all previous treatise addresses cultural aspects as abstract concepts without providing a quantitative representation. Improved knowledge of these matters might help project managers to assemble more productive teams and tool vendors to design software analytics toolkits that may better estimate productivity. This paper has the goal of enlarging the existing body of knowledge on the factors affecting productivity by focusing on cultural and geographical dispersion of a development community—namely, how diverse a community is in terms of cultural attitudes and geographical collocation of the members who belong to it. To reach this goal, we performed a mixed-method empirical study. First, we built a statistical model relating dispersion metrics with the productivity of 25 open-source communities on Github. Then, we performed a confirmatory survey with 140 practitioners. The key results of our study indicate that cultural and geographical dispersion considerably impact productivity, thus encouraging managers and practitioners to consider such aspects during all the phases of the software development lifecycle. We conclude our paper by elaborating on the main insights from our analyses and instilling implications that may drive further research.
Display omitted
•We use Hofstede dimensions to represent culture and cultural dispersion.•Cultural and geographical dispersion impact the productivity of a community.•Power Distance and Long Term Orientation Dispersion negatively impact productivity.•Individualism and Indulgence Dispersion positively impacts productivity.•Geographical Dispersion positively impacts productivity.
Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that ...exploit the abundance of patterns of code. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.
Code flaws or vulnerabilities are prevalent in software systems and can potentially cause a variety of problems including deadlock, hacking, information loss and system failure. A variety of ...approaches have been developed to try and detect the most likely locations of such code vulnerabilities in large code bases. Most of them rely on manually designing code features (e.g., complexity metrics or frequencies of code tokens) that represent the characteristics of the potentially problematic code to locate. However, all suffer from challenges in sufficiently capturing both semantic and syntactic representation of source code, an important capability for building accurate prediction models. In this paper, we describe a new approach, built upon the powerful deep learning Long Short Term Memory model, to automatically learn both semantic and syntactic features of code. Our evaluation on 18 Android applications and the Firefox application demonstrates that the prediction power obtained from our learned features is better than what is achieved by state of the art vulnerability prediction models, for both within-project prediction and cross-project prediction.
What kinds of contracts do ML APIs need? Khairunnesa, Samantha Syeda; Ahmed, Shibbir; Imtiaz, Sayem Mohammad ...
Empirical software engineering : an international journal,
11/2023, Volume:
28, Issue:
6
Journal Article
Peer reviewed
Open access
Recent work has shown that Machine Learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API ...users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on
Stack Overflow
of the four most often-discussed ML libraries:
TensorFlow
,
Scikit-learn
,
Keras
, and
PyTorch
. For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.