Context
Code review is a widely used technique of systematic examination of code changes which aims at increasing software quality. Code reviews provide several benefits for the project, including ...finding bugs, knowledge transfer, and assurance of adherence to project guidelines and coding style. However, code reviews have a major cost: they can delay the merge of the code change, and thus, impact the overall development process. This cost can be even higher if developers do not understand something, i.e., when developers face
confusion
during the code review.
Objective
This paper studies the phenomenon of
confusion
in code reviews. Understanding confusion is an important starting point to help reducing the cost of code reviews and enhance the effectiveness of this practice, and hence, improve the development process.
Method
We conducted two complementary studies. The first one aimed at identifying the reasons for confusion in code reviews, its impacts, and the coping strategies developers use to deal with it. Then, we surveyed developers to identify the most frequently experienced reasons for confusion, and conducted a systematic mapping study of solutions proposed for those reasons in the scientific literature.
Results
From the first study, we build a framework with 30 reasons for confusion, 14 impacts, and 13 coping strategies. The results of the systematic mapping study shows 38 articles addressing the most frequent reasons for confusion. From those articles, we found 13 different solutions for confusion proposed in the literature, and five impacts were established related to the most frequent reasons for confusion.
Conclusions
Based on the solutions identified in the mapping study, or the lack of them, we propose an actionable guideline for developers on how to cope with confusion during code reviews; we also make several suggestions how tool builders can support code reviews. Additionally, we propose a research agenda for researchers studying code reviews.
Sentiment analysis methods have become popular for investigating human communication, including discussions related to software projects. Since general-purpose sentiment analysis tools do not fit ...well with the information exchanged by software developers, new tools, specific for software engineering (SE), have been developed. We investigate to what extent off-the-shelf SE-specific tools for sentiment analysis mitigate the threats to conclusion validity of empirical studies in software engineering, highlighted by previous research. First, we replicate two studies addressing the role of sentiment in security discussions on GitHub and in question-writing on Stack Overflow. Then, we extend the previous studies by assessing to what extent the tools agree with each other and with the manual annotation on a gold standard of 600 documents. We find that different SE-specific sentiment analysis tools might lead to contradictory results at a fine-grain level, when used
off-the-shelf
. Conversely, platform-specific tuning or retraining might be needed to take into account differences in platform conventions, jargon, or document lengths.
Sustained participation by contributors in opensource software is critical to the survival of open-source projects and can provide career advancement benefits to individual contributors. However, not ...all contributors reap the benefits of open-source participation fully, with prior work showing that women are particularly underrepresented and at higher risk of disengagement. While many barriers to participation in open-source have been documented in the literature, relatively little is known about how the social networks that open-source contributors form impact their chances of long-term engagement. In this paper we report on a mixed-methods empirical study of the role of social capital (i.e., the resources people can gain from their social connections) for sustained participation by women and men in open-source GitHub projects. After combining survival analysis on a large, longitudinal data set with insights derived from a user survey, we confirm that while social capital is beneficial for prolonged engagement for both genders, women are at disadvantage in teams lacking diversity in expertise.
Self-admitted technical debt (SATD) consists of annotations, left by developers as comments in the source code or elsewhere, as a reminder about pieces of software manifesting technical debt (TD), ...i.e., “not being ready yet”. While previous studies have investigated SATD management and its relationship with software quality, there is little understanding of the extent and circumstances to which developers admit TD. This paper reports the results of a study in which we asked developers from industry and open-source about their practices in annotating source code and other artifacts for self-admitting TD. The study consists of two phases. First, we conducted 10 interviews to gather a first understanding of the phenomenon and to prepare a survey questionnaire. Then, we surveyed 52 industrial developers as well as 49 contributors to open-source projects. Results of the study show how the TD annotation practices, as well as the typical content of SATD comments, are very similar between open-source and industry. At the same time, our results highlight how, while open-source code is spread of comments admitting the need for improvements, SATD in industry may be dictated by organizational guidelines but, at the same time, implicitly discouraged by the fear of admitting responsibilities. Results also highlight the need for tools helping developers to achieve a better TD awareness.
When identifying the origin of software bugs, many studies assume that “a bug was introduced by the lines of code that were modified to fix it”. However, this assumption does not always hold and at ...least in some cases, these modified lines are not responsible for introducing the bug. For example, when the bug was caused by a change in an external API. The lack of empirical evidence makes it impossible to assess how important these cases are and therefore, to which extent the assumption is valid. To advance in this direction, and better understand how bugs “are born”, we propose a model for defining criteria to identify the first snapshot of an evolving software system that exhibits a bug. This model, based on the
perfect test
idea, decides whether a bug is observed after a change to the software. Furthermore, we studied the model’s criteria by carefully analyzing how 116 bugs were introduced in two different open source software projects. The manual analysis helped classify the root cause of those bugs and created manually curated datasets with bug-introducing changes and with bugs that were not introduced by any change in the source code. Finally, we used these datasets to evaluate the performance of four existing SZZ-based algorithms for detecting bug-introducing changes. We found that SZZ-based algorithms are not very accurate, especially when multiple commits are found; the F-Score varies from 0.44 to 0.77, while the percentage of true positives does not exceed 63%. Our results show empirical evidence that the prevalent assumption, “a bug was introduced by the lines of code that were modified to fix it”, is just one case of how bugs are introduced in a software system. Finding what introduced a bug is not trivial: bugs can be introduced by the developers and be in the code, or be created irrespective of the code. Thus, further research towards a better understanding of the origin of bugs in software projects could help to improve design integration tests and to design other procedures to make software development more robust.
Code reading is one of the most frequent activities in software maintenance. Such an activity aims at acquiring information from the code and, thus, it is a prerequisite for program comprehension: ...developers need to read the source code they are going to modify before implementing changes. As the code changes, so does its readability; however, it is not clear yet
how
code readability changes during software evolution. To understand how code readability changes when software evolves, we studied the history of 25 open source systems. We modeled code readability evolution by defining four states in which a file can be at a certain point of time (
non-existing
,
other-name
,
readable
, and
unreadable
). We used the data gathered to infer the probability of transitioning from one state to another one. In addition, we also manually checked a significant sample of transitions to compute the performance of the state-of-the-art readability prediction model we used to calculate the transition probabilities. With this manual analysis, we found that the tool correctly classifies all the transitions in the majority of the cases, even if there is a loss of accuracy compared to the single-version readability estimation. Our results show that most of the source code files are created readable. Moreover, we observed that only a minority of the commits change the readability state. Finally, we manually carried out qualitative analysis to understand what makes code unreadable and what developers do to prevent this. Using our results we propose some guidelines (i) to reduce the risk of code readability erosion and (ii) to promote best practices that make code readable.
The development of systems following model-driven engineering can include models from different domains. For example, to develop a mechatronic component one might need to combine expertise about ...mechanics, electronics, and software. Although these models belong to different domains, the changes in one model can affect other models causing inconsistencies in the entire system. Only few tools, however, support management of models from different domains. Indeed, these models are created using different modeling notations and it is not plausible to use a multitude of parsers geared toward each and every modeling notation. Therefore, to ensure maintenance of multi-domain systems, we need a uniform approach that would be independent from the peculiarities of the notation. Notation independence implies that such a uniform approach can only be based on elements commonly present in models of different domains, i.e., text, boxes, and lines. In this study, we investigate the suitability of optical character recognition (OCR) as a basis for such a uniformed approach. We select graphical models from various domains that typically combine textual and graphical elements. We start by analyzing the performance of Google Cloud Vision and Microsoft Cognitive Services, two off-the-shelf OCR services. Google Cloud Vision performed better than Microsoft Cognitive Services being able to detect text of 70% of model elements. Errors made by Google Cloud Vision are due to absence of support for text common in engineering formulas, e.g., Greek letters, equations, and subscripts. We identified the
multi-line text
error as one of the main issues of using OCR to recognize textual elements in models from different domains. This error happens when OCR misinterprets one textual element as two separate elements. To address the
multi-line text
error, we build
Xamã
on top of Google Cloud Vision.
Xamã
includes two approaches to identify whether the elements are positioned on a single line or multiple lines, and merge those identified as positioned on multiples lines. With and without shape detection,
Xamã
correctly identified 956 and 905 elements, respectively, out of 1171. Additionally, we compared the accuracy of
Xamã
and state-of-the-art tool img2UML, and we observe that
Xamã
outperformed img2UML in both precision and recall, being able to recognize 433 out of 614 textual elements as opposed to 171 by img2UML.
Objective
The goal of this study is to identify gaps and challenges related to cross-domain model management focusing on consistency checking.
Method
We conducted a systematic literature review. We ...used the keyword-based search on Google Scholar, and we identified 618 potentially relevant studies; after applying inclusion and exclusion criteria, 96 papers were selected for further analysis.
Results
The main findings/contributions are: (i) a list of available tools used to support model management; (ii) 40% of the tools can provide consistency checking on models of different domains and 25% on models of the same domain, and 35% do not provide any consistency checking; (iii) available strategies to keep the consistency between models of different domains are not mature enough; (iv) most of the tools that provide consistency checking on models of different domains can only capture up to two inconsistency types; (v) the main challenges associated with tools that manage models on different domains are related to
interoperability
between tools and the
consistency maintenance
.
Conclusion
The results presented in this study can be used to guide new research on maintaining the consistency between models of different domains. Example of further research is to investigate how to capture the
Behavioral
and
Refinement
inconsistency types. This study also indicates that the tools should be improved in order to address, for example, more kinds of consistency check.
Context
Models, as the main artifact in model-driven engineering, have been extensively used in the area of embedded systems for code generation and verification. One of the most popular behavioral ...modeling techniques is the state machine. Many state machine modeling guidelines recommend that a state machine should have more than one state in order to be meaningful. However, single-state state machines (SSSMs) violating this recommendation have been used in modeling cases reported in the literature.
Objective
We aim for understanding the phenomenon of using SSSMs in practice as understanding why developers violate the modeling guidelines is the first step towards improvement of modeling tools and practice.
Method
To study the phenomenon, we conducted an exploratory study which consists of two complementary studies. The first study investigated the prevalence and role of SSSMs in the domain of embedded systems, as well as the reasons why developers use them and their perceived advantages and disadvantages. We employed the sequential explanatory strategy, including repository mining and interview, to study 1500 state machines from 26 components at ASML, a leading company in manufacturing lithography machines from the semiconductor industry. In the second study, we investigated the evolutionary aspects of SSSMs, exploring when SSSMs are introduced to the systems and how developers modify them by mining the largest state-machine-based component from the company.
Results
We observe that 25 out of 26 components contain SSSMs. Our interviews suggest that SSSMs are used to interface with the existing code, to deal with tool limitations, to facilitate maintenance and to ease verification. Our study on the evolutionary aspects of SSSMs reveals that the need for SSSMs to deal with tool limitations grew continuously over the years. Moreover, only a minority of SSSMs have been changed between SSSM and multiple-state state machine (MSSM) during their evolution. The most frequent modifications developers made to SSSMs is inserting events with constraints on the execution of the events.
Conclusions
Based on our results, we provide implications for developers and tool builders. Furthermore, we formulate hypotheses about the effectiveness of SSSMs, the impacts of SSSMs on development, maintenance and verification as well as the evolution of SSSMs.