The fast-paced evolution of Android APIs has posed a challenging task for Android app developers. To leverage Androids frequently released APIs, developers must often spend considerable effort on API ...migrations. Prior research and Android official documentation typically provide enough information to guide developers in identifying the API calls that must be migrated and the corresponding API calls in an updated version of Android ( what to migrate). However, API migration remains a challenging task since developers lack the knowledge of how to migrate the API calls. There exist code examples, such as Google Samples, that illustrate the usage of APIs. We posit that by analyzing the changes of API usage in code examples, we can learn API migration patterns to assist developers with API Migrations. In this paper, we propose an approach that learns API migration patterns from code examples, applies these patterns to the source code of Android apps for API migration, and presents the results to users as potential migration solutions. To evaluate our approach, we migrate API calls in open source Android apps by learning API migration patterns from code examples. We find that our approach can successfully learn API migration patterns and provide API migration assistance in 71 out of 80 cases. Our approach can either migrate API calls with little to no extra modifications needed or provide guidance to assist with the migrations. Through a user study, we find that adopting our approach can reduce the time spent on migrating APIs, on average, by 29 percent. Moreover, our interviews with app developers highlight the benefits of our approach when seeking API migrations. Our approach demonstrates the value of leveraging the knowledge contained in software repositories to facilitate API migrations.
Studying software logging using topic models Li, Heng; Chen, Tse-Hsun (Peter); Shang, Weiyi ...
Empirical software engineering : an international journal,
10/2018, Letnik:
23, Številka:
5
Journal Article
Recenzirano
Software developers insert logging statements in their source code to record important runtime information; such logged information is valuable for understanding system usage in production and ...debugging system failures. However, providing proper logging statements remains a manual and challenging task. Missing an important logging statement may increase the difficulty of debugging a system failure, while too much logging can increase system overhead and mask the truly important information. Intuitively, the actual functionality of a software component is one of the major drivers behind logging decisions. For instance, a method maintaining network communications is more likely to be logged than getters and setters. In this paper, we used automatically-computed topics of a code snippet to approximate the functionality of a code snippet. We studied the relationship between the topics of a code snippet and the likelihood of a code snippet being logged (i.e., to contain a logging statement). Our driving intuition is that certain topics in the source code are more likely to be logged than others. To validate our intuition, we conducted a case study on six open source systems, and we found that i) there exists a small number of “log-intensive” topics that are more likely to be logged than other topics; ii) each pair of the studied systems share 12% to 62% common topics, and the likelihood of logging such common topics has a statistically significant correlation of 0.35 to 0.62 among all the studied systems; and iii) our topic-based metrics help explain the likelihood of a code snippet being logged, providing an improvement of 3% to 13% on AUC and 6% to 16% on balanced accuracy over a set of baseline metrics that capture the structural information of a code snippet. Our findings highlight that topics contain valuable information that can help guide and drive developers’ logging decisions.
Continuous integration is widely adopted in software projects to reduce the time it takes to deliver the changes to the market. To ensure software quality, developers also run regression test cases ...in a continuous fashion. The CI practice generates commit-by-commit software evolution data that provides great opportunities for future testing research. However, such data is often unavailable due to space limitation (e.g., developers only keep the data for a certain period) and the significant effort involved in re-running the test cases on a per-commit basis. In this paper, we present T-Evos, a dataset on test result and coverage evolution, covering 8,093 commits across 12 open-source Java projects. Our dataset includes the evolution of statement-level code coverage for every test case (either passed and failed), test result, all the builds information, code changes, and the corresponding bug reports. We conduct an initial analysis to demonstrate the overall dataset. In addition, we conduct an empirical study using T-Evos to study the characteristics of test failures in CI settings. We find that test failures are frequent, and while most failures are resolved within a day, some failures require several weeks to resolve. We highlight the relationship between code changes and test failure, and provide insights for future automated testing research. Our dataset may be used for future testing research and benchmarking in CI. Our findings provide an important first step in understanding code coverage evolution and test failures in a continuous environment.
Logs contain valuable information about the runtime behaviors of software systems. Thus, practitioners rely on logs for various tasks such as debugging, system comprehension, and anomaly detection. ...However, logs are difficult to analyze due to their unstructured nature and large size. In this paper, we propose a novel approach called LogAssist that assists practitioners with log analysis. LogAssist provides an organized and concise view of logs by first grouping logs into event sequences (i.e., workflows), which better illustrate the system runtime execution paths. Then, LogAssist compresses the log events in workflows by hiding consecutive events and applying n-gram modeling to identify common event sequences. We evaluated LogAssist on logs generated by one enterprise and two open source systems. We find that LogAssist can reduce the number of log events that practitioners need to investigate by up to 99 percent. Through a user study with 19 participants, we find that LogAssist can assist practitioners by reducing the time required for log analysis tasks by an average of 40 percent. The participants also rated LogAssist an average of 4.53 out of 5 for improving their experiences of performing log analysis. Finally, we document our experiences and lessons learned from developing and adopting LogAssist in practice. We believe that LogAssist and our reported experiences may lay the basis for future analysis and interactive exploration on logs.
Logging is a common practice in software engineering. Prior research has investigated the characteristics of logging practices in system software (e.g., web servers or databases) as well as desktop ...applications. However, despite the popularity of mobile apps, little is known about their logging practices. In this paper, we sought to study logging practices in mobile apps. In particular, we conduct a case study on 1,444 open source Android apps in the F-Droid repository. Through a quantitative study, we find that although mobile app logging is less pervasive than server and desktop applications, logging is leveraged in almost all studied apps. However, we find that there exist considerable differences between the logging practices of mobile apps and the logging practices in server and desktop applications observed by prior studies. In order to further understand such differences, we conduct a firehouse email interview and a qualitative annotation on the rationale of using logs in mobile app development. By comparing the logging level of each logging statement with developers’ rationale of using the logs, we find that all too often (35.4%), the chosen logging level and the rationale are inconsistent. Such inconsistency may prevent the useful runtime information to be recorded or may generate unnecessary logs that may cause performance overhead. Finally, to understand the magnitude of such performance overhead, we conduct a performance evaluation between generating all the logs and not generating any logs in eight mobile apps. In general, we observe a statistically significant performance overhead based on various performance metrics (response time, CPU and battery consumption). In addition, we find that if the performance overhead of logging is significantly observed in an app, disabling the unnecessary logs indeed provides a statistically significant performance improvement. Our results show the need for a systematic guidance and automated tool support to assist in mobile logging practices.
In recent years, researchers and practitioners have been studying the impact of test smells on test maintenance. However, there is still limited empirical evidence on why developers remove test ...smells in software maintenance and the mechanism employed for addressing test smells. In this paper, we conduct an empirical study on 12 real-world open-source systems to study the evolution and maintenance of test smells, and how test smells are related to software quality. Our results show that: 1) Although the number of test smell instances increases, test smell density decreases as systems evolve. 2) However, our qualitative analysis on those removed test smells reveals that most test smell removal (83%) is a by-product of feature maintenance activities. 45% of the removed test smells relocate to other test cases due to refactoring, while developers deliberately address the only 17% of the test smell instances, consisting of largely
Exception Catch/Throw
and
Sleepy Test
. 3) Our statistical model shows that test smell metrics can provide additional explanatory power on post-release defects over traditional baseline metrics (an average of 8.25% increase in AUC). However, most types of test smells have a minimal effect on post-release defects. Our study provides insight into how developers resolve test smells and current test maintenance practices. Future studies on test smells may consider focusing on the specific types of test smells that may have a higher correlation with defect-proneness when helping developers with test code maintenance.
Logs in bug reports provide important debugging information for developers. During the debugging process, developers need to study the bug report and examine user-provided logs to understand the ...system executions that lead to the problem. Intuitively, user-provided logs illustrate the problems that users encounter and may help developers with the debugging process. However, some logs may be incomplete or inaccurate, which can cause difficulty for developers to diagnose the bug, and thus, delay the bug fixing process. In this paper, we conduct an empirical study on the challenges that developers may encounter when analyzing the user-provided logs and their benefits. In particular, we study both log snippets and exception stack traces in bug reports. We conduct our study on 10 large-scale open-source systems with a total of 1,561 bug reports with logs (BRWL) and 7,287 bug reports without logs (BRNL). Our findings show that: 1) BRWL takes longer time (median ranges from 3 to 91 days) to resolve compared to BRNL (median ranges from 1 to 25 days). We also find that reporters may not attach accurate or sufficient logs (i.e., developers often ask for additional logs in the Comments section of a bug report), which extends the bug resolution time. 2) Logs often provide a good indication of where a bug is located. Most bug reports (73%) have overlaps between the classes that generate the logs and their corresponding fixed classes. However, there is still a large number of bug reports where there is no overlap between the logged and fixed classes. 3) Our manual study finds that there is often missing system execution information in the logs. Many logs only show the point of failure (e.g., exception) and do not provide a direct hint on the actual root cause. In fact, through call graph analysis, we find that 28% of the studied bug reports have the fixed classes reachable from the logged classes, while they are not visible in the logs attached in bug reports. In addition, some logging statements are removed in the source code as the system evolves, which may cause further challenges in analyzing the logs. In short, our findings highlight possible future research directions to better help practitioners attach or analyze logs in bug reports.
A readme file plays an important role in a GitHub repository to provide a starting point for developers to reuse and make contributions. A good readme could provide sufficient information for users ...to learn and start a GitHub repository and might be correlated to the popularity of a repository. Given the importance of the role that a readme file plays, we aim to study to understand the correlation between the readme file of GitHub repositories and their popularity. We analyze readme files of 5,000 GitHub repositories across more than 20 languages. We study the relationship between readme file related factors and the popularity of GitHub repositories. We observe that: (1) Most of the studied readme file related factors (e.g., the number of lists, the number and frequency of updates on the readme file) are statistically significantly different between popular and non-popular repositories with non-negligible effect size. (2) After controlling repository-specific factors (e.g., repository topics and license information), the number of lists and the frequency of updates are the most significantly important factors that discriminate between popular and non-popular repositories. (3) The most of updates were made to update references in popular repositories, while in non-popular repositories most updates are for the content of how to use the repository.
Editor’s note: Open Science material was validated by the Journal of Systems and Software Open Science Board.
•Popular and non-popular repositories are different in various aspects.•The number of lists is an important factor correlating with the popularity.•The frequency of updates is an important factor correlating with the popularity.•A large portion of updates were made to update references and content of “how”.•Developers should pay attention to organize and maintain the readme file well.
Developers write logging statements to generate logs that provide valuable runtime information for debugging and maintenance of software systems. Log level is an important component of a logging ...statement, which enables developers to control the information to be generated at system runtime. However, due to the complexity of software systems and their runtime behaviors, deciding a proper log level for a logging statement is a challenging task. For example, choosing a higher level (e.g., error) for a trivial event may confuse end users and increase system maintenance overhead, while choosing a lower level (e.g., trace) for a critical event may prevent the important execution information to be conveyed opportunely. In this paper, we tackle the challenge by first conducting a preliminary manual study on the characteristics of log levels. We find that the syntactic context of the logging statement and the message to be logged might be related to the decision of log levels, and log levels that are further apart in order (e.g., trace and error) tend to have more differences in their characteristics. Based on this, we then propose a deep-learning based approach that can leverage the ordinal nature of log levels to make suggestions on choosing log levels, by using the syntactic context and message features of the logging statements extracted from the source code. Through an evaluation on nine large-scale open source projects, we find that: 1) our approach outperforms the state-of-the-art baseline approaches; 2) we can further improve the performance of our approach by enlarging the training data obtained from other systems; 3) our approach also achieves promising results on cross-system suggestions that are even better than the baseline approaches on within-system suggestions. Our study highlights the potentials in suggesting log levels to help developers make informed logging decisions.