TODO comments play an important role in helping developers to manage their tasks and communicate with other team members. TODO comments are often introduced by developers as a type of technical debt, ...such as a reminder to add/remove features or a request to optimize the code implementations. These can all be considered as notifications for developers to revisit regarding the current suboptimal solutions. TODO comments often bring short-term benefits - higher productivity or shorter development cost - and indicate attention needs to be paid for the long-term software quality. Unfortunately, due to their lack of knowledge or experience and/or the time constraints, developers sometimes may forget or even not be aware of suboptimal implementations. The loss of the TODO comments for these suboptimal solutions may hurt the software quality and reliability in the long-term. Therefore it is beneficial to remind the developers of the suboptimal solutions whenever they change the code. In this work, we refer this problem to the task of detecting TODO-missed commits , and we propose a novel approach named TDR eminder ( T O D O comment Reminder ) to address the task. With the help of TDR eminder , developers can identify possible missing TODO commits just-in-time when submitting a commit. Our approach has two phases: offline training and online inference. We first embed code change and commit message into contextual vector representations using two neural encoders respectively. The association between these representations is learned by our model automatically.In the online inference phase, TDR eminder leverages the trained model to compute the likelihood of a commit being a TODO-missed commit . We evaluate TDR eminder on datasets crawled from 10k popular Python and Java repositories in GitHub respectively. Our experimental results show that TDR eminder outperforms a set of benchmarks by a large margin in TODO-missed commits detection. Moreover, to better help developers use TDR eminder in practice, we have incorporated Large Language Models (LLMs) with our approach to provide explainable recommendations. The user study shows that our tool can effectively inform developers not only "when" to add TODOs, but also "where" and "what" TODOs should be added, verifying the value of our tool in practical application.
There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes ...in the form of the first self-described 'AI pair programmer', GitHub Copilot, which is a language model trained over open-source GitHub code. However, code often contains bugs-and so, given the vast quantity of unvetted code that Copilot has processed, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions. In this work, we systematically investigate the prevalence and conditions that can cause GitHub Copilot to recommend insecure code. To perform this analysis we prompt Copilot to generate code in scenarios relevant to high-risk cybersecurity weaknesses, e.g. those from MITRE's "Top 25" Common Weakness Enumeration (CWE) list. We explore Copilot's performance on three distinct code generation axes-examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains. In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40% to be vulnerable.
The practice of Continuous Integration (CI) allows developers to quickly integrate and verify projects modifications. Thus, CI acceleration products are a boon to developers seeking rapid feedback. ...However, if outcomes vary between accelerated and non-accelerated settings, the trustworthiness of the acceleration is called into question.
In this paper, we study the trustworthiness of two CI acceleration products, one based on program analysis (PA) and the other on machine learning (ML). We re-execute 50 failing builds from ten open-source projects in non-accelerated (baseline), PAaccelerated, and ML-accelerated settings. We find that when applied to known failing builds, PA-accelerated builds more often (43.83 percentage point difference across ten projects) align with the non-accelerated build results. We conclude that while there is still room for improvement for both CI acceleration products, the selected PA-product currently provides a more trustworthy signal of build outcomes than the ML-product.
High-level guidelines and tools for managing artificial intelligence (AI) ethics have been introduced to help industry organizations make more ethical AI systems. The results of a survey of 211 ...software companies provide insights into the current state of industrial practice.
Agile Management and VUCA-RRprovides cutting-edge, multidisciplinary research and expert insight into the advancing technologies and new strategies being used in businesses settings, as well as for ...administrative and leadership roles in organizations.
GitOps: The Evolution of DevOps? Beetz, Florian; Harrer, Simon
IEEE software,
2022-July-Aug., Volume:
39, Issue:
4
Journal Article
Peer reviewed
GitOps is a new concept that is supposed to be the evolution of DevOps, and we analyze whether that is so. Since neither DevOps nor GitOps has a standard definition, we synthesize one for each and ...compare them by their key elements.
Reusing code snippets from online programming Q&A communities has become a common development practice, in which developers often need to adapt code snippets to their code contexts to satisfy their ...own programming needs. However, how developers make these code adaptations based on contexts is still unclear. To bridge this gap, we first conduct a semi-structured interview of 21 developers to investigate their adaptation practices and perceived challenges during this process. The result suggests that code snippet adaptation is a challenging and exhausting task for developers, as they should tailor the snippets to guarantee their correctness and quality with laborious work. We also note that developers all resort to their intra-file context to complete adaptations, which motivates us to further study how developers performed context-based adaptations (CAs) in real scenarios. To this end, we conduct a quantitative study on an adaptation dataset comprising 300 code snippet reuse cases with 1,384 adaptations from Stack Overflow to GitHub. For each adaptation, we manually annotate its intention and relationship with the context. Based on our annotated data, we employ frequent itemset mining to obtain four CA patterns from our dataset, including Fortification , Code Wiring , Attribute-ization and Parameterization . Our main findings reveal that: (1) more than half of the code snippet reuse cases include CAs and 23.3% of the adaptations are CAs; (2) more than half of the CAs are corrective adaptations and variable is the primary adapted language construct; (3) attribute is the most frequently utilized context and 88% of the local contexts are within the nearest 10 LOCs; and (4) CAs towards different intentions are repetitive, which are useful for automatic adaptation. Overall, our study provides valuable insights into code snippet adaptation and has important implications for research, practice, and tool design.
Dashboards, which comprise multiple views on a single display, help analyze and communicate multiple perspectives of data simultaneously. However, creating effective and elegant dashboards is ...challenging since it requires careful and logical arrangement and coordination of multiple visualizations. To solve the problem, we propose a data-driven approach for mining design rules from dashboards and automating dashboard organization. Specifically, we focus on two prominent aspects of the organization: arrangement , which describes the position, size, and layout of each view in the display space; and coordination , which indicates the interaction between pairwise views. We build a new dataset containing 854 dashboards crawled online, and develop feature engineering methods for describing the single views and view-wise relationships in terms of data, encoding, layout, and interactions. Further, we identify design rules among those features and develop a recommender for dashboard design. We demonstrate the usefulness of DMiner through an expert study and a user study. The expert study shows that our extracted design rules are reasonable and conform to the design practice of experts. Moreover, a comparative user study shows that our recommender could help automate dashboard organization and reach human-level performance. In summary, our work offers a promising starting point for design mining visualizations to build recommenders.
Protect your user Floyd, Raymond E.
IEEE potentials,
2023-Sept.-Oct., 2023-9-00, Volume:
42, Issue:
5
Journal Article
Peer reviewed
In the electronic environment of today, the sophistication of a user cannot be anticipated to be at any particular advanced level. As a matter of fact, users range from first graders, to individuals ...with advanced knowledge, to elders with little or no technical background. As a result, a much greater burden is placed on software developers to ensure the product they release operates correctly under any conceivable key entry from a user. Anything less can be a disaster for a product, or set of products, due to customer dissatisfaction and complaints.