Path Complexity of Recursive Functions Pregerson, Eli
2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion),
05/2023
Conference Proceeding
Path coverage is of critical importance in software testing and verification, and further, path explosion is a well-known challenge for automatic software analysis techniques like symbolic execution ...7. Asymptotic Path Complexity (APC), a code complexity metric developed in my research lab, formalizes the quantitative measurement of path explosion.
Fault localization is a practical research topic that helps developers identify code locations that might cause bugs in a program. Most existing fault localization techniques are designed for ...imperative programs (e.g., C and Java) and rely on analyzing correct and incorrect executions of the program to identify suspicious statements. In this work, we introduce a fault localization approach for models written in a declarative language, where the models are not "executed," but rather converted into a logical formula and solved using backend constraint solvers. We present FLACK, a tool that takes as input an Alloy model consisting of some violated assertion and returns a ranked list of suspicious expressions contributing to the assertion violation. The key idea is to analyze the differences between counterexamples, i.e., instances of the model that do not satisfy the assertion, and instances that do satisfy the assertion to find suspicious expressions in the input model. The experimental results show that FLACK is efficient (can handle complex, real-world Alloy models with thousand lines of code within 5 seconds), accurate (can consistently rank buggy expressions in the top 1.9% of the suspicious list), and useful (can often narrow down the error to the exact location within the suspicious expressions).
Assurance Case Development as Data: A Manifesto Menghi, Claudio; Viger, Torin; Di Sandro, Alessio ...
2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER),
05/2023
Conference Proceeding
Safety problems can be costly and catastrophic. Engineers typically rely on assurance cases to ensure their systems are adequately safe. Building safe software systems requires engineers to ...iteratively design, analyze and refine assurance cases until sufficient safety evidence is identified. The assurance case development is typically manual, time-consuming, and far from being straightforward. This paper presents a manifesto for our forward-looking idea: using assurance cases as data. We argue that engineers produce a lot of data during the assurance case development process, and such data can be collected and used to effectively improve this process. Therefore, in this manifesto, we propose to monitor the assurance case development activities, treat assurance cases as data, and learn suggestions that help safety engineers in designing safer systems.
Recent years have witnessed a wide array of results in software testing, exploring different approaches and methodologies ranging from fuzzers to symbolic engines, with a full spectrum of instances ...in between such as concolic execution and hybrid fuzzing. A key ingredient of many of these tools is Satisfiability Modulo Theories (SMT) solvers, which are used to reason over symbolic expressions collected during the analysis. In this paper, we investigate whether techniques borrowed from the fuzzing domain can be applied to check whether symbolic formulas are satisfiable in the context of concolic and hybrid fuzzing engines, providing a viable alternative to classic SMT solving techniques. We devise a new approximate solver, Fuzzy-Sat, and show that it is both competitive with and complementary to state-of-the-art solvers such as Z3 with respect to handling queries generated by hybrid fuzzers.
Semi-automated test-case propagation in fork ecosystems Mukelabai, Mukelabai; Berger, Thorsten; Borba, Paulo
2021 IEEE/ACM 43rd International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER),
05/2021
Conference Proceeding
Forking provides a flexible and low-cost strategy for developers to adapt an existing project to new requirements, for instance, when addressing different market segments, hardware constraints, or ...runtime environments. Then, small ecosystems of forked projects are formed, with each project in the ecosystem maintained by a separate team or organization. The software quality of projects in fork ecosystems varies with the resources available as well as team experience, and expertise, especially when the forked projects are maintained independently by teams that are unaware of the evolution of other's forks. Consequently, the quality of forked projects could be improved by reusing test cases as well as code, thereby leveraging community expertise and experience, and commonalities between the projects. We propose a novel technique for recommending and propagating test cases across forked projects. We motivate our idea with a pre-study we conducted to investigate the extent to which test cases are shared or can potentially be reused in a fork ecosystem. We also present the theoretical and practical implications underpinning the proposed idea, together with a research agenda.
The Go programming language offers a wide range of primitives to coordinate lightweight threads, e.g., channels, waitgroups, and mutexes --- all of which may cause concurrency bugs. Static checkers ...that guarantee the absence of bugs are essential to help programmers avoid these costly errors before their code is executed. However existing tools either miss too many bugs or cannot handle large programs. To address these limitations, we propose a static checker for Go programs which relies on performing bounded model checking of their concurrent behaviours. In contrast to previous works, our approach deals with large codebases, supports programs that have statically unknown parameters, and is extensible to additional concurrency primitives. Our work includes a detailed presentation of the extraction algorithm from Go programs to models, an algorithm to automatically check programs with statically unknown parameters, and a large scale evaluation of our approach. The latter shows that our approach outperforms the state-of-the-art.
Reasoning-Based Software Testing Giamattei, Luca; Pietrantuono, Roberto; Russo, Stefano
2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER),
05/2023
Conference Proceeding
With software systems becoming increasingly pervasive and autonomous, our ability to test for their quality is severely challenged. Many systems are called to operate in uncertain and highly-changing ...environment, not rarely required to make intelligent decisions by themselves. This easily results in an intractable state space to explore at testing time. The state-of-the-art techniques try to keep the pace, e.g., by augmenting the tester's intuition with some form of (explicit or implicit) learning from observations to search this space efficiently. For instance, they exploit historical data to drive the search (e.g., ML-driven testing) or the tests execution data itself (e.g., adaptive or search-based testing). Despite the indubitable advances, the need for smartening the search in such a huge space keeps to be pressing.
We introduce Reasoning-Based Software Testing (RBST), a new way of thinking at the testing problem as a causal reasoning task. Compared to mere intuition-based or state-of-the-art learning-based strategies, we claim that causal reasoning more naturally emulates the process that a human would do to "smartly" search the space. RBST aims to mimic and amplify, with the power of computation, this ability. The conceptual leap can pave the ground to a new trend of techniques, which can be variously instantiated from the proposed framework, by exploiting the numerous tools for causal discovery and inference. Preliminary results reported in this paper are promising.
Explaining the prediction results of software bug prediction models is a challenging task, which can provide useful information for developers to understand and fix the predicted bugs. Recently, ...Jirayus et al.'s proposed to use two model-agnostic techniques (i.e., LIME and iBreakDown) to explain the prediction results of bug prediction models. Although their experiments on file-level bug prediction show promising results, the performance of these techniques on explaining the results of just-in-time (i.e., change-level) bug prediction is unknown. This paper conducts the first empirical study to explore the explainability of these model-agnostic techniques on just-in-time bug prediction models. Specifically, this study takes a three-step approach, 1) replicating previously widely used just-in-time bug prediction models, 2) applying Local Interpretability Model-agnostic Explanation Technique (LIME) and iBreakDown on the prediction results, and 3) manually evaluating the explanations for buggy instances (i.e. positive predictions) against the root cause of the bugs. The results of our experiment show that LIME and iBreakDown fail to explain defect prediction explanations for just-in-time bug prediction models, unlike file-level. This paper urges for new approaches for explaining the results of just-in-time bug prediction models.
Many program verification tools can be customized via run-time configuration options that trade off performance, precision, and soundness. However, in practice, users often run tools under their ...default configurations, because understanding these tradeoffs requires significant expertise. In this paper, we ask how well a single, default configuration can work in general, and we propose SATUNE, a novel tool for automatically configuring program verification tools for given target programs. To answer our question, we gathered a dataset that runs four well-known program verification tools against a range of C and Java benchmarks, with results labeled as correct, incorrect, or inconclusive (e.g., timeout). Examining the dataset, we find there is generally no one-size-fits-all best configuration. Moreover, a statistical analysis shows that many individual configuration options do not have simple tradeoffs: they can be better or worse depending on the program.
Motivated by these results, we developed SATUNE, which constructs configurations using a meta-heuristic search. The search is guided by a surrogate fitness function trained on our dataset. We compare the performance of SATUNE to three baselines: a single configuration with the most correct results in our dataset; the most precise configuration followed by the most correct configuration (if needed); and the most precise configuration followed by random search (also if needed). We find that SATUNE outperforms these approaches by completing more correct tasks with high precision. In summary, our work shows that good configurations for verification tools are not simple to find, and SATUNE takes an important step towards automating the process of finding them.
Having the expected behavior of software specified in a formal language can greatly improve the automation of software verification activities, since these need to contrast the intended behavior with ...the actual software implementation. Unfortunately, software many times lacks such specifications, and thus providing tools and techniques that can assist developers in the construction of software specifications are relevant in software engineering. As an aid in this context, we present EvoSpex, a tool that given a Java method, automatically produces a specification of the method's current behavior, in the form of postcondition assertions. EvoSpex is based on generating software runs from the implementation (valid runs), making modifications to the runs to build divergent behaviors (invalid runs), and executing a genetic algorithm that tries to evolve a specification to satisfy the valid runs, and leave out the invalid ones. Our tool supports a rich JML-like assertion language, that can capture complex specifications, including sophisticated object structural properties.