Heap-based vulnerabilities such as buffer overflow and use after free are severe flaws in various software systems. Detecting heap-based vulnerabilities and demonstrating their severity via ...generating exploits for them are of critical importance. Existing symbolic execution-based approaches have shown their potential in the above tasks. However, they still have some fundamental limitations in path exploration, memory modeling, and environment modeling, which significantly impede existing symbolic execution engines from efficiently and effectively detecting and exploiting heap-based vulnerabilities. The objective of this thesis is to design and implement a boosted symbolic execution engine named HeapX to facilitate the automatic detection and exploitation of heap-based vulnerabilities. Specifically, a new path exploration strategy, a new memory model, and a new environment modeling solution are expected to be designed in HeapX, so that the new boosted symbolic execution engine can detect heap-based vulnerabilities and generate working exploits for them more efficiently and effectively.
Case Study Research in Software Engineering Runeson, Per; Host, Martin; Rainer, Austen ...
ELLIIT: the Linköping-Lund initiative on IT and mobile communication,
2012, 2012., 2012-04-10T00:00:00, 2012-03-07, 2012-02-14
eBook, Book
Based on their own experiences of in-depth case studies of software projects in international corporations, in this bookthe authors present detailed practical guidelines on the preparation, conduct, ...design and reporting of case studies of software engineering. This is the first software engineering specific book on thecase study research method.
DeepLocalize Wardat, Mohammad; Le, Wei; Rajan, Hridesh
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE),
05/2021
Conference Proceeding
Odprti dostop
Deep neural networks (DNNs) are becoming an integral part of most software systems. Previous work has shown that DNNs have bugs. Unfortunately, existing debugging techniques don't support localizing ...DNN bugs because of the lack of understanding of model behaviors. The entire DNN model appears as a black box. To address these problems, we propose an approach and a tool that automatically determines whether the model is buggy or not, and identifies the root causes for DNN errors. Our key insight is that historic trends in values propagated between layers can be analyzed to identify faults, and also localize faults. To that end, we first enable dynamic analysis of deep learning applications: by converting it into an imperative representation and alternatively using a callback mechanism. Both mechanisms allows us to insert probes that enable dynamic analysis over the traces produced by the DNN while it is being trained on the training data. We then conduct dynamic analysis over the traces to identify the faulty layer or hyperparameter that causes the error. We propose an algorithm for identifying root causes by capturing any numerical error and monitoring the model during training and finding the relevance of every layer/parameter on the DNN outcome. We have collected a benchmark containing 40 buggy models and patches that contain real errors in deep learning applications from Stack Overflow and GitHub. Our benchmark can be used to evaluate automated debugging tools and repair techniques. We have evaluated our approach using this DNN bug-and-patch benchmark, and the results showed that our approach is much more effective than the existing debugging approach used in the state-of-the-practice Keras library. For 34/40 cases, our approach was able to detect faults whereas the best debugging approach provided by Keras detected 32/40 faults. Our approach was able to localize 21/40 bugs whereas Keras did not localize any faults.
Automatic Solution Summarization for Crash Bugs Wang, Haoye; Xia, Xin; Lo, David ...
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE),
05/2021
Conference Proceeding
Odprti dostop
The causes of software crashes can be hidden anywhere in the source code and development environment. When encountering software crashes, recurring bugs that are discussed on Q&A sites could provide ...developers with solutions to their crashing problems. However, it is difficult for developers to accurately search for relevant content on search engines, and developers have to spend a lot of manual effort to find the right solution from the returned results. In this paper, we present CraSolver, an approach that takes into account both the structural information of crash traces and the knowledge of crash-causing bugs to automatically summarize solutions from crash traces. Given a crash trace, CraSolver retrieves relevant questions from Q&A sites by combining a proposed position dependent similarity - based on the structural information of the crash trace - with an extra knowledge similarity, based on the knowledge from official documentation sites. After obtaining the answers to these questions from the Q&A site, CraSolver summarizes the final solution based on a multi-factor scoring mechanism. To evaluate our approach, we built two repositories of Java and Android exception-related questions from Stack Overflow with size of 69,478 and 33,566 questions respectively. Our user study results using 50 selected Java crash traces and 50 selected Android crash traces show that our approach significantly outperforms four baselines in terms of relevance, usefulness, and diversity. The evaluation also confirms the effectiveness of the relevant question retrieval component in our approach for crash traces.
Testing cyber-physical system (CPS) development tools such as MathWorks' Simulink is very important as they are widely used in design, simulation, and verification of CPS data-flow models. Existing ...randomized differential testing frameworks such as SLforge leverages semi-formal Simulink specifications to guide random model generation which requires significant research and engineering investment along with the need to manually update the tool, whenever MathWorks updates model validity rules. To address the limitations, we propose to learn validity rules automatically by learning a language model using our framework DeepFuzzSL from existing corpus of Simulink models. In our experiments, DeepFuzzSL consistently generate over 90% valid Simulink models and also found 2 confirmed bugs by MathWorks Support.
Forking structure is widespread in the open-source repositories and that causes a significant number of merge conflicts. In this paper, we study the problem of textual merge conflicts from the ...perspective of Microsoft Edge, a large, highly collaborative fork of the main Chromium branch with significant merge conflicts. Broadly, this study is divided into two sections. First, we empirically evaluate textual merge conflicts in Microsoft Edge and classify them based on the type of files, location of conflicts in a file, and the size of conflicts. We found that ~28% of the merge conflicts are 1-2 line changes, and many resolutions have frequent patterns. Second, driven by these findings, we explore Program Synthesis (for the first time) to learn patterns and resolve structural merge conflicts. We propose a novel domain-specific language (DSL) that captures many of the repetitive merge conflict resolution patterns and learn resolution strategies as programs in this DSL from example resolutions. We found that the learned strategies can resolve 11.4% of the conflicts (~41% of 1-2 line changes) that arise in the C++ files with 93.2% accuracy.
White-Box Analysis over Machine Learning Velez, Miguel; Jamshidi, Pooyan; Siegmund, Norbert ...
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE),
05/2021
Conference Proceeding
Odprti dostop
Performance-influence models can help stakeholders understand how and where configuration options and their interactions influence the performance of a system. With this understanding, stakeholders ...can debug performance behavior and make deliberate configuration decisions. Current black-box techniques to build such models combine various sampling and learning strategies, resulting in tradeoffs between measurement effort, accuracy, and interpretability. We present Comprex, a white-box approach to build performance-influence models for configurable systems, combining insights of local measurements, dynamic taint analysis to track options in the implementation, compositionality, and compression of the configuration space, without relying on machine learning to extrapolate incomplete samples. Our evaluation on 4 widely-used, open-source projects demonstrates that Comprex builds similarly accurate performance-influence models to the most accurate and expensive black-box approach, but at a reduced cost and with additional benefits from interpretable and local models.
Path Complexity of Recursive Functions Pregerson, Eli
2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion),
05/2023
Conference Proceeding
Path coverage is of critical importance in software testing and verification, and further, path explosion is a well-known challenge for automatic software analysis techniques like symbolic execution ...7. Asymptotic Path Complexity (APC), a code complexity metric developed in my research lab, formalizes the quantitative measurement of path explosion.
Fault localization is a practical research topic that helps developers identify code locations that might cause bugs in a program. Most existing fault localization techniques are designed for ...imperative programs (e.g., C and Java) and rely on analyzing correct and incorrect executions of the program to identify suspicious statements. In this work, we introduce a fault localization approach for models written in a declarative language, where the models are not "executed," but rather converted into a logical formula and solved using backend constraint solvers. We present FLACK, a tool that takes as input an Alloy model consisting of some violated assertion and returns a ranked list of suspicious expressions contributing to the assertion violation. The key idea is to analyze the differences between counterexamples, i.e., instances of the model that do not satisfy the assertion, and instances that do satisfy the assertion to find suspicious expressions in the input model. The experimental results show that FLACK is efficient (can handle complex, real-world Alloy models with thousand lines of code within 5 seconds), accurate (can consistently rank buggy expressions in the top 1.9% of the suspicious list), and useful (can often narrow down the error to the exact location within the suspicious expressions).
A comprehensive study of autonomous vehicle bugs Garcia, Joshua; Feng, Yang; Shen, Junjie ...
2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE),
06/2020
Conference Proceeding
Odprti dostop
Self-driving cars, or Autonomous Vehicles (AVs), are increasingly becoming an integral part of our daily life. About 50 corporations are actively working on AVs, including large companies such as ...Google, Ford, and Intel. Some AVs are already operating on public roads, with at least one unfortunate fatality recently on record. As a result, understanding bugs in AVs is critical for ensuring their security, safety, robustness, and correctness. While previous studies have focused on a variety of domains (e.g., numerical software; machine learning; and error-handling, concurrency, and performance bugs) to investigate bug characteristics, AVs have not been studied in a similar manner. Recently, two software systems for AVs, Baidu Apollo and Autoware, have emerged as frontrunners in the open-source community and have been used by large companies and governments (e.g., Lincoln, Volvo, Ford, Intel, Hitachi, LG, and the US Department of Transportation). From these two leading AV software systems, this paper describes our investigation of 16,851 commits and 499 AV bugs and introduces our classification of those bugs into 13 root causes, 20 bug symptoms, and 18 categories of software components those bugs often affect. We identify 16 major findings from our study and draw broader lessons from them to guide the research community towards future directions in software bug detection, localization, and repair.