Interface Compliance of Inline Assembly Recoules, Frédéric; Bardin, Sébastien; Bonichon, Richard ...
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE),
05/2021
Conference Proceeding
Odprti dostop
Inline assembly is still a common practice in low-level C programming, typically for efficiency reasons or for accessing specific hardware resources. Such embedded assembly codes in the GNU syntax ...(supported by major compilers such as GCC, Clang and ICC) have an interface specifying how the assembly codes interact with the C environment. For simplicity reasons, the compiler treats GNU inline assembly codes as blackboxes and relies only on their interface to correctly glue them into the compiled C code. Therefore, the adequacy between the assembly chunk and its interface (named compliance) is of primary importance, as such compliance issues can lead to subtle and hard-to-find bugs. We propose RUSTInA, the first automated technique for formally checking inline assembly compliance, with the extra ability to propose (proven) patches and (optimization) refinements in certain cases. RUSTInA is based on an original formalization of the inline assembly compliance problem together with novel dedicated algorithms. Our prototype has been evaluated on 202 Debian packages with inline assembly (2656 chunks), finding 2183 issues in 85 packages - 986 significant issues in 54 packages (including major projects such as ffmpeg or ALSA), and proposing patches for 92% of them. Currently, 38 patches have already been accepted (solving 156 significant issues), with positive feedback from development teams.
DaMAT: A Data-Driven Mutation Analysis Tool Viganò, Enrico; Cornejo, Oscar; Pastore, Fabrizio ...
2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion),
05/2023
Conference Proceeding
Odprti dostop
We present DaMAT, a tool that implements data-driven mutation analysis. In contrast to traditional code-driven mutation analysis tools it mutates (i.e., modifies) the data exchanged by components ...instead of the source of the software under test. Such an approach helps ensure that test suites appropriately exercise components interoperability --- essential for safety-critical cyber-physical systems. A user-provided fault model drives the mutation process. We have successfully evaluated DaMAT on software controlling a microsatellite and a set of libraries used in deployed CubeSats. A demo video of DaMAT is available at https://youtu.be/s5M52xWCj84
Randomized Differential Testing of RDF Stores Yang, Rui; Zheng, Yingying; Tang, Lei ...
2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion),
05/2023
Conference Proceeding
As a special kind of graph database systems, RDF stores have been widely used in many applications, e.g., knowledge graphs and semantic web. RDF stores utilize SPARQL as their standardized query ...language to store and retrieve RDF graphs. Incorrect implementations of RDF stores can introduce logic bugs that cause RDF stores to return incorrect query results. These logic bugs can lead to severe consequences and are likely to go unnoticed by developers. However, no available tools can detect logic bugs in RDF stores.
In this paper, we propose RD2, a Randomized Differential testing approach of RDF stores, to reveal discrepancies among RDF stores, which indicate potential logic bugs in RDF stores. The core idea of RD2 is to build an equivalent RDF graph for multiple RDF stores, and verify whether they can return the same query result for a given SPARQL query. Guided by the SPARQL syntax and the generated RDF graph, we automatically generate syntactically valid SPARQL queries, which can return non-empty query results with high probability. We further unify the formats of SPARQL query results from different RDF stores and find discrepancies among them. We evaluate RD2 on three popular and widely-used RDF stores. In total, we have detected 5 logic bugs in them. A video demonstration of RD2 is available at https://youtu.be/da7XlsdbRR4.
ActionsRemaker: Reproducing GitHub Actions Zhu, Hao-Nan; Guan, Kevin Z.; Furth, Robert M. ...
2023 IEEE/ACM 45th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion),
05/2023
Conference Proceeding
Mining Continuous Integration and Continuous Delivery (CI/CD) has enabled new research opportunities for the software engineering (SE) research community. However, it remains a challenge to reproduce ...CI/CD build processes, which is crucial for several areas of research within SE such as fault localization and repair. In this paper, we present ActionsRemaker, a reproducer for GitHub Actions builds. We describe the challenges on reproducing GitHub Actions builds and the design of ActionsRemaker. Evaluation of ActionsRemaker demonstrates its ability to reproduce fail-pass pairs: of 180 pairs from 67 repositories, 130 (72.2%) from 43 repositories are reproducible. We also discuss reasons for unreproducibility. ActionsRemaker is publicly available at https://github.com/bugswarm/actions-remaker, and a demo of the tool can be found at https://youtu.be/flblSqoxeAk.
AutoTrainer Zhang, Xiaoyu; Zhai, Juan; Ma, Shiqing ...
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE),
05/2021
Conference Proceeding
With machine learning models especially Deep Neural Network (DNN) models becoming an integral part of the new intelligent software, new tools to support their engineering process are in high demand. ...Existing DNN debugging tools are either post-training which wastes a lot of time training a buggy model and requires expertises, or limited on collecting training logs without analyzing the problem not even fixing them. In this paper, we propose AutoTrainer, a DNN training monitoring and automatic repairing tool which supports detecting and auto-repairing five commonly seen training problems. During training, it periodically checks the training status and detects potential problems. Once a problem is found, AutoTrainer tries to fix it by using built-in state-of-the-art solutions. It supports various model structures and input data types, such as Convolutional Neural Networks (CNNs) for image and Recurrent Neural Networks (RNNs) for texts. Our evaluation on 6 datasets, 495 models show that AutoTrainer can effectively detect all potential problems with 100% detection rate and no false positives. Among all models with problems, it can fix 97.33% of them, increasing the accuracy by 47.08% on average.
The influence of organizational structure on software quality Nagappan, Nachiappan; Murphy, Brendan; Basili, Victor
International Conference on Software Engineering 2008,
01/2008, Letnik:
2008, Številka:
24
Conference Proceeding, Journal Article
Often software systems are developed by organizations consisting of many teams of individuals working together. Brooks states in the Mythical Man Month book that product quality is strongly affected ...by organization structure. Unfortunately there has been little empirical evidence to date to substantiate this assertion. In this paper we present a metric scheme to quantify organizational complexity, in relation to the product development process to identify if the metrics impact failure-proneness. In our case study, the organizational metrics when applied to data from Windows Vista were statistically significant predictors of failure-proneness. The precision and recall measures for identifying failure-prone binaries, using the organizational metrics, was significantly higher than using traditional metrics like churn, complexity, coverage, dependencies, and pre-release bug measures that have been used to date to predict failure-proneness. Our results provide empirical evidence that the organizational metrics are related to, and are effective predictors of failure-proneness.
DepOwl Jia, Zhouyang; Li, Shanshan; Yu, Tingting ...
2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE),
05/2021
Conference Proceeding
Odprti dostop
Applications depend on libraries to avoid reinventing the wheel. Libraries may have incompatible changes during evolving. As a result, applications will suffer from compatibility failures. There has ...been much research on addressing detecting incompatible changes in libraries, or helping applications co-evolve with the libraries. The existing solution helps the latest application version work well against the latest library version as an afterthought. However, end users have already been suffering from the failures and have to wait for new versions. In this paper, we propose DepOwl, a practical tool helping users prevent compatibility failures. The key idea is to avoid using incompatible versions from the very beginning. We evaluated DepOwl on 38 known compatibility failures from StackOverflow, and DepOwl can prevent 35 of them. We also evaluated DepOwl using the software repository shipped with Ubuntu-19.10. DepOwl detected 77 unknown dependency bugs, which may lead to compatibility failures.
SMARTIAN Choi, Jaeseung; Kim, Doyeon; Kim, Soomin ...
2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE),
11/2021
Conference Proceeding
Unlike traditional software, smart contracts have the unique organization in which a sequence of transactions shares persistent states. Unfortunately, such a characteristic makes it difficult for ...existing fuzzers to find out critical transaction sequences. To tackle this challenge, we employ both static and dynamic analyses for fuzzing smart contracts. First, we statically analyze smart contract bytecodes to predict which transaction sequences will lead to effective testing, and figure out if there is a certain constraint that each transaction should satisfy. Such information is then passed to the fuzzing phase and used to construct an initial seed corpus. During a fuzzing campaign, we perform a lightweight dynamic data-flow analysis to collect data-flow-based feedback to effectively guide fuzzing. We implement our ideas on a practical open-source fuzzer, named Smartian. Smartian can discover bugs in real-world smart contracts without the need for the source code. Our experimental results show that Smartian is more effective than existing state-of-the-art tools in finding known CVEs from real-world contracts. Smartian also outperforms other tools in terms of code coverage.
Due to the difficulty of repairing defect, many research efforts have been devoted into automatic defect repair. Given a buggy program that fails some test cases, a typical automatic repair technique ...tries to modify the program to make all tests pass. However, since the test suites in real world projects are usually insufficient, aiming at passing the test suites often leads to incorrect patches. This problem is known as weak test suites or overfitting. In this paper we aim to produce precise patches, that is, any patch we produce has a relatively high probability to be correct. More concretely, we focus on condition synthesis, which was shown to be able to repair more than half of the defects in existing approaches. Our key insight is threefold. First, it is important to know what variables in a local context should be used in an "if" condition, and we propose a sorting method based on the dependency relations between variables. Second, we observe that the API document can be used to guide the repair process, and propose document analysis technique to further filter the variables. Third, it is important to know what predicates should be performed on the set of variables, and we propose to mine a set of frequently used predicates in similar contexts from existing projects. Based on the insight, we develop a novel program repair system, ACS, that could generate precise conditions at faulty locations. Furthermore, given the generated conditions are very precise, we can perform a repair operation that is previously deemed to be too overfitting: directly returning the test oracle to repair the defect. Using our approach, we successfully repaired 18 defects on four projects of Defects4J, which is the largest number of fully automatically repaired defects reported on the dataset so far. More importantly, the precision of our approach in the evaluation is 78.3%, which is significantly higher than previous approaches, which are usually less than 40%.