Online judges are systems designed for the reliable evaluation of algorithm source code submitted by users, which is next compiled and tested in a homogeneous environment. Online judges are becoming ...popular in various applications. Thus, we would like to review the state of the art for these systems. We classify them according to their principal objectives into systems supporting organization of competitive programming contests, enhancing education and recruitment processes, facilitating the solving of data mining challenges, online compilers and development platforms integrated as components of other custom systems. Moreover, we introduce a formal definition of an online judge system and summarize the common evaluation methodology supported by such systems. Finally, we briefly discuss an Optil.io platform as an example of an online judge system, which has been proposed for the solving of complex optimization problems. We also analyze the competition results conducted using this platform. The competition proved that online judge systems, strengthened by crowdsourcing concepts, can be successfully applied to accurately and efficiently solve complex industrial- and science-driven challenges.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, SAZU, UL, UM, UPUK
Compilers should not crash and they should not miscompile applications. Random testing is an effective method for finding compiler bugs that have escaped other kinds of testing. This paper presents ...Yet Another Random Program Generator (YARPGen), a random test-case generator for C and C++ that we used to find and report more than 220 bugs in GCC, LLVM, and the Intel® C++ Compiler. Our research contributions include a method for generating expressive programs that avoid undefined behavior without using dynamic checks, and generation policies, a mechanism for increasing diversity of generated code and for triggering more optimizations. Generation policies decrease the testing time to find hard-to-trigger compiler bugs and, for the kinds of scalar optimizations YARPGen was designed to stress-test, increase the number of times these optimizations are applied by the compiler by an average of 20% for LLVM and 40% for GCC. We also created tools for automating most of the common tasks related to compiler fuzzing; these tools are also useful for fuzzers other than ours.
Optimization-Aware Compiler-Level Event Profiling Basso, Matteo; Prokopec, Aleksandar; Rosà, Andrea ...
ACM transactions on programming languages and systems,
06/2023, Volume:
45, Issue:
2
Journal Article
Peer reviewed
Open access
Tracking specific events in a program’s execution, such as object allocation or lock acquisition, is at the heart of dynamic analysis. Despite the apparent simplicity of this task, quantifying these ...events is challenging due to the presence of compiler optimizations. Profiling perturbs the optimizations that the compiler would normally do—a profiled program usually behaves differently than the original one.In this article, we propose a novel technique for quantifying compiler-internal events in the optimized code, reducing the profiling perturbation on compiler optimizations. Our technique achieves this by instrumenting the program from within the compiler, and by delaying the instrumentation until the point in the compilation pipeline after which no subsequent optimizations can remove the events. We propose two different implementation strategies of our technique based on path-profiling, and a modification to the standard path-profiling algorithm that facilitates the use of the proposed strategies in a modern just-in-time (JIT) compiler. We use our technique to analyze the behaviour of the optimizations in Graal, a state-of-the-art compiler for the Java Virtual Machine, identifying the reasons behind a performance improvement of a specific optimization, and the causes behind an unexpected slowdown of another. Finally, our evaluation results show that the two proposed implementations result in a significantly lower execution-time overhead w.r.t. a naive implementation.
Tuning compiler optimizations for rapidly evolving hardware makes porting and extending an optimizing compiler for each new platform extremely challenging. Iterative optimization is a popular ...approach to adapting programs to a new architecture automatically using feedback-directed compilation. However, the large number of evaluations required for each program has prevented iterative compilation from widespread take-up in production compilers. Machine learning has been proposed to tune optimizations across programs systematically but is currently limited to a few transformations, long training phases and critically lacks publicly released, stable tools. Our approach is to develop a modular, extensible, self-tuning optimization infrastructure to automatically learn the best optimizations across multiple programs and architectures based on the correlation between program features, run-time behavior and optimizations. In this paper we describe Milepost GCC, the first publicly-available open-source machine learning-based compiler. It consists of an Interactive Compilation Interface (ICI) and plugins to extract program features and exchange optimization data with the cTuning.org open public repository. It automatically adapts the internal optimization heuristic at function-level granularity to improve execution time, code size and compilation time of a new program on a given architecture. Part of the MILEPOST technology together with low-level ICI-inspired plugin framework is now included in the mainline GCC. We developed machine learning plugins based on probabilistic and transductive approaches to predict good combinations of optimizations. Our preliminary experimental results show that it is possible to automatically reduce the execution time of individual MiBench programs, some by more than a factor of 2, while also improving compilation time and code size. On average we are able to reduce the execution time of the MiBench benchmark suite by 11% for the ARC reconfigurable processor. We also present a realistic multi-objective optimization scenario for Berkeley DB library using Milepost GCC and improve execution time by approximately 17%, while reducing compilation time and code size by 12% and 7% respectively on Intel Xeon processor.
Full text
Available for:
CEKLJ, DOBA, EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, IZUM, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, SIK, UILJ, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
In this on going research paper, we present our work on the compiler support for an AI-oriented SIMD Extension, called SPARROW. The SPARROW hardware design has been developed during a recently ...defended, awardwinning Master Thesis and is targeting Cobham Gaisler's space processors Leon3 and NOEL-V. We present the compiler support we have included in two compiler toolchains, gcc and llvm as well as a SIMD intrinsics library for easy programmability. Compiler modifications are kept to minimum in order to enable incremental qualification of the toolchains. We present our experience working with the two compilers and performance results for the two compilers on top an FPGA implementation of the target space processor.
Many programmers live in happy ignorance of their compilers’ internal workings. Others may want to take a look at what is going on inside a compiler in much the same way that they use a debugger to ...watch their compiled programs execute. While conventional compilers are black boxes whose internals are hidden from the user, the Jaccie tool set helps to open up the box and have a look at what is going on inside.
Technically speaking, Jaccie is a compiler–compiler that, from suitable formal descriptions, generates the scanner, parser, and attribute evaluator components of a compiler and presents them in a visual debugging environment. It offers a number of alternative parser generators producing both top-down (LL) and bottom-up parsers of the LR variety, including SLR(1) and LALR(1) parsers, thus allowing users to experiment with different parsing strategies and to get a “feel” for their relative pros and cons. When designing Jaccie, the main emphasis was on two ergonomic goals that we considered important for educational software. Firstly, give user control over the program and not vice versa, e.g., our parsers (and other components) can be directed to go step-by-step forwards or backwards or to leap to some point in the input indicated by the users. Secondly, overcome the sometimes severe size limitations of computer displays by offering the same information in multiple representations that complement each other and by dividing information in smaller chunks that can be traversed in a meaningful way.
In this paper, after outlining the architecture of Jaccie, we discuss some of its technical and ergonomic aspects in detail, give a brief introduction into the use of Jaccie and its documentation, show an example application done with Jaccie, and finally discuss related work and future plans.
► JACCIE is an educational tool for visualizing compiling techniques. ► It generates the scanner, parser and attribute evaluator components of a compiler. ► It can produce both top-down (LL) and bottom-up parsers (LR, SLR, and LALR). ► In the design of JACCIE, the main emphasis was on ergonomic goals.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Loop tiling and fusion are two essential transformations in optimizing compilers to enhance the data locality of programs. Existing heuristics either perform loop tiling and fusion in a particular ...order, missing some of their profitable compositions, or execute ad-hoc implementations for domain-specific applications, calling for a generalized and systematic solution in optimizing compilers. In this article, we present a so-called basteln (an abbreviation for backward slicing of tiled loop nests) strategy in polyhedral compilation to better model the interplay between loop tiling and fusion. The basteln strategy first groups loop nests by preserving their parallelism/tilability and next performs rectangular/parallelogram tiling to the output groups that produce data consumed outside the considered program fragment. The memory footprints required by each tile are then computed, from which the upward exposed data are extracted to determine the tile shapes of the remaining fusion groups. Such a tiling mechanism can construct complex tile shapes imposed by the dependences between these groups, which are further merged by a post-tiling fusion algorithm for enhancing data locality without losing the parallelism/tilability of the output groups. The basteln strategy also takes into account the amount of redundant computations and the fusion of independent groups, exhibiting a general applicability. We integrate the basteln strategy into two optimizing compilers, with one a general-purpose optimizer and the other a domain-specific compiler for deploying deep learning models. The experiments are conducted on CPU, GPU, and a deep learning accelerator to demonstrate the effectiveness of the approach for a wide class of application domains, including deep learning, image processing, sparse matrix computation, and linear algebra. In particular, the basteln strategy achieves a mean speedup of 1.8× over cuBLAS/cuDNN and 1.1× over TVM on GPU when used to optimize deep learning models; it also outperforms PPCG and TVM by 11% and 20%, respectively, when generating code for the deep learning accelerator.
The convolutional neural network (CNN) has become a state-of-the-art method for several artificial intelligence domains in recent years. The increasingly complex CNN models are both computation-bound ...and I/O-bound. Field-programmable gate array-based accelerators driven by custom instruction set architecture (ISA) achieve a balance between generality and efficiency, but there is much on them left to be optimized. We propose the full-stack compiler deep neural network virtual machine (DNNVM), which is an integration of optimizers for graphs, loops and data layouts, an assembler, a runtime supporter, and a validation environment. The DNNVM works in the context of deep learning frameworks and transforms CNN models into the directed acyclic graph: XGraph. Based on XGraph, we transform the optimization challenges for both data layout and pipeline into graph-level problems. DNNVM enumerates all potentially profitable fusion opportunities by a heuristic subgraph isomorphism algorithm to leverage pipeline and data layout optimizations, and searches for the best choice of execution strategies of the whole computing graph. On the Xilinx ZU2@330 MHz and ZU9@330 MHz, we achieve equivalently state-of-the-art performance on our benchmarks by naïve implementations without optimizations, and the throughput is further improved up to <inline-formula> <tex-math notation="LaTeX">1.26\times </tex-math></inline-formula> by leveraging heterogeneous optimizations in DNNVM. Finally, with ZU9@330 MHz, we achieve state-of-the-art performance for VGG and ResNet50. We achieve a throughput of 2.82 TOPs/s and an energy efficiency of 123.7 GOPs/s/W for VGG. Additionally, we achieve 1.38 TOPs/s for ResNet50 and 1.41 TOPs/s for GoogleNet.
Finding a good compiler autotuning methodology, particularly for selecting the right set of optimisations and finding the best ordering of these optimisations for a given code fragment has been a ...long-standing problem. As the rapid development of machine learning techniques, tackling the problem of compiler autotuning using machine learning or deep learning has become increasingly common in recent years. There have been many deep learning models proposed to solve problems, such as predicting the optimal heterogeneous mapping or thread-coarsening factor; however, very few have revisited the problem of optimisation phase tuning. In this paper, we intend to revisit and tackle the problem using deep learning techniques. Unfortunately, the problem is too complex to be addressed in its full scope. We present a new problem, called reduced O3 subsequence labelling - a reduced O3 subsequence is defined as a subsequence of O3 optimisation passes that contains no useless passes, which is a simplified problem and expected to be a stepping stone towards solving the optimisation phase tuning problem. We formulated the problem and attempted to solve the problem by using genetic algorithms. We believe that with mature deep learning techniques, a machine learning model that predicts the reduced O3 subsequences or even the best O3 subsequence for a given code fragment could be developed and used to prune the search space of the problem of the optimisation phase tuning, thereby shortening the tuning process and also providing more effective tuning results.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Compiler diagnostic warnings help developers identify potential programming mistakes during program compilation. However, these warnings could be erroneous due to the defects of compiler warning ...diagnostics. Although the existing technique (i.e., Epiphron) can automatically generate test programs for compiler warning defect detection, the effectiveness of Epiphron on defect-finding is still limited, due to the limitation for generating warning-sensitive test program structures. Therefore, in this paper, we propose a DIversity-guided PROgram Mutation approach, called DIPROM, to construct diverse warning-sensitive programs for effective compiler warning defect detection. Given a seed test program, DIPROM first removes its dead code to reduce false positive warning defects. Then, the abstract syntax tree (AST) of the test program is constructed; DIPROM iteratively mutates the structures of the AST to generate warning-sensitive program variants. To effectively construct diverse warning-sensitive structures, DIPROM applies a novel diversity-guided strategy to generate program variants in each iteration. With the generated program variants, differential testing is conducted to detect warning defects in different compilers. In the experiments, we evaluate DIPROM with two popular C compilers (i.e., GCC and Clang). Experimental results show that DIPROM significantly outperforms three state-of-the-art approaches (i.e., HiCOND, Epiphron, and Hermes) by up to 18.93%<inline-formula><tex-math notation="LaTeX">\sim</tex-math> <mml:math><mml:mo>∼</mml:mo></mml:math><inline-graphic xlink:href="jiang-ieq1-3119186.gif"/> </inline-formula>76.74% in terms of the bug-finding capability on average. Meanwhile, DIPROM is efficient, which spends less time on finding the same average number of warning defects. We at last applied DIPROM to the latest development versions of GCC and Clang. After two months' running, we reported 8 new warning defects; 5 of them have been confirmed/fixed by developers.