Research in the domain of quantum computation is mainly driven by their promising applications e.g., for factorization or database search. At the same time, physical developments for this emerging ...technology constantly lead to new constraints to be addressed by logic designers. The limited interaction distance between qubits, the elementary information storage in quantum circuits, is one of the most common restrictions, leading to the fact that, for many quantum architectures, computations can only be performed on adjacent (i.e., nearest neighbor) qubits. Motivated by that, optimization of quantum circuits with respect to this restriction has become an intensely considered research topic. In this paper, we briefly review existing approaches that have been proposed in the past for this purpose. We particularly consider that almost all existing solutions are of heuristic nature, i.e., do not guarantee an optimal solution. In order to address this, exact alternatives are introduced which make use of the deductive power of constraint solvers. By this, we are able to perform a qualitative evaluation of the performance of existing (heuristic) solutions for linear nearest neighbor quantum circuit optimization.
Quantum systems provide a new way of conducting computations based on the so-called qubits. Due to the potential for significant speed-ups, this field received significant research attention in ...recent years. The Clifford+T library is a very promising and popular gate library for these kinds of computations. Unlike other libraries considered so far, it consists of only a small number of gates for all of which robust, fault-tolerant realizations are known for many technologies that seem to be promising for large-scale quantum computing. As a consequence, (logic) synthesis of Clifford+T quantum circuits became an important research problem. However, previous work in this area has several drawbacks: Corresponding approaches are either only applicable to very small quantum systems or lead to circuits that are far from being optimal. The latter is mainly caused by the fact that current synthesis realizes the desired circuit by a local, i.e., column-wise, consideration of the underlying unitary transformation matrix to be synthesized. In this paper, we analyze the conceptual drawbacks of this approach and propose to overcome them by taking a global view of the matrices and perform a separation of concerns regarding individual synthesis steps. We precisely describe a corresponding algorithm as well as its efficient implementation on top of decision diagrams. Experimental results confirm the resulting benefits and show improvements of up to several orders of magnitudes in costs compared to previous work.
Due to the ever increasing complexity of hardware systems, designers strive for higher levels of abstractions in the early stages of the design process. Modeling hardware at the electronic system ...level (ESL) is one way to address this demand, with the C++-based system modeling framework SystemC and its abstract communication library transaction level modeling (TLM) having become de-facto standards for ESL system design. While the C++ compiler is sufficient to compile and simulate a given ESL design, for tasks of design understanding, debugging, or validation (where access to the details of design's structure and behavior is necessarily required), design needs to be processed by an appropriate tool. This problem is often solved by adding instrumentation code to either the design or the library, usually resulting in incomplete logs, work overhead and/or incompatibilities. This paper introduces an approach that automatically extracts information about both, structure and behavior of SystemC designs and TLM transactions, nonintrusively. The information is retrieved from a given design by running it in debug mode while being connected to a preprogrammed debugger, thus leaving the existing sources and workflows untouched while collecting a vast amount of data without user intervention. Illustrating use cases, value change dump files of the SystemC models' behavior and unified modeling language activity diagrams of transaction protocols are created automatically from simulation runs.
While a couple of impressive quantum technologies have been proposed, they have several intrinsic limitations which must be considered by circuit designers to produce realizable circuits. Limited ...interaction distance between gate qubits is one of the most common limitations. In this paper, we suggest extensions of the existing synthesis flow aimed to realize circuits for quantum architectures with linear nearest neighbor interaction. To this end, a template matching optimization, an exact synthesis approach, and two reordering strategies are introduced. The proposed methods are combined as an integrated synthesis flow. Experiments show that by using the suggested flow, quantum cost can be improved by more than 50% on average.
Reversible logic is the basis for several emerging technologies such as quantum computing, optical computing, or DNA computing and has further applications in domains like low-power design and ...nanotechnologies. However, current methods for the synthesis of reversible logic are limited, i.e. they are applicable to relatively small functions only. In this paper, we propose a synthesis approach, that can cope with Boolean functions containing more than a hundred of variables. We present a technique to derive reversible circuits for a function given by a Binary Decision Diagram (BDD). The circuit is obtained using an algorithm with linear worst case behavior regarding run-time and space requirements. Furthermore, the size of the resulting circuit is bounded by the BDD size. This allows to transfer theoretical results known from BDDs to reversible circuits. Experiments show better results (with respect to the circuit cost) and a significantly better scalability in comparison to previous synthesis approaches.
Recently, there has been a growing concern regarding the dependability of NAND flash cells, notably as the scale of their features reduces. To address this issue, implementing error correction codes ...(ECC) proves to be an effective solution. Among the various methods, BCH coding has gained significant interest because of its exceptional error correction capabilities. Over the last decades, there has been much research on BCH decoder design to meet the demand for reduced hardware complexities, minimized delay performance, and lower power dissipation to enable BCH decoders and their VLSI implementations to facilitate different code lengths and rates of code. This paper surveys the trends and challenges associated with BCH decoder in NAND flash memory devices, the possible solutions for overcoming of time and area overhead in architecture of BCH decoder block and an examination of the extent to which present architectures will respond to the escalating requirements on data transfer rate, bit error rate (BER) performance, power consumption, and silicon area that will be essential for the extensive acceptance of BCH code in applications that will emerge in the near future. To demonstrate the need for such solutions, we present rigorous experimental data on BCH error correction codes on various types of flash memory errors, to motivate the need for such techniques. Based on the understanding developed by the experimental characterization, we describe several area-delay efficient techniques, including three low-latency decoding strategies for implementing the BCH decoder: pipeline method, re-encoding scheme, and parallelization method, and various hardware optimization strategies for the BCH decoder, such as three area-efficient syndrome block architectures, four error locator polynomial detection algorithms, and four error position identification algorithms using the Chien search method. We investigate the increase in reliability that each of these methods brings. We also briefly address future directions that these methods and flash memory techniques could evolve into the future.
The complexity of hardware designs is still increasing according to Moore's law. With embedded systems being more and more intertwined and working together not only with each other, but also with ...their environments as cyber physical systems (CPSs), more streamlined development workflows are employed to handle the increasing complexity during a system's design phase. SystemC is a C++ library for the design of hardware/software systems, enabling the designer to quickly prototype, e.g., a distributed CPS without having to decide about particular implementation details (such as whether to implement a feature in hardware or in software) early in the design process. Thereby, this approach reduces the initial implementation's complexity by offering an abstract layer with which to build a working prototype. However, as SystemC is based on C++, analyzing designs becomes a difficult task due to the complex language features that are available to the designer. Several fundamentally different approaches for analyzing SystemC designs have been suggested. This work illustrates several different SystemC analysis approaches, including their specific advantages and shortcomings, allowing designers to pick the right tools to assist them with a specific problem during the design of a system using SystemC.
In-memory computing (IMC) has attracted significant interest in recent years as it aims to bridge the memory bottleneck in the Von Neumann architectures. IMC also improves the energy efficiency in ...these architectures. Another technique that has been explored to reduce the energy consumption is the use of approximate circuits, targeted toward error resilient applications. These applications have addition as one of their most frequently used operations. In literature, CMOS-based approximate adder libraries have been implemented to help designers choose from a variety of designs depending on the output quality requirements. However, the same is not true for memristor-based approximate adders targeted for IMC architectures. Hence, in this work, we developed a framework to generate approximate adder designs with varying output errors for the 8-, 12-, and 16-bit adders. We implemented a state-of-the-art scheduling algorithm to obtain the best mapping of these approximate adder designs for IMC. We performed an exhaustive design space exploration to obtain the pareto-optimal approximate adder designs for various design and error metrics. We then proposed IMAGIN, a library of approximate adders compatible with the memristor-based IMC architecture, which are based on the IMPLY and MAGIC design styles. We also performed mean filtering on the Kodak image dataset using the approximate adders from the IMAGIN library. IMAGIN can help designers select from a wide variety of approximate adders depending on the output quality requirements and serve as benchmarks for future research in this direction. All pareto-optimal designs will be made available at https://github.com/agra-uni-bremen/JxCDC2022-imagin-add .