The Petaflops supercomputer “Zhores” recently launched in the “Center for Computational and Data-Intensive Science and Engineering” (CDISE) of Skolkovo Institute of Science and Technology (Skoltech) ...opens up new exciting opportunities for scientific discoveries in the institute especially in the areas of data-driven modeling, machine learning and artificial intelligence. This supercomputer utilizes the latest generation of Intel and NVidia processors to provide resources for the most compute intensive tasks of the Skoltech scientists working in digital pharma, predictive analytics, photonics, material science, image processing, plasma physics and many more. Currently it places 7
in the Russian and CIS TOP-50 (2019) supercomputer list. In this article we summarize the cluster properties and discuss the measured performance and usage modes of this new scientific instrument in Skoltech.
•A FOSS photogrammetric workflow to quickly process several UAV images for hydro-geomorphological analysis, was proposed.•Approaches based on high-performance computing clusters to optimize the ...workflow processing time were described.•Processing times were greatly improved through “pssh” approach on HTC cluster and parallel execution on HPC single server.
Photogrammetry is one of the most reliable techniques to generate high-resolution topographic data and it is key to territorial mapping and change detection analysis of landforms in hydro-geomorphological high-risk areas. Specifically, the Structure from Motion (SfM) is an emerging topographic survey technique that addresses the problem of determining the 3D position of image descriptors to estimate three-dimensional structures. Thanks to the potential of SfM algorithm and the development of Unmanned Aerial Vehicles (UAVs) that allow the on-demand acquisition of high-resolution aerial images, it is possible to survey extended areas of the Earth surface and monitor active phenomena through multi-temporal surveys. However, the ability to detect remote and wide areas with a very high-resolution is countered by the need to capture large datasets which can limit the photogrammetric process, due to the need for high-performance hardware. This paper presents a photogrammetric workflow based on Free and Open-Source Software (FOSS), which is able to return different outputs and to manage a large amount of data in reasonable time, through the distribution of the most computationally expensive steps on computing clusters hosted by the ReCaS-Bari data center for scientific research. The results are given in terms of performance evaluations based on different computing configurations of the clusters and setups of the steps of the workflow. The HTC cluster test with a parallel SSH approach involved an important reduction of several hours in the processing time of thousands UAV images, especially compared to classic photogrammetric process on a single workstation with commercial software.
A parallel test, aimed to validate the performance of a single sever of the new HPC cluster, involved really good results halving the processing time with respect to the HTC cluster test.
Reducing Coflow Completion Time (CCT) has a significant impact on application performance in data-parallel frameworks. Most existing works assume that the endpoints of constituent flows in each ...coflow are predetermined. We argue that CCT can be further optimized by treating flows' destinations as an additional optimization dimension via reducer placement. In this article, we propose and implement RPC, a joint online Reducer Placement and Coflow bandwidth scheduling framework, to minimize the average CCT in cloud clusters. We first develop a 2-approximation algorithm to minimize the CCT of a single coflow, and then schedule all the coflows following the Shortest Remaining Time First (SRTF) principle. We use real testbed experiments and extensive large-scale simulations to demonstrate that RPC can reduce the average CCT by 64.98% compared with the state-of-the-art technologies.
An open‐source program named VeloxChem has been developed for the calculation of electronic real and complex linear response functions at the levels of Hartree–Fock and Kohn–Sham density functional ...theories. With an object‐oriented program structure written in a Python/C++ layered fashion, VeloxChem enables time‐efficient prototyping of novel scientific approaches without sacrificing computational efficiency, so that molecular systems involving up to and beyond 500 second‐row atoms (or some 10,000 contracted and in part diffuse Gaussian basis functions) can be routinely addressed. In addition, VeloxChem is equipped with a polarizable embedding scheme for the treatment of the classical electrostatic interactions with an environment that in turn is modeled by atomic site charges and polarizabilities. The underlying hybrid message passing interface (MPI)/open multiprocessing (OpenMP) parallelization scheme makes VeloxChem suitable for execution in high‐performance computing cluster environments, showing even slightly beyond linear scaling for the Fock matrix construction with use of up to 16,384 central processing unit (CPU) cores. An efficient—with respect to convergence rate and overall computational cost—multifrequency/gradient complex linear response equation solver enables calculations not only of conventional spectra, such as visible/ultraviolet/X‐ray electronic absorption and circular dichroism spectra, but also time‐resolved linear response signals as due to ultra‐short weak laser pulses. VeloxChem distributed under the GNU Lesser General Public License version 2.1 (LGPLv2.1) license and made available for download from the homepage https://veloxchem.org.
This article is categorized under:
Software > Quantum Chemistry
Electronic Structure Theory > Density Functional Theory
Theoretical and Physical Chemistry > Spectroscopy
With a high degree of code vectorization and parallelization, the VeloxChem program provides a powerful tool to calculate absorptive and dispersive parts of real and complex linear response functions at the level of Kohn–Sham density functional theory, also allowing for a treatment of ultra‐short light pulses.
Edge computing is emerging as a new paradigm to allow processing data near the edge of the network, where the data is typically generated and collected. This enables critical computations at the edge ...in applications such as Internet of Things (IoT), in which an increasing number of devices (sensors, cameras, health monitoring devices, etc.) collect data that needs to be processed through computationally intensive algorithms with stringent reliability, security and latency constraints. Our key tool is the theory of coded computation, which advocates mixing data in computationally intensive tasks by employing erasure codes and offloading these tasks to other devices for computation. Coded computation is recently gaining interest, thanks to its higher reliability, smaller delay, and lower communication costs. In this paper, we develop a private and rateless adaptive coded computation (PRAC) algorithm for distributed matrix-vector multiplication by taking into account (1) the privacy requirements of IoT applications and devices, and (2) the heterogeneous and time-varying resources of edge devices. We show that PRAC outperforms known secure coded computing methods when resources are heterogeneous. We provide theoretical guarantees on the performance of PRAC and its comparison to baselines. Moreover, we confirm our theoretical results through simulations and implementations on Android-based smartphones.
•We describe resilient operation of cyber-physical application platforms.•We describe implicit design-time encoding of the reconfiguration.•We describe design-time analysis and validation tools for ...these systems.
Improvements in mobile networking combined with the ubiquitous availability and adoption of low-cost development boards have enabled the vision of mobile platforms of Cyber-Physical Systems (CPS), such as fractionated spacecraft and UAV swarms. Computation and communication resources, sensors, and actuators that are shared among different applications characterize these systems. The cyber-physical nature of these systems means that physical environments can affect both the resource availability and software applications that depend on resource availability. While many application development and management challenges associated with such systems have been described in existing literature, resilient operation and execution have received less attention. This paper describes our work on improving runtime support for resilience in mobile CPS, with a special focus on our runtime infrastructure that provides autonomous resilience via self-reconfiguration. We also describe the interplay between this runtime infrastructure and our design-time tools, as the later is used to statically determine the resilience properties of the former. Finally, we present a use case study to demonstrate and evaluate our design-time resilience analysis and runtime self-reconfiguration infrastructure.
The Gator program has been developed for computational spectroscopy and calculations of molecular properties using real and complex propagators at the correlated level of wave function theory. ...Currently, the focus lies on methods based on the algebraic diagrammatic construction (ADC) scheme up to the third order of perturbation theory. An auxiliary Fock matrix‐driven implementation of the second‐order ADC method for excitation energies has been realized with an underlying hybrid MPI/OpenMP parallelization scheme suitable for execution in high‐performance computing cluster environments. With a modular and object‐oriented program structure written in a Python/C++ layered fashion, Gator additionally enables time‐efficient prototyping of novel scientific approaches, as well as interactive notebook‐driven training of students in quantum chemistry.
This article is categorized under:
Computer and Information Science > Computer Algorithms and Programming
Electronic Structure Theory > Ab Initio Electronic Structure Methods
Software > Quantum Chemistry
The Gator program is an easy‐to‐use yet powerful tool to calculate molecular properties and spectroscopies with real and complex response functions using the second‐ and third‐order algebraic diagrammatic construction schemes.
This work presents an efficient approach for the generation of distributed Sparse Approximate Inverse preconditioners based on the near-field coupling information for the analysis of electromagnetic ...problems on large computing clusters. This scheme combines the Message Passing Interface and Open Multi-Processing paradigms in order to minimise the CPU time and memory footprint of the preconditioner, making use of specific algorithms tailored to balance the load and reduce the amount ot information shared between nodes. Some representative examples provide insight into the scalability and performance of the described approach addressing large and realistic scenarios.
Mobile Cloud Computing enables the migration of services to the edge of Internet. Therefore, high performance computing clusters are widely deployed to improve computational capabilities of such ...environments. However, they are prone to failures and need analytical models to predict their behavior in order to deliver desired quality-of-service and quality-of-experience to mobile users. This article proposes a 3D analytical model and a problem-solving approach for sustainability evaluation of high-performance computing clusters. The proposed solution uses an iterative approach to obtain performance measurements to overcome the state space explosion problem. The availability modeling and evaluation of master and computing nodes are performed using a multi-repairman approach. The optimum number of repairmen is also obtained to get realistic results and reduce the overall cost. The proposed model is validated using discrete event simulation. The analytical approach is much faster and in good agreement with the simulations. The analysis focuses on mean queue length, throughput and mean response time outputs. The maximum differences between analytical and simulation results in the considered scenarios of up to a billion states are less than 1.149, 3.82, and 3.76 percent, respectively. These differences are well within the 5 percent of confidence interval of the simulation and the proposed model.
High performance computing clusters are increasingly operating under a shared/buy-in paradigm. Under this paradigm, users choose between two tiers of services: shared services and buy-in services. ...Shared services provide users with access to shared resources for free, while buy-in services allow users to purchase additional buy-in resources in order to shorten job completion time. An important feature of shared/buy-in computing systems consists of making unused buy-in resources available to all other users of the system. Such a feature has been shown to enhance the utilization of resources. Alongside, it creates strategic interactions among users, hence giving rise to a non-cooperative game at the system level. Specifically, each user is faced with the questions of whether to purchase buy-in resources, and if so, how much to pay for them. Under quite general conditions, we establish that a shared/buy-in computing game yields a unique Nash equilibrium, which can be computed in polynomial time. We provide an algorithm for this purpose, which can be implemented in a distributed manner. Moreover, by establishing a connection to the theory of aggregative games, we prove that the game converges to the Nash equilibrium through best response dynamics from any initial state. We justify the underlying game-theoretic assumptions of our model using real data from a computing cluster, and conduct numerical simulations to further explore convergence properties and the influence of system parameters on the Nash equilibrium. In particular, we point out potential unfairness and abuse issues and discuss solution venues.