DIKUL - logo
E-viri
Celotno besedilo
Recenzirano
  • Simulation and reconstructi...
    Cai, Wei; Zhu, Peimin; Li, Ziang

    Computers & geosciences, August 2024, 2024-08-00, Letnik: 190
    Journal Article

    3D finite-difference time-domain numerical simulation and reconstruction based on the domain decomposition technique are essential parts of high-performance computation for reverse-time migration and full-waveform inversion. However, the low GPU utilization in computing for small-sized models and the tremendous memory consumption for large-sized models may result in low computational efficiency and high memory costs. This paper proposes a contiguous memory management (CMM) method and a variable-order wavefield reconstruction (VWR) method. The CMM allocates the memory of many small-sized arrays used for MPI communications on a larger-sized contiguous memory block, which aims to reduce the number of MPI communications between subdomains and improve the communication bandwidth, thus reducing the MPI time overhead and improving the GPU utilization. Meanwhile, the VWR can flexibly set the number of layers of boundary wavefield used for source wavefield reconstruction according to the host memory capacity and accuracy requirements. Since one layer of boundary wavefield could be stored using the VWR, the memory consumption of host memory can be significantly alleviated. Numerical experiments show that GPU utilization in computing for the model with a size of 1213 can be improved from 25% to 90% using the CMM method, and the VWR method can reduce memory consumption by about 86% while maintaining good accuracy in wavefield reconstruction. In addition, the issue of how to obtain a domain decomposition scheme with optimal performance is discussed in this paper. •A contiguous memory management strategy is proposed for improving the efficiency of MPI communication.•A variable-order wavefield reconstruction strategy is developed to reduce memory consumption.•The issue of how to obtain a domain decomposition scheme with optimal performance is discussed.