Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
  • Jain, Akshat; Kazi, Owais; Joshi, Raviraj; Basantwani, Shraddha; Bang, Yogita; Khengare, Rahul

    2019 IEEE 5th International Conference for Convergence in Technology (I2CT), 2019-March
    Conference Proceeding

    The GPU usually handles the homogenous data parallel work, by taking advantage of its massive number of cores. In most of the applications, we use CUDA programming for utilizing the power of GPU. In data intensive high computational applications like neural networks, utilizing the GPU on a single machine is time consuming. Instead if multiple GPUs are used in a network the amount of time required will be significantly reduced. Traditionally to enable a set of program to be run in a distributed environment, programmer has to accordingly design components to make his system dynamic and resilient to changes in number of systems in cluster, which is a daunting task. This work distribution can be a poor solution as it may underutilize the GPUsIn our approach, we developed a framework which transparently distributes data parallel kernels across multiple GPUs in a distributed network. The programmer is responsible for developing a single data parallel kernel in CUDA while the framework automatically distributes the workload across an arbitrary set of CUDA enabled GPUs. Depending on current workload on GPUs and the amount of data to be processed optimal distribution is done. The goal is to maximally utilize the available resources with minimal programming complexity. The systems not compatible with CUDA can also utilize our solution. We expect our framework to reduce the processing time along with simplifying the task of programmers.