DIKUL - logo

Rezultati iskanja

Osnovno iskanje    Ukazno iskanje   

Trenutno NISTE avtorizirani za dostop do e-virov UL. Za polni dostop se PRIJAVITE.

1 2 3 4 5
zadetkov: 57
1.
  • Handling the problems and o... Handling the problems and opportunities posed by multiple on-chip memory controllers
    Awasthi, Manu; Nellans, David W.; Sudan, Kshitij ... 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), 09/2010
    Conference Proceeding
    Odprti dostop

    Modern processors such as Tilera's Tile64, Intel's Nehalem, and AMD's Opteron are migrating memory controllers (MCs) on-chip, while maintaining a large, flat memory address space. This trend to ...
Celotno besedilo
Dostopno za: UL

PDF
2.
  • Beyond block I/O: Rethinkin... Beyond block I/O: Rethinking traditional storage primitives
    Xiangyong Ouyang; Nellans, D; Wipfel, R ... 2011 IEEE 17th International Symposium on High Performance Computer Architecture, 02/2011
    Conference Proceeding

    Over the last twenty years the interfaces for accessing persistent storage within a computer system have remained essentially unchanged. Simply put, seek, read and write have defined the fundamental ...
Celotno besedilo
Dostopno za: UL
3.
  • NVBit NVBit
    Villa, Oreste; Stephenson, Mark; Nellans, David ... Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 10/2019
    Conference Proceeding

    Binary instrumentation frameworks are widely used to implement profilers, performance evaluation, error checking, and bug detection tools. While dynamic binary instrumentation tools such as PIN and ...
Celotno besedilo
Dostopno za: UL
4.
  • Optimizing Multi-GPU Parall... Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
    Pal, Saptadeep; Ebrahimi, Eiman; Zulfiqar, Arslan ... IEEE MICRO, 2019-Sept.-Oct.-1, 2019-9-1, Letnik: 39, Številka: 5
    Journal Article
    Recenzirano
    Odprti dostop

    Deploying deep learning (DL) models across multiple compute devices to train large and complex models continues to grow in importance because of the demand for faster and more frequent training. Data ...
Celotno besedilo
Dostopno za: UL

PDF
5.
  • Translation ranger Translation ranger
    Yan, Zi; Lustig, Daniel; Nellans, David ... 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA), 06/2019
    Conference Proceeding
    Odprti dostop

    Virtual memory (VM) eases programming effort but can suffer from high address translation overheads. Architects have traditionally coped by increasing Translation Lookaside Buffer (TLB) capacity; ...
Celotno besedilo
Dostopno za: UL

PDF
6.
  • FinePack: Transparently Improving the Efficiency of Fine-Grained Transfers in Multi-GPU Systems
    Muthukrishnan, Harini; Lustig, Daniel; Villa, Oreste ... 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2023-Feb.
    Conference Proceeding

    Recent studies have shown that using fine-grained peer-to-peer (P2P) stores to communicate among devices in multi-GPU systems is a promising path to achieve strong performance scaling. In many ...
Celotno besedilo
Dostopno za: UL
7.
  • Parsimony: Enabling SIMD/Ve... Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows
    Kandiah, Vijay; Lustig, Daniel; Villa, Oreste ... Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization, 02/2023
    Conference Proceeding

    Achieving peak throughput on modern CPUs requires maximizing the use of single-instruction, multiple-data (SIMD) or vector compute units. Single-program, multiple-data (SPMD) programming models are ...
Celotno besedilo
Dostopno za: UL
8.
  • GPS: A Global Publish-Subsc... GPS: A Global Publish-Subscribe Model for Multi-GPU Memory Management
    Muthukrishnan, Harini; Lustig, Daniel; Nellans, David ... MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 10/2021
    Conference Proceeding

    Suboptimal management of memory and bandwidth is one of the primary causes of low performance on systems comprising multiple GPUs. Existing memory management solutions like Unified Memory (UM) offer ...
Celotno besedilo
Dostopno za: UL
9.
  • Buddy compression Buddy compression
    Choukse, Esha; Sullivan, Michael B.; O'Connor, Mike ... 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 05/2020
    Conference Proceeding
    Odprti dostop

    GPUs accelerate high-throughput applications, which require orders-of-magnitude higher memory bandwidth than traditional CPU-only systems. However, the capacity of such high-bandwidth memory tends to ...
Celotno besedilo
Dostopno za: UL

PDF
10.
  • MCM-GPU: Multi-chip-module GPUs for continued performance scalability
    Arunkumar, Akhil; Bolotin, Evgeny; Cho, Benjamin ... 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 2017-June
    Conference Proceeding

    Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number of transistors per die no longer grows at ...
Celotno besedilo
Dostopno za: UL
1 2 3 4 5
zadetkov: 57

Nalaganje filtrov