NUK - logo

Rezultati iskanja

Osnovno iskanje    Ukazno iskanje   

Trenutno NISTE avtorizirani za dostop do e-virov NUK. Za polni dostop se PRIJAVITE.

3 4 5
zadetkov: 44
41.
  • MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
    Jiang, Ziheng; Lin, Haibin; Zhong, Yinmin ... arXiv (Cornell University), 02/2024
    Paper, Journal Article
    Odprti dostop

    We present the design, implementation and engineering experience in building and deploying MegaScale, a production system for training large language models (LLMs) at the scale of more than 10,000 ...
Celotno besedilo
42.
  • DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
    Liu, Aixin; Feng, Bei; Wang, Bin ... arXiv (Cornell University), 06/2024
    Paper, Journal Article
    Odprti dostop

    We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated ...
Celotno besedilo
43.
  • DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
    DeepSeek-AI; :; Bi, Xiao ... arXiv (Cornell University), 01/2024
    Journal Article
    Odprti dostop

    The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark ...
Celotno besedilo
44.
Celotno besedilo

Nalaganje filtrov