Akademska digitalna zbirka SLovenije - logo

Rezultati iskanja

Osnovno iskanje    Ukazno iskanje   

Trenutno NISTE avtorizirani za dostop do e-virov konzorcija SI. Za polni dostop se PRIJAVITE.

1 2 3
zadetkov: 23
1.
  • X-Risk Analysis for AI Research
    Hendrycks, Dan; Mantas Mazeika arXiv (Cornell University), 09/2022
    Paper, Journal Article
    Odprti dostop

    Artificial intelligence (AI) has the potential to greatly improve society, but as with any powerful technology, it comes with heightened risks and responsibilities. Current AI research lacks a ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
2.
  • An Overview of Catastrophic AI Risks
    Hendrycks, Dan; Mantas Mazeika; Woodside, Thomas arXiv.org, 10/2023
    Paper, Journal Article
    Odprti dostop

    Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
3.
  • How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection
    Mantas Mazeika; Li, Bo; syth, David arXiv (Cornell University), 06/2022
    Paper, Journal Article
    Odprti dostop

    Model stealing attacks present a dilemma for public machine learning APIs. To protect financial investments, companies may be forced to withhold important information about their models that could ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
4.
  • PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures
    Hendrycks, Dan; Zou, Andy; Mazeika, Mantas ... 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022-June
    Conference Proceeding
    Odprti dostop

    In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy. These other goals include out-of-distribution (OOD) ...
Celotno besedilo
Dostopno za: IJS, NUK, UL, UM
5.
  • Testing Robustness Against Unforeseen Adversaries
    Kaufmann, Max; Kang, Daniel; Sun, Yi ... arXiv.org, 10/2023
    Paper, Journal Article
    Odprti dostop

    Adversarial robustness research primarily focuses on L_p perturbations, and most defenses are developed with identical training-time and test-time adversaries. However, in real-world applications ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
6.
  • PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures
    Hendrycks, Dan; Zou, Andy; Mantas Mazeika ... arXiv (Cornell University), 12/2021
    Paper, Journal Article
    Odprti dostop

    In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy. These other goals include out-of-distribution (OOD) ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
7.
  • Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty
    Hendrycks, Dan; Mantas Mazeika; Kadavath, Saurav ... arXiv.org, 10/2019
    Paper, Journal Article
    Odprti dostop

    Self-supervision provides effective representations for downstream tasks without requiring labels. However, existing approaches lag behind fully supervised training and are often not thought ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
8.
  • Using Pre-Training Can Improve Model Robustness and Uncertainty
    Hendrycks, Dan; Lee, Kimin; Mantas Mazeika arXiv.org, 10/2019
    Paper, Journal Article
    Odprti dostop

    He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training. We show that although pre-training ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
9.
  • Scaling Out-of-Distribution Detection for Real-World Settings
    Hendrycks, Dan; Basart, Steven; Mantas Mazeika ... arXiv (Cornell University), 05/2022
    Paper, Journal Article
    Odprti dostop

    Detecting out-of-distribution examples is important for safety-critical machine learning applications such as detecting novel biological phenomena and self-driving cars. However, existing research ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
10.
  • HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
    Mantas Mazeika; Long, Phan; Yin, Xuwang ... arXiv.org, 02/2024
    Paper, Journal Article
    Odprti dostop

    Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized ...
Celotno besedilo
Dostopno za: NUK, UL, UM, UPUK
1 2 3
zadetkov: 23

Nalaganje filtrov