NUK - logo

Rezultati iskanja

Osnovno iskanje    Ukazno iskanje   

Trenutno NISTE avtorizirani za dostop do e-virov NUK. Za polni dostop se PRIJAVITE.

1 2
zadetkov: 19
1.
  • Natural Adversarial Examples
    Hendrycks, Dan; Zhao, Kevin; Basart, Steven ... 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021-June
    Conference Proceeding
    Odprti dostop

    We introduce two challenging datasets that reliably cause machine learning model performance to substantially degrade. The datasets are collected with a simple adversarial filtration technique to ...
Celotno besedilo

PDF
2.
  • The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
    Hendrycks, Dan; Basart, Steven; Mu, Norman ... 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021-Oct.
    Conference Proceeding

    We introduce four new real-world distribution shift datasets consisting of changes in image style, image blurriness, geographic location, camera operation, and more. With our new datasets, we take ...
Celotno besedilo

PDF
3.
  • Towards Robustness of Neura... Towards Robustness of Neural Networks
    Basart, Steven 01/2021
    Dissertation

    We introduce several new datasets namely ImageNet-A/O and ImageNet-R as well as a synthetic environment and testing suite we called CAOS. ImageNet-A/O allow researchers to focus in on the blind spots ...
Celotno besedilo
4.
  • Towards Robustness of Neural Networks
    Basart, Steven arXiv (Cornell University), 12/2021
    Paper, Journal Article
    Odprti dostop

    We introduce several new datasets namely ImageNet-A/O and ImageNet-R as well as a synthetic environment and testing suite we called CAOS. ImageNet-A/O allow researchers to focus in on the blind spots ...
Celotno besedilo
5.
  • Aligning AI With Shared Human Values
    Hendrycks, Dan; Burns, Collin; Basart, Steven ... arXiv (Cornell University), 02/2023
    Paper, Journal Article
    Odprti dostop

    We show how to assess a language model's knowledge of basic concepts of morality. We introduce the ETHICS dataset, a new benchmark that spans concepts in justice, well-being, duties, virtues, and ...
Celotno besedilo
6.
  • Natural Adversarial Examples
    Hendrycks, Dan; Zhao, Kevin; Basart, Steven ... arXiv (Cornell University), 03/2021
    Paper, Journal Article
    Odprti dostop

    We introduce two challenging datasets that reliably cause machine learning model performance to substantially degrade. The datasets are collected with a simple adversarial filtration technique to ...
Celotno besedilo
7.
  • Testing Robustness Against Unforeseen Adversaries
    Kaufmann, Max; Kang, Daniel; Sun, Yi ... arXiv (Cornell University), 10/2023
    Paper, Journal Article
    Odprti dostop

    Adversarial robustness research primarily focuses on L_p perturbations, and most defenses are developed with identical training-time and test-time adversaries. However, in real-world applications ...
Celotno besedilo
8.
  • Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
    Ren, Richard; Basart, Steven; Khoja, Adam ... arXiv.org, 07/2024
    Paper, Journal Article
    Odprti dostop

    As artificial intelligence systems grow more powerful, there has been increasing interest in "AI safety" research to address emerging and future risks. However, the field of AI safety remains poorly ...
Celotno besedilo
9.
  • Measuring Mathematical Problem Solving With the MATH Dataset
    Hendrycks, Dan; Burns, Collin; Kadavath, Saurav ... arXiv (Cornell University), 11/2021
    Paper, Journal Article
    Odprti dostop

    Many intellectual endeavors require mathematical problem solving, but this skill remains beyond the capabilities of computers. To measure this ability in machine learning models, we introduce MATH, a ...
Celotno besedilo
10.
  • Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
    Pan, Alexander; Jun Shern Chan; Zou, Andy ... arXiv (Cornell University), 06/2023
    Paper, Journal Article
    Odprti dostop

    Artificial agents have traditionally been trained to maximize reward, which may incentivize power-seeking and deception, analogous to how next-token prediction in language models (LMs) may ...
Celotno besedilo
1 2
zadetkov: 19

Nalaganje filtrov