E-viri
Recenzirano
Odprti dostop
-
Reymond, Mathieu; Hayes, Conor F.; Steckelmacher, Denis; Roijers, Diederik M.; Nowé, Ann
Autonomous agents and multi-agent systems, 10/2023, Letnik: 37, Številka: 2Journal Article
We propose a novel multi-objective reinforcement learning algorithm that successfully learns the optimal policy even for non-linear utility functions. Non-linear utility functions pose a challenge for SOTA approaches, both in terms of learning efficiency as well as the solution concept. A key insight is that, by proposing a critic that learns a multi-variate distribution over the returns, which is then combined with accumulated rewards, we can directly optimize on the utility function, even if it is non-linear. This allows us to vastly increase the range of problems that can be solved compared to those which can be handled by single-objective methods or multi-objective methods requiring linear utility functions, yet avoiding the need to learn the full Pareto front. We demonstrate our method on multiple multi-objective benchmarks, and show that it learns effectively where baseline approaches fail.
Vnos na polico
Trajna povezava
- URL:
Faktor vpliva
Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.
Leto | Faktor vpliva | Izdaja | Kategorija | Razvrstitev | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Baze podatkov, v katerih je revija indeksirana
Ime baze podatkov | Področje | Leto |
---|
Povezave do osebnih bibliografij avtorjev | Povezave do podatkov o raziskovalcih v sistemu SICRIS |
---|
Vir: Osebne bibliografije
in: SICRIS
To gradivo vam je dostopno v celotnem besedilu. Če kljub temu želite naročiti gradivo, kliknite gumb Nadaljuj.