OM-TCN: A dynamic and agile opponent modeling approach for competitive games

E-viri

Recenzirano

OM-TCN: A dynamic and agile opponent modeling approach for competitive games

Ma, Yuxi; Shen, Meng; Zhang, Nan; Tong, Xiaoyao; Li, Yuanzhang

Information sciences, November 2022, 2022-11-00, Letnik: 615

Journal Article

The non-stationarity of the environment is a crucial challenge for competitive Multi-Agent Reinforcement Learning (MARL) due to the constantly changing opponent policy. Existing schemes are challenging to make the protagonist agent that agilely responds to the opponent’s changes and the resulting non-stationarity, which may inevitably limit their applicability. To address the dynamic opponent policy and adapt to the non-stationary environment continuously, we propose a Temporal Convolutional Network (TCN) model for modeling and predicting opponent behaviors called OM-TCN, and apply it to the widely-used Multi-Agent Deep Deterministic Policies Gradient (MADDPG) algorithm of competitive MARL. In this work, we collect the opponent’s behavior data observed by the protagonist agent and serialize it in granularity of episodes. Then we input the time-series data into OM-TCN for sequence modeling. The OM-TCN learns the historical behaviors of the opponent instead of overfitting to a specific opponent policy, and can make predictions about the opponent’s future actions. Finally, we use predictions of opponent actions in place of the history sampled from the playback buffer, and apply the OM-TCN model to the MADDPG framework for decentralized training. We use the competitive scenario of Multi-agent Particle Environment (MPE) to evaluate the proposed method. Simulation results show that the protagonist agent is able to learn more efficient and stable policy and converge easier than other baselines.

Išči dalje

Avtor

Ma, Yuxi | Shen, Meng | Zhang, Nan | Tong, Xiaoyao | Li, Yuanzhang

Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.

Leto	Faktor vpliva		Izdaja		Kategorija		Razvrstitev
Leto	JCR	SNIP	JCR	SNIP	JCR	SNIP	JCR	SNIP

Povezave do osebnih bibliografij avtorjev	Povezave do podatkov o raziskovalcih v sistemu SICRIS

Vir: Osebne bibliografije in: SICRIS

Naloži sliko

Vnos na polico

Dodajanje gradiva na polico je uspelo.

Dodajanje gradiva na polico je spodletelo.

Dodajanje gradiva na polico ni bilo potrebno.

Trajna povezava

E-pošta

Faktor vpliva

Izberite knjižnično izkaznico:

Baze podatkov, v katerih je revija indeksirana

Citiranje

Tema