NUK - logo
E-resources
Full text
Peer reviewed
  • A reinforcement learning-ba...
    Wang, Xianjia; Yang, Zhipeng; Liu, Yanli; Chen, Guici

    Physica A, 05/2023, Volume: 618
    Journal Article

    The emergence of cooperation between competing agents has been commonly studied through evolutionary games, but such cooperation often requires a mechanism or a third party to be activated and kept alive. To investigate how a mechanism affects the evolution of cooperation, this paper proposes an innovative reinforcement learning-based strategy updating model. The model consists of two symmetrical sets of convolutional neural networks. Besides, the agents’ strategies updating rules are defined: firstly, the agents learn and predict the environment and the behaviors of neighboring agents, then estimate their future payoffs based on this information, and finally determine their strategies based on these estimated payoffs. Through investigating the behavior characteristics and the stable states of the network for highly intelligent agents with memory learning and prediction ability in the evolution of the prisoner’s dilemma game, the results demonstrate that the game initiators who adopt the mixed optimal payoff approach can increase the number of cooperators and facilitate “global cooperation” and “repaying kindness with kindness”. Although the temptation factor has little effect on the population, increasing the discount factor can expand the scale of the cooperative cluster and even achieve dynamic stability. Additionally, a smaller size of minibatch is beneficial for the evolution of cooperation in a smaller experience replay pool. A larger size of minibatch is more conducive to the evolution of cooperation with an increasing capacity of the experience replay pool. This research provides a novel perspective from reinforcement learning to understand the evolution of cooperation. •An innovative RLSUM is proposed to investigate the PDG.•The behavioral characteristics and the steady state of the network are studied.•The RLSUM can facilitate “global cooperation” and “repaying kindness with kindness”.•The research provides a novel perspective from RL to understand the cooperative evolution.