NUK - logo

Search results

Basic search    Expert search   

Currently you are NOT authorised to access e-resources NUK. For full access, REGISTER.

1 2 3 4 5
hits: 42
11.
Full text
12.
Full text
13.
  • Recursive Least-Squares Tem... Recursive Least-Squares Temporal Difference With Gradient Correction
    Song, Tianheng; Li, Dazi; Yang, Weimin ... IEEE transactions on cybernetics, 08/2021, Volume: 51, Issue: 8
    Journal Article
    Peer reviewed

    Since the late 1980s, temporal difference (TD) learning has dominated the research area of policy evaluation algorithms. However, the demand for the avoidance of TD defects, such as low ...
Full text
14.
  • Actor-Critic Learning Contr... Actor-Critic Learning Control Based on -Regularized Temporal-Difference Prediction With Gradient Correction
    Li, Luntong; Li, Dazi; Song, Tianheng ... IEEE transaction on neural networks and learning systems, 12/2018, Volume: 29, Issue: 12
    Journal Article

    Actor-critic based on the policy gradient (PG-based AC) methods have been widely studied to solve learning control problems. In order to increase the data efficiency of learning prediction in the ...
Full text
15.
  • Actor-Critic Learning Contr... Actor-Critic Learning Control Based on \ell -Regularized Temporal-Difference Prediction With Gradient Correction
    Li, Luntong; Li, Dazi; Song, Tianheng ... IEEE transaction on neural networks and learning systems, 2018-Dec., 2018-12-00, Volume: 29, Issue: 12
    Journal Article

    Actor-critic based on the policy gradient (PG-based AC) methods have been widely studied to solve learning control problems. In order to increase the data efficiency of learning prediction in the ...
Full text
16.
  • Feature selection in determ... Feature selection in deterministic policy gradient
    Li, Luntong; Li, Dazi; Song, Tianheng Journal of engineering (Stevenage, England), 07/2020, Volume: 2020, Issue: 13
    Journal Article
    Peer reviewed
    Open access

    The authors consider the task of learning control problem in reinforcement learning (RL) with continuous action space. Policy gradient, and in particular the determinist policy gradient (DPG) ...
Full text

PDF
17.
  • Sparse Proximal Reinforceme... Sparse Proximal Reinforcement Learning via Nested Optimization
    Song, Tianheng; Li, Dazi; Jin, Qibing ... IEEE transactions on systems, man, and cybernetics. Systems, 11/2020, Volume: 50, Issue: 11
    Journal Article
    Peer reviewed

    We consider the tasks of feature selection and policy evaluation based on linear value function approximation in reinforcement learning problems. High-dimension feature vectors and limited number of ...
Full text
18.
  • Online ℓ2-regularized reinforcement learning via RBF neural network
    Tianheng Song; Dazi Li 2016 Chinese Control and Decision Conference (CCDC)
    Conference Proceeding

    An ℓ2-regularized policy evaluation algorithm, termed RRC (Regularized RC), is proposed for applying in the reinforcement learning problems. RBF network is used to construct VFA, and its weight ...
Full text
19.
  • Sustainable ℓ2-regularized actor-critic based on recursive least-squares temporal difference learning
    Luntong Li; Dazi Li; Tianheng Song 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2017-Oct.
    Conference Proceeding

    Least-squares temporal difference learning (LSTD) has been used mainly for improving the data efficiency of the critic in actor-critic (AC). However, convergence analysis of the resulted algorithms ...
Full text
20.
  • Improving convolutional neural network using accelerated proximal gradient method for epilepsy diagnosis
    Dazi Li; Guifang Wang; Tianheng Song ... 2016 UKACC 11th International Conference on Control (CONTROL), 2016-Aug.
    Conference Proceeding

    The task of epilepsy diagnosing in medicine by classification of electroencephao-graph (EEG) signals is considered. Since an EEG signal has a large number of dimensions as an input sample vector, ...
Full text
1 2 3 4 5
hits: 42

Load filters