UNI-MB - logo
UMNIK - logo
 
E-resources
Full text
Peer reviewed
  • Real-time data-driven dynam...
    Li, Yuxin; Gu, Wenbin; Yuan, Minghai; Tang, Yaming

    Robotics and computer-integrated manufacturing, April 2022, 2022-04-00, 20220401, Volume: 74
    Journal Article

    •The multiobjective optimization model of dynamic flexible job shop scheduling problem with insufficient transportation resources (DFJSP-ITR) is established to minimize the makespan and total energy consumption.•A hybrid deep Q network (HDQN) is developed for DFJSP-ITR to make agent learn to select the appropriate rule according to the production state at each decision point, which has three extensions to deep Q network: double Q-learning, prioritized replay and soft target network update policy.•To implement the DRL-based scheduling, the shop floor state model is established at first, and then the decision point, 26 generic state features, genetic-programming-based action space and reward function are designed. Based on the above contents, the training method using HDQN and the strategy for facing new job insertions and machine breakdowns are proposed.•Experimental results show that HDQN has superiority and generality compared with current optimization-based approaches, and can effectively deal with disturbance events and unseen situations through learning. With the extensive application of automated guided vehicles in manufacturing system, production scheduling considering limited transportation resources becomes a difficult problem. At the same time, the real manufacturing system is prone to various disturbance events, which increase the complexity and uncertainty of shop floor. To this end, this paper addresses the dynamic flexible job shop scheduling problem with insufficient transportation resources (DFJSP-ITR) to minimize the makespan and total energy consumption. As a sequential decision-making problem, DFJSP-ITR can be modeled as a Markov decision process where the agent should determine the scheduling object and allocation of resources at each decision point. So this paper adopts deep reinforcement learning to solve DFJSP-ITR. In this paper, the multiobjective optimization model of DFJSP-ITR is established. Then, in order to make agent learn to choose the appropriate rule based on the production state at each decision point, a hybrid deep Q network (HDQN) is developed for this problem, which combines deep Q network with three extensions. Moreover, the shop floor state model is established at first, and then the decision point, generic state features, genetic-programming-based action space and reward function are designed. Based on these contents, the training method using HDQN and the strategy for facing new job insertions and machine breakdowns are proposed. Finally, comprehensive experiments are conducted, and the results show that HDQN has superiority and generality compared with current optimization-based approaches, and can effectively deal with disturbance events and unseen situations through learning.