E-resources
Peer reviewed
-
Li, Yuxin; Gu, Wenbin; Yuan, Minghai; Tang, Yaming
Robotics and computer-integrated manufacturing, April 2022, 2022-04-00, 20220401, Volume: 74Journal Article
•The multiobjective optimization model of dynamic flexible job shop scheduling problem with insufficient transportation resources (DFJSP-ITR) is established to minimize the makespan and total energy consumption.•A hybrid deep Q network (HDQN) is developed for DFJSP-ITR to make agent learn to select the appropriate rule according to the production state at each decision point, which has three extensions to deep Q network: double Q-learning, prioritized replay and soft target network update policy.•To implement the DRL-based scheduling, the shop floor state model is established at first, and then the decision point, 26 generic state features, genetic-programming-based action space and reward function are designed. Based on the above contents, the training method using HDQN and the strategy for facing new job insertions and machine breakdowns are proposed.•Experimental results show that HDQN has superiority and generality compared with current optimization-based approaches, and can effectively deal with disturbance events and unseen situations through learning. With the extensive application of automated guided vehicles in manufacturing system, production scheduling considering limited transportation resources becomes a difficult problem. At the same time, the real manufacturing system is prone to various disturbance events, which increase the complexity and uncertainty of shop floor. To this end, this paper addresses the dynamic flexible job shop scheduling problem with insufficient transportation resources (DFJSP-ITR) to minimize the makespan and total energy consumption. As a sequential decision-making problem, DFJSP-ITR can be modeled as a Markov decision process where the agent should determine the scheduling object and allocation of resources at each decision point. So this paper adopts deep reinforcement learning to solve DFJSP-ITR. In this paper, the multiobjective optimization model of DFJSP-ITR is established. Then, in order to make agent learn to choose the appropriate rule based on the production state at each decision point, a hybrid deep Q network (HDQN) is developed for this problem, which combines deep Q network with three extensions. Moreover, the shop floor state model is established at first, and then the decision point, generic state features, genetic-programming-based action space and reward function are designed. Based on these contents, the training method using HDQN and the strategy for facing new job insertions and machine breakdowns are proposed. Finally, comprehensive experiments are conducted, and the results show that HDQN has superiority and generality compared with current optimization-based approaches, and can effectively deal with disturbance events and unseen situations through learning.
![loading ... loading ...](themes/default/img/ajax-loading.gif)
Shelf entry
Permalink
- URL:
Impact factor
Access to the JCR database is permitted only to users from Slovenia. Your current IP address is not on the list of IP addresses with access permission, and authentication with the relevant AAI accout is required.
Year | Impact factor | Edition | Category | Classification | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Select the library membership card:
If the library membership card is not in the list,
add a new one.
DRS, in which the journal is indexed
Database name | Field | Year |
---|
Links to authors' personal bibliographies | Links to information on researchers in the SICRIS system |
---|
Source: Personal bibliographies
and: SICRIS
The material is available in full text. If you wish to order the material anyway, click the Continue button.