DEEP REINFORCEMENT LEARNING FOR AUTONOMOUS CONTROL OF MANUFACTURING SYSTEMS IN VOCATIONAL EDUCATION: A COMPARATIVE ANALYSIS

doi:10.2316/J.2024.206-1106

DEEP REINFORCEMENT LEARNING FOR AUTONOMOUS CONTROL OF MANUFACTURING SYSTEMS IN VOCATIONAL EDUCATION: A COMPARATIVE ANALYSIS

Lei Liu, Yimeng Li, Haoran Li, and Dongmei Wang

References

[1] Y. Umeda, J. Ota, F. Kojima, M. Saito, H. Matsuzawa,T. Sukekawa, A. Takeuchi, K. Makida, and S. Shirafuji,Development of an education program for digital manufacturingsystem engineers based on ‘Digital Triplet’ concept, ProcediaManufacturing, 3(1), 2019, 363–369.
[2] D. Mourtzis, Simulation in the design and operation ofmanufacturing systems: State of the art and new trends,International Journal of Production Research, 58(7), 2020,1927–1949.
[3] A. Polenghi, L. Fumagalli, and I. Roda, Role of simulationin industrial engineering: Focus on manufacturing systems,IFAC-PapersOnLine, 51(11), 2018, 496–501.
[4] Y. Lu, X. Xu, and L. Wang, Smart manufacturing processand system automation–A critical review of the standardsand envisioned scenarios, Journal of Manufacturing Systems,56(4), 2020, 312–325.
[5] K. Ding, F.T. Chan, X. Zhang, G. Zhou, and F. Zhang,Deﬁning a digital twin-based cyber-physical productionsystem for autonomous manufacturing in smart shop ﬂoors,International Journal of Production Research, 57(20), 2019,6315–6334.
[6] I. Gonzalez and A. Jose Calderon, Development of ﬁnal projectsin engineering degrees around an industry 4.0-oriented ﬂexiblemanufacturing system: Preliminary outcomes and some initialconsiderations, Education Sciences, 8(4), 2018, 214–215.
[7] R. Khajuria, A. Quyoom, and A. Sarwar, A comparison ofdeep reinforcement learning and deep learning for compleximage analysis, Journal of Multimedia Information System,7(1), 2020, 1–10.
[8] X. Xiang and S. Foo, Recent advances in deep reinforcementlearning applications for solving partially observable Markovdecision processes (POMDP) problems: Part 1—Fundamentals10and applications in games, robotics and natural languageprocessing, Machine Learning and Knowledge Extraction, 3(3),2021, 554–581.
[9] Y. Zhang, H. Zhu, D. Tang, T. Zhou, and Y. Gui, Dynamicjob shop scheduling based on deep reinforcement learning formulti-agent manufacturing systems, Robotics and Computer-Integrated Manufacturing, 7(8), 2022, 102412–102413.
[10] D. Scrimieri, S.M. Afazov, and S.M. Ratchev, Design of a self-learning multi-agent framework for the adaptation of modularproduction systems, The International Journal of AdvancedManufacturing Technology, 115(5–6), 2021, 1745–1761.
[11] H. Zhang, J. Peng, H. Tan, H. Dong, and F. Ding, A deepreinforcement learning-based energy management frameworkwith Lagrangian relaxation for plug-in hybrid electric vehicle,IEEE Transactions on Transportation Electriﬁcation, 7(3),2020, 1146–1160.
[12] G. Li, J. Wu, C. Deng, X. Xu, and X. Shao, Deep reinforcementlearning-based online domain adaptation method for faultdiagnosis of rotating machinery, IEEE/ASME Transactions onMechatronics, 27(5), 2021, 2796–2805.
[13] F. Li, B. Shen, J. Guo, K.Y. Lam, G. Wei, and L. Wang,Dynamic spectrum access for Internet-of-Things based onfederated deep reinforcement learning, IEEE Transactions onVehicular Technology, 71(7), 2022, 7952–7956.
[14] J. Cao, D. Harrold, Z. Fan, T. Morstyn, D. Healey, andK. Li, Deep reinforcement learning-based energy storagearbitrage with accurate lithium-ion battery degradationmodel, IEEE Transactions on Smart Grid, 11(5), 2020,4513–4521.
[15] C. Li and T. Cong, Relying on the research on the systematictraining of equipment manufacturing talents of vocationaleducation group-taking the “three-two-stage” talent trainingof middle and higher vocational convergence as an example,Liaoning Journal of Higher Vocational Education, 21(11),2019, 9–14.
[16] H. He, Y. Hansheng, and X. Yongjun, Exploration on thetransformation of vocational education talent training underthe new vocational needs of intelligent manufacturing engi-neering technology, Education and Occupation, 1004(4), 2022,106–111.
[17] G. Gao, Y. Wenhui, and W. Feng, [Equipment healthmanagement] Intelligent diagnosis of the health status ofmanufacturing systems based on CPS method and vulnerabilityassessment, China Mechanical Engineering, 30(02), 2019,212–214.
[18] H.R. Baghaee, M. Mirsalim, G.B. Gharehpetian, and H.A.Talebi, Unbalanced harmonic power sharing and voltagecompensation of microgrids using radial basis function neuralnetwork-based harmonic power-ﬂow calculations for distributedand decentralised control structures, IET Generation, Trans-mission & Distribution, 12(7), 2018, 1518–1530.
[19] C. Zhang and H. Wang, Swing vibration control ofsuspended structures using the active rotary inertia driversystem: Theoretical modeling and experimental veriﬁcation,Structural Control and Health Monitoring, 27(6), 2020,e2543–e2544.
[20] S. Sui, C.L. Philip Chen, and S. Tong, Neural network ﬁlteringcontrol design for nontriangular structure switched nonlinearsystems in ﬁnite time, IEEE Transactions on Neural Networksand Learning Systems, 30(7), 2018, 2153–2162.
[21] R. Pansare, G. Yadav, M.R. Nagare, and S. Jani, Mappingthe competencies of reconﬁgurable manufacturing system withthe requirements of industry 4.0, Journal of Remanufacturing,12(3), 2022, 385–409.
[22] V. Charpenay, D. Schraudner, T. Seidelmann, T. Spieldenner,J. Weise, R. Schubotz, S. Mostaghim, and A. Harth, MOSAIK:A formal model for self-organizing manufacturing systems,IEEE Pervasive Computing, 20(1), 2020, 9–18.
[23] C. Yang, X. Wang, and S. Mao, RFID-pose: Vision-aidedthree-dimensional human pose estimation with radio-frequencyidentiﬁcation, IEEE Transactions on Reliability, 70(3), 2020,1218–1231.
[24] M.S. Novelan, Application of attendance monitoring systemusing RFID (Radio Frequency Identiﬁcation) and interface,Jurnal Mantik, 4(3), 2020, 1837–1842.
[25] Y. Rouchdi, K. El Yassini, and K. Oufaska, Resolving securityand privacy issues in radio frequency identiﬁcation middleware,International Journal of Innovative Science, Engineering &Technology (IJISET), 5(2), 2018, 2348–7968.
[26] E.S. Soegoto, Radio frequency identiﬁcation (RFID) smartcard on parking system as e-business prospect, Journal ofEngineering Science and Technology (JESTEC), 13(6), 2018,1690–1699.
[27] L. Li, J. Zhang, Y. Wang, and B. Ran, Missing value imputationfor traﬃc-related time series data based on a multi-view learningmethod, IEEE Transactions on Intelligent TransportationSystems, 20(8), 2018, 2933–2943.
[28] J. Ibarz, J. Tan, C. Finn, M. Kalakrishnan, P. Pastor, andS. Levine, How to train your robot with deep reinforcementlearning: lessons we have learned, The International Journalof Robotics Research, 40(4–5), 2021, 698–721.
[29] T. Li, Z. Wang, G. Yang, Y. Cui, Y. Chen, and X. Yu,Semi-selﬁsh mining based on hidden Markov decision process,International Journal of Intelligent Systems, 36(7), 2021,3596–3612.
[30] S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K.K.Leung, Dynamic service migration in mobile edge computingbased on Markov decision process, IEEE/ACM Transactionson Networking, 27(3), 2019, 1272–1288.

Important Links:

Abstract
DOI: 10.2316/J.2024.206-1106
From Journal (206) International Journal of Robotics and Automation - 2025

Go Back