DDETR-SLAM: A TRANSFORMER-BASED APPROACH TO POSE OPTIMISATION IN DYNAMIC ENVIRONMENTS

Feng Li, Yuanyuan Liu, Kelong Zhang, Zhengpeng Hu, and Guozheng Zhang

References

  1. [1] J. Fuentes-Pacheco, J. Ruiz-Ascencio, and J.M. Rend´on-Mancha, Visual simultaneous localization and mapping: Asurvey, Artificial Intelligence Review, 43(1), 2015, 55–81.
  2. [2] A.J. Davison, I.D. Reid, N.D. Molton, and O. Stasse,MonoSLAM: Real-time single camera SLAM, IEEE13Transactions on Pattern Analysis and Machine Intelligence,29(6), 2007, 1052–1067.
  3. [3] G. Klein and D. Murray, Parallel tracking and mapping forsmall AR workspaces, Proc. 6th IEEE and ACM InternationalSymposium on Mixed and Augmented Reality, Nara, 2007,225–234.
  4. [4] R. Mur-Artal, J.M.M. Montiel, and J.D. Tard´os, ORB-SLAM:A versatile and accurate monocular SLAM system, IEEETransactions on Robotics, 31(5), 2015, 1147–1163.
  5. [5] R. Mur-Artal and J.D. Tard´os, ORB-SLAM2: An open-sourceSLAM system for monocular, stereo, and RGB-D cameras,IEEE Transactions on Robotics, 33(5), 2017, 1255–1262.
  6. [6] T. Taketomi, H. Uchiyama, and S. Ikeda, Visual SLAMalgorithms: A survey from 2010 to 2016, IPSJ Transaction onComputer Vision Applications, 9(1), 2017, 1–11.
  7. [7] Z. Chang, H. Wu, Y. Sun, and C. Li, RGB-D visual SLAM basedon Yolov4-tiny in indoor dynamic environment, Micromachines,13(2), 2022, 230.
  8. [8] X. Gao, X. Shi, Q. Ge, and K. Chen, An overview of visualSLAM for dynamic object scenes, Robotics, 2021.
  9. [9] T. Diwan, G. Anirudh, and J.V. Tembhurne, Object detectionusing YOLO: Challenges, architectural successors, datasetsand applications, Multimedia Tools and Applications, 82, 2022,9243– 9275.
  10. [10] S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz,and D. Terzopoulos, Image segmentation using deep learning:A survey, IEEE Transactions on Pattern Analysis and MachineIntelligence, 44(7), 2022, 3523–3542.
  11. [11] L. Kenye and R. Kala, Improving RGB-D SLAM in dynamicenvironments using semantic aided segmentation, Robotica,40(6), 2022, 2065–2090.
  12. [12] L. Cui and C. Ma, SOF-SLAM: A semantic visual SLAM fordynamic environments, IEEE Access, 7, 2019, 166528–166539.
  13. [13] B. Bescos, J.M. Facil, J. Civera, and J. Neira, DynaSLAM:Tracking, mapping, and inpainting in dynamic scenes, IEEERobotics and Automation Letter, 3(4), 2018, 4076–4083.
  14. [14] F. Zhong, S. Wang, Z. Zhang, C. Chen, and Y. Wang, Detect-SLAM: Making object detection and SLAM mutually beneficial,Proc. IEEE Winter Conf. on Applications of Computer Vision(WACV), Lake Tahoe, NV, 2018, 1001–1010.
  15. [15] L. Xiao, J. Wang, X. Qiu, Z. Rong, and X. Zou, Dynamic-SLAM: Semantic monocular visual localization and mappingbased on deep learning in dynamic environment, Robotics andAutonomous Systems, 117, 2019, 1–16.
  16. [16] W. Chen, M. Fang, Y.-H. Liu, and L. Li, Monocular semanticSLAM in dynamic street scene based on multiple objecttracking, Proc. IEEE International Conf. on Cybernetics andIntelligent Systems (CIS) and IEEE Conference on Robotics,Automation and Mechatronics (RAM), Ningbo, 2017, 599–604.
  17. [17] Y. Hu, S. Ma, B. Li, M. Wang, and Y. Wang, Dynamicmodelling of reconfigurable robots with independent locomotionand manipulation ability, International Journal of Roboticsand Automation, 32(3), 2017, 206–4381.
  18. [18] G. Yang, Z. Chen, Y. Li, and Z. Su, Rapid relocation methodfor mobile robot based on improved ORB-SLAM2 algorithm,Remote Sensing, 11(2), 2019, 149
  19. [19] C. Yu, Z. Liu, X.–J. Liu, F. Xie, Y. Yang, Q. Wei,and Q. Fei, DS-SLAM: A semantic visual SLAM towardsdynamic environments, Proc. IEEE/RSJ International Conf.on Intelligent Robots and Systems (IROS), Madrid, Oct. 2018,1168–1174.
  20. [20] K. He, G. Gkioxari, P. Doll´ar, and R. Girshick, Mask R-CNN,Proc. of the IEEE International Conf. on Computer Vision,Venice, 2017, 2961–2969.
  21. [21] V. Badrinarayanan, A. Kendall, and R. Cipolla, Segnet: Adeep convolutional encoder-decoder architecture for imagesegmentation, IEEE Transactions on Pattern Analysis andMachine Intelligence, 39(12), 2017, 2481–2495.
  22. [22] B. Triggs, P.F. McLauchlan, R.I. Hartley, and A.W. Fitzgibbon,Bundle adjustment—A modern synthesis, Proc. InternationalWorkshop on Vision Algorithms, Corfu, 1999, 298–372.
  23. [23] C.–F. Tsai, Bag-of-words representation in image annotation:A review, ISRN Artificial Intelligence, 2012, 2012, 1–19.
  24. [24] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones,A.N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all youneed, Dec. 05, 2017, arXiv:1706.03762.
  25. [25] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, andS. Zagoruyko, End-to-end object detection with transformers,2020, arXiv:2005.12872.
  26. [26] X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai,Deformable DETR: Deformable transformers for end-to-endobject detection, 2021, arXiv:2010.04159.
  27. [27] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D.Cremers, A benchmark for the evaluation of RGB-D SLAMsystems, Proc. IEEE/RSJ International Conf. on IntelligentRobots and Systems, Vilamoura-Algarve, 2012, 573–580.
  28. [28] Y. Sun, M. Liu, and M.Q.–H. Meng, Improving RGB-D SLAMin dynamic environments: A motion removal approach, Roboticsand Autonomous Systems, 89, 2017, 110–122.

Important Links:

Go Back