A NOVEL ARTIFICIAL POTENTIAL FIELD-BASED REINFORCEMENT LEARNING FOR MOBILE ROBOTICS IN AMBIENT INTELLIGENCE

H. Chen∗ and L. Xie∗∗

References

  1. [1] X. Ma, X. Li, & H. Qiao, Fuzzy neural network-based real-time self-reaction of mobile robot in unknown environments,Mechatronics, 11(8), 2001, 1039–1052.
  2. [2] L.P. Kaelbling, M.L. Littman, & A.W. Moore, Reinforcementlearning: a survey, Journal of Artificial Intelligence Research,4, 1996, 237–285.
  3. [3] R.S. Sutton & A. Barto, Reinforcement learning: an introduc-tion (Cambridge, MA: MIT Press, 1998).
  4. [4] D.P. Bertsekas & J.N.Tsitsiklis, Neuro-Dynamic programming(Athena Scientific, 1996).
  5. [5] J.-S.R. Jang, C.-T. Sun, & E. Mizutani, Neuro-fuzzy and softcomputing (New Jersey: Prentice-Hall, Inc 1997).
  6. [6] J. Holland, Adaptation in natural and artificial systems (AnnArbor: University of Michigan Press, 1975).
  7. [7] D.E. Goldberg, Genetic algorithms in search, optimization,and machine learning (Addison Welsey, 1989).
  8. [8] J. Grefenstette, Credit assignment in rule discovery systemsbased on genetic algorithms, Machine Learning, 3, 1988, 225–245.
  9. [9] T.G. Dietterich, Hierarchical reinforcement learning with theMAXQ value function decomposition, Journal of ArtificialIntelligence Research, 13, 2000, 227–303.
  10. [10] R.S. Sutton, D. Precup, & S. Singh, Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcementlearning, Artificial Intelligence, 112(1), 1999, 181–211.
  11. [11] P. Dayan & G. Hinton, Feudal reinforcement learning (Ad-vances in Neural Information Processing Systems, San Mateo,CA: Morgan Kaufmann, 1993).253
  12. [12] L.-J. Lin, Scaling up reinforcement learning for robot control,10th International Conf. on Machine Learning, Amherst, MA,27–29 June, 1993, 182–189.
  13. [13] R. Parr & S. Russell, Reinforcement learning with hierarchicalmachines (Advances in Neural Information Processing Systems,Cambridge, MA: MIT Press, 1998).
  14. [14] S.P. Singh, Transfer of learning by composing solutions ofelemental sequential tasks, Machine Learning, 8, 1992, 323–339.
  15. [15] B. Digney, Learning hierarchical control structure from multipletasks and changing environments, In From Animals to Animats5: The Fifth Conf. on Simulation of Adaptive Behaviour,(Cambridge, MA: MIT Press), 1998.
  16. [16] B. Hengst, Discovering hierarchy in reinforcement learning withhexq, Proc. 19th International Conf. on Machine Learning,Sydney, Australia, 2002, 243–250.
  17. [17] A. McGovern & A. Barto, Automatic discovery of subgoalsin reinforcement learning using diverse density, Proc. 18thInternational Conf. on Machine Learning, Williamstown, MA,2001, 361–368.
  18. [18] W. Zhou & R. Coggins, A biologically inspired hierarchicalreinforcement learning system, Cybernetics and Systems, 36,Williamstown, MA, 2005, 1–44.
  19. [19] A. Barto & S. Mahadevan, Recent advances in hierarchical rein-forcement learning, Discrete Event Dynamic Systems: Theoryand Applications, 13, 2003, 41–77.
  20. [20] M. Kearns & D. Koller, Efficient reinforcement learning in fac-tored MDPs, Proc. 16th International Joint Conf. on ArtificialIntelligence, Morgan Kaufmann, 1999, 740–747.
  21. [21] J.R. Andrews & N. Hogan, Impedance control as a frameworkfor implementing obstacle avoidance in a manipulator, inE.D. Hardt and J.W. Book (Eds), Control of ManufacturingProcess and Robotic System (New York: ASME, 1983), 243–251.
  22. [22] O. Khatib, Real-time obstacle avoidance for manipulators andmobile robots, International Journal of Robotics Research,5(1), 1986, 90–98.
  23. [23] Y. Koren & J. Borenstein. Potential field methods and theirinherent limitations for mobile robot navigation, Proc. IEEEConf. on Robotics and Automation, Sacramento, California,April 7–12, 1991, 1394–1404.
  24. [24] W.H. Huang & B.R. Fajen, Visual navigation and obstacleavoidance using a steering potential function, Robotics andAutonomous Systems, 54, 2006, 288–299.
  25. [25] M.G. Park & M.C. Lee, Artificial potential field based pathplanning for mobile robots using a virtual obstacle concept,Proc. IEEE/ASME Conf. on Advanced Intelligent Mechatron-ics, Kobe, Japan, 2003, 735–740.
  26. [26] C.Q. Liu, M.H.A. Jr, H. Krishnan, & L.S. Yong, Virtualobstacle concept for local-minimum-recovery in potential-fieldbased navigation, Proc. IEEE Int. Conf. on Robotics andAutomation, San Francisco, CA, 2000, 983–988.
  27. [27] O. Brock & O. Khatib, High-speed navigation using the globaldynamic window approach, Proc. IEEE Conf. on Robotics andAutomation, Detroit, Michigan, 1999, 341–346.
  28. [28] K. Konolige, A gradient method for realtime robot control,Proc. IEEE/RSJ Conf. on Intelligent Robots and Systems,Takamatsu, Japan, 2000, 639–646.
  29. [29] E. Rimon & D. Koditschek, Exact robot navigation usingartificial potential functions, IEEE Transactions on Roboticsand Automation, 8(5), Amherst, 1992, 501–518.
  30. [30] J. Loch & S. Singh, Using eligibility traces to find the bestmemoryless policy in partially observable Markov decisionprocesses, Proc. 15th International Conf. on Machine Learning,Morgan Kaufmann, San Francisco, CA, 1998, 323–331.
  31. [31] D. Precup, Temporal Abstraction in Reinforcement Learning,PhD Thesis, University of Masachusetts, Amherst, 2000.
  32. [32] M. Wiering & J. Schmidhuber, HQ-Learning, Adaptive Be-haviour, 6(2), Washington, DC, 1998, 219–246.
  33. [33] M.L. Littman, Memoryless policies: theoretical limitationsand practical results, Proc. Third International Conf. onSimulation of Adaptive Behavior (SAB’94), Washington, DC,1994, 238–245.

Important Links:

Go Back