An adaptive actor-critic algorithm with multi-step simulated experiences for controlling nonholonomic mobile robots

Rafiuddin Syam, Keigo Watanabe, Kiyotaka Izumi

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

In this paper, we propose a new algorithm of an adaptive actor-critic method with multi-step simulated experiences, as a kind of temporal difference (TD) method. In our approach, the TD-error is composed of two value- functions and m utility functions, where m denotes the number of multi-steps in which the experience should be simulated. The value-function is constructed from the critic formulated by a radial basis function neural network (RBFNN), which has a simulated experience as an input, generated from a predictive model based on a kinematic model. Thus, since our approach assumes that the model is available to simulate the m-step experiences and to design a controller, such a kinematic model is also applied to construct the actor and the resultant model based actor (MBA) is also regarded as a network, i.e., it is just viewed as a resolved velocity control network. We implement this approach to control nonholonomic mobile robot, especially in a trajectory tracking control problem for the position coordinates and azimuth. Some simulations show the effectiveness of the proposed method for controlling a mobile robot with two-independent driving wheels.

Original languageEnglish
Pages (from-to)81-89
Number of pages9
JournalSoft Computing
Volume11
Issue number1
DOIs
Publication statusPublished - Jan 1 2007
Externally publishedYes

Keywords

  • Actor-critic algorithms
  • Kinematic model
  • Multi-step prediction
  • Nonholonomic mobile robot
  • Nonlinear predictive model
  • Simulated experience

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Geometry and Topology

Fingerprint

Dive into the research topics of 'An adaptive actor-critic algorithm with multi-step simulated experiences for controlling nonholonomic mobile robots'. Together they form a unique fingerprint.

Cite this