Abstract
In this paper, we propose two methods of adaptive actor-critic architectures to solve control problems of nonlinear systems. One method uses two actual states at time k and time k+1 to update the learning algorithm. The basic idea of this method is that the agent can directly take some knowledge from the environment to improve its knowledge. The other method only uses the state at time k to update the algorithm. This method is called, learning from prediction (or simulated experience). Both methods include one or two predictive models, which are assumed to be applied to construct predictive states and a model-based actor (MBA). Here, the MBA as an actor can be viewed as a network where the connection weights are the elements of the feedback gain matrix. In the critic part, two value-functions are realized as a pure static mapping, which can be reduced to a nonlinear current estimator by using the radial basis function neural networks (RBFNNs). Simulation results obtained for a dynamical model of nonholonomic mobile robots with two independent driving wheels are presented. They show the effectiveness of the proposed approaches for the trajectory tracking control problem.
Original language | English |
---|---|
Pages (from-to) | 835-845 |
Number of pages | 11 |
Journal | Soft Computing |
Volume | 9 |
Issue number | 11 |
DOIs | |
Publication status | Published - Nov 1 2005 |
Externally published | Yes |
Keywords
- Actor-critic algorithms
- Nonholonomic mobile robot
- Predictive model
- Temporal difference learning
- Tracking control problem
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- Geometry and Topology