e-journal
A Novel Iterative -Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems
This paper is concerned with a new iterative -adaptive dynamic programming (ADP) technique to solve optimal control problems of infinite horizon discrete-time nonlinear systems. The idea is to use an iterative ADP algorithm to obtain the iterative control law which optimizes the iterative performance index function. In the present iterative -ADP algorithm, the condition of initial admissible control in policy iteration algorithm is avoided. It is proved that all the iterative controls obtained in the iterative -ADP algorithm can stabilize the nonlinear system which means that the iterative -ADP algorithm is feasible for implementations both online and offline. Convergence analysis of the performance index function is presented to guarantee that the iterative performance index function will converge to the optimummonotonically. Neural networks are used to approximate the performance index
function and compute the optimal control policy, respectively, for facilitating the implementation of the iterative -ADP algorithm. Finally, two simulation examples are given to illustrate the performance
of the established method.
Note to Practitioners—Optimal control of nonlinear systems has always been the key focus of the control field in the past several decades. Dynamic programming is a very useful tool in solving optimal
control problems. However, due to the “curse of dimensionality,” it is often computationally untenable to run dynamic programming to obtain the optimal solution. In the iterative approximation methods, most results need rigorous initial conditions (such as initial admissible control condition) which limit the applications. In addition, the iterative control cannot guarantee to stabilize the system which makes it very difficult to apply in real-world problems. Therefore, in this paper, the iterative -ADP algorithm is developed to deal with the optimal control problem of discrete-time nonlinear systems to overcome these difficulties. It is shown that each of the iterative controls can stabilize the nonlinear system and the initial condition of admissible control is avoided effectively.
This implies the convenience of applications to real-world nonlinear systems using present iterative -ADP algorithm.Moreover, detailed implementation of the iterative -ADP algorithm is also presented by using neural networks.
Index Terms—Adaptive critic designs, adaptive dynamic programming, approximate dynamic programming, neural networks, neuro-dynamic programming, nonlinear systems, optimal control, policy iteration, reinforcement learning, value iteration.
Tidak ada salinan data
Tidak tersedia versi lain