UR robot manipulator path planning via reinforcement learning
With the increasing demand of people's intelligent life, there have been new developments in robotic field. Combined with neural network, traditional reinforcement learning algorithms is also improved to adapt to the situation of high-dimensional continuous space. As the most basic step of mani...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/152896 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | With the increasing demand of people's intelligent life, there have been new developments in robotic field. Combined with neural network, traditional reinforcement learning algorithms is also improved to adapt to the situation of high-dimensional continuous space. As the most basic step of manipulator motion control, path planning has also become more intelligent with the development of reinforcement learning.
First of all, the path planned in the Cartesian space needs to be converted into a path of joint angles, which will play an important role in subsequent motion control. We use the D-H parameter method to model the kinematics of the manipulator, find out its forward and inverse kinematics and verify its correctness.
Secondly, we discuss the obstacle avoidance effect and its limitations of the traditional obstacle avoidance method artificial potential field method in a known environment, and use the RRT algorithm to improve it when it falls into a local optimum.
Thirdly, we use the DDPG algorithm in reinforcement learning to plan the trajectory, which is suitable for continuous state and action situation. Through the training of the model, the simple path planning between 2 given points of the robot arm is completed by training the RL model. We analyze the problems of sparse rewards and propose some possible solutions. |
---|