Sebastian Castro demonstrates an example of controlling humanoid robot locomotion using deep reinforcement learning, specifically the Deep Deterministic Policy Gradient (DDPG) algorithm. The robot is simulated using Simscape Multibody™, while training the control policy is done using Reinforcement Learning Toolbox™.
In this video, Sebastian outlines the setup, training, and evaluation of reinforcement learning with Simulink® models. First, he introduces how to choose states, actions, and a reward function for the reinforcement learning problem. Then he describes the neural network structure and training algorithm parameters. Finally, he shows some training results and discusses the benefits and drawbacks of reinforcement learning.