Reinforcement Learning for Bipedal walking robot.

This repository contains the simulation architecture based in Gazebo environment for implementing reinforcement learning algorithm, DDPG for generating bipedal walking patterns for the robot.

Planar Bipedal walking robot in Gazebo environment using Deep Deterministic Policy Gradient(DDPG).

The autonomous walking of the bipedal walking robot is achieved using reinforcement learning algorithm called <a href="https://github.com/nav74neet/ddpg_biped#references">Deep Deterministic Policy Gradient(DDPG)1</a>. DDPG utilises the actor-critic learning framework for learning controls in continuous action spaces. The project details & the results of the experiment have been documented in the research manuscript, <a href="https://arxiv.org/abs/1807.05924v2">Bipedal walking robot using Deep Deterministic Policy Gradient</a> This project was developed at the <a href="https://sites.google.com/site/compintellab/home">Computational Intelligence Laboratory, IISc, Bangalore</a>.

What you need before starting (Dependencies & Packages):

<a href="http://releases.ubuntu.com/16.04/">Ubuntu 16.04</a>
<a href="http://wiki.ros.org/kinetic">ROS Kinetic</a>
<a href="http://gazebosim.org/">Gazebo 7</a>
<a href="https://www.tensorflow.org/">TensorFlow: 1.1.0 [with GPU support]</a>
<a href="https://gym.openai.com/docs/">gym: 0.9.3</a>
Python 2.7

File setup:

walker_gazebo contains the robot model(both .stl files & .urdf file) and also the gazebo launch file.
walker_controller contains the reinforcement learning implementation of DDPG algorithm for control of the bipedal walking robot.

Learning to walk, initial baby steps

Stable bipedal walking

[<a href="https://goo.gl/1hwqJy*">Project video</a>]

Note: A stable bipedal walking was acheived after training the model using a Nvidia GeForce GTX 1050 Ti GPU enabled system for over 41 hours. The visualization for the horizontal boom(attached to the waist) is turned off.

Sources:

<ol> <li>Lillicrap, Timothy P., et al.<a href="https://arxiv.org/abs/1509.02971"> Continuous control with deep reinforcement learning.</a> arXiv preprint arXiv:1509.02971 (2015).</li> <li>Silver, David, et al.<a href="http://proceedings.mlr.press/v32/silver14.pdf"> Deterministic Policy Gradient Algorithms.</a> ICML (2014).</li> </ol>

Project Collaborator(s):

<a href="https://github.com/ioarun">Arun Kumar</a> (arunkumar12@iisc.ac.in) & <a href="http://www.aero.iisc.ernet.in/people/s-n-omkar/">Dr. S N Omkar</a> (omkar@iisc.ac.in)