Sim-to-Real Transfer for Hopper Control via Reinforcement Learning and Domain Randomization

Overview

This project investigates the Sim-to-Real Transfer problem in robotics, focusing on Reinforcement Learning (RL) for the Hopper-v0 environment. The key challenge addressed is the reality gap, where policies trained in simulation often fail to perform optimally in the real world due to discrepancies in dynamics.

We explore Proximal Policy Optimization (PPO) as the RL algorithm and employ Domain Randomization (DR) to improve the generalization of policies trained in a simulated environment with altered dynamics. Our results demonstrate the impact of various randomization strategies on transfer performance.

Key Features

Proximal Policy Optimization (PPO) for policy training.
Reality gap modeling by modifying torso mass in simulation.
Domain Randomization (DR) to enhance robustness.
- Uniform Domain Randomization (UDR): Varying link masses using a uniform distribution.
- Gaussian Domain Randomization (GDR): Sampling link masses from a Gaussian distribution.
Comparative analysis of randomization strategies to identify optimal transfer techniques.
Extended study on Walker2D-v4 to evaluate scalability to more complex robots.

Environment Setup

This project utilizes OpenAI Gym environments, particularly Hopper-v0 and Walker2D-v4.

Training Pipeline

Environment Creation: Instantiate either the source (modified torso mass) or target (default torso mass) Hopper environment.
PPO Training: Train policies in the source environment with and without domain randomization.
Evaluation:
- Evaluate trained policies in both source and target environments.
- Compare source-to-source (S2S) and source-to-target (S2T) performances.
Extended Analysis:
- Tune individual mass parameters to analyze their effect on transferability.
- Compare uniform vs. Gaussian domain randomization.

Results Summary

Baseline PPO training shows a significant performance drop when transferring from the source (lower torso mass) to the target (correct torso mass).
UDR improves transfer performance, but the level of improvement depends on the randomization range.
GDR provides better transfer results than UDR, as it samples more frequently near nominal values while still exposing the agent to variations.
Selective mass tuning enhances transferability, especially for the thigh mass, which plays a critical role in stability.
Walker2D-v4 tests confirm that domain randomization is beneficial but requires careful tuning for complex robots.

Future Work

Implement adaptive domain randomization that dynamically adjusts sampling ranges.
Explore correlated randomization for physically interdependent parameters.
Extend experiments to real-world robots for validation beyond simulation.

Acknowledgments

This project was developed for a robot learning course at Politecnico di Torino. Special thanks to instructors and peers for their support.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Extension_Hopper		Extension_Hopper
Extension_walker_2D		Extension_walker_2D
PPO		PPO
Plots		Plots
UDR		UDR
colab_template		colab_template
env		env
.gitignore		.gitignore
Paper_Robot_Learning.pdf		Paper_Robot_Learning.pdf
readme.md		readme.md
requirements.txt		requirements.txt
test_random_policy.py		test_random_policy.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sim-to-Real Transfer for Hopper Control via Reinforcement Learning and Domain Randomization

Overview

Key Features

Environment Setup

Training Pipeline

Results Summary

Future Work

Acknowledgments

About

Releases

Packages

Languages

alexscavo/Sim-to-Real-Transfer-Project-RL

Folders and files

Latest commit

History

Repository files navigation

Sim-to-Real Transfer for Hopper Control via Reinforcement Learning and Domain Randomization

Overview

Key Features

Environment Setup

Training Pipeline

Results Summary

Future Work

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages