Skip to content

Latest commit

 

History

History
58 lines (46 loc) · 2.48 KB

README.md

File metadata and controls

58 lines (46 loc) · 2.48 KB

Actor-Sharer-Learner (ASL): An Efficient Training Framework for Off-policy Deep Reinforcement Learning

Python Pytorch DRL TrainingFramework

Introduction

The Actor-Sharer-Learner (ASL) is a highly efficient training framework for off-policy DRL algorithms, capable of enhancing sample efficiency, shortening training time, and improving final performance simultaneously. Detailly, the ASL framework employs a Vectorized Data Collection (VDC) mode to expedite data acquisition, decouples the data collection from model optimization by multithreading, and partially connects the two procedures by harnessing a Time Feedback Mechanism (TFM) to evade data underuse or overuse.

Dependencies

envpool >= 0.6.6  (https://envpool.readthedocs.io/en/latest/)
torch >= 1.13.0  (https://pytorch.org/)
numpy >= 1.23.4  (https://numpy.org/)
tensorboard >= 2.11.0  (https://pytorch.org/docs/stable/tensorboard.html)
python >= 3.8.0 
ubuntu >= 18.04.1 

Quick Start:

After installation, you can use the ASL framework to train an Atari agent via:

python main.py

where the default envionment is Alien and the underlying DRL algorithm is DDQN. For more details about experiment setup, please check the main.py. The trianing curves of 57 Atari games are listed as follows.

Citing the Project

To cite this repository in publications:

@article{Color2025XJH,
title = {Train a real-world local path planner in one hour via partially decoupled reinforcement learning and vectorized diversity},
journal = {Engineering Applications of Artificial Intelligence},
volume = {141},
pages = {109726},
year = {2025},
issn = {0952-1976},
doi = {https://doi.org/10.1016/j.engappai.2024.109726},
}

Maintenance History

  • 2023/6/20
    • sample_core() in Sharer.py is optimized, where
      • we use a more pytorch way to delete self.ptr-1 in ind
      • for Sharer.shared_data_cuda(), the ind and env_ind are generated on self.B_dvc to run faster