Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem in PPOAgent #6

Open
sailxjx opened this issue Apr 28, 2021 · 0 comments
Open

A problem in PPOAgent #6

sailxjx opened this issue Apr 28, 2021 · 0 comments

Comments

@sailxjx
Copy link

sailxjx commented Apr 28, 2021

Hi, Rokas:

First of all thanks for your great tutorial on reinforcement learning, I went through all the series and learned a lot.

In the PPOAgent I think there may be something wrong with this line. When I vstack the discounted_r (shape of (n,1)) and subtract it with predicted values (shape of (n,)), the advantages become shape of (n,n). So I think maybe we should not vstack discounted_r, but vstack the advantages in this line advantages = np.vstack(discounted_r - values), then the advantages are shape of (n,1), which is the expected result.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant