Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shouldn't manager_vf be function of x_t? #2

Open
imbalu007 opened this issue Sep 11, 2017 · 8 comments
Open

Shouldn't manager_vf be function of x_t? #2

imbalu007 opened this issue Sep 11, 2017 · 8 comments

Comments

@imbalu007
Copy link

Right after eq.(7) in the paper, the authors say V_t as a function of x_t. However, in the code it is a function of g_hat (feudal_policy.py->_build_manager()),
self.manager_vf = self._build_value(g_hat)
Shouldn't it be a function of x_t?

@davidhershey
Copy link
Owner

davidhershey commented Sep 11, 2017 via email

@shanlior
Copy link

Hi,
I didn't understand your answer - Are you still trying to implement it or did you abandon this repository?

Thanks

@davidhershey
Copy link
Owner

davidhershey commented Sep 13, 2017 via email

@shanlior
Copy link

shanlior commented Sep 13, 2017

Thanks for the quick and detailed response.

If I can offer any help, I haven't seen an implementation of the "dilated lstm" in your code (or maybe I missed it?).
I think it's a core idea in this paper for two reasons:

  1. Had it worked without it, I don't believe it was in the paper.
  2. Without it, there is no mechanism that controls the time interval c, so the goals objective is ill-defined. I believe it suppose to act as some kind of a finite-state machine.

@davidhershey
Copy link
Owner

davidhershey commented Sep 13, 2017 via email

@shanlior
Copy link

Ohh. I didn't check other branches. Anyhow, that's truly frustrating because this model sounds really tempting (and I guess you put a lot of work into it).
Good luck

@KadeG
Copy link

KadeG commented Feb 26, 2018

@dmakian You mentioned a formal post about convergence problems and the idea feudal networks may not converge as described in the paper. Any update there?

@davidhershey
Copy link
Owner

Sorry this has been off my mind for a while. I doubt a post is coming.

In short: I believe that feudal networks can work, I just think that the implementation is fragile. If DeepMind released their code I'm sure it would function, but as with a lot of Deep RL small differences in code can lead to wildly different performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants