Shouldn't manager_vf be function of x_t? #2

imbalu007 · 2017-09-11T06:17:15Z

Right after eq.(7) in the paper, the authors say V_t as a function of x_t. However, in the code it is a function of g_hat (feudal_policy.py->_build_manager()),
self.manager_vf = self._build_value(g_hat)
Shouldn't it be a function of x_t?

The text was updated successfully, but these errors were encountered:

davidhershey · 2017-09-11T10:51:07Z

That's correct, we caught that late in this development cycle. We still couldn't get anything to converge with that value function, but you're correct that it should be built from the visual input.

…

On Mon, Sep 11, 2017 at 2:17 AM, imbalu007 ***@***.***> wrote: Right after eq.(7) in the paper <https://arxiv.org/pdf/1703.01161.pdf>, the authors say V_t as a function of x_t. However, in the code it is a function of g_hat (feudal_policy.py->_build_manager()), self.manager_vf = self._build_value(g_hat) Shouldn't it be a function of x_t? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#2>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALHLEtC3itFyh26hIcrDFt4nhyDlaemDks5shNBsgaJpZM4PSrxj> .

shanlior · 2017-09-13T13:25:02Z

Hi,
I didn't understand your answer - Are you still trying to implement it or did you abandon this repository?

Thanks

davidhershey · 2017-09-13T13:51:07Z

I'll make a formal post about it as I haven't looked at this in a bit (been busy elsewhere), but as of now I've talked with some researchers and we have reason to believe that FeUdal networks will not converge as described in the original paper. In the coming weeks I'll try to consolidate the code-base and clean it up as well as possible in case (1) new details are published on how to actually train these networks or (2) someone else can figure out a magic bullet.

…

On Wed, Sep 13, 2017 at 9:25 AM, Lior Shani ***@***.***> wrote: Hi, I didn't understand your answer - Are you still trying to implement it or did you abandon this repository? Thanks — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALHLEk_RL2ZO9KS034nHLP3_R2q-llVwks5sh9evgaJpZM4PSrxj> .

shanlior · 2017-09-13T18:28:16Z

Thanks for the quick and detailed response.

If I can offer any help, I haven't seen an implementation of the "dilated lstm" in your code (or maybe I missed it?).
I think it's a core idea in this paper for two reasons:

Had it worked without it, I don't believe it was in the paper.
Without it, there is no mechanism that controls the time interval c, so the goals objective is ill-defined. I believe it suppose to act as some kind of a finite-state machine.

davidhershey · 2017-09-13T20:03:57Z

I do have a dilated LSTM implemented, its in a branch (see dlstm_fix). Hence the need for consolidating! You can find it in models/models.py

…

On Wed, Sep 13, 2017 at 2:28 PM, Lior Shani ***@***.***> wrote: Thanks for the quick and detalied response. If I can offer any help, I haven't seen an implementation of the "dilated lstm" in your code (or maybe I missed it?). I think it's a core idea in this paper for two reasons: 1. Had it worked without it, I don't believe it was in the paper. 2. Without it, there is no mechanism that controls the time interval c, so the goals objective is ill-defined. I believe it suppose to act as some kind of a finite-state machine. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALHLEuxbIhqJcBQv2cTRXgc8gxB6QCuSks5siB7BgaJpZM4PSrxj> .

shanlior · 2017-09-14T07:34:03Z

Ohh. I didn't check other branches. Anyhow, that's truly frustrating because this model sounds really tempting (and I guess you put a lot of work into it).
Good luck

KadeG · 2018-02-26T15:47:08Z

@dmakian You mentioned a formal post about convergence problems and the idea feudal networks may not converge as described in the paper. Any update there?

davidhershey · 2018-02-26T15:56:09Z

Sorry this has been off my mind for a while. I doubt a post is coming.

In short: I believe that feudal networks can work, I just think that the implementation is fragile. If DeepMind released their code I'm sure it would function, but as with a lot of Deep RL small differences in code can lead to wildly different performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shouldn't manager_vf be function of x_t? #2

Shouldn't manager_vf be function of x_t? #2

imbalu007 commented Sep 11, 2017

davidhershey commented Sep 11, 2017 via email

shanlior commented Sep 13, 2017

davidhershey commented Sep 13, 2017 via email

shanlior commented Sep 13, 2017 •

edited

Loading

davidhershey commented Sep 13, 2017 via email

shanlior commented Sep 14, 2017

KadeG commented Feb 26, 2018

davidhershey commented Feb 26, 2018

Shouldn't manager_vf be function of x_t? #2

Shouldn't manager_vf be function of x_t? #2

Comments

imbalu007 commented Sep 11, 2017

davidhershey commented Sep 11, 2017 via email

shanlior commented Sep 13, 2017

davidhershey commented Sep 13, 2017 via email

shanlior commented Sep 13, 2017 • edited Loading

davidhershey commented Sep 13, 2017 via email

shanlior commented Sep 14, 2017

KadeG commented Feb 26, 2018

davidhershey commented Feb 26, 2018

shanlior commented Sep 13, 2017 •

edited

Loading