Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于choose_action #3

Open
xlyue92 opened this issue Dec 23, 2020 · 5 comments
Open

关于choose_action #3

xlyue92 opened this issue Dec 23, 2020 · 5 comments

Comments

@xlyue92
Copy link

xlyue92 commented Dec 23, 2020

您好,请教下在for i in range(episodes): 下面,也就是第52行,model 还没开始train怎么去做choose_action里的predict?

@AKIRAsamadesu
Copy link

运行了300的多个周期依然没有收敛迹象,是不是memory有问题。。。

@weslythisway
Copy link

您好,请教下在for i in range(episodes): 下面,也就是第52行,model 还没开始train怎么去做choose_action里的predict?

還沒train的情況下 系統默認 每一種 action 機率基本相同,所以一開始action 就是隨機的

@weslythisway
Copy link

运行了300的多个周期依然没有收敛迹象,是不是memory有问题。。。

運氣問題,起始點不好就跑到local minum,多重跑幾次程式碼就有機會看到收斂

@weslythisway
Copy link

想請問你的train 為什麼模型 只訓練 當次 episode ,過去的回合不訓練??

@weslythisway
Copy link

比較大的問題是不是,model 在 train 的時候程式碼只用單次遊戲的資料進行訓練,並沒有把過去玩的資料一起納入訓練??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants