Replies: 1 comment 2 replies
-
It is really just a hack to avoid the situation where the car starts driving in the wrong direction in TrackMania. The problem with this would be that, since we don't provide a signal telling the car that it is going in the wrong direction, the model wouldn't understand why its reward is suddenly bad, especially in the "lidar" environment. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I just understood the meaning of this:
Question: Would bad experience still not be useful experience?
Is there a good reason to terminate early?
If the reward were 0 (or as I did in my case - negative) would the NN still not be able to learn something useful from it?
Beta Was this translation helpful? Give feedback.
All reactions