Replies: 2 comments
-
>>> kdavis |
Beta Was this translation helpful? Give feedback.
0 replies
-
>>> utunga |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
>>> utunga
[June 13, 2018, 1:01am]
Hey ya'll,
Any help understanding this would be much appreciated.
There is a tensorflow variable called 'loss' which is already defined in
the train() method (of DeepSpeech.py). Not suprisingly really since it
is what is passed to the gradient optimizer.
I added it to my
tensorboard
so I could see it's progress while training DeepSpeech on a new
language.
Over the first two and a bit epochs of training the loss function looks
like this: slash
As you can see it appears to consistently go up during the epoch not
down as I would've expected.
Over many epochs it does what you would want it to do - track down ... slash
But I'm wondering if any one can help explain why the loss goes up from
batch to batch.
It isn't just a question of needing to divide the loss by the batch
count in order to get the 'average' loss per batch because - well you
can see from the numbers involved... it starts at 100 then goes up to
only 300 over many batches (I think about slash ~50 batches per epoch in
this case) so its not just a sum of the loss over all batches?
I assume I am missing something super obvious here but would love to
know what the story is from anyone who does know.
Thanks!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/loss-function-appears-to-slowly-climb-over-batches-during-epoch-reset-every-epoch]
Beta Was this translation helpful? Give feedback.
All reactions