You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 18, 2021. It is now read-only.
After the discussion in #52, we ended up with the need of model selection based on evaluation metrics such as BLEU on the fly, which allows us to save the disk space as well as to merge the evaluation process into the training script.
I'm looking at the way of non-blocking computation of evaluation scores using Timer where an evaluation script is executed regularly with a certain time interval.
starting...
At epoch 0, 0/10...
At epoch 0, 1/10...
At epoch 0, 2/10...
At epoch 0, 3/10...
At epoch 0, 4/10...
At epoch 0, 5/10...
At epoch 0, 6/10...
At epoch 0, 7/10...
Do validation !!
At epoch 0, 8/10...
At epoch 0, 9/10...
At epoch 1, 0/10...
At epoch 1, 1/10...
At epoch 1, 2/10...
At epoch 1, 3/10...
Validation is over
At epoch 1, 4/10...
At epoch 1, 5/10...
Do validation !!
At epoch 1, 6/10...
At epoch 1, 7/10...
At epoch 1, 8/10...
At epoch 1, 9/10...
At epoch 2, 0/10...
At epoch 2, 1/10...
Validation is over
At epoch 2, 2/10...
At epoch 2, 3/10...
Do validation !!
At epoch 2, 4/10...
At epoch 2, 5/10...
At epoch 2, 6/10...
At epoch 2, 7/10...
At epoch 2, 8/10...
At epoch 2, 9/10...
Validation is over
At epoch 3, 0/10...
At epoch 3, 1/10...
Do validation !!
At epoch 3, 2/10...
At epoch 3, 3/10...
At epoch 3, 4/10...
At epoch 3, 5/10...
At epoch 3, 6/10...
At epoch 3, 7/10...
Validation is over
At epoch 3, 8/10...
At epoch 3, 9/10...
Do validation !!
At epoch 4, 0/10...
At epoch 4, 1/10...
At epoch 4, 2/10...
^CTraceback (most recent call last):
File "timer_sample.py", line 52, in
train(max_epoch, rt)
File "timer_sample.py", line 43, in train
time.sleep(5) # your long-running job goes here...
KeyboardInterrupt
Validation is over
Also, I'm going to implement unknown word replacement by using alignment weights during translation.
Let me know if you have a better idea or an easier way to achieve such things.
The text was updated successfully, but these errors were encountered:
I ran some experiments to check if computing BLEU on the fly (#61) and naive unknown word replacement (#60) work fine or not.
It seems like the BLEU score calculation works as intended though there's a possibility of bugs.
The unknown word replacement seems to be fine as well.
If we try to translate the following German sentence to English taken from newstest2013 as a validation set,
Es ist auch ein Risikofaktor für mehrere andere Krebsarten .
It is also a risk factor for a number of others .
the output of the model is
It is also a Risikofaktor of several other Krebsarten .
You may see that the translated sentence contains two German words, i.e., Risikofaktor and Krebarten, which were copied directly from the source sentence using alignment scores.
Due to the limit of vocabulary size, both German words are out of vocabulary, meaning that the input sentence is fed into the model as follows.
Es ist auch ein für mehrere andere .
Note that the original output from the model before UNK replacement would be the following.
It is also a of several other .
As you can see, this sort of naive UNK replacement may work pretty well if UNK tokens are proper noun and have same in both source and target languages.
However, it' not always the case.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
After the discussion in #52, we ended up with the need of model selection based on evaluation metrics such as BLEU on the fly, which allows us to save the disk space as well as to merge the evaluation process into the training script.
I'm looking at the way of non-blocking computation of evaluation scores using
Timer
where an evaluation script is executed regularly with a certain time interval.Here is the sample output using a toy script.
Also, I'm going to implement unknown word replacement by using alignment weights during translation.
Let me know if you have a better idea or an easier way to achieve such things.
The text was updated successfully, but these errors were encountered: