You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 18, 2021. It is now read-only.
This paper is the only one that I am aware of which has looked at the differences between NMT and SMT in depth. It reaches some interesting conclusions:
[...] we found that the majority of the gains [after reranking] were related to improvements in the accuracy of transfer of correct grammatical structure to the target sentence [but] neural MT reranking had an overall negative effect on choice of terminology [...] the neural MT model tended to prefer more common words, mistaking “radiant heat” as “radiation heat” or “slipring” as “ring.” While these tendencies will be affected by many factors such as the size of the vocabulary or the number and size of hidden layers of the net, we feel it is safe to say that neural MT reranking can be expected to have a large positive effect on syntactic correctness of output, while results for lexical choice are less conclusive.
These results were for reranking of SMT sentences using an NMT model, so it's hard to say whether these problems are more/less present when doing direct translation (my guess would be more).
A simple way of trying to address this problem this could be to weight the target words using something like TF-IDF (which is basically TF if we take each sentence to be a document). This would force the model to pay more attention to the rare words, not getting away as easily by optimising for common words.
Evaluating whether this helps could be tricky; standard perplexity and BLEU score won't easily show an improvement in these rarer failure cases, so it could be worthwhile to evaluate using e.g. NIST instead.
The text was updated successfully, but these errors were encountered:
This paper is the only one that I am aware of which has looked at the differences between NMT and SMT in depth. It reaches some interesting conclusions:
These results were for reranking of SMT sentences using an NMT model, so it's hard to say whether these problems are more/less present when doing direct translation (my guess would be more).
A simple way of trying to address this problem this could be to weight the target words using something like TF-IDF (which is basically TF if we take each sentence to be a document). This would force the model to pay more attention to the rare words, not getting away as easily by optimising for common words.
Evaluating whether this helps could be tricky; standard perplexity and BLEU score won't easily show an improvement in these rarer failure cases, so it could be worthwhile to evaluate using e.g. NIST instead.
The text was updated successfully, but these errors were encountered: