You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 14, 2024. It is now read-only.
Hi there! I hope all is well. I noticed in the code that the embeddings seem to be initialized from a centered normal distribution (originally thought torch.empty was being used), which naturally produces different results on each call (both in terms of magnitude and orientation). We're noticing that the resulting embeddings trained on two separate runs (holding data fixed) seem to differ noticeably. I imagine that it's probably up to a difference in rotation/translation. We're wondering if the initialization might be the cause.
Would it also potentially be caused by the negative sampling not producing the same negatives? I did not see a generator/random seed in the negative sampling function. Any thoughts are appreciated!
The text was updated successfully, but these errors were encountered:
There are a few random operations (including parameter initialization and negative sampling) involved in the training process that would contribute to different trained embeddings.
Unfortunately setting the random seed is currently not supported.
Thank you @parsa-saadatpanah! Could you elaborate on some of the other random operations (aside from parameter initialization and negative sampling) you see that could contribute to different trained embeddings? It would help me figure out what we can do on our end to adjust. Thanks!
Hi there! I hope all is well. I noticed in the code that the embeddings seem to be initialized from a centered normal distribution (originally thought torch.empty was being used), which naturally produces different results on each call (both in terms of magnitude and orientation). We're noticing that the resulting embeddings trained on two separate runs (holding data fixed) seem to differ noticeably. I imagine that it's probably up to a difference in rotation/translation. We're wondering if the initialization might be the cause.
Would it also potentially be caused by the negative sampling not producing the same negatives? I did not see a generator/random seed in the negative sampling function. Any thoughts are appreciated!
The text was updated successfully, but these errors were encountered: