Skip to content
This repository has been archived by the owner on Mar 14, 2024. It is now read-only.

[Question] Stability of embeddings on consecutive runs? #264

Open
ml7 opened this issue Sep 1, 2022 · 3 comments
Open

[Question] Stability of embeddings on consecutive runs? #264

ml7 opened this issue Sep 1, 2022 · 3 comments

Comments

@ml7
Copy link

ml7 commented Sep 1, 2022

Hi there! I hope all is well. I noticed in the code that the embeddings seem to be initialized from a centered normal distribution (originally thought torch.empty was being used), which naturally produces different results on each call (both in terms of magnitude and orientation). We're noticing that the resulting embeddings trained on two separate runs (holding data fixed) seem to differ noticeably. I imagine that it's probably up to a difference in rotation/translation. We're wondering if the initialization might be the cause.

Would it also potentially be caused by the negative sampling not producing the same negatives? I did not see a generator/random seed in the negative sampling function. Any thoughts are appreciated!

@parsa-saadatpanah
Copy link
Contributor

There are a few random operations (including parameter initialization and negative sampling) involved in the training process that would contribute to different trained embeddings.
Unfortunately setting the random seed is currently not supported.

@ml7
Copy link
Author

ml7 commented Sep 12, 2022

Thank you @parsa-saadatpanah! Could you elaborate on some of the other random operations (aside from parameter initialization and negative sampling) you see that could contribute to different trained embeddings? It would help me figure out what we can do on our end to adjust. Thanks!

@parsa-saadatpanah
Copy link
Contributor

I believe the bucket scheduling and how each bucket is split into sub-buckets can also be random.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants