Skip to content
This repository has been archived by the owner on Mar 14, 2024. It is now read-only.

Add to existing embeddings and retrain #240

Open
ssharpe42 opened this issue Oct 21, 2021 · 1 comment
Open

Add to existing embeddings and retrain #240

ssharpe42 opened this issue Oct 21, 2021 · 1 comment

Comments

@ssharpe42
Copy link

Is there an easy way or framework that is used to

  1. Load existing embeddings
  2. Add to node vocabulary
  3. Retrain with data that contains new nodes
@lw
Copy link
Contributor

lw commented Oct 27, 2021

I don't think there's anything out-of-the-box for that. Though you should be able to build it yourself. The init_path argument in the config could be a good place to start: it's used to provide a checkpoint of a previous run which will be used to "warmstart" the new run. That old checkpoint must have the exact same entities as the new run though. However, you could try to manually alter a previous checkpoint to artificially insert some entities that weren't there, and give them a random embedding.

Even better would be to do this as part of your importing/exporting scripts. The scripts we provide don't do it AFAIK but you could modify them or write your own ones so that the importer looks up, for each entity, whether the previous exporter has stored an embedding for that entity, and if so it includes it into a "fake" checkpoint that can be passed to init_path.

Hope this helps.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants