Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated model.py #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

srijan-mishra
Copy link

The disambiguate task was working on a the individual letters of the words of the whole sentence. I have changed it now and it would be running the word2id on the words as a whole rather than the individual letters of the words.

@lopuhin
Copy link
Owner

lopuhin commented May 9, 2017

@srijan-mishra ah, thanks for the PR! Actually, I should have made it more clear that context should be already tokenized. I would like to leave tokenization outside of "disambiguate", as it can be different for different languages, and it's important for tokenization to be the same in training and inference.

So instead of doing tokenization via split, I would like to document that context should already be tokenized.

@srijan-mishra
Copy link
Author

Awesome!

Great work here :)

@lopuhin
Copy link
Owner

lopuhin commented May 9, 2017

@srijan-mishra great work with figuring it out!

Btw, just though that it would be also possible to raise an error if a string is passed instead of a list, so if someone misses the docs, they'll get a clear error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants