Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrated UrduHack and IndicNLP Resources directly into the module #68

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

VarunGumma
Copy link

Integrated UrduHack and indic_nlp_resources directly into the module. This negates the need to install the TensorFlow-based Urdu library which was causing some conflicts. Also, the resources are directly added to this module and we do not need to separately clone it and set the path. This will help in easy installation, and packaging, especially for IT2 HF tokenizer.

@VarunGumma
Copy link
Author

Hi @anoopkunchukuttan , as discussed I have opened a PR for the indicnlp version we have been using for IT2 and its tokenizer. This repo integrates UrduHack, indic_nlp_resources and is debloated to support the primary requirements of IT2.Hope this can added directly as another branch to the original repo.

@anoopkunchukuttan
Copy link
Owner

Thanks @VarunGumma , will review and get back in a couple of days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants