Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spacy and Huggingface integrations #84

Merged
merged 18 commits into from
Nov 4, 2024

Conversation

adamkells
Copy link
Contributor

@adamkells adamkells commented Oct 13, 2024

Description

This PR adds convenient integrations for integrating pipelines from spacy, huggingface or langchain into healthchain pipelines.

These NLP pipelines constitue a single component of the Healthchain pipeline.

Related Issue

This addresses issue #78 .

Changes Made

  • Modifies the Document container object to be less specific to spacy and to allow addition of information from spacy/hf/langchain.
  • Adds integration to instantiate components from each of the three libraries.
  • Adds documentation with examples for each integration as well as full example using all 3 in one pipeline.

Testing

Unit tests were added for all new functionality. As transformers is an optional dependency, the huggingface tests are skipped in CI but were run locally.

Checklist

  • I have read the contributing guidelines
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Notes

@adamkells adamkells force-pushed the improv/pipeline_integrations branch from aa232d0 to 1e057c9 Compare October 18, 2024 10:24
@adamkells adamkells marked this pull request as ready for review October 23, 2024 23:16
Copy link
Member

@jenniferjiangkells jenniferjiangkells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just left some comments on some stylistic decisions and documentation improvements. Also note updated usage of add_node() and importing from .base. You good to make changes or want me to take over?

docs/reference/pipeline/integrations.md Outdated Show resolved Hide resolved
docs/reference/pipeline/integrations.md Outdated Show resolved Hide resolved
docs/reference/pipeline/integrations.md Show resolved Hide resolved
docs/reference/pipeline/integrations.md Show resolved Hide resolved
docs/reference/pipeline/integrations.md Show resolved Hide resolved
healthchain/io/containers.py Outdated Show resolved Hide resolved
healthchain/io/containers.py Outdated Show resolved Hide resolved
healthchain/pipeline/components/integrations.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
docs/reference/pipeline/integrations.md Outdated Show resolved Hide resolved
Copy link
Member

@jenniferjiangkells jenniferjiangkells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you accidentally pasted a url in the middle of a code snippet in the documentation but otherwise lgtm


# Access spaCy annotations
spacy_doc = processed_doc.get_spacy_doc()
for token in spacy_doc:https://github.com/dotimplement/HealthChain
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@jenniferjiangkells jenniferjiangkells merged commit 0bd1fd2 into main Nov 4, 2024
5 checks passed
@jenniferjiangkells jenniferjiangkells deleted the improv/pipeline_integrations branch December 5, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pipeline integrations for common NLP/LLM Packages
2 participants