Skip to content

Commit

Permalink
Merge pull request #1 from haggqvist/main
Browse files Browse the repository at this point in the history
add pre-commit configuration
  • Loading branch information
ksachdeva authored Aug 16, 2024
2 parents a82f14d + 83f67e9 commit 602c013
Show file tree
Hide file tree
Showing 31 changed files with 130 additions and 78 deletions.
4 changes: 2 additions & 2 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ ARG USERNAME=vscode
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update \
&& apt-get upgrade -y \
&& apt-get -y install --no-install-recommends build-essential libmagic-dev iputils-ping \
&& apt-get -y install --no-install-recommends build-essential libmagic-dev iputils-ping \
&& apt-get autoremove -y \
&& apt-get clean -y \
&& rm -rf /var/lib/apt/lists/*

ENV SHELL /bin/zsh
ENV SHELL /bin/zsh
20 changes: 11 additions & 9 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,15 +29,17 @@
}
}
},
"extensions": [
"ms-python.python",
"charliermarsh.ruff",
"ms-python.vscode-pylance",
"ms-toolsai.jupyter",
"visualstudioexptteam.vscodeintellicode",
"ms-python.mypy-type-checker",
"github.vscode-github-actions"
]
"vscode": {
"extensions": [
"ms-python.python",
"charliermarsh.ruff",
"ms-python.vscode-pylance",
"ms-toolsai.jupyter",
"visualstudioexptteam.vscodeintellicode",
"ms-python.mypy-type-checker",
"github.vscode-github-actions"
]
}
},
"features": {
"ghcr.io/devcontainers/features/common-utils:2": {
Expand Down
2 changes: 1 addition & 1 deletion .devcontainer/rye/install.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ echo 'source "$HOME/.rye/env"' >> /home/vscode/.zshrc

chown -R vscode $RYE_HOME

echo "Done!"
echo "Done!"
12 changes: 6 additions & 6 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'Publish on pypi'
on:
name: 'Publish on pypi'
on:
release:
types: [created]
push:
Expand All @@ -24,15 +24,15 @@ jobs:
uses: actions/checkout@v3

- name: Login to GitHub Container Registry
uses: docker/login-action@v2
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Build and run dev container task
uses: devcontainers/[email protected]
with:
uses: devcontainers/[email protected]
with:
imageName: ghcr.io/ksachdeva/langchain-graphrag-devcontainer
cacheFrom: ghcr.io/ksachdeva/langchain-graphrag-devcontainer
runCmd: rye build
Expand All @@ -42,4 +42,4 @@ jobs:
with:
packages-dir: dist
skip-existing: true
verbose: true
verbose: true
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -168,4 +168,4 @@ outputs
.env
test-data
scratch
.DS_Store
.DS_Store
20 changes: 20 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: check-ast
- id: check-toml
- id: check-yaml
- id: end-of-file-fixer
exclude: ^(\.devcontainer|\.vscode).+json
- id: trailing-whitespace
- id: mixed-line-ending
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.6.0
hooks:
- id: ruff
- id: ruff-format
2 changes: 1 addition & 1 deletion .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ python:
path: .

mkdocs:
configuration: mkdocs.yml
configuration: mkdocs.yml
21 changes: 11 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@

[![Documentation build status](https://readthedocs.org/projects/langchain-graphrag/badge/?version=latest
)](https://langchain-graphrag.readthedocs.io/en/latest/)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)


** WORK IN PROGRESS **

This is an implementation of GraphRAG as described in
This is an implementation of GraphRAG as described in

https://arxiv.org/pdf/2404.16130

Expand All @@ -29,7 +30,7 @@ The primary reasons for re-implementing:
## Install (Not Recommended yet!)

Note - this is work in progress so installing the package is not recommended yet.
It would be better to clone the repo and try out current state of the code.
It would be better to clone the repo and try out current state of the code.
See below for more details.

I published the package so as to reserve the name. Clone the repo and install the package locally.
Expand All @@ -38,15 +39,15 @@ I published the package so as to reserve the name. Clone the repo and install th
pip install langchain-graphrag
```

## Projects
## Projects

There are 2 projects in the repo:

### `langchain_graphrag`
### `langchain_graphrag`

This is the core library that implements the GraphRAG paper. It is built on top of the `langchain` library.

The concepts described in GraphRAG paper are implemented in a modular fashion with easy extensibility and replacement in mind.
The concepts described in GraphRAG paper are implemented in a modular fashion with easy extensibility and replacement in mind.

To use the development version (Recommended as it is under active development):

Expand Down Expand Up @@ -81,8 +82,8 @@ the classes as long as they implement the required interface.
```bash
# To generate the index
# default set azure_openai/gpt4-o/text-embedding-3-small
# you can change the model and other parameters from command line
rye run simple-app-indexer
# you can change the model and other parameters from command line
rye run simple-app-indexer
```

```bash
Expand All @@ -93,15 +94,15 @@ rye run simple-app-indexer --help
```bash
# To do global search/query
# defaults are azure_openai/gpt4-o/text-embedding-3-small
# you can change the model and other parameters from command line
# you can change the model and other parameters from command line
rye run simple-app-global-search --query "What are the top themes in this story?"
```

```bash
# To do local search/query
# defaults are azure_openai/gpt4-o/text-embedding-3-small
# you can change the model and other parameters from command line
# you can change the model and other parameters from command line
rye run simple-app-local-search --query "Who is Scrooge, and what are his main relationships?"
```

See `examples/simple-app/README.md` for more details.
See `examples/simple-app/README.md` for more details.
2 changes: 1 addition & 1 deletion examples/simple-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,4 @@ rye run simple-app-global-search --query "What are the top themes in this story?
# Step 3 - Local Search
# make sure to run this from the root of the repository
rye run simple-app-local-search --query "Who is Scrooge, and what are his main relationships?"
```
```
3 changes: 2 additions & 1 deletion examples/simple-app/app/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
from langchain_community.storage import SQLStore
from langchain_core.embeddings import Embeddings
from langchain_core.language_models import BaseLLM
from langchain_graphrag.indexing.artifacts import IndexerArtifacts
from langchain_ollama import OllamaEmbeddings, OllamaLLM
from langchain_openai import (
AzureChatOpenAI,
Expand All @@ -19,6 +18,8 @@
OpenAIEmbeddings,
)

from langchain_graphrag.indexing.artifacts import IndexerArtifacts


class LLMType(str, Enum):
openai: str = "openai"
Expand Down
3 changes: 2 additions & 1 deletion examples/simple-app/app/indexer.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@
)
from langchain_chroma.vectorstores import Chroma as ChromaVectorStore
from langchain_community.document_loaders.directory import DirectoryLoader
from langchain_text_splitters import TokenTextSplitter

from langchain_graphrag.indexing.artifacts import IndexerArtifacts
from langchain_graphrag.indexing.embedding_generation.graph import (
Node2VectorGraphEmbeddingGenerator,
Expand All @@ -51,7 +53,6 @@
TextUnitsTableGenerator,
)
from langchain_graphrag.indexing.text_unit_extractor import TextUnitExtractor
from langchain_text_splitters import TokenTextSplitter

app = Typer()

Expand Down
6 changes: 2 additions & 4 deletions examples/simple-app/app/query.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
)
from langchain_chroma.vectorstores import Chroma as ChromaVectorStore
from langchain_core.output_parsers.string import StrOutputParser

from langchain_graphrag.query.global_search import GlobalSearch
from langchain_graphrag.query.global_search.community_weight_calculator import (
CommunityWeightCalculator,
Expand All @@ -34,10 +35,7 @@
from langchain_graphrag.query.global_search.key_points_generator import (
KeyPointsGenerator,
)
from langchain_graphrag.query.local_search import (
LocalSearch,
LocalSearchPromptBuilder,
)
from langchain_graphrag.query.local_search import LocalSearch, LocalSearchPromptBuilder
from langchain_graphrag.query.local_search.context_builders import (
CommunitiesReportsContextBuilder,
ContextBuilder,
Expand Down
10 changes: 4 additions & 6 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
site_name: GraphRAG
site_name: GraphRAG
theme:
name: readthedocs
highlightjs: true
plugins:
- search
- mkdocstrings:
handlers:
handlers:
python:
options:
docstring_style: sphinx
Expand All @@ -16,7 +16,5 @@ markdown_extensions:
base_path: .
- admonition
nav:
- Home: index.md
- Overview: overview.md


- Home: index.md
- Overview: overview.md
2 changes: 1 addition & 1 deletion mypy.ini
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
[mypy]
enable_incomplete_feature=Unpack
enable_incomplete_feature=Unpack
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ dev-dependencies = [
"mkdocs>=1.6.0",
"mkdocstrings[python]>=0.25.2",
"markdown-include>=0.8.1",
"pre-commit>=3.8.0",
]

[tool.hatch.metadata]
Expand All @@ -44,3 +45,6 @@ simple-app-indexer = "python examples/simple-app/app/main.py indexer index --inp
simple-app-report = "python examples/simple-app/app/main.py indexer report --artifacts-dir tmp/artifacts"
simple-app-local-search = "python examples/simple-app/app/main.py query local-search --output-dir tmp --cache-dir tmp/cache"
simple-app-global-search = "python examples/simple-app/app/main.py query global-search --output-dir tmp --cache-dir tmp/cache"

[tool.isort]
profile = "black"
14 changes: 14 additions & 0 deletions requirements-dev.lock
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ certifi==2024.7.4
# via kubernetes
# via requests
# via unstructured-client
cfgv==3.4.0
# via pre-commit
chardet==5.2.0
# via unstructured
charset-normalizer==3.3.2
Expand Down Expand Up @@ -99,6 +101,8 @@ deprecated==1.2.14
# via opentelemetry-api
# via opentelemetry-exporter-otlp-proto-grpc
# via opentelemetry-semantic-conventions
distlib==0.3.8
# via virtualenv
distro==1.9.0
# via openai
emoji==2.12.1
Expand All @@ -115,6 +119,7 @@ fastapi==0.112.0
fastparquet==2024.5.0
filelock==3.15.4
# via huggingface-hub
# via virtualenv
filetype==1.2.0
# via unstructured
flatbuffers==24.3.25
Expand Down Expand Up @@ -167,6 +172,8 @@ humanfriendly==10.0
# via coloredlogs
hyppo==0.4.0
# via graspologic
identify==2.6.0
# via pre-commit
idna==3.7
# via anyio
# via httpx
Expand Down Expand Up @@ -294,6 +301,8 @@ networkx==3.3
# via langchain-graphrag
nltk==3.8.1
# via unstructured
nodeenv==1.9.1
# via pre-commit
numba==0.60.0
# via hyppo
# via pynndescent
Expand Down Expand Up @@ -404,12 +413,14 @@ platformdirs==4.2.2
# via jupyter-core
# via mkdocs-get-deps
# via mkdocstrings
# via virtualenv
pluggy==1.5.0
# via pytest
posthog==3.5.0
# via chromadb
pot==0.9.4
# via graspologic
pre-commit==3.8.0
prompt-toolkit==3.0.47
# via ipython
protobuf==4.25.4
Expand Down Expand Up @@ -479,6 +490,7 @@ pyyaml==6.0.1
# via langchain-core
# via mkdocs
# via mkdocs-get-deps
# via pre-commit
# via pymdown-extensions
# via pyyaml-env-tag
# via uvicorn
Expand Down Expand Up @@ -638,6 +650,8 @@ uvicorn==0.30.5
# via chromadb
uvloop==0.19.0
# via uvicorn
virtualenv==20.26.3
# via pre-commit
watchdog==4.0.2
# via mkdocs
watchfiles==0.22.0
Expand Down
2 changes: 1 addition & 1 deletion requirements-docs.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
mkdocs
mkdocstrings[python]
markdown-include
markdown-include
Loading

0 comments on commit 602c013

Please sign in to comment.