-
Notifications
You must be signed in to change notification settings - Fork 609
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Make DocQA a one-clickable app implementation (#151)
# What does this PR do? This PR adds a one-clickable app implementation of DocQA example, where user can choose local provider like Ollama or cloud provider like together or fireworks to do RAG. It allows users to install the `DocQA` app into the Application folder from the `DocQA.dmg` file. This `MacQA` app leverages llama-stack new `LlamaStackAsLibraryClient` feature and make every agentic component in-line. This PR also explains the steps for building this app. This DocQA app will replace the existing docker solution as the docker solution is not easy to user and maintain. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/meta-llama/llama-stack-apps/blob/main/CONTRIBUTING.md#pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue? Please add a link to it if that's the case. - [x] Did you make sure to update the documentation with your changes? - [x] Did you write any new necessary tests? Thanks for contributing 🎉! --------- Co-authored-by: Hamid Shojanazeri <[email protected]>
- Loading branch information
1 parent
6eabaa2
commit 994d8b4
Showing
28 changed files
with
606 additions
and
2,164 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
DocQA.dmg filter=lfs diff=lfs merge=lfs -text |
Git LFS file not shown
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# -*- mode: python ; coding: utf-8 -*- | ||
from PyInstaller.utils.hooks import collect_data_files | ||
from PyInstaller.utils.hooks import collect_submodules | ||
|
||
hidden_imports=[] | ||
hidden_imports+= collect_submodules('llama_stack') | ||
hidden_imports+= collect_submodules('llama_stack_client') | ||
hidden_imports+= collect_submodules('llama_models') | ||
|
||
datas = [] | ||
datas += collect_data_files('llama_stack') | ||
datas += collect_data_files('llama_stack',subdir='providers',include_py_files=True) | ||
datas += collect_data_files('llama_models') | ||
datas += collect_data_files('llama_stack_client') | ||
datas += collect_data_files('blobfile') | ||
datas += [('/opt/homebrew/anaconda3/envs/blank/lib/python3.10/site-packages/customtkinter', 'customtkinter/')] | ||
|
||
|
||
|
||
a = Analysis( | ||
['app.py'], | ||
pathex=[], | ||
binaries=[], | ||
datas=datas, | ||
hiddenimports=hidden_imports, | ||
hookspath=[], | ||
hooksconfig={}, | ||
runtime_hooks=[], | ||
excludes=[], | ||
noarchive=False, | ||
optimize=0, | ||
) | ||
pyz = PYZ(a.pure) | ||
|
||
exe = EXE( | ||
pyz, | ||
a.scripts, | ||
[], | ||
exclude_binaries=True, | ||
name='DocQA', | ||
debug=False, | ||
bootloader_ignore_signals=False, | ||
strip=False, | ||
upx=True, | ||
console=True, | ||
disable_windowed_traceback=False, | ||
argv_emulation=False, | ||
target_arch=None, | ||
codesign_identity=None, | ||
entitlements_file=None, | ||
) | ||
coll = COLLECT( | ||
exe, | ||
a.binaries, | ||
a.datas, | ||
strip=False, | ||
upx=True, | ||
upx_exclude=[], | ||
name='DocQA', | ||
) | ||
app = BUNDLE(coll, | ||
name='DocQA.app', | ||
icon='DocQA.icns', | ||
bundle_identifier=None) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,51 +1,59 @@ | ||
## DocQA | ||
# DocQA | ||
|
||
This is an end-to-end Retrieval Augmented Generation (RAG) App leveraging llama-stack that handles the logic for ingesting documents, storing them in a vector database and providing an inference interface. | ||
This is an end-to-end Retrieval Augmented Generation (RAG) app example, leveraging llama-stack that handles the logic for ingesting documents, storing them in a vector database and providing an inference interface. | ||
|
||
`DocQA` app is only build for MacOS arm64 platform where you can install the app to your Application folder from the dmg file. | ||
|
||
We share the details of how to run first and then an outline of how it works: | ||
|
||
### Prerequisite: | ||
|
||
Install docker: Check [this doc for Mac](https://docs.docker.com/desktop/setup/install/mac-install/), [this doc for Windows](https://docs.docker.com/desktop/setup/install/windows-install/) and this [instruction for Linux](https://docs.docker.com/engine/install/). | ||
You can either do local inference with Ollama or choose a cloud provider: | ||
|
||
**Local Inference**: | ||
|
||
If you want to use Ollama to run inference, please follow [Ollama's download instruction](https://ollama.com/download) to install Ollama. Before running the app, please open Ollama software and download the model you want to use, eg. use the command `ollama pull llama3.2:1b-instruct-fp16` in terminal. Only 1B, 3B and 8B model are supported as most machine can not run models bigger than 8B locally. | ||
|
||
**Cloud Provider**: | ||
|
||
Register an account in [TogetherAI](https://www.together.ai/) or [FireworksAI](https://fireworks.ai/) to get an API key. | ||
|
||
### How to run DocQA app: | ||
|
||
1. To get the dmg file, you can just download raw file from [here](https://github.com/meta-llama/llama-stack-apps/blob/docqav2/examples/DocQA/DocQA.dmg) or use git clone by first following instructions [here](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) to enable git lfs and do another `git pull`. | ||
|
||
2. Open the `DocQA.dmg` in the folder and move `DocQA.app` to Application folder to have it installed. | ||
(If you see this warning pops up and stops you from installing the app: | ||
|
||
For Mac and Windows users, you need to start the Docker app manually after installation. | ||
 | ||
|
||
### How to run the pipeline: | ||
You need to open `System Settings` -> `Privacy & Security` -> Choose `Open Anyway`, shown here:  | ||
|
||
 | ||
Please check [this documentation](https://support.apple.com/en-us/102445) for more details on how to bypass this warning and install.) | ||
|
||
The above is the workflow diagram for this RAG app. To run the app, please read the following instructions: | ||
3. Double click `DocQA.app` in the Application folder. | ||
|
||
1. Copy the template configuration file `docqa_env_template` to create your own `docqv_env` inside the docker folder: | ||
4. Choose your data folder then select the models and providers. Put your API key if you choose to use TogetherAI or FireworksAI, as shown below: | ||
|
||
```bash | ||
cd docker | ||
cp docqa_env_template docqv_env | ||
``` | ||
 | ||
|
||
2. Then update `model_name` and `document_path` accordingly in your `docqv_env`, for example: | ||
5. Wait for the setup to be ready and click `Chat` tab to start chating to this app, as shown below: | ||
|
||
``` | ||
DOC_PATH=/path/to/your/llama-stack-apps/examples/DocQA/example_data | ||
MODEL_NAME=llama3.2:1b-instruct-fp16 | ||
HOST=localhost | ||
LLAMA_STACK_PORT=5000 | ||
CHROMA_PORT=6000 | ||
GRADIO_SERVER_PORT=7860 | ||
USE_GPU_FOR_DOC_INGESTION=false | ||
``` | ||
 | ||
|
||
3. In the `docker` folder, run following code: | ||
6. Click `exit` button to quit the app. | ||
|
||
```bash | ||
bash run_RAG.sh | ||
``` | ||
### How to build the DocQA app: | ||
|
||
4. Once the service is ready, open the link http://localhost:7861/ in your browser to chat with your documents. | ||
1. Create a new python venv, eg. `conda create -n build_app python=3.10` and then `conda activate build_app` to use it. | ||
2. Run `pip install -r build_app_env.txt` to install required pypi packages. | ||
3. Run `python app.py` make sure everything works. | ||
4. UPX is a executable packer to reduce the size of our App, we need to download UPX zip corresponding to your machine platform from [UPX website](https://github.com/upx/upx/releases/) to this folder and unzip it. | ||
5. We will use Pyinstaller to build the app from `app.py` file. Please use it with correct upx path, eg. `pyinstaller --upx-dir ./upx-4.2.4-arm64_linux DocQA.spec`, the one-clickable app should be in `./dist/DocQA.app` (This step may take ~10 mins). | ||
6. Optionally, you can move the DocQA.app to Application folder now to have it locally installed. | ||
7. Alternatively, if you want to create a .dmg file for easier distribution. You can follow those steps: | ||
|
||
### Overview of how the RAG app works: | ||
- Copy ./dist/DocQA.app to a new folder. | ||
- In your Mac, search and open Disk Utility -> File -> New Image -> Image From Folder. | ||
- Select the folder where you have placed the App. Give a name for the DMG and save. This creates a distributable image for you. | ||
|
||
1. We use [docling](https://github.com/DS4SD/docling) framework for handling multiple file input formats (PDF, PPTX, DOCX) | ||
2. If you are using a GPU, we have an option to use `Llama-3.2-11B-Vision` to caption images in the documents. On a CPU-only machine this step is skipped. | ||
3. Once ingested, we use a llama-stack distribution running chroma-db and `Llama-3.2-3B-Instruct` to ingest chunks into a memory_bank | ||
4. Once the vectordb is created, we then use llama-stack with the `Llama-3.2-3B-Instruct` to chat with the model. | ||
The current `DocQA` app is built for MacOS arm64 platform. To build the app for other platform, you can follow the [Pyinstaller documentation](https://pyinstaller.org/en/stable/usage.html#) to make modifications on `DocQA.spec` and rebuild. |
Oops, something went wrong.