Skip to content

Commit

Permalink
Make DocQA a one-clickable app implementation (#151)
Browse files Browse the repository at this point in the history
# What does this PR do?
This PR adds a one-clickable app implementation of DocQA example, where
user can choose local provider like Ollama or cloud provider like
together or fireworks to do RAG. It allows users to install the `DocQA`
app into the Application folder from the `DocQA.dmg` file. This `MacQA`
app leverages llama-stack new `LlamaStackAsLibraryClient` feature and
make every agentic component in-line. This PR also explains the steps
for building this app.

This DocQA app will replace the existing docker solution as the docker
solution is not easy to user and maintain.

## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Did you read the [contributor
guideline](https://github.com/meta-llama/llama-stack-apps/blob/main/CONTRIBUTING.md#pull-requests),
      Pull Request section?
- [ ] Was this discussed/approved via a Github issue? Please add a link
      to it if that's the case.
- [x] Did you make sure to update the documentation with your changes?
- [x] Did you write any new necessary tests?

Thanks for contributing 🎉!

---------

Co-authored-by: Hamid Shojanazeri <[email protected]>
  • Loading branch information
wukaixingxp and HamidShojanazeri authored Feb 27, 2025
1 parent 6eabaa2 commit 994d8b4
Show file tree
Hide file tree
Showing 28 changed files with 606 additions and 2,164 deletions.
1 change: 1 addition & 0 deletions examples/DocQA/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DocQA.dmg filter=lfs diff=lfs merge=lfs -text
3 changes: 3 additions & 0 deletions examples/DocQA/DocQA.dmg
Git LFS file not shown
Binary file added examples/DocQA/DocQA.icns
Binary file not shown.
64 changes: 64 additions & 0 deletions examples/DocQA/DocQA.spec
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# -*- mode: python ; coding: utf-8 -*-
from PyInstaller.utils.hooks import collect_data_files
from PyInstaller.utils.hooks import collect_submodules

hidden_imports=[]
hidden_imports+= collect_submodules('llama_stack')
hidden_imports+= collect_submodules('llama_stack_client')
hidden_imports+= collect_submodules('llama_models')

datas = []
datas += collect_data_files('llama_stack')
datas += collect_data_files('llama_stack',subdir='providers',include_py_files=True)
datas += collect_data_files('llama_models')
datas += collect_data_files('llama_stack_client')
datas += collect_data_files('blobfile')
datas += [('/opt/homebrew/anaconda3/envs/blank/lib/python3.10/site-packages/customtkinter', 'customtkinter/')]



a = Analysis(
['app.py'],
pathex=[],
binaries=[],
datas=datas,
hiddenimports=hidden_imports,
hookspath=[],
hooksconfig={},
runtime_hooks=[],
excludes=[],
noarchive=False,
optimize=0,
)
pyz = PYZ(a.pure)

exe = EXE(
pyz,
a.scripts,
[],
exclude_binaries=True,
name='DocQA',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=True,
console=True,
disable_windowed_traceback=False,
argv_emulation=False,
target_arch=None,
codesign_identity=None,
entitlements_file=None,
)
coll = COLLECT(
exe,
a.binaries,
a.datas,
strip=False,
upx=True,
upx_exclude=[],
name='DocQA',
)
app = BUNDLE(coll,
name='DocQA.app',
icon='DocQA.icns',
bundle_identifier=None)
74 changes: 41 additions & 33 deletions examples/DocQA/README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,59 @@
## DocQA
# DocQA

This is an end-to-end Retrieval Augmented Generation (RAG) App leveraging llama-stack that handles the logic for ingesting documents, storing them in a vector database and providing an inference interface.
This is an end-to-end Retrieval Augmented Generation (RAG) app example, leveraging llama-stack that handles the logic for ingesting documents, storing them in a vector database and providing an inference interface.

`DocQA` app is only build for MacOS arm64 platform where you can install the app to your Application folder from the dmg file.

We share the details of how to run first and then an outline of how it works:

### Prerequisite:

Install docker: Check [this doc for Mac](https://docs.docker.com/desktop/setup/install/mac-install/), [this doc for Windows](https://docs.docker.com/desktop/setup/install/windows-install/) and this [instruction for Linux](https://docs.docker.com/engine/install/).
You can either do local inference with Ollama or choose a cloud provider:

**Local Inference**:

If you want to use Ollama to run inference, please follow [Ollama's download instruction](https://ollama.com/download) to install Ollama. Before running the app, please open Ollama software and download the model you want to use, eg. use the command `ollama pull llama3.2:1b-instruct-fp16` in terminal. Only 1B, 3B and 8B model are supported as most machine can not run models bigger than 8B locally.

**Cloud Provider**:

Register an account in [TogetherAI](https://www.together.ai/) or [FireworksAI](https://fireworks.ai/) to get an API key.

### How to run DocQA app:

1. To get the dmg file, you can just download raw file from [here](https://github.com/meta-llama/llama-stack-apps/blob/docqav2/examples/DocQA/DocQA.dmg) or use git clone by first following instructions [here](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) to enable git lfs and do another `git pull`.

2. Open the `DocQA.dmg` in the folder and move `DocQA.app` to Application folder to have it installed.
(If you see this warning pops up and stops you from installing the app:

For Mac and Windows users, you need to start the Docker app manually after installation.
![open-app-anyway](./assets/warning.png)

### How to run the pipeline:
You need to open `System Settings` -> `Privacy & Security` -> Choose `Open Anyway`, shown here: ![open-app-anyway](./assets/open-app-anyway.png)

![RAG_workflow](./data/assets/DocQA.png)
Please check [this documentation](https://support.apple.com/en-us/102445) for more details on how to bypass this warning and install.)

The above is the workflow diagram for this RAG app. To run the app, please read the following instructions:
3. Double click `DocQA.app` in the Application folder.

1. Copy the template configuration file `docqa_env_template` to create your own `docqv_env` inside the docker folder:
4. Choose your data folder then select the models and providers. Put your API key if you choose to use TogetherAI or FireworksAI, as shown below:

```bash
cd docker
cp docqa_env_template docqv_env
```
![Setup](./assets/DocQA_setup.png)

2. Then update `model_name` and `document_path` accordingly in your `docqv_env`, for example:
5. Wait for the setup to be ready and click `Chat` tab to start chating to this app, as shown below:

```
DOC_PATH=/path/to/your/llama-stack-apps/examples/DocQA/example_data
MODEL_NAME=llama3.2:1b-instruct-fp16
HOST=localhost
LLAMA_STACK_PORT=5000
CHROMA_PORT=6000
GRADIO_SERVER_PORT=7860
USE_GPU_FOR_DOC_INGESTION=false
```
![Chat](./assets/DocQA_chat.png)

3. In the `docker` folder, run following code:
6. Click `exit` button to quit the app.

```bash
bash run_RAG.sh
```
### How to build the DocQA app:

4. Once the service is ready, open the link http://localhost:7861/ in your browser to chat with your documents.
1. Create a new python venv, eg. `conda create -n build_app python=3.10` and then `conda activate build_app` to use it.
2. Run `pip install -r build_app_env.txt` to install required pypi packages.
3. Run `python app.py` make sure everything works.
4. UPX is a executable packer to reduce the size of our App, we need to download UPX zip corresponding to your machine platform from [UPX website](https://github.com/upx/upx/releases/) to this folder and unzip it.
5. We will use Pyinstaller to build the app from `app.py` file. Please use it with correct upx path, eg. `pyinstaller --upx-dir ./upx-4.2.4-arm64_linux DocQA.spec`, the one-clickable app should be in `./dist/DocQA.app` (This step may take ~10 mins).
6. Optionally, you can move the DocQA.app to Application folder now to have it locally installed.
7. Alternatively, if you want to create a .dmg file for easier distribution. You can follow those steps:

### Overview of how the RAG app works:
- Copy ./dist/DocQA.app to a new folder.
- In your Mac, search and open Disk Utility -> File -> New Image -> Image From Folder.
- Select the folder where you have placed the App. Give a name for the DMG and save. This creates a distributable image for you.

1. We use [docling](https://github.com/DS4SD/docling) framework for handling multiple file input formats (PDF, PPTX, DOCX)
2. If you are using a GPU, we have an option to use `Llama-3.2-11B-Vision` to caption images in the documents. On a CPU-only machine this step is skipped.
3. Once ingested, we use a llama-stack distribution running chroma-db and `Llama-3.2-3B-Instruct` to ingest chunks into a memory_bank
4. Once the vectordb is created, we then use llama-stack with the `Llama-3.2-3B-Instruct` to chat with the model.
The current `DocQA` app is built for MacOS arm64 platform. To build the app for other platform, you can follow the [Pyinstaller documentation](https://pyinstaller.org/en/stable/usage.html#) to make modifications on `DocQA.spec` and rebuild.
Loading

0 comments on commit 994d8b4

Please sign in to comment.