Make DocQA a one-clickable app implementation (#151)

# What does this PR do? This PR adds a one-clickable app implementation of DocQA example, where user can choose local provider like Ollama or cloud provider like together or fireworks to do RAG. It allows users to install the `DocQA` app into the Application folder from the `DocQA.dmg` file. This `MacQA` app leverages llama-stack new `LlamaStackAsLibraryClient` feature and make every agentic component in-line. This PR also explains the steps for building this app. This DocQA app will replace the existing docker solution as the docker solution is not easy to user and maintain. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/meta-llama/llama-stack-apps/blob/main/CONTRIBUTING.md#pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue? Please add a link to it if that's the case. - [x] Did you make sure to update the documentation with your changes? - [x] Did you write any new necessary tests? Thanks for contributing 🎉! --------- Co-authored-by: Hamid Shojanazeri <[email protected]>
meta-llama · Feb 27, 2025 · 994d8b4 · 994d8b4
1 parent 6eabaa2
commit 994d8b4
Show file tree

Hide file tree

Showing 28 changed files with 606 additions and 2,164 deletions.
diff --git a/examples/DocQA/.gitattributes b/examples/DocQA/.gitattributes
@@ -0,0 +1 @@
+DocQA.dmg filter=lfs diff=lfs merge=lfs -text
diff --git a/examples/DocQA/DocQA.dmg b/examples/DocQA/DocQA.dmg
diff --git a/examples/DocQA/DocQA.icns b/examples/DocQA/DocQA.icns
diff --git a/examples/DocQA/DocQA.spec b/examples/DocQA/DocQA.spec
@@ -0,0 +1,64 @@
+# -*- mode: python ; coding: utf-8 -*-
+from PyInstaller.utils.hooks import collect_data_files
+from PyInstaller.utils.hooks import collect_submodules
+
+hidden_imports=[]
+hidden_imports+= collect_submodules('llama_stack')
+hidden_imports+= collect_submodules('llama_stack_client')
+hidden_imports+= collect_submodules('llama_models')
+
+datas = []
+datas += collect_data_files('llama_stack')
+datas += collect_data_files('llama_stack',subdir='providers',include_py_files=True)
+datas += collect_data_files('llama_models')
+datas += collect_data_files('llama_stack_client')
+datas += collect_data_files('blobfile')
+datas += [('/opt/homebrew/anaconda3/envs/blank/lib/python3.10/site-packages/customtkinter', 'customtkinter/')]
+
+
+
+a = Analysis(
+    ['app.py'],
+    pathex=[],
+    binaries=[],
+    datas=datas,
+    hiddenimports=hidden_imports,
+    hookspath=[],
+    hooksconfig={},
+    runtime_hooks=[],
+    excludes=[],
+    noarchive=False,
+    optimize=0,
+)
+pyz = PYZ(a.pure)
+
+exe = EXE(
+    pyz,
+    a.scripts,
+    [],
+    exclude_binaries=True,
+    name='DocQA',
+    debug=False,
+    bootloader_ignore_signals=False,
+    strip=False,
+    upx=True,
+    console=True,
+    disable_windowed_traceback=False,
+    argv_emulation=False,
+    target_arch=None,
+    codesign_identity=None,
+    entitlements_file=None,
+)
+coll = COLLECT(
+    exe,
+    a.binaries,
+    a.datas,
+    strip=False,
+    upx=True,
+    upx_exclude=[],
+    name='DocQA',
+)
+app = BUNDLE(coll,
+             name='DocQA.app',
+             icon='DocQA.icns',
+             bundle_identifier=None)
diff --git a/examples/DocQA/README.md b/examples/DocQA/README.md
@@ -1,51 +1,59 @@
-## DocQA
+# DocQA
 
-This is an end-to-end Retrieval Augmented Generation (RAG) App leveraging llama-stack that handles the logic for ingesting documents, storing them in a vector database and providing an inference interface.
+This is an end-to-end Retrieval Augmented Generation (RAG) app example, leveraging llama-stack that handles the logic for ingesting documents, storing them in a vector database and providing an inference interface.
+
+`DocQA` app is only build for MacOS arm64 platform where you can install the app to your Application folder from the dmg file.
 
-We share the details of how to run first and then an outline of how it works:
 
 ### Prerequisite:
 
-Install docker: Check [this doc for Mac](https://docs.docker.com/desktop/setup/install/mac-install/), [this doc for Windows](https://docs.docker.com/desktop/setup/install/windows-install/) and this [instruction for Linux](https://docs.docker.com/engine/install/).
+You can either do local inference with Ollama or choose a cloud provider:
+
+**Local Inference**:
+
+If you want to use Ollama to run inference, please follow [Ollama's download instruction](https://ollama.com/download) to install Ollama.  Before running the app, please open Ollama software and download the model you want to use, eg. use the command `ollama pull llama3.2:1b-instruct-fp16` in terminal. Only 1B, 3B and 8B model are supported as most machine can not run models bigger than 8B locally.
+
+**Cloud Provider**:
+
+Register an account in [TogetherAI](https://www.together.ai/) or [FireworksAI](https://fireworks.ai/) to get an API key.
+
+### How to run DocQA app:
+
+1. To get the dmg file, you can just download raw file from [here](https://github.com/meta-llama/llama-stack-apps/blob/docqav2/examples/DocQA/DocQA.dmg) or use git clone by first following instructions [here](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage) to enable git lfs and do another `git pull`.
+
+2. Open the `DocQA.dmg` in the folder and move `DocQA.app` to Application folder to have it installed.
+(If you see this warning pops up and stops you from installing the app:
 
-For Mac and Windows users, you need to start the Docker app manually after installation.
+![open-app-anyway](./assets/warning.png)
 
-### How to run the pipeline:
+You need to open `System Settings` -> `Privacy & Security` -> Choose `Open Anyway`, shown here: ![open-app-anyway](./assets/open-app-anyway.png)
 
-![RAG_workflow](./data/assets/DocQA.png)
+Please check [this documentation](https://support.apple.com/en-us/102445) for more details on how to bypass this warning and install.)
 
-The above is the workflow diagram for this RAG app. To run the app, please read the following instructions:
+3. Double click `DocQA.app` in the Application folder.
 
-1. Copy the template configuration file `docqa_env_template` to create your own `docqv_env` inside the docker folder:
+4. Choose your data folder then select the models and providers. Put your API key if you choose to use TogetherAI or FireworksAI, as shown below:
 
-```bash
-cd docker
-cp docqa_env_template docqv_env
-```
+![Setup](./assets/DocQA_setup.png)
 
-2. Then update `model_name` and `document_path` accordingly in your `docqv_env`, for example:
+5. Wait for the setup to be ready and click `Chat` tab to start chating to this app, as shown below:
 
-```
-DOC_PATH=/path/to/your/llama-stack-apps/examples/DocQA/example_data
-MODEL_NAME=llama3.2:1b-instruct-fp16
-HOST=localhost
-LLAMA_STACK_PORT=5000
-CHROMA_PORT=6000
-GRADIO_SERVER_PORT=7860
-USE_GPU_FOR_DOC_INGESTION=false
-```
+![Chat](./assets/DocQA_chat.png)
 
-3. In the `docker` folder, run following code:
+6. Click `exit` button to quit the app.
 
-```bash
-bash run_RAG.sh
-```
+### How to build the DocQA app:
 
-4.  Once the service is ready, open the link http://localhost:7861/ in your browser to chat with your documents.
+1. Create a new python venv, eg. `conda create -n build_app python=3.10` and then `conda activate build_app` to use it.
+2. Run `pip install -r build_app_env.txt` to install required pypi packages.
+3. Run `python app.py` make sure everything works.
+4. UPX is a executable packer to reduce the size of our App, we need to download UPX zip corresponding to your machine platform from [UPX website](https://github.com/upx/upx/releases/) to this folder and unzip it.
+5. We will use Pyinstaller to build the app from `app.py` file. Please use it with correct upx path, eg. `pyinstaller --upx-dir ./upx-4.2.4-arm64_linux DocQA.spec`, the one-clickable app should be in `./dist/DocQA.app` (This step may take ~10 mins).
+6. Optionally, you can move the DocQA.app to Application folder now to have it locally installed.
+7. Alternatively, if you want to create a .dmg file for easier distribution. You can follow those steps:
 
-### Overview of how the RAG app works:
+ - Copy ./dist/DocQA.app to a new folder.
+ -  In your Mac, search and open Disk Utility -> File -> New Image -> Image From Folder.
+ - Select the folder where you have placed the App. Give a name for the DMG and save. This creates a distributable image for you.
 
-1. We use [docling](https://github.com/DS4SD/docling) framework for handling multiple file input formats (PDF, PPTX, DOCX)
-2. If you are using a GPU, we have an option to use `Llama-3.2-11B-Vision` to caption images in the documents. On a CPU-only machine this step is skipped.
-3. Once ingested, we use a llama-stack distribution running chroma-db and `Llama-3.2-3B-Instruct` to ingest chunks into a memory_bank
-4. Once the vectordb is created, we then use llama-stack with the `Llama-3.2-3B-Instruct` to chat with the model.
+The current `DocQA` app is built for MacOS arm64 platform. To build the app for other platform, you can follow the [Pyinstaller documentation](https://pyinstaller.org/en/stable/usage.html#) to make modifications on `DocQA.spec` and rebuild.