meta-llama · leseb · Feb 18, 2025 · ashwinb · Feb 20, 2025 · leseb
@@ -31,42 +31,6 @@ Llama Stack standardizes the core building blocks that simplify AI application d
 
 By reducing friction and complexity, Llama Stack empowers developers to focus on what they do best: building transformative generative AI applications.
 
-### API Providers
-Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack. 
-
-| **API Provider Builder** |    **Environments**    | **Agents** | **Inference** | **Memory** | **Safety** | **Telemetry** |
-|:------------------------:|:----------------------:|:----------:|:-------------:|:----------:|:----------:|:-------------:|
-|      Meta Reference      |      Single Node       |     ✅      |       ✅       |     ✅      |     ✅      |       ✅       |
-|        SambaNova         |         Hosted         |            |       ✅       |            |            |               |
-|         Cerebras         |         Hosted         |            |       ✅       |            |            |               |
-|        Fireworks         |         Hosted         |     ✅      |       ✅       |     ✅      |            |               |
-|       AWS Bedrock        |         Hosted         |            |       ✅       |            |     ✅      |               |
-|         Together         |         Hosted         |     ✅      |       ✅       |            |     ✅      |               |
-|           Groq           |         Hosted         |            |       ✅       |            |            |               |
-|          Ollama          |      Single Node       |            |       ✅       |            |            |               |
-|           TGI            | Hosted and Single Node |            |       ✅       |            |            |               |
-|        NVIDIA NIM        | Hosted and Single Node |            |       ✅       |            |            |               |
-|          Chroma          |      Single Node       |            |               |     ✅      |            |               |
-|        PG Vector         |      Single Node       |            |               |     ✅      |            |               |
-|    PyTorch ExecuTorch    |     On-device iOS      |     ✅      |       ✅       |            |            |               |
-|           vLLM           | Hosted and Single Node |            |       ✅       |            |            |               |
-
-### Distributions
-
-A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario - you can begin with a local development setup (eg. ollama) and seamlessly transition to production (eg. Fireworks) without changing your application code. Here are some of the distributions we support:
-
-|               **Distribution**                |                                                                    **Llama Stack Docker**                                                                     |                                                 Start This Distribution                                                  |
-|:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|
-|                Meta Reference                 |           [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general)           |      [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/meta-reference-gpu.html)      |
-|           Meta Reference Quantized            | [llamastack/distribution-meta-reference-quantized-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-quantized-gpu/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/meta-reference-quantized-gpu.html) |
-|                   SambaNova                   |                     [llamastack/distribution-sambanova](https://hub.docker.com/repository/docker/llamastack/distribution-sambanova/general)                     |   [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/sambanova.html)   |
-|                   Cerebras                    |                     [llamastack/distribution-cerebras](https://hub.docker.com/repository/docker/llamastack/distribution-cerebras/general)                     |   [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/cerebras.html)   |
-|                    Ollama                     |                       [llamastack/distribution-ollama](https://hub.docker.com/repository/docker/llamastack/distribution-ollama/general)                       |            [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/ollama.html)            |
-|                      TGI                      |                          [llamastack/distribution-tgi](https://hub.docker.com/repository/docker/llamastack/distribution-tgi/general)                          |             [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/tgi.html)              |
-|                   Together                    |                     [llamastack/distribution-together](https://hub.docker.com/repository/docker/llamastack/distribution-together/general)                     |           [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/together.html)           |
-|                   Fireworks                   |                    [llamastack/distribution-fireworks](https://hub.docker.com/repository/docker/llamastack/distribution-fireworks/general)                    |          [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/fireworks.html)           |
-| vLLM |                  [llamastack/distribution-remote-vllm](https://hub.docker.com/repository/docker/llamastack/distribution-remote-vllm/general)                  |         [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/remote-vllm.html)          |
-
 ### Installation
 
 You have two ways to install this repository:

@@ -13,8 +13,11 @@ Which templates / distributions to choose depends on the hardware you have for r
 - **Do you have access to machines with GPUs?** If you wish to run Llama Stack locally or on a cloud instance and host your own Llama Stack endpoint, we suggest:
   - {dockerhub}`distribution-remote-vllm` ([Guide](self_hosted_distro/remote-vllm))
   - {dockerhub}`distribution-meta-reference-gpu` ([Guide](self_hosted_distro/meta-reference-gpu))
+  - {dockerhub}`distribution-meta-reference-quantized-gpu` ([Guide](self_hosted_distro/meta-reference-quantized-gpu))
   - {dockerhub}`distribution-tgi` ([Guide](self_hosted_distro/tgi))
   - {dockerhub}`distribution-nvidia` ([Guide](self_hosted_distro/nvidia))
+  - {dockerhub}`distribution-sambanova` ([Guide](self_hosted_distro/sambanova))
+  - {dockerhub}`distribution-cerebras` ([Guide](self_hosted_distro/cerebras))
 
 - **Are you running on a "regular" desktop or laptop ?** We suggest using the ollama template for quick prototyping and get started without having to worry about needing GPUs.
   - {dockerhub}`distribution-ollama` ([Guide](self_hosted_distro/ollama))

@@ -47,38 +47,30 @@ We have a number of client-side SDKs available for different languages.
 
 A number of "adapters" are available for some popular Inference and Vector Store providers. For other APIs (particularly Safety and Agents), we provide *reference implementations* you can use to get started. We expect this list to grow over time. We are slowly onboarding more providers to the ecosystem as we get more confidence in the APIs.
 
-**Inference API**
-|  **Provider** |  **Environments** |
-| :----: | :----: |
-|  Meta Reference  |  Single Node |
-|  Ollama  | Single Node   |
-|  Fireworks  |  Hosted  |
-|  Together  |  Hosted  |
-|  NVIDIA NIM  |  Hosted and Single Node  |
-|  vLLM  | Hosted and Single Node |
-|  TGI  |  Hosted and Single Node  |
-|  AWS Bedrock  |  Hosted  |
-|  Cerebras  |  Hosted  |
-|  Groq  |  Hosted  |
-|  SambaNova  |  Hosted  |
-| PyTorch ExecuTorch | On-device iOS, Android |
-
-**Vector IO API**
-|  **Provider** |  **Environments** |
-| :----: | :----: |
-|  FAISS | Single Node |
-|  SQLite-Vec| Single Node |
-|  Chroma | Hosted and Single Node |
-|  Postgres (PGVector) | Hosted and Single Node |
-|  Weaviate | Hosted |
-
-**Safety API**
-|  **Provider** |  **Environments** |
-| :----: | :----: |
-|  Llama Guard | Depends on Inference Provider |
-|  Prompt Guard | Single Node |
-|  Code Scanner | Single Node |
-|  AWS Bedrock | Hosted |
+Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.
+
+| **API Provider Builder** |    **Environments**           | **Agents** | **Inference** | **Vector IO** | **Safety** | **Telemetry** |
+|:------------------------:|:-----------------------------:|:----------:|:-------------:|:-------------:|:----------:|:-------------:|
+|      Meta Reference      |      Single Node              |     ✅     |       ✅      |     ✅        |     ✅     |       ✅      |
+|        SambaNova         |         Hosted                |            |       ✅      |               |            |               |
+|         Cerebras         |         Hosted                |            |       ✅      |               |            |               |
+|        Fireworks         |         Hosted                |     ✅     |       ✅      |     ✅        |            |               |
+|       AWS Bedrock        |         Hosted                |            |       ✅      |               |     ✅     |               |
+|         Together         |         Hosted                |     ✅     |       ✅      |               |     ✅     |               |
+|           Groq           |         Hosted                |            |       ✅      |               |            |               |
+|          Ollama          |      Single Node              |            |       ✅      |               |            |               |
+|           TGI            | Hosted and Single Node        |            |       ✅      |               |            |               |
+|        NVIDIA NIM        | Hosted and Single Node        |            |       ✅      |               |            |               |
+|          Chroma          | Hosted and Single Node        |            |               |     ✅        |            |               |
+|        PG Vector         |      Single Node              |            |               |     ✅        |            |               |
+|    PyTorch ExecuTorch    |     On-device iOS             |     ✅     |       ✅      |               |            |               |
+|           vLLM           | Hosted and Single Node        |            |       ✅      |               |            |               |
+|          Weaviate        |         Hosted                |            |       ✅      |               |            |               |
+|          FAISS           |      Single Node              |            |       ✅      |               |            |               |
+|    Postgres (PGVector)   | Hosted and Single Node        |            |       ✅      |               |            |               |
+|    Code Scanner          |      Single Node              |            |               |               |     ✅     |               |
+|    Prompt Guard          |      Single Node              |            |               |               |     ✅     |               |
+|    Llama Guard           | Depends on Inference Provider |            |               |               |     ✅     |               |
 
 
 ```{toctree}