diff --git a/README.md b/README.md index 749db8a690..47dbd5ac3f 100644 --- a/README.md +++ b/README.md @@ -26,15 +26,16 @@ potential of cutting-edge AI models. ## 🔥 Hot Topics ### Framework Enhancements +- Incorporate vLLM: [#445](https://github.com/xorbitsai/inference/pull/445) - Embedding model support: [#418](https://github.com/xorbitsai/inference/pull/418) -- Custom model support: [#325](https://github.com/xorbitsai/inference/pull/325) - LoRA support: [#271](https://github.com/xorbitsai/inference/issues/271) - Multi-GPU support for PyTorch models: [#226](https://github.com/xorbitsai/inference/issues/226) - Xinference dashboard: [#93](https://github.com/xorbitsai/inference/issues/93) ### New Models - Built-in support for [CodeLLama](https://github.com/facebookresearch/codellama): [#414](https://github.com/xorbitsai/inference/pull/414) [#402](https://github.com/xorbitsai/inference/pull/402) -### Tools -- LlamaIndex plugin: [#7151](https://github.com/jerryjliu/llama_index/pull/7151) +### Integrations +- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable. +- [Chatbox](https://chatboxai.app/): a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux. ## Key Features @@ -57,8 +58,7 @@ for seamless management and monitoring. allowing the seamless distribution of model inference across multiple devices or machines. 🔌 **Built-in Integration with Third-Party Libraries**: Xorbits Inference seamlessly integrates -with popular third-party libraries like [LangChain](https://python.langchain.com/docs/integrations/providers/xinference) -and [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window). +with popular third-party libraries including [LangChain](https://python.langchain.com/docs/integrations/providers/xinference), [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window), [Dify](https://docs.dify.ai/advanced/model-configuration/xinference), and [Chatbox](https://chatboxai.app/). ## Getting Started Xinference can be installed via pip from PyPI. It is highly recommended to create a new virtual diff --git a/README_zh_CN.md b/README_zh_CN.md index 274e88028a..20f09c8007 100644 --- a/README_zh_CN.md +++ b/README_zh_CN.md @@ -23,14 +23,17 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 ## 🔥 近期热点 ### 框架增强 -- 自定义模型: [#325](https://github.com/xorbitsai/inference/pull/325) +- 引入 vLLM: [#445](https://github.com/xorbitsai/inference/pull/445) +- Embedding 模型支持: [#418](https://github.com/xorbitsai/inference/pull/418) - LoRA 支持: [#271](https://github.com/xorbitsai/inference/issues/271) - PyTorch 模型多 GPU 支持: [#226](https://github.com/xorbitsai/inference/issues/226) - Xinference 仪表盘: [#93](https://github.com/xorbitsai/inference/issues/93) ### 新模型 - 内置 [CodeLLama](https://github.com/facebookresearch/codellama): [#414](https://github.com/xorbitsai/inference/pull/414) [#402](https://github.com/xorbitsai/inference/pull/402) -### 工具 -- LlamaIndex 插件: [#7151](https://github.com/jerryjliu/llama_index/pull/7151) +### 集成 +- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。 +- [Chatbox](https://chatboxai.app/): 一个支持前沿大语言模型的桌面客户端,支持 Windows,Mac,以及 Linux。 + @@ -39,16 +42,13 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布 ⚡️ **前沿模型,应有尽有**:框架内置众多中英文的前沿大语言模型,包括 baichuan,chatglm2 等,一键即可体验!内置模型列表还在快速更新中! - 🖥 **异构硬件,快如闪电**:通过 [ggml](https://github.com/ggerganov/ggml),同时使用你的 GPU 与 CPU 进行推理,降低延迟,提高吞吐! ⚙️ **接口调用,灵活多样**:提供多种使用模型的接口,包括 RPC,RESTful API,命令行,web UI 等等。方便模型的管理与监控。 🌐 **集群计算,分布协同**: 支持分布式部署,通过内置的资源调度器,让不同大小的模型按需调度到不同机器,充分使用集群资源。 -🔌 **开放生态,无缝对接**: 与流行的三方库无缝对接,包括 [LangChain](https://python.langchain.com/docs/integrations/providers/xinference) -and [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window)。 -让开发者能够快速构建基于 AI 的应用。 +🔌 **开放生态,无缝对接**: 与流行的三方库无缝对接,包括 [LangChain](https://python.langchain.com/docs/integrations/providers/xinference),[LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/XinferenceLocalDeployment.html#i-run-pip-install-xinference-all-in-a-terminal-window),[Dify](https://docs.dify.ai/advanced/model-configuration/xinference),以及 [Chatbox](https://chatboxai.app/)。 ## 快速入门 Xinference 可以通过 pip 从 PyPI 安装。我们非常推荐在安装前创建一个新的虚拟环境以避免依赖冲突。 diff --git a/doc/source/index.rst b/doc/source/index.rst index f18cdf5eec..56177bae04 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -33,7 +33,9 @@ allowing the seamless distribution of model inference across multiple devices or 🔌 **Built-in Integration with Third-Party Libraries**: Xorbits Inference seamlessly integrates with popular third-party libraries like `LangChain `_ -and `LlamaIndex `_. +, `LlamaIndex `_ +, `Dify `_ +, and `Chatbox `_. 🔥 Hot Topics @@ -41,20 +43,21 @@ and `LlamaIndex `_ +- Incorporate vLLM: `#445 `_ +- Embedding model support: `#418 `_ - LoRA support: `#271 `_ - Multi-GPU support for PyTorch models: `#226 `_ - Xinference dashboard: `#93 `_ New Models ~~~~~~~~~~ -- Built-in support for `Starcoder` in GGML: `#289 `_ -- Built-in support for `MusicGen `_: `#313 `_ -- Built-in support for `SD-XL `_: `318 `_ +- Built-in support for `CodeLLama `_: `#414 `_ `#402 `_ -Tools -~~~~~ -- LlamaIndex plugin: `7151 `_ + +Integrations +~~~~~~~~~~~~ +- `Dify `_: an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable. +- `Chatbox `_: a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux. License diff --git a/doc/source/models/builtin/llama-2-chat.rst b/doc/source/models/builtin/llama-2-chat.rst index 2eca8d85ef..ded31d4a41 100644 --- a/doc/source/models/builtin/llama-2-chat.rst +++ b/doc/source/models/builtin/llama-2-chat.rst @@ -66,7 +66,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 5 (pytorch, 13 Billion) @@ -84,7 +84,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 6 (pytorch, 70 Billion) ++++++++++++++++++++++++++++++++++ @@ -101,4 +101,4 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. \ No newline at end of file + 4-bit quantization is not supported on macOS. \ No newline at end of file diff --git a/doc/source/models/builtin/llama-2.rst b/doc/source/models/builtin/llama-2.rst index 19614bbba4..a42090890a 100644 --- a/doc/source/models/builtin/llama-2.rst +++ b/doc/source/models/builtin/llama-2.rst @@ -65,7 +65,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 5 (pytorch, 13 Billion) ++++++++++++++++++++++++++++++++++ @@ -82,7 +82,7 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS. Model Spec 6 (pytorch, 70 Billion) ++++++++++++++++++++++++++++++++++ @@ -99,4 +99,4 @@ chosen quantization method from the options listed above:: .. note:: -4-bit quantization is not supported on macOS. + 4-bit quantization is not supported on macOS.