two text embedding models (#148)

basetenlabs · Jan 12, 2024 · d36f038 · d36f038
1 parent 1655512
commit d36f038
Show file tree

Hide file tree

Showing 10 changed files with 530 additions and 0 deletions.
diff --git a/all-minilm-l6-v2/README.md b/all-minilm-l6-v2/README.md
@@ -0,0 +1,79 @@
+ # All miniLM L6 V2 Truss
+
+This repository packages [All MiniLM L6 V2](https://huggingface.co/sentence-transformers/all-miniLM-L6-v2) as a [Truss](https://truss.baseten.co).
+
+
+## Deploying All miniLM L6 V2
+
+First, clone this repository:
+
+```sh
+git clone https://github.com/basetenlabs/truss-examples/
+cd all-minilm-l6-v2
+```
+
+Before deployment:
+
+1. Make sure you have a [Baseten account](https://app.baseten.co/signup) and [API key](https://app.baseten.co/settings/account/api_keys).
+2. Install the latest version of Truss: `pip install --upgrade truss`
+
+With `all-minilm-l6-v2` as your working directory, you can deploy the model with:
+
+```sh
+truss push
+```
+
+Paste your Baseten API key if prompted.
+
+For more information, see [Truss documentation](https://truss.baseten.co).
+
+## Invoking All miniLM L6 V2
+
+The model takes a dictionary with:
+
+* `text`: A list of strings. Each string will be encoded into a text embedding and returned.
+
+Example invocation:
+
+```sh
+truss predict -d '{"text": ["I want to eat pasta", "I want to eat pizza"]}'
+```
+
+Expected response:
+
+```python
+[
+  [
+    0.2593194842338562,
+    ...
+    -1.4059709310531616
+  ],
+  [
+    0.11028853803873062,
+    ...
+    -0.9492666125297546
+  ],
+]
+```
+
+We also prepared a sample input file of all 154 of Shakespeare's sonnets. You can create embeddings from this file with:
+
+```sh
+truss predict -f sample.json > output.json
+```
+
+## Hardware notes
+
+For creating a few embeddings from relatively short chunks of text, a CPU-only instance with 4 cores and 16 GiB of RAM is more than sufficient. If you need to quickly create an embedding for a large corpus of text, you may want to upgrade to a larger instance type or add a small GPU like an NVIDIA T4.
+
+Default config:
+
+```yaml
+...
+resources:
+  cpu: "4"
+  memory: 16Gi
+  use_gpu: false
+  accelerator: null
+...
+```
diff --git a/all-minilm-l6-v2/config.yaml b/all-minilm-l6-v2/config.yaml
@@ -0,0 +1,15 @@
+environment_variables: {}
+external_package_dirs: []
+model_metadata:
+  example_model_input: {"text": ["I want to eat pasta", "I want to eat pizza"]}
+model_framework: custom
+model_name: all-miniLM-L6-v2
+python_version: py39
+requirements:
+- sentence-transformers==2.2.2
+resources:
+  cpu: '3'
+  memory: 14Gi
+  use_gpu: false
+secrets: {}
+system_packages: []
diff --git a/all-minilm-l6-v2/model/__init__.py b/all-minilm-l6-v2/model/__init__.py
diff --git a/all-minilm-l6-v2/model/model.py b/all-minilm-l6-v2/model/model.py
@@ -0,0 +1,12 @@
+from sentence_transformers import SentenceTransformer
+
+
+class Model:
+    def __init__(self, **kwargs):
+        self._model = None
+
+    def load(self):
+        self._model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
+
+    def predict(self, model_input):
+        return self._model.encode(model_input["text"])
diff --git a/all-minilm-l6-v2/sample.json b/all-minilm-l6-v2/sample.json
diff --git a/all-mpnet-base-v2/README.md b/all-mpnet-base-v2/README.md
@@ -0,0 +1,79 @@
+ # All MPNet Base V2 Truss
+
+This repository packages [All MPNet Base V2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) as a [Truss](https://truss.baseten.co).
+
+
+## Deploying All MPNet Base V2
+
+First, clone this repository:
+
+```sh
+git clone https://github.com/basetenlabs/truss-examples/
+cd all-mpnet-base-v2
+```
+
+Before deployment:
+
+1. Make sure you have a [Baseten account](https://app.baseten.co/signup) and [API key](https://app.baseten.co/settings/account/api_keys).
+2. Install the latest version of Truss: `pip install --upgrade truss`
+
+With `all-mpnet-base-v2` as your working directory, you can deploy the model with:
+
+```sh
+truss push
+```
+
+Paste your Baseten API key if prompted.
+
+For more information, see [Truss documentation](https://truss.baseten.co).
+
+## Invoking All MPNet Base V2
+
+The model takes a dictionary with:
+
+* `text`: A list of strings. Each string will be encoded into a text embedding and returned.
+
+Example invocation:
+
+```sh
+truss predict -d '{"text": ["I want to eat pasta", "I want to eat pizza"]}'
+```
+
+Expected response:
+
+```python
+[
+  [
+    0.2593194842338562,
+    ...
+    -1.4059709310531616
+  ],
+  [
+    0.11028853803873062,
+    ...
+    -0.9492666125297546
+  ],
+]
+```
+
+We also prepared a sample input file of all 154 of Shakespeare's sonnets. You can create embeddings from this file with:
+
+```sh
+truss predict -f sample.json > output.json
+```
+
+## Hardware notes
+
+For creating a few embeddings from relatively short chunks of text, a CPU-only instance with 4 cores and 16 GiB of RAM is more than sufficient. If you need to quickly create an embedding for a large corpus of text, you may want to upgrade to a larger instance type or add a small GPU like an NVIDIA T4.
+
+Default config:
+
+```yaml
+...
+resources:
+  cpu: "4"
+  memory: 16Gi
+  use_gpu: false
+  accelerator: null
+...
+```
diff --git a/all-mpnet-base-v2/config.yaml b/all-mpnet-base-v2/config.yaml
@@ -0,0 +1,15 @@
+environment_variables: {}
+external_package_dirs: []
+model_metadata:
+  example_model_input: {"text": ["I want to eat pasta", "I want to eat pizza"]}
+model_framework: custom
+model_name: all-mpnet-base-v2
+python_version: py39
+requirements:
+- sentence-transformers==2.2.2
+resources:
+  cpu: '3'
+  memory: 14Gi
+  use_gpu: false
+secrets: {}
+system_packages: []
diff --git a/all-mpnet-base-v2/model/__init__.py b/all-mpnet-base-v2/model/__init__.py
diff --git a/all-mpnet-base-v2/model/model.py b/all-mpnet-base-v2/model/model.py
@@ -0,0 +1,12 @@
+from sentence_transformers import SentenceTransformer
+
+
+class Model:
+    def __init__(self, **kwargs):
+        self._model = None
+
+    def load(self):
+        self._model = SentenceTransformer("sentence-transformers/all-mpnet-base-v2")
+
+    def predict(self, model_input):
+        return self._model.encode(model_input["text"])