Metaphor develops several knowledgebase connectors that are capable of retrieving documents and generating vector embeddings for use with Metaphor AI.
Embedding models are configured via the EmbeddingModelConfig
class. Required and optional configs should be entered in the embedding_model
dictionary in the crawler YAML file. See below for formatting examples.
The following models are known to work:
text-embedding-ada-002
text-embedding-3-small
embedding_model:
azure_openai:
key: <key> # Required
endpoint: <endpoint> # Required
version: <version> # Defaults to "2024-03-01-preview" if not specified
deployment_name: <deployment_name> # Defaults to "Embedding_3_small" if not specified
model: <azure_openAI_model> # Defaults to "text-embedding-3-small" if not specified
chunk_size: 512 # Defaults to 512
chunk_overlap: 50 # Defaults to 50
The following models are known to work:
text-embedding-ada-002
text-embedding-3-small
embedding_model:
openai:
key: <openAI_key> # Required
model: <openAI_model> # Defaults to "text-embedding-3-small" if not specified
chunk_size: 512 # Defaults to 512
chunk_overlap: 50 # Defaults to 50
Configuration of the chunk_size
and chunk_overlap
is supported as well since some models have smaller context windows and for optimizing search detail. There are defaults configured.