Sentence Encoder bge-m3

FastAPI implementation of BAAI/bge-m3 encoder, containerized for scalable Kubernetes deployment.

Description

This project provides a high-performance API for generating sentence embeddings using the BAAI/bge-m3 model. It's built with FastAPI for efficient handling of requests and containerized for easy deployment and scaling in Kubernetes environments.

Features

Fast and efficient sentence encoding using BAAI/bge-m3 model
RESTful API built with FastAPI
Docker containerization for consistent environments
Kubernetes deployment ready
Scalable architecture suitable for high-load environments
Health check endpoints for Kubernetes probes

Prerequisites

Python 3.9+
Docker
Kubernetes cluster (for production deployment)

Quick Start

Local Development

Clone the repository:

git clone [email protected]:jeroenherczeg/sentence-encoder-bge-m3.git
cd sentence-encoder-bge-m3

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Run the FastAPI server:
```
uvicorn main:app --reload
```
Access the API at http://localhost:8000 and the interactive docs at http://localhost:8000/docs

Docker

Build the Docker image:

docker build -t sentence-encoder-bge-m3:latest .

Run the container:

docker run -p 8000:8000 sentence-encoder-bge-m3:latest

Kubernetes Deployment

Apply the Kubernetes manifests:
```
kubectl apply -f kubernetes/
```
Access the service (method depends on your Kubernetes setup)

API Usage

Encode Sentences

Endpoint: POST /encode

curl -X POST "http://localhost:8000/encode" -H "Content-Type: application/json" -d '{"sentences": ["Hello, world!", "This is a test sentence."]}'

Request Body:

{
  "sentences": ["Hello, world!", "Another sentence to encode."]
}

Response:

{
  "encodings": [
    [0.1, 0.2, 0.3, ...],
    [0.4, 0.5, 0.6, ...]
  ]
}

Kubernetes Health Probes

Endpoint: GET /readiness

curl http://localhost:8000/readiness

Endpoint: GET /liveness

curl http://localhost:8000/liveness

Model Information

This API uses the BAAI/bge-m3 model, which is a state-of-the-art sentence embedding model. It's designed to generate high-quality vector representations of sentences that capture semantic meaning, making it ideal for various natural language processing tasks such as semantic search, text classification, and similarity comparison.

Performance

The BAAI/bge-m3 model offers a good balance between performance and accuracy. In our testing, it processes approximately 457 sentences per second on a standard CPU. For production environments, we recommend GPU acceleration for higher throughput.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Acknowledgements

BAAI for the bge-m3 model
FastAPI for the web framework
Sentence Transformers for the embedding framework
Docker for containerization
Kubernetes for orchestration

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
manifests		manifests
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentence Encoder bge-m3

Description

Features

Prerequisites

Quick Start

Local Development

Docker

Kubernetes Deployment

API Usage

Encode Sentences

Kubernetes Health Probes

Model Information

Performance

Contributing

License

Acknowledgements

About

Releases

Packages

Languages

jeroenherczeg/sentence-encoder-bge-m3

Folders and files

Latest commit

History

Repository files navigation

Sentence Encoder bge-m3

Description

Features

Prerequisites

Quick Start

Local Development

Docker

Kubernetes Deployment

API Usage

Encode Sentences

Kubernetes Health Probes

Model Information

Performance

Contributing

License

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages