An SDK written in Python for the Inference Gateway.
pip install inference-gateway
from inference_gateway.client import InferenceGatewayClient, Provider
client = InferenceGatewayClient("http://localhost:8080")
# With authentication token(optional)
client = InferenceGatewayClient("http://localhost:8080", token="your-token")
To list all available models from all providers, use the list_models method:
models = client.list_models()
print("Available models: ", models)
To list available models for a specific provider, use the list_provider_models method:
models = client.list_provider_models(Provider.OPENAI)
print("Available OpenAI models: ", models)
To generate content using a model, use the generate_content method:
from inference_gateway.client import Provider, Role, Message
messages = [
Message(
Role.SYSTEM,
"You are an helpful assistant"
),
Message(
Role.USER,
"Hello!"
),
]
response = client.generate_content(
provider=Provider.OPENAI,
model="gpt-4",
messages=messages
)
print("Assistant: ", response["response"]["content"])
To stream content using a model, use the stream_content method:
from inference_gateway.client import Provider, Role, Message
messages = [
Message(
Role.SYSTEM,
"You are an helpful assistant"
),
Message(
Role.USER,
"Hello!"
),
]
# Use SSE for streaming
for response in client.generate_content_stream(
provider=Provider.Ollama,
model="llama2",
messages=messages,
use_sse=true
):
print("Event: ", response["event"])
print("Assistant: ", response["data"]["content"])
# Or raw JSON response
for response in client.generate_content_stream(
provider=Provider.GROQ,
model="deepseek-r1",
messages=messages,
use_sse=false
):
print("Assistant: ", response.content)
To check the health of the API, use the health_check method:
is_healthy = client.health_check()
print("API Status: ", "Healthy" if is_healthy else "Unhealthy")
This SDK is distributed under the MIT License, see LICENSE for more information.