Python
ritellm.completion(model, messages, temperature=None, max_tokens=None, base_url=None, stream=False, additional_params=None)
Clean Python wrapper around the completion_gateway function.
This function provides a convenient interface to call various LLM providers' chat completion APIs through the Rust-backed completion_gateway binding. The model string should include a provider prefix.
Example
from ritellm import completion
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
# Non-streaming
response = completion(model="openai/gpt-3.5-turbo", messages=messages)
print(response["choices"][0]["message"]["content"])
# Streaming
response = completion(model="openai/gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
print(chunk["choices"][0]["delta"].get("content", ""), end="")
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
str
|
The model to use with provider prefix (e.g., "openai/gpt-4", "openai/gpt-3.5-turbo") |
required |
messages
|
list
|
A list of message dictionaries with "role" and "content" keys |
required |
temperature
|
float
|
Sampling temperature (0.0 to 2.0) |
None
|
max_tokens
|
int
|
Maximum tokens to generate |
None
|
base_url
|
str
|
Base URL for the API endpoint |
None
|
stream
|
bool
|
Enable streaming responses (default: False) |
False
|
additional_params
|
str
|
Additional parameters as a JSON string |
None
|
Returns:
Type | Description |
---|---|
dict[str, Any] | Iterator[dict[str, Any]]
|
dict | Iterator[dict]: A dictionary containing the API response, or an iterator of chunks if stream=True |
Raises:
Type | Description |
---|---|
ValueError
|
If the provider prefix is not supported |
Environment Variables
OPENAI_API_KEY: Required for OpenAI models
ritellm.acompletion(model, messages, temperature=None, max_tokens=None, base_url=None, stream=False, additional_params=None)
async
Async Python wrapper around the async_completion_gateway function.
This function provides an async interface to call various LLM providers' chat completion APIs through the Rust-backed async_completion_gateway binding. The model string should include a provider prefix.
Example
import asyncio
from ritellm import acompletion
async def main():
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
# Non-streaming
response = await acompletion(model="openai/gpt-3.5-turbo", messages=messages)
print(response["choices"][0]["message"]["content"])
# Streaming
response = await acompletion(model="openai/gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
print(chunk["choices"][0]["delta"].get("content", ""), end="")
asyncio.run(main())
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
str
|
The model to use with provider prefix (e.g., "openai/gpt-4", "openai/gpt-3.5-turbo") |
required |
messages
|
list
|
A list of message dictionaries with "role" and "content" keys |
required |
temperature
|
float
|
Sampling temperature (0.0 to 2.0) |
None
|
max_tokens
|
int
|
Maximum tokens to generate |
None
|
base_url
|
str
|
Base URL for the API endpoint |
None
|
stream
|
bool
|
Enable streaming responses (default: False) |
False
|
additional_params
|
str
|
Additional parameters as a JSON string |
None
|
Returns:
Type | Description |
---|---|
dict[str, Any] | Iterator[dict[str, Any]]
|
dict | Iterator[dict]: A dictionary containing the API response, or an iterator of chunks if stream=True |
Raises:
Type | Description |
---|---|
ValueError
|
If the provider prefix is not supported |
Environment Variables
OPENAI_API_KEY: Required for OpenAI models