Streaming Responses
warning
Streaming is currently under development and will be available soon.
Specification
Request Format
# Using requests
response = requests.post(
"https://mintii-router-500540193826.us-central1.run.app/route/mintiiv0",
headers=headers,
json={
"prompt": "Hi!",
"stream": true,
"max_tokens": 1000 # Optional
},
stream=True
)
# Using OpenAI library
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Hi!"}],
model="mintiiv0",
stream=True
)
Expected Response Format
{
"id": "resp_123",
"chunk": "Hello",
"model": "gemma-7b-it",
"finish_reason": null
}
{
"id": "resp_123",
"chunk": " there",
"model": "gemma-7b-it",
"finish_reason": null
}
{
"id": "resp_123",
"chunk": "!",
"model": "gemma-7b-it",
"finish_reason": "stop"
}
Error Response Format
{
"error": {
"message": "Stream connection error",
"type": "stream_error",
"code": 500
}
}
Implementation Notes
- Responses will be delivered as Server-Sent Events (SSE)
- Each chunk contains a portion of the complete response
- The final chunk includes finish_reason: "stop"
- Error handling remains consistent with non-streaming responses