Skip to main content

Streaming Responses

warning

Streaming is currently under development and will be available soon.

Specification

Request Format

# Using requests
response = requests.post(
"https://mintii-router-500540193826.us-central1.run.app/route/mintiiv0",
headers=headers,
json={
"prompt": "Hi!",
"stream": true,
"max_tokens": 1000 # Optional
},
stream=True
)

# Using OpenAI library
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Hi!"}],
model="mintiiv0",
stream=True
)

Expected Response Format

{
"id": "resp_123",
"chunk": "Hello",
"model": "gemma-7b-it",
"finish_reason": null
}
{
"id": "resp_123",
"chunk": " there",
"model": "gemma-7b-it",
"finish_reason": null
}
{
"id": "resp_123",
"chunk": "!",
"model": "gemma-7b-it",
"finish_reason": "stop"
}

Error Response Format

{
"error": {
"message": "Stream connection error",
"type": "stream_error",
"code": 500
}
}

Implementation Notes

  • Responses will be delivered as Server-Sent Events (SSE)
  • Each chunk contains a portion of the complete response
  • The final chunk includes finish_reason: "stop"
  • Error handling remains consistent with non-streaming responses