Skip to main content
Streaming allows you to receive partial responses from the API as they are generated, rather than waiting for the entire response to be completed. This can significantly improve the perceived latency and user experience, especially for longer generations.

How Streaming Works

The Nusantara AI API implements streaming using Server-Sent Events (SSE) when you set the stream: true parameter in your request. When a streaming request is made:
  1. Content-Type Header: The API sets the Content-Type header to text/event-stream.
  2. Partial Data Chunks: The server sends data in small chunks. Each chunk is an event with a specific type and data payload.
  3. End of Stream Signal: The stream is terminated by a final event, like response.completed or a data: [DONE] message, indicating that no further data will be sent.
This allows your client application to display or process the generated content incrementally.

Usage with SDKs

You can enable streaming easily by setting stream: true in your request body. Below are examples for both of our main endpoints.
  • Responses API
  • Chat Completions API
The /v1/responses endpoint uses a modern, event-driven stream. Each chunk is a typed event, allowing you to easily handle different parts of the response, such as text deltas or function call arguments.

Request Body Example

{
  "model": "nusantara-base",
  "input": "Explain the concept of AI streaming in detail.",
  "stream": true
}

Example Streaming Events

event: response.created
data: {"type":"response.created","response":{"id":"resp_abc123",...}}

event: response.output_item.added
data: {"type":"response.output_item.added","item":{"id":"msg_def456",...}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"AI "}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"streaming "}

... more delta events ...

event: response.completed
data: {"type":"response.completed","response":{...}}

Python & Javascript Examples

from openai import OpenAI

# Initialize the client for Nusantara AI
client = OpenAI(
    api_key="<YOUR_NUSANTARA_API_KEY>", 
    base_url="https://api.neosantara.xyz/v1"
)

def stream_response(prompt: str, model: str = "nusantara-base"):
    """Makes a streaming request to the Responses API."""
    print(f"Streaming from /v1/responses for model: {model}")
    print(f"Prompt: {prompt}\n")

    try:
        stream = client.responses.create(
            model=model,
            input=prompt,
            stream=True,
            max_tokens=2045
        )

        full_response_content = ""
        print("AI Response: ", end="", flush=True)
        for event in stream:
            # Check for the event type containing text chunks
            if event.type == 'response.output_text.delta':
                content = event.delta
                full_response_content += content
                print(content, end="", flush=True)

        print("\n\n--- Stream finished ---")
        return full_response_content

    except Exception as e:
        print(f"\nAn error occurred: {e}")
        return None

if __name__ == "__main__":
    user_prompt = "Tell me a short story about a mythical creature from Indonesian folklore."
    streamed_text = stream_response(user_prompt)
    if streamed_text:
        print(f"\nTotal streamed content length: {len(streamed_text)} characters")