Reference

API documentation

Call your deployed models with an OpenAI-compatible POST /v1/chat/completions endpoint. Use an API key from API Keys in the console — each key routes to one instance.

Quick start

Deploy a model under Templates and wait until the instance is ready.
Use the API key created automatically when the instance became ready. If you did not save it, open API Keys, select that instance, and generate a new key (shown once).
Send a POST to the chat completions URL below with Authorization: Bearer … and a JSON body including model and messages.

API keys & instances

Keys tie your apps to a single running instance. Manage them under API Keys or view live instances on My Instances.

Auto-generated on deploy — When an instance reaches ready, a default API key is created automatically in the background. The secret is only shown once at creation—if you did not copy it, generate a new key below.
One key, one instance — Each API key is bound to exactly one deployment. Requests authenticated with that key are routed only to that instance’s model endpoint.
Instance ended → key revoked — When an instance stops, fails, or its pack ends, all keys for that deployment are revoked. They cannot be used again. A new deployment needs a new API key.
Extend pack → same key — Extending uptime on the same instance keeps the same deployment id. Your existing API keys continue to work—no rotation required.
Create another key — Open API Keys, choose the ready instance, name the key, and click Generate Key. You can hold multiple active keys per instance if needed.

Generate API key

Endpoint

Use this URL for all chat requests through the OpenLLM Buddy proxy. Your API key selects which deployment receives the traffic.

POST openllmbuddy-proxy.botbuddytech.workers.dev/v1/chat/completions

OpenAI SDK users can set baseURL to the same host with path /v1 (omit /chat/completions).

Authentication

Send your secret key in the Authorization header using the Bearer scheme. Keys look like ob_sk_….

Header

http

Authorization: Bearer ob_sk_000000000000000000000000000000000000000000000001

Manage API keys

Headers

Header	Required	Value	Notes
Authorization	Yes	Bearer <YOUR_API_KEY>	API key from Console → API Keys. Each key is tied to one deployment.
Content-Type	Yes	application/json	Request body must be JSON.

Request body

JSON object. The modelfield must match your deployment's model id.

Gemma 4 26B A4Bgemma4:26b Qwen3.6 27B A4Bqwen3.6:27b

Field	Type	Required	Description
model	string	Yes	Model id for your deployment (see table below). Must match the model running on that instance.
messages	array	Yes	Chat turns: [{ "role": "user" \| "assistant" \| "system", "content": "..." }, ...].
temperature	number	No	Sampling temperature (0–2). Default depends on the upstream runtime.
top_p	number	No	Nucleus sampling (0–1). Often used with temperature.
max_tokens	integer	No	Cap completion length when supported by the upstream server.

Example body

json

{
  "model": "qwen3.6:27b",
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "temperature": 1,
  "top_p": 0.95
}

Model ids

Use the modelvalue from your deployment's model page.

Model	API model id
Gemma 4 26B A4B	gemma4:26b
Qwen3.6 27B A4B	qwen3.6:27b

Response

On success you receive a standard OpenAI-style chat completion. Read the assistant text from choices[0].message.content. Token usage is in usage when the upstream provides it.

Example response

json

{
  "id": "chatcmpl-example",
  "object": "chat.completion",
  "created": 1779715432,
  "model": "qwen3.6:27b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The assistant reply is in choices[0].message.content."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 48,
    "total_tokens": 60
  }
}

Code examples

cURL

bash

curl -s openllmbuddy-proxy.botbuddytech.workers.dev/v1/chat/completions \
  -H 'Authorization: Bearer ob_sk_000000000000000000000000000000000000000000000001' \
  -H 'Content-Type: application/json' \
  -d '{"model":"qwen3.6:27b","messages":[{"role":"user","content":"Hello!"}]}'

Errors

HTTP	Meaning
400	Missing API key, invalid JSON body, or malformed request.
401	Invalid or revoked API key.
404	Deployment not found for this key.
409	Deployment not ready, stopped, terminated, or has no endpoint yet.

n8n step-by-step guide Test in console