OpenAI-Compatible API

ClawDesk exposes OpenAI-compatible endpoints, allowing it to serve as a drop-in replacement for the OpenAI API. Any client library or tool that supports the OpenAI API can connect to ClawDesk.

Base URL

http://localhost:1420/v1

tip

Point any OpenAI SDK at ClawDesk by setting the base URL:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:1420/v1",
    api_key="any-value"  # ClawDesk uses its own auth
)

Chat Completions

`POST /v1/chat/completions`

Create a chat completion. Supports both synchronous and streaming responses.

Request Body:

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Explain quantum computing in simple terms." }
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}

Parameters:

Field	Type	Required	Description
`model`	string	Yes	Model ID (mapped to provider)
`messages`	array	Yes	Conversation messages
`temperature`	float	No	Sampling temperature (0.0–2.0)
`max_tokens`	integer	No	Maximum tokens to generate
`stream`	boolean	No	Enable SSE streaming
`top_p`	float	No	Nucleus sampling
`stop`	string/array	No	Stop sequences
`tools`	array	No	Function calling tools

Non-Streaming Response

Response 200 OK:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1708167000,
  "model": "claude-sonnet-4-20250514",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing uses quantum bits (qubits)..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Streaming Response

When stream: true, the response uses Server-Sent Events (SSE):

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1708167000,"model":"claude-sonnet-4-20250514","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1708167000,"model":"claude-sonnet-4-20250514","choices":[{"index":0,"delta":{"content":"Quantum"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1708167000,"model":"claude-sonnet-4-20250514","choices":[{"index":0,"delta":{"content":" computing"},"finish_reason":null}]}

data: [DONE]

Function Calling

{
  "model": "claude-sonnet-4-20250514",
  "messages": [
    { "role": "user", "content": "What's the weather in London?" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

Models

`GET /v1/models`

List all available models in OpenAI-compatible format.

Response 200 OK:

{
  "object": "list",
  "data": [
    {
      "id": "claude-sonnet-4-20250514",
      "object": "model",
      "created": 1708167000,
      "owned_by": "anthropic"
    },
    {
      "id": "gpt-4o",
      "object": "model",
      "created": 1708167000,
      "owned_by": "openai"
    },
    {
      "id": "llama3.1:70b",
      "object": "model",
      "created": 1708167000,
      "owned_by": "ollama"
    }
  ]
}

Responses API

`POST /v1/responses`

Create a response using the Responses API format.

Request Body:

{
  "model": "claude-sonnet-4-20250514",
  "input": "Write a haiku about Rust programming.",
  "instructions": "You are a creative poet.",
  "temperature": 0.9
}

Response 200 OK:

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": "2026-02-17T10:30:00Z",
  "model": "claude-sonnet-4-20250514",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Ownership borrowed\nLifetimes guard the memory\nSafe concurrency"
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 18,
    "output_tokens": 15
  }
}

`GET /v1/responses/:id`

Retrieve a previously created response.

Response 200 OK: Same format as creation response.

Model Mapping

ClawDesk maps OpenAI model names to the configured providers:

Requested Model	Provider	Actual Model
`claude-*`	Anthropic	Direct pass-through
`gpt-*`	OpenAI	Direct pass-through
`gemini-*`	Google	Direct pass-through
`llama`, `mixtral`	Ollama	Local inference
Any unknown	Fallback chain	Best available match

info

If a requested model is not available, ClawDesk's fallback system automatically routes to the next capable provider in the chain.

Client Examples

Python (openai SDK)

from openai import OpenAI

client = OpenAI(base_url="http://localhost:1420/v1", api_key="unused")

response = client.chat.completions.create(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

curl

curl http://localhost:1420/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

TypeScript (openai SDK)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:1420/v1",
  apiKey: "unused",
});

const completion = await client.chat.completions.create({
  model: "claude-sonnet-4-20250514",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(completion.choices[0].message.content);

Base URL​

Chat Completions​

POST /v1/chat/completions​

Non-Streaming Response​

Streaming Response​

Function Calling​

Models​

GET /v1/models​

Responses API​

POST /v1/responses​

GET /v1/responses/:id​

Model Mapping​

Client Examples​

Python (openai SDK)​

curl​

TypeScript (openai SDK)​

Base URL

Chat Completions

`POST /v1/chat/completions`

Non-Streaming Response

Streaming Response

Function Calling

Models

`GET /v1/models`

Responses API

`POST /v1/responses`

`GET /v1/responses/:id`

Model Mapping

Client Examples

Python (openai SDK)

curl

TypeScript (openai SDK)