> For the complete documentation index, see [llms.txt](https://docs.dos.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.dos.ai/api-reference/chat-completions.md).

# Chat Completions

Create a chat completion by sending a conversation (a list of messages) to a model. The API is fully compatible with the OpenAI Chat Completions format, so you can use existing OpenAI SDKs and tools by changing the base URL.

## Endpoint

```
POST https://api.dos.ai/v1/chat/completions
```

## Authentication

Include your API key in the `Authorization` header using the Bearer scheme:

```
Authorization: Bearer dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```

API keys can be created and managed from the [dashboard](https://app.dos.ai).

## Request Headers

| Header          | Required | Description           |
| --------------- | -------- | --------------------- |
| `Authorization` | Yes      | `Bearer YOUR_API_KEY` |
| `Content-Type`  | Yes      | `application/json`    |

## Request Body

| Parameter           | Type             | Required | Default       | Description                                                                                                        |
| ------------------- | ---------------- | -------- | ------------- | ------------------------------------------------------------------------------------------------------------------ |
| `model`             | string           | Yes      | -             | The model ID to use. See [Available Models](/models/available-models.md).                                          |
| `messages`          | array            | Yes      | -             | A list of messages comprising the conversation. See [Message Format](#message-format).                             |
| `temperature`       | number           | No       | 1.0           | Sampling temperature between 0 and 2. Lower values produce more focused output; higher values increase randomness. |
| `max_tokens`        | integer          | No       | Model default | Maximum number of tokens to generate in the response.                                                              |
| `top_p`             | number           | No       | 1.0           | Nucleus sampling parameter. Only tokens within the top `top_p` probability mass are considered.                    |
| `stream`            | boolean          | No       | false         | If `true`, the response is streamed back as Server-Sent Events (SSE).                                              |
| `stop`              | string or array  | No       | null          | Up to 4 sequences where the model will stop generating further tokens.                                             |
| `tools`             | array            | No       | null          | A list of tool (function) definitions the model may call. See [Tool Calling](#tool-calling).                       |
| `tool_choice`       | string or object | No       | "auto"        | Controls which tool the model calls: `"auto"`, `"none"`, or a specific function.                                   |
| `response_format`   | object           | No       | null          | Force a specific output format. Use `{"type": "json_object"}` for JSON mode.                                       |
| `frequency_penalty` | number           | No       | 0             | Penalizes new tokens based on their frequency in the text so far (-2.0 to 2.0).                                    |
| `presence_penalty`  | number           | No       | 0             | Penalizes new tokens based on whether they appear in the text so far (-2.0 to 2.0).                                |
| `n`                 | integer          | No       | 1             | Number of completions to generate for each prompt.                                                                 |

### Message Format

Each message in the `messages` array is an object with the following fields:

| Field          | Type   | Required | Description                                                             |
| -------------- | ------ | -------- | ----------------------------------------------------------------------- |
| `role`         | string | Yes      | One of `system`, `user`, `assistant`, or `tool`.                        |
| `content`      | string | Yes      | The text content of the message.                                        |
| `name`         | string | No       | An optional name for the participant.                                   |
| `tool_calls`   | array  | No       | Tool calls generated by the model (for `assistant` messages).           |
| `tool_call_id` | string | No       | The ID of the tool call this message responds to (for `tool` messages). |

## Response Body

```json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1711000000,
  "model": "dos-ai",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}
```

### Response Fields

| Field                     | Type    | Description                                                                         |
| ------------------------- | ------- | ----------------------------------------------------------------------------------- |
| `id`                      | string  | Unique identifier for the completion.                                               |
| `object`                  | string  | Always `"chat.completion"`.                                                         |
| `created`                 | integer | Unix timestamp of when the completion was created.                                  |
| `model`                   | string  | The model used for the completion.                                                  |
| `choices`                 | array   | A list of completion choices.                                                       |
| `choices[].index`         | integer | The index of this choice in the list.                                               |
| `choices[].message`       | object  | The generated message.                                                              |
| `choices[].finish_reason` | string  | Why the model stopped: `"stop"`, `"length"`, `"tool_calls"`, or `"content_filter"`. |
| `usage`                   | object  | Token usage statistics for the request.                                             |
| `usage.prompt_tokens`     | integer | Number of tokens in the input prompt.                                               |
| `usage.completion_tokens` | integer | Number of tokens in the generated response.                                         |
| `usage.total_tokens`      | integer | Total tokens (prompt + completion).                                                 |

## Streaming

When `stream: true` is set, the response is delivered as **Server-Sent Events (SSE)**. Each event contains a JSON chunk:

```
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":12,"completion_tokens":9,"total_tokens":21}}

data: [DONE]
```

* Each `data:` line contains a JSON object with a `delta` field instead of `message`.
* The `delta` contains incremental content as it is generated.
* The final chunk includes `finish_reason` and `usage` statistics.
* The stream ends with `data: [DONE]`.

## Tool Calling

You can provide tool (function) definitions that the model can choose to call. This enables agentic workflows where the model can request external actions.

### Defining Tools

```json
{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "City name, e.g. San Francisco, CA"
            }
          },
          "required": ["location"]
        }
      }
    }
  ]
}
```

### Tool Call Response

When the model decides to call a tool, the response will have `finish_reason: "tool_calls"` and include tool call details in the message:

```json
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"San Francisco, CA\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
```

## Error Responses

| Status Code | Error                 | Description                                                      |
| ----------- | --------------------- | ---------------------------------------------------------------- |
| 400         | Bad Request           | Invalid request parameters.                                      |
| 401         | Unauthorized          | Invalid or missing API key.                                      |
| 402         | Payment Required      | Insufficient credits.                                            |
| 429         | Too Many Requests     | Rate limit exceeded. See [Rate Limits](/support/rate-limits.md). |
| 500         | Internal Server Error | Unexpected server error.                                         |
| 503         | Service Unavailable   | Model temporarily unavailable.                                   |

See [Error Codes](/support/error-codes.md) for detailed troubleshooting.

## Examples

### Basic Chat Completion (cURL)

```bash
curl https://api.dos.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -d '{
    "model": "dos-ai",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'
```

### Streaming (cURL)

```bash
curl https://api.dos.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -N \
  -d '{
    "model": "dos-ai",
    "messages": [
      {"role": "user", "content": "Write a haiku about programming."}
    ],
    "stream": true
  }'
```

### Using the OpenAI Python SDK

Since DOS AI is OpenAI-compatible, you can use the official OpenAI SDK by changing the base URL:

```python
from openai import OpenAI

client = OpenAI(
    api_key="dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    base_url="https://api.dos.ai/v1"
)

response = client.chat.completions.create(
    model="dos-ai",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    temperature=0.7,
    max_tokens=512
)

print(response.choices[0].message.content)
```

### Using the OpenAI Node.js SDK

```javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  baseURL: "https://api.dos.ai/v1",
});

const response = await client.chat.completions.create({
  model: "dos-ai",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain quantum computing in simple terms." },
  ],
  temperature: 0.7,
  max_tokens: 512,
});

console.log(response.choices[0].message.content);
```

### JSON Mode

```bash
curl https://api.dos.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -d '{
    "model": "dos-ai",
    "messages": [
      {"role": "system", "content": "Respond in JSON format."},
      {"role": "user", "content": "List 3 programming languages with their year of creation."}
    ],
    "response_format": {"type": "json_object"}
  }'
```

### Tool Calling Example

```bash
curl https://api.dos.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dos_sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -d '{
    "model": "dos-ai",
    "messages": [
      {"role": "user", "content": "What is the weather in Tokyo?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "City name"
              }
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dos.ai/api-reference/chat-completions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
