API Reference

Complete API documentation for integrating with your Thox.ai device.

REST API

Base URL

http://thox.local:8080/v1

Chat Completions

Generate a chat completion response.

POST/chat/completions
{
  "model": "thox-coder",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}

Response Format

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1703721600,
  "model": "thox-coder",
  "backend": "ollama",  // or "tensorrt" for large models
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Available Endpoints

POST/chat/completionsGenerate chat response
GET/modelsList available models
GET/models/{model_id}Get model details
POST/completionsText completion (legacy)
POST/embeddingsGenerate embeddings
GET/healthAPI and backend health status

Router Endpoints

The hybrid API includes smart routing between Ollama and TensorRT-LLM backends:

GET/router/statusRouter configuration and backend health

Hybrid Routing: Requests are automatically routed to the optimal backend. Models 10B+ use TensorRT-LLM for 60-100% faster inference. The response includes a "backend" field indicating which engine processed the request.

WebSocket Streaming

For real-time streaming responses, connect via WebSocket for lower latency and token-by-token output.

Connection

ws://thox.local:8080/v1/stream

JavaScript Example

const ws = new WebSocket('ws://thox.local:8080/v1/stream');

ws.onopen = () => {
  ws.send(JSON.stringify({
    model: 'thox-coder',
    messages: [
      { role: 'user', content: 'Explain async/await' }
    ]
  }));
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  if (data.choices?.[0]?.delta?.content) {
    process.stdout.write(data.choices[0].delta.content);
  }
};

ws.onerror = (error) => console.error('WebSocket error:', error);
ws.onclose = () => console.log('Connection closed');

Stream Events

delta

Incremental content chunk

done

Stream complete, includes usage stats

error

Error occurred during generation

OpenAI Compatibility

Thox.ai provides an OpenAI-compatible API, allowing you to use existing OpenAI SDKs and tools with minimal changes.

Python (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://thox.local:8080/v1",
    api_key="not-required"  # Local device doesn't need auth
)

response = client.chat.completions.create(
    model="thox-coder",
    messages=[
        {"role": "user", "content": "Write a quicksort in Python"}
    ]
)

print(response.choices[0].message.content)

Node.js (OpenAI SDK)

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://thox.local:8080/v1',
  apiKey: 'not-required'
});

const response = await client.chat.completions.create({
  model: 'thox-coder',
  messages: [
    { role: 'user', content: 'Write a quicksort in TypeScript' }
  ]
});

console.log(response.choices[0].message.content);

Compatibility Note: Most OpenAI API features are supported including function calling, JSON mode, and vision (with supported models). Check the model catalog for specific capabilities.

MCP Protocol

Thox.ai supports the Model Context Protocol (MCP) for seamless integration with development tools and IDEs.

MCP Server Configuration

Add Thox.ai as an MCP server in your configuration:

{
  "mcpServers": {
    "thox": {
      "url": "http://thox.local:8080/mcp",
      "transport": "http"
    }
  }
}

Available Tools

thox_complete

Generate code completions with context awareness

thox_explain

Explain code snippets or error messages

thox_refactor

Suggest refactoring improvements

thox_test

Generate unit tests for functions

CONFIDENTIAL AND PROPRIETARY INFORMATION

This documentation is provided for informational and operational purposes only. The specifications and technical details herein are subject to change without notice. Thox.ai LLC reserves all rights in the technologies, methods, and implementations described.

Nothing in this documentation shall be construed as granting any license or right to use any patent, trademark, trade secret, or other intellectual property right of Thox.ai LLC, except as expressly provided in a written agreement.

Patent Protection

The MagStack™ magnetic stacking interface technology, including the magnetic alignment system, automatic cluster formation, NFC-based device discovery, and distributed inference me...

Reverse Engineering Prohibited

You may not reverse engineer, disassemble, decompile, decode, or otherwise attempt to derive the source code, algorithms, data structures, or underlying ideas of any Thox.ai hardwa...

Thox.ai™, Thox OS™, MagStack™, and the Thox.ai logo are trademarks or registered trademarks of Thox.ai LLC in the United States and other countries.

NVIDIA, Jetson, TensorRT, and related marks are trademarks of NVIDIA Corporation. Ollama is a trademark of Ollama, Inc. All other trademarks are the property of their respective owners.

© 2026 Thox.ai LLC. All Rights Reserved.