API Reference

Complete REST API documentation for Thox.ai devices

REST API Reference

OpenAI-compatible API for Thox.ai devices

Thox.ai devices expose an OpenAI-compatible REST API for inference, plus additional endpoints for cluster management and device control. All endpoints support JSON request/response bodies.

Inference API: Port 8080
Cluster API: Port 5381

Authentication

All API requests require authentication via an API key passed in the Authorization header.

HeaderValueRequiredDescription
AuthorizationBearer <api-key>YesAPI key for authentication
Content-Typeapplication/jsonYesRequest body format
X-Request-ID<uuid>NoOptional request tracking ID

# Example request with authentication

curl -X POST http://thox.local:8080/v1/chat/completions \

-H "Authorization: Bearer your-api-key" \

-H "Content-Type: application/json" \

-d '{"model": "llama-3.1-8b", "messages": [...]}'

Inference API (Port 8080)

OpenAI-compatible endpoints for running AI inference. Use these endpoints to generate completions, chat responses, and embeddings.

POST/v1/chat/completions

Create a chat completion with conversation history

Request body:

{
  "model": "llama-3.1-8b",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}
POST/v1/completions

Create a text completion from a prompt

Request body:

{
  "model": "llama-3.1-8b",
  "prompt": "The quick brown fox",
  "max_tokens": 100
}
POST/v1/embeddings

Generate vector embeddings for text

Request body:

{
  "model": "nomic-embed-text",
  "input": "Hello, world!"
}
GET/v1/models

List available models on the device

Cluster API (Port 5381)

MagStack™ cluster management endpoints for monitoring topology, health, and power contracts.

GET/v1/topology

Get the current cluster topology and node connections

Example response:

{
  "cluster_id": "ms-cluster-abc123",
  "node_count": 4,
  "leader": "node-001",
  "nodes": [
    {"id": "node-001", "role": "leader", "ram_gb": 16},
    {"id": "node-002", "role": "follower", "ram_gb": 16}
  ]
}
GET/v1/cluster

Get overall cluster status and capabilities

Example response:

{
  "status": "healthy",
  "total_ram_gb": 64,
  "total_tops": 400,
  "active_model": "thox-cluster-nano"
}
GET/v1/nodes

List all nodes with detailed status

POST/v1/power/contract

Negotiate power contract with connected modules

GET/v1/health

Get health status of the cluster

Example response:

{
  "status": "ok",
  "uptime_seconds": 86400,
  "temperature_c": 42
}

Model Management API

Endpoints for loading, unloading, and managing AI models on your Thox.ai device or cluster.

POST/v1/models/load

Load a model into cluster memory

Request body:

{
  "model": "thox-cluster-nano",
  "options": {
    "context_length": 32768
  }
}
POST/v1/models/unload

Unload a model from cluster memory

Request body:

{
  "model": "thox-cluster-nano"
}
POST/v1/models/pull

Download a model from the Thox.ai model registry

Request body:

{
  "model": "thox-pro-medical-32b"
}

Error Codes

CodeNameDescription
400Bad RequestInvalid request parameters or body
401UnauthorizedMissing or invalid API key
403ForbiddenAPI key lacks required permissions
404Not FoundEndpoint or model not found
429Too Many RequestsRate limit exceeded
500Internal ErrorServer error - contact support
503Service UnavailableCluster temporarily unavailable

Error response format:

{
  "error": {
    "code": 401,
    "message": "Invalid API key",
    "type": "authentication_error"
  }
}

SDK Libraries

Use our official SDK libraries for easier integration:

Rate Limits

Default rate limits are 100 requests per minute for inference endpoints and 1000 requests per minute for cluster management endpoints. Contact support if you need higher limits for your deployment.

CONFIDENTIAL AND PROPRIETARY INFORMATION

This documentation is provided for informational and operational purposes only. The specifications and technical details herein are subject to change without notice. Thox.ai LLC reserves all rights in the technologies, methods, and implementations described.

Nothing in this documentation shall be construed as granting any license or right to use any patent, trademark, trade secret, or other intellectual property right of Thox.ai LLC, except as expressly provided in a written agreement.

Patent Protection

The MagStack™ magnetic stacking interface technology, including the magnetic alignment system, automatic cluster formation, NFC-based device discovery, and distributed inference me...

Reverse Engineering Prohibited

You may not reverse engineer, disassemble, decompile, decode, or otherwise attempt to derive the source code, algorithms, data structures, or underlying ideas of any Thox.ai hardwa...

Thox.ai™, Thox OS™, MagStack™, and the Thox.ai logo are trademarks or registered trademarks of Thox.ai LLC in the United States and other countries.

NVIDIA, Jetson, TensorRT, and related marks are trademarks of NVIDIA Corporation. Ollama is a trademark of Ollama, Inc. All other trademarks are the property of their respective owners.

© 2026 Thox.ai LLC. All Rights Reserved.