API Reference
Complete REST API documentation for Thox.ai devices
REST API Reference
OpenAI-compatible API for Thox.ai devices
Thox.ai devices expose an OpenAI-compatible REST API for inference, plus additional endpoints for cluster management and device control. All endpoints support JSON request/response bodies.
Port 8080Port 5381Authentication
All API requests require authentication via an API key passed in the Authorization header.
| Header | Value | Required | Description |
|---|---|---|---|
| Authorization | Bearer <api-key> | Yes | API key for authentication |
| Content-Type | application/json | Yes | Request body format |
| X-Request-ID | <uuid> | No | Optional request tracking ID |
# Example request with authentication
curl -X POST http://thox.local:8080/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{"model": "llama-3.1-8b", "messages": [...]}'
Inference API (Port 8080)
OpenAI-compatible endpoints for running AI inference. Use these endpoints to generate completions, chat responses, and embeddings.
/v1/chat/completionsCreate a chat completion with conversation history
Request body:
{
"model": "llama-3.1-8b",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1024
}/v1/completionsCreate a text completion from a prompt
Request body:
{
"model": "llama-3.1-8b",
"prompt": "The quick brown fox",
"max_tokens": 100
}/v1/embeddingsGenerate vector embeddings for text
Request body:
{
"model": "nomic-embed-text",
"input": "Hello, world!"
}/v1/modelsList available models on the device
Cluster API (Port 5381)
MagStack™ cluster management endpoints for monitoring topology, health, and power contracts.
/v1/topologyGet the current cluster topology and node connections
Example response:
{
"cluster_id": "ms-cluster-abc123",
"node_count": 4,
"leader": "node-001",
"nodes": [
{"id": "node-001", "role": "leader", "ram_gb": 16},
{"id": "node-002", "role": "follower", "ram_gb": 16}
]
}/v1/clusterGet overall cluster status and capabilities
Example response:
{
"status": "healthy",
"total_ram_gb": 64,
"total_tops": 400,
"active_model": "thox-cluster-nano"
}/v1/nodesList all nodes with detailed status
/v1/power/contractNegotiate power contract with connected modules
/v1/healthGet health status of the cluster
Example response:
{
"status": "ok",
"uptime_seconds": 86400,
"temperature_c": 42
}Model Management API
Endpoints for loading, unloading, and managing AI models on your Thox.ai device or cluster.
/v1/models/loadLoad a model into cluster memory
Request body:
{
"model": "thox-cluster-nano",
"options": {
"context_length": 32768
}
}/v1/models/unloadUnload a model from cluster memory
Request body:
{
"model": "thox-cluster-nano"
}/v1/models/pullDownload a model from the Thox.ai model registry
Request body:
{
"model": "thox-pro-medical-32b"
}Error Codes
| Code | Name | Description |
|---|---|---|
| 400 | Bad Request | Invalid request parameters or body |
| 401 | Unauthorized | Missing or invalid API key |
| 403 | Forbidden | API key lacks required permissions |
| 404 | Not Found | Endpoint or model not found |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Error | Server error - contact support |
| 503 | Service Unavailable | Cluster temporarily unavailable |
Error response format:
{
"error": {
"code": 401,
"message": "Invalid API key",
"type": "authentication_error"
}
}SDK Libraries
Use our official SDK libraries for easier integration:
Rate Limits
Default rate limits are 100 requests per minute for inference endpoints and 1000 requests per minute for cluster management endpoints. Contact support if you need higher limits for your deployment.
CONFIDENTIAL AND PROPRIETARY INFORMATION
This documentation is provided for informational and operational purposes only. The specifications and technical details herein are subject to change without notice. Thox.ai LLC reserves all rights in the technologies, methods, and implementations described.
Nothing in this documentation shall be construed as granting any license or right to use any patent, trademark, trade secret, or other intellectual property right of Thox.ai LLC, except as expressly provided in a written agreement.
Patent Protection
The MagStack™ magnetic stacking interface technology, including the magnetic alignment system, automatic cluster formation, NFC-based device discovery, and distributed inference me...
Reverse Engineering Prohibited
You may not reverse engineer, disassemble, decompile, decode, or otherwise attempt to derive the source code, algorithms, data structures, or underlying ideas of any Thox.ai hardwa...
Thox.ai™, Thox OS™, MagStack™, and the Thox.ai logo are trademarks or registered trademarks of Thox.ai LLC in the United States and other countries.
NVIDIA, Jetson, TensorRT, and related marks are trademarks of NVIDIA Corporation. Ollama is a trademark of Ollama, Inc. All other trademarks are the property of their respective owners.
© 2026 Thox.ai LLC. All Rights Reserved.