AI Runtime
Learn about the AI runtime, APIs, models, and integrations available on your Thox.ai device.
OpenAI-Compatible API
API Compatibility
Thox.ai provides a fully OpenAI-compatible API endpoint, allowing you to use existing OpenAI SDKs and tools with your local device. Simply point your API calls to http://thox.local:8080/v1 instead of api.openai.com.
Supported Endpoints
The following endpoints are fully supported: /v1/chat/completions, /v1/completions, /v1/models, and /v1/embeddings. All endpoints support streaming responses for real-time output.
Authentication
By default, no API key is required for local access. For remote access, generate an API key using: thox api-key generate. Use the key in your Authorization header as "Bearer YOUR_API_KEY".
Example Usage
Using the OpenAI Python SDK: client = OpenAI(base_url="http://thox.local:8080/v1", api_key="optional"). Then use client.chat.completions.create() as you normally would with OpenAI.
Rate Limits
Local requests have no rate limits. For remote access, the default limit is 100 requests per minute, configurable via thox config set api.rate_limit <value>.
Coder Models
Pre-installed Models
Your Thox.ai device comes with thox-coder pre-installed, a 7B parameter model optimized for code completion, explanation, and refactoring. It supports 50+ programming languages.
Available Models
Additional models include: thox-coder-large (13B, more capable), thox-coder-fast (3B, faster responses), and specialized models for Python, JavaScript, Rust, and Go.
Downloading Models
List available models: thox models list --remote. Download a model: thox models pull thox-coder-large. Check download progress: thox models status.
Switching Models
Set the default model: thox config set default_model thox-coder-large. Or specify per-request using the model parameter in your API calls.
Model Performance
thox-coder-fast: ~50 tokens/sec, best for quick completions. thox-coder: ~30 tokens/sec, balanced performance. thox-coder-large: ~15 tokens/sec, highest quality.
Custom Fine-tuning
Fine-tune models on your codebase: thox finetune --model thox-coder --data ./my-code --epochs 3. This creates a personalized model that understands your coding patterns.
MCP (Model Context Protocol) Support
What is MCP?
Model Context Protocol (MCP) is an open standard for connecting AI models to external tools and data sources. Thox.ai natively supports MCP, enabling rich integrations with your development workflow.
Built-in MCP Servers
Pre-configured MCP servers include: filesystem (read/write local files), git (repository operations), shell (execute commands), and browser (web fetching). Enable them via thox mcp enable <server>.
Connecting MCP Clients
Your Thox.ai device acts as an MCP server. Connect from Claude Desktop, VS Code, or any MCP-compatible client using the endpoint: http://thox.local:8080/mcp.
Custom MCP Servers
Add custom MCP servers by placing configuration in ~/.thox/mcp/servers.json. Thox.ai will automatically discover and load compatible servers on startup.
Security Considerations
MCP servers can execute code and access files. Review permissions carefully: thox mcp permissions list. Restrict access: thox mcp permissions set filesystem --read-only.
Debugging MCP
Enable MCP debug logging: thox config set mcp.debug true. View logs: thox logs --filter mcp. Test server connectivity: thox mcp test <server-name>.