MagStack™ Cluster Models

Name: Thox OS
Author: Thox.ai LLC

Enterprise AI models optimized for distributed inference across MagStack clusters

Thox.ai provides specialized AI models designed for distributed inference across MagStack™ device clusters. These models leverage tensor parallelism and MoE (Mixture of Experts) architectures to run frontier-class AI on your desk with complete privacy.

Healthcare

HIPAA Compliant

Legal

Attorney-Client Privilege

Enterprise

SOC2 & GDPR

Research

FERPA Compliant

Recommended for Most Users

Start with thox-cluster-nano - it offers a 1 million token context window in just 24GB, making it ideal for processing entire documents, codebases, and datasets on a 2-device cluster.

Core Cluster Models

Cluster Nano

RecommendedEntry

Thox-ai/thox-cluster-nano

View on Ollama

Long-context model with 1 million token window for processing entire documents, datasets, and complex analyses. MoE architecture with 128 experts.

Parameters

30B

3.5B (MoE)

Context

1M tokens

Memory

24GB

Min Devices

Speed

80-120 tok/s

Base Model

Nemotron-3-Nano

Benchmarks:

mmluPro: 78.3%gpqa: 73.0%aime25: 99.2%liveCodeBench: 68.3%

Key Features:

1M token context - process entire documents
MoE architecture (3.5B active of 30B)
5-15 concurrent professional users
HIPAA, GDPR, SOC2, FERPA compliant

HIPAAGDPRSOC2FERPA

Cluster Scout

Professional

Thox-ai/thox-cluster-scout

View on Ollama

Professional multimodal model with vision capabilities and industry-leading 10M token context. Native image understanding for healthcare, legal, and finance.

Parameters

109B

17B (MoE)

Context

10M tokens

Memory

67GB

Min Devices

Speed

60-90 tok/s

Base Model

Llama 4 Scout

Benchmarks:

mmluPro: 74.3%gpqaDiamond: 57.2%mmmu: 69.4%chartQA: 88.8%docVQA: 94.4%

Key Features:

10M token context - industry-leading
Native vision & image understanding
Multilingual (12 languages)
Medical imaging, chart analysis, OCR

HIPAAGDPRSOC2FERPACCPA

Cluster Maverick

Enterprise

Thox-ai/thox-cluster-maverick

View on Ollama

Enterprise flagship model with frontier multimodal intelligence. For Fortune 500, hospitals, universities, and government.

Parameters

400B

17B (MoE)

Context

1M tokens

Memory

245GB

Min Devices

12x

Speed

30-50 tok/s

Base Model

Llama 4 Maverick

Benchmarks:

mmluPro: 80.5%gpqaDiamond: 69.8%mmmu: 73.4%liveCodeBench: 43.4%mgsm: 92.3%

Key Features:

Frontier-class multimodal AI
200+ concurrent enterprise users
All major compliance frameworks
Fortune 500, hospitals, government

HIPAAGDPRSOC2FERPACCPAFISMAITAR

Specialized Cluster Variants

Purpose-built models optimized for specific professional workflows: software engineering, high-speed operations, frontier reasoning, and government/defense security requirements.

Cluster Code

Professional

Thox-ai/thox-cluster-code

View on Ollama

Elite software engineering model with GPT-4o competitive performance. Supports 92 programming languages with repository-level analysis, code generation, debugging, and collaborative code review.

Size

32B

Context

128K tokens

Devices

Key Features:

92 programming languages support
GPT-4o competitive on code generation
Repository-level analysis (128K context)
73.7 Aider score - elite code repair

Benchmarks:

aiderScore: 73.7humanEvalPlus: 92.7%mbppPlus: 88.1%liveCodeBench: 73.5%

HIPAAGDPRSOC2FERPAFISMA

Cluster Swift

Professional

Thox-ai/thox-cluster-swift

View on Ollama

Speed-optimized model for high-volume, real-time applications. Handles 30-50+ concurrent users with <100ms latency. Ideal for customer support, call centers, and interactive applications.

Size

Context

32K tokens

Devices

Key Features:

50+ tokens/sec ultra-fast responses
<100ms first token latency
30-50+ concurrent users
Real-time chat & customer support

HIPAAGDPRSOC2FERPA

Cluster Deep

Enterprise

Thox-ai/thox-cluster-deep

View on Ollama

Frontier reasoning model with state-of-the-art capabilities. Largest openly available model for research institutions, strategic consulting, financial modeling, legal research, and complex quantitative analysis.

Size

405B

Context

128K tokens

Devices

12+

Key Features:

405B parameters - largest open model
Frontier-class reasoning capabilities
Research-grade deep analysis
Strategic consulting & financial modeling

Benchmarks:

mmluPro: 84.7%gpqaDiamond: 59.1%mathVista: 67.8%ifEval: 88.6%

HIPAAGDPRSOC2FERPAFISMA

Cluster Secure

Enterprise

Thox-ai/thox-cluster-secure

View on Ollama

Government/defense-grade model with maximum security. Supports UNCLASSIFIED through SECRET workloads with N+2 redundancy, air-gap deployment, ITAR compliance, and FedRAMP High authorization.

Size

72B

Context

128K tokens

Devices

Key Features:

ITAR, FedRAMP High, FISMA compliant
Air-gapped deployment ready
UNCLASSIFIED to SECRET workloads
N+2 redundancy for mission assurance

HIPAAGDPRSOC2FERPAFISMAITARFedRAMPCMMC

Detailed Use Cases & ROI

Cluster Code - Software Engineering Teams

Use Cases:

Software Teams (25 engineers): Repository analysis, code reviews, architecture design, testing
Startups (15-20 engineers): Rapid prototyping, technical debt reduction, performance optimization
DevOps (20 engineers): IaC, CI/CD pipelines, monitoring, incident response
QA Engineering (15 engineers): Test automation, coverage analysis, security testing

Engineering Team ROI:

Development Velocity: 40-50% faster feature development
Code Quality: 80%+ test coverage, reduced vulnerabilities
Onboarding: 60% faster new engineer ramp-up
Cost Savings: $50,000 - $150,000/year vs cloud alternatives

3-Year TCO (6 devices, 30 engineers):

$29,500 (~$985/user)

vs GitHub Copilot: $54,000 - $67,500

Pays for itself in 12-18 months

IDE Integration:

VS CodeIntelliJ / JetBrainsVim / NeovimEmacs

Cluster Swift - High-Speed Operations

Use Cases:

Customer Support (40-50 agents): Real-time query assistance, knowledge base search
Call Centers (30-40 agents): Live transcription, agent assistance, sentiment analysis
Interactive Apps (50+ users): Real-time chatbots, collaboration, dynamic content
Healthcare Admissions (20-30 staff): Patient intake, scheduling, HIPAA-compliant support

Enterprise ROI:

Customer Support: 50% faster response times, higher satisfaction
Call Center: $80,000/year operational efficiency gains
Healthcare: 40% faster patient processing
Cost Savings: $40,000 - $130,000/year vs cloud APIs

3-Year TCO (3 devices, 50 users):

$13,000 (~$260/user)

vs Cloud AI APIs: $54,000 - $144,000

Pays for itself in 3-9 months

Real-Time Performance:

First Token

<100ms

Throughput

50+ tok/s

Response Time

<1 second

Concurrent

50+ users

Uptime

99.9%

Cluster Deep - Frontier Reasoning

Use Cases:

R1 Universities (30-40 faculty): Grant proposals, literature reviews, collaboration
Consulting Firms (20-30 consultants): Strategic analysis, market research, forecasting
Financial Analysis (25-35 analysts): Quantitative modeling, risk assessment, compliance
Legal Research (20-30 attorneys): Case law analysis, litigation strategy, compliance

Research Institution ROI:

Grant Success: 4-5x improvement in grant funding
Publications: 3-5x increase in peer-reviewed output
Research Speed: 60% faster literature review and analysis
Cost Savings: $100,000 - $300,000/year vs cloud frontiers

3-Year TCO (14 devices, 40 researchers):

$79,000 (~$1,975/user)

vs Claude Opus/GPT-4: $108,000 - $288,000

Pays for itself in 9-24 months

Frontier Capabilities:

Advanced Chain-of-ThoughtLiterature ReviewHypothesis GenerationMathematical ModelingStrategic Planning

Cluster Secure - Government/Defense

Use Cases:

DOD (20-30 analysts): Intelligence analysis, mission planning, threat assessment
Intelligence Community (15-25 analysts): All-source analysis, counterintel, OSINT
Defense Contractors (20-30 personnel): ITAR-controlled analysis, export control
Federal Law Enforcement (15-25 agents): Case investigation, counterterrorism

Government/Defense ROI:

Mission Effectiveness: $200,000+/year improvement
Compliance: Avoid classification spills and violations
Security: Zero external data exposure
Availability: 99.99% mission assurance with N+2 redundancy

3-Year TCO (10 devices, 30 cleared personnel):

$124,000 (~$4,133/user)

Cloud AI: NOT AUTHORIZED for SECRET

Only authorized solution for classified AI

Security Features:

CAC/PIV AuthAir-Gap ReadyTEMPESTZero TrustSIEM IntegrationInsider Threat Detection

Compliance & Certification:

ITARFedRAMP HighFISMACMMC Level 3-5NIST 800-53NIST 800-171RMFDISA STIGs

Additional Cluster Models

Cluster 70B

Professional

Thox-ai/thox-cluster-70b

Enterprise-grade model for complex reasoning, analysis, and professional workflows.

Size

72B

Context

64K tokens

Devices

Cluster 100B

Professional

Thox-ai/thox-cluster-100b

Expert-level model for enterprise, research, healthcare, and legal workloads.

Size

110B

Context

96K tokens

Devices

Cluster 200B

Enterprise

Thox-ai/thox-cluster-200b

Frontier-class model matching cloud AI capabilities for any industry application.

Size

405B

Context

128K tokens

Devices

Cluster Coordinator

Coordinator

Utility

Lightweight cluster orchestration and management model.

Size: 4B

Context: 16K tokens

Memory: 3GB

Speed: <100ms

Quick Start

# Pull and run thox-cluster-nano (recommended)

$ ollama pull Thox-ai/thox-cluster-nano

$ ollama run Thox-ai/thox-cluster-nano

# For software engineering teams (4+ devices)

$ ollama pull Thox-ai/thox-cluster-code

$ ollama run Thox-ai/thox-cluster-code

# For high-volume/real-time apps (2+ devices)

$ ollama pull Thox-ai/thox-cluster-swift

$ ollama run Thox-ai/thox-cluster-swift

# For frontier reasoning research (12+ devices)

$ ollama pull Thox-ai/thox-cluster-deep

$ ollama run Thox-ai/thox-cluster-deep

# For government/defense (6+ devices)

$ ollama pull Thox-ai/thox-cluster-secure

$ ollama run Thox-ai/thox-cluster-secure

# For vision capabilities (4+ devices)

$ ollama pull Thox-ai/thox-cluster-scout

$ ollama run Thox-ai/thox-cluster-scout

# Enterprise flagship (12+ devices)

$ ollama pull Thox-ai/thox-cluster-maverick

$ ollama run Thox-ai/thox-cluster-maverick

# Use clusterctl for automatic model selection

$ clusterctl recommend

# Output: Recommended: thox-cluster-nano (2 devices, 32GB RAM)

Model Selection Guide

Use Case	Recommended	Devices	Why
Full document/codebase analysis	thox-cluster-nano	2+	1M token context handles entire repos
Software engineering teams	thox-cluster-code	4+	92 languages, GPT-4o competitive coding
High-volume customer support	thox-cluster-swift	2+	50+ concurrent users, <100ms latency
Advanced research & analysis	thox-cluster-deep	12+	405B frontier reasoning model
Government/defense classified	thox-cluster-secure	6+	ITAR, FedRAMP, air-gap ready
Medical imaging & chart analysis	thox-cluster-scout	4+	Native vision, 10M context, HIPAA
Fortune 500 enterprise AI	thox-cluster-maverick	12+	Frontier-class, 200+ concurrent users

Complete Privacy & Compliance

All cluster models run entirely on your local MagStack configuration. No data is transmitted to external servers. Your code, documents, medical records, and conversations never leave your devices.

HIPAAGDPRSOC2FERPACCPA

CONFIDENTIAL AND PROPRIETARY INFORMATION

This documentation is provided for informational and operational purposes only. The specifications and technical details herein are subject to change without notice. Thox.ai LLC reserves all rights in the technologies, methods, and implementations described.

Nothing in this documentation shall be construed as granting any license or right to use any patent, trademark, trade secret, or other intellectual property right of Thox.ai LLC, except as expressly provided in a written agreement.

Patent Protection

The MagStack™ magnetic stacking interface technology, including the magnetic alignment system, automatic cluster formation, NFC-based device discovery, and distributed inference me...

Reverse Engineering Prohibited

You may not reverse engineer, disassemble, decompile, decode, or otherwise attempt to derive the source code, algorithms, data structures, or underlying ideas of any Thox.ai hardwa...

Thox.ai™, Thox OS™, MagStack™, and the Thox.ai logo are trademarks or registered trademarks of Thox.ai LLC in the United States and other countries.

NVIDIA, Jetson, TensorRT, and related marks are trademarks of NVIDIA Corporation. Ollama is a trademark of Ollama, Inc. All other trademarks are the property of their respective owners.