Model Compatibility
Complete guide to Ollama models compatible with Thox.ai devices. Latest 2024-2025 models with vision, multilingual, and professional capabilities.
Latest Ollama Models (2024-2025)
This catalog features the newest generation of Ollama models optimized for professional use. All models have been evaluated for compatibility with Thox.ai's 16GB RAM architecture and NVIDIA Jetson Orin NX hardware.
Vision Models
Native image understanding, medical imaging, OCR in 32+ languages
Multilingual
Support for 12-32+ languages including English, Spanish, Chinese, Arabic
Performance
Optimized with Q4_K_M/INT4_AWQ quantization for Jetson hardware
Updated: December 28, 2025 - Catalog refreshed with Llama 4 Scout, Ministral-3, Gemma3, and other frontier models
Filter Models
Ministral-3 8B
Edge-optimized vision model with 32+ languages. Perfect for single devices with vision needs.
Size
8B parameters
Speed
40-60 tok/s
Memory
10GB
Context
256K tokens
Backend
Ollama
Min Devices
1x
Best For
ministral-3:8bGemma 3 8B
Google's efficient vision model optimized for single GPU. Excellent balance of performance and capability.
Size
8B parameters
Speed
38-55 tok/s
Memory
10GB
Context
128K tokens
Backend
Ollama
Min Devices
1x
Best For
gemma3:8bQwen 3 14B
Advanced reasoning model with vision and multilingual support. Excellent for complex professional tasks.
Size
14B parameters
Speed
30-45 tok/s
Memory
14GB
Context
128K tokens
Backend
TensorRT-LLM
Min Devices
1x
Best For
qwen3:14bPhi-4 Mini (3.8B)
Microsoft's compact model with exceptional performance. Multilingual with function calling.
Size
3.8B parameters
Speed
70-95 tok/s
Memory
4GB
Context
128K tokens
Backend
Ollama
Min Devices
1x
Best For
phi4:miniLlama 3.2 8B
Meta's reliable foundation model. Excellent for general professional use.
Size
8B parameters
Speed
42-65 tok/s
Memory
10GB
Context
128K tokens
Backend
Ollama
Min Devices
1x
Best For
llama3.2:8bQwen 2.5 Coder 14B
State-of-the-art coding model with reasoning improvements and 128K context.
Size
14B parameters
Speed
28-42 tok/s
Memory
14GB
Context
128K tokens
Backend
TensorRT-LLM
Min Devices
1x
Best For
qwen2.5-coder:14bDeepSeek-Coder-V2 16B
Advanced coding model with MoE architecture. Excellent for software engineering.
Size
16B parameters
Speed
25-38 tok/s
Memory
16GB
Context
64K tokens
Backend
TensorRT-LLM
Min Devices
1x
Best For
deepseek-coder-v2:16bCompatibility Guide
Single Device (16GB RAM)
Best for 3-14B parameter models with Q4_K_M or INT4_AWQ quantization:
- Ministral-3 8B (Vision, 32+ languages)
- Phi-4 Mini 3.8B (Ultra-fast, multilingual)
- Qwen 3 14B (Vision, thinking mode)
- Gemma 3 8B (Vision, single GPU optimized)
MagStack 2x (32GB RAM)
Unlocks frontier models with 10M context and multimodal capabilities:
- Llama 4 Scout (109B with MoE, 10M context, 12 languages)
- Qwen 3 32B (Vision, advanced reasoning)
- DeepSeek-Coder-V2 16B (Code specialist)
MagStack 4x+ (64GB+ RAM)
Enterprise-grade frontier models for professional workflows:
- Custom 70B+ models for healthcare, legal, finance
- Llama 4 Maverick (400B with MoE)
- Enterprise-specific fine-tuned models
Pro Tip: For vision tasks, we recommend Ministral-3 8B (single device) or Llama 4 Scout (2x stack). For coding, try Qwen 2.5 Coder 14B with TensorRT acceleration.
Quick Start Guide
# Pull a model from Ollama
ollama pull ministral-3:8b
# Run the model
ollama run ministral-3:8b
# For vision tasks, attach an image
ollama run ministral-3:8b "Analyze this medical image" /path/to/image.jpg
CONFIDENTIAL AND PROPRIETARY INFORMATION
This documentation is provided for informational and operational purposes only. The specifications and technical details herein are subject to change without notice. Thox.ai LLC reserves all rights in the technologies, methods, and implementations described.
Nothing in this documentation shall be construed as granting any license or right to use any patent, trademark, trade secret, or other intellectual property right of Thox.ai LLC, except as expressly provided in a written agreement.
Patent Protection
The MagStack™ magnetic stacking interface technology, including the magnetic alignment system, automatic cluster formation, NFC-based device discovery, and distributed inference me...
Reverse Engineering Prohibited
You may not reverse engineer, disassemble, decompile, decode, or otherwise attempt to derive the source code, algorithms, data structures, or underlying ideas of any Thox.ai hardwa...
Thox.ai™, Thox OS™, MagStack™, and the Thox.ai logo are trademarks or registered trademarks of Thox.ai LLC in the United States and other countries.
NVIDIA, Jetson, TensorRT, and related marks are trademarks of NVIDIA Corporation. Ollama is a trademark of Ollama, Inc. All other trademarks are the property of their respective owners.
© 2026 Thox.ai LLC. All Rights Reserved.