

Nemotron-3
#26 in Frontier LLMsnvidia · v3 · seit 4. Juni 2026 · 45× · zuletzt 29. Juni 2026
42
Momentum
NVIDIA Nemotron-3 Ultra (550B-A55B) is a fully open frontier-scale language model with 550 billion total parameters and 55 billion active parameters per forward pass. It uses a hybrid LatentMoE architecture (interleaved Mamba-2, MoE, and selective Attention layers) with Multi-Token Prediction layers, pretrained on approximately 20 trillion tokens using NVFP4 precision. The model targets long-running agentic workflows (multi-step reasoning, tool use, code, math, science) and was released on June 4, 2026 under the permissive OpenMDW-1.1 open-source license, including weights, training data, and recipes.
Momentum trend
04.04.03.07.
Features
| Key Benchmark (%) | Artificial Analysis Intelligence Index: 48 (or 47.7) – highest score of a US open-weight model (as of June 2026) |
| Context Window (Tokens) | up to 1,000,000 tokens |
| License | OpenMDW License Agreement v1.1 (open weights, data & training recipes) |
| Multimodality | Text-only (input/output); no native image, audio, or video understanding |
| Platform | NVIDIA GPUs (Hopper, Blackwell, Ampere); deployment via vLLM, SGLang, TensorRT-LLM, NIM Microservices, Hugging Face |
| Price per 1M Tokens | approx. $0.50 input / $2.20–2.50 output per 1M tokens (varies by hosting provider) |
| Release Date | June 4, 2026 (announced June 1, 2026, Computex Taipei) |