

LLaDA
#38 in Open-Source LLMsunknown · seit 2025-02-14 · 2× · zuletzt 30. Juni 2026
13
Momentum
LLaDA is a language model based on diffusion-based methods for text generation. What makes LLaDA distinctive is that it has demonstrated that diffusion-based approaches can scale into real large language models.
Momentum trend
04.04.03.07.
Features
| Benchmark Score (MMLU/Similar) | MMLU 5-shot: 65.9 (LLaDA 8B Base) — surpasses LLaMA3 8B Base (65.4) at equal training tokens (2.3T); GSM8K: 70.7; Math: 27.3; HumanEval: 33.5 |
| Context Window | Native up to 8,192 tokens (8k); LongBench tests show 8k is the supported context window – evaluated at 4k and 8k, content beyond that is truncated |
| Model Size (Parameters) | 8 billion parameters (8B), trained from scratch |
| Price Tier | Free / Open Source (Apache 2.0 licensed, weights publicly available on Hugging Face under GSAI-ML/LLaDA-8B-Base and GSAI-ML/LLaDA-8B-Instruct) |