TINY MODELS WITH BIG INTELLIGENCE - a urroxyz Collection

urroxyz 's Collections

✨ free demo spaces

WTF GENIUS PAPERS

TINY MODELS WITH BIG INTELLIGENCE

ETHICALLY-DECENT & LEGALLY-ADJACENT

HUMAN-WRITTEN & LEGALLY-SOURCED*

ATTENTIVE ASR MODELS FOR ONNX

TINY MODELS WITH BIG INTELLIGENCE

updated 10 days ago

Tiny (<30B) models that tend to outperform their same-parameter counterparts.

prism-ml/Bonsai-8B-gguf

Text Generation • 8B • Updated 4 days ago • 71.7k • 558
Qwen/Qwen3.5-27B

Image-Text-to-Text • 28B • Updated Feb 25 • 3.14M • • 905
Qwen/Qwen3.5-9B

Image-Text-to-Text • 10B • Updated Mar 2 • 5.24M • • 1.23k
cerebras/GLM-4.7-Flash-REAP-23B-A3B

Text Generation • 23B • Updated Jan 23 • 138k • 69

Note 30 on the Artificial Analysis Intelligence Index (Jan '26), beating GPT-OSS 20B, and only 3 points behind the larger 120B variant. More than HALF as intelligent as its big sibling, GLM 4.7 (Reasoning). Only 23B when pruned for "unused" experts. Uniquely good for its size, and MoE; only 3B params active per token. https://artificialanalysis.ai/models/glm-4-7-flash GGUF: unsloth/GLM-4.7-Flash-REAP-23B-A3B-GGUF https://huggingface.co/unsloth/GLM-4.7-Flash-REAP-23B-A3B-GGUF
janhq/Jan-v3-4B-base-instruct

Text Generation • 4B • Updated Feb 2 • 276 • 60

Note Beats Qwen3 4B Thinking... But it's not a thinking model. Just instruct! Same param count.
ServiceNow-AI/Apriel-1.6-15b-Thinker

Image-Text-to-Text • Updated Dec 22, 2025 • 926 • 297

Note Doesn't usually overthink, massive improvement over the previous 1.5 model. Outstanding intelligence for a 15B model.
Nanbeige/Nanbeige4.1-3B

Text Generation • 4B • Updated 18 days ago • 402k • • 1.03k

Note Outperforms Qwen 30B models at almost 1/10 the size.
Alibaba-Apsara/DASD-4B-Thinking

Text Generation • Updated Jan 15 • 394 • 217

Note Born from a great paper. Visibly outperforms all models of similar size.
Nanbeige/Nanbeige4-3B-Thinking-2511

Text Generation • 4B • Updated Dec 17, 2025 • 1.89k • 204

Note Outperforms Qwen3 4B Thinking at a slightly smaller size.
ByteDance/Ouro-1.4B-Thinking

Text Generation • Updated Feb 26 • 1.62k • 35

Note On par with 3-4B models.
ByteDance/Ouro-2.6B-Thinking

Text Generation • Updated Feb 26 • 4.36k • 109

Note On par with 4-8B models.
tiiuae/Falcon-H1R-7B

Text Generation • Updated Jan 21 • 2.36k • 218

Note Overthinks, but good proof-of-concept. Similar in intelligence to Apriel 1.5 Thinker (a 15B model), but not as good at agentic tasks. A bit benchmaxxed, and not so great at general knowledge. Better with RAG.
tiiuae/Falcon-H1-Tiny-R-0.6B

Text Generation • 0.6B • Updated Feb 4 • 190 • 16

Note Competitive with Qwen3 4B Instruct but at just 600M params. Check out the blog post, it's pretty cool. IMPORTANT: Has no general knowledge whatsoever. Only trained for logic/reasoning (math/coding).
AiAsistent/Gemma3-4B-Dark-Chain-of-Thought-CoT

Text Generation • 4B • Updated Jan 3 • 15 • 16

Note Experimental model.
AngelSlim/HY-1.8B-2Bit-GGUF

2B • Updated 24 days ago • 641 • 39
Qwen/Qwen3.5-4B

Image-Text-to-Text • 5B • Updated Mar 2 • 2.94M • 447
LiquidAI/LFM2.5-350M

Text Generation • 0.4B • Updated 10 days ago • 30.6k • 265