Nanbeige/Nanbeige4-3B-Thinking-2511
Text Generation
•
4B
•
Updated
•
6.68k
•
169
Note Outperforms Qwen3 4B Thinking at a slightly smaller size.
Note Doesn't usually overthink, massive improvement over the previous 1.5 model. Outstanding intelligence for a 15B model.
Note On par with 3-4B models.
Note On par with 4-8B models.
Note Overthinks, but good proof-of-concept. Similar in intelligence to Apriel 1.5 Thinker (a 15B model), but not as good at agentic tasks. A bit benchmaxxed, and not so great at general knowledge. Better with RAG.
Note Experimental model.