nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Text Generation • 124B • Updated 13 days ago • 270k • 320
Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models Paper • 2603.01571 • Published Mar 2 • 33
view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day Dec 8, 2025 • 57
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 307