The ATOM Report: Measuring the Open Language Model Ecosystem Paper • 2604.07190 • Published 19 days ago • 5
CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production Paper • 2603.01973 • Published Mar 2 • 7
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 39
When Judgment Becomes Noise: How Design Failures in LLM Judge Benchmarks Silently Undermine Validity Paper • 2509.20293 • Published Sep 24, 2025 • 8
Story2Board: A Training-Free Approach for Expressive Storyboard Generation Paper • 2508.09983 • Published Aug 13, 2025 • 70
When Do Neural Nets Outperform Boosted Trees on Tabular Data? Paper • 2305.02997 • Published May 4, 2023
MARVIS: Modality Adaptive Reasoning over VISualizations Paper • 2507.01544 • Published Jul 2, 2025 • 13
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 78
LiveBench: A Challenging, Contamination-Free LLM Benchmark Paper • 2406.19314 • Published Jun 27, 2024 • 23
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks Paper • 2402.11137 • Published Feb 17, 2024
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models Paper • 2310.18208 • Published Oct 27, 2023
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2, 2025 • 158
Perpetuating Misogyny with Generative AI: How Model Personalization Normalizes Gendered Harm Paper • 2505.04600 • Published May 7, 2025
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 207