PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents Paper • 2605.10341 • Published 12 days ago • 32
Efficient Training on Multiple Consumer GPUs with RoundPipe Paper • 2604.27085 • Published 24 days ago • 40
Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling Paper • 2604.28075 • Published 23 days ago • 20
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale Paper • 2604.04771 • Published Apr 6 • 123
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 898
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook Paper • 2604.02029 • Published Apr 2 • 151
view article Article Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline nvidia • Mar 13 • 40
Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs Paper • 2603.16932 • Published Mar 14 • 89
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published Mar 11 • 155
view article Article Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model nvidia • Feb 4 • 28
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders thomwolf, matthieu-lapeyre • Jul 9, 2025 • 800
GutenOCR: A Grounded Vision-Language Front-End for Documents Paper • 2601.14490 • Published Jan 20 • 37
Typhoon OCR: Open Vision-Language Model For Thai Document Extraction Paper • 2601.14722 • Published Jan 21 • 15
PP-OCRv5 Collection PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated 10 days ago • 56
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 268