Frank Sommers's picture

Frank Sommers PRO

fsommers

·

fsommers

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents

upvoted a collection 15 days ago

upvoted a paper 15 days ago

Efficient Training on Multiple Consumer GPUs with RoundPipe

View all activity

Organizations

upvoted a paper 10 days ago

PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents

Paper • 2605.10341 • Published 12 days ago • 32

upvoted a collection 15 days ago

Gemma 4

12 items • Updated 17 days ago • 840

upvoted a paper 15 days ago

Efficient Training on Multiple Consumer GPUs with RoundPipe

Paper • 2604.27085 • Published 24 days ago • 40

upvoted a paper 17 days ago

Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling

Paper • 2604.28075 • Published 23 days ago • 20

upvoted a collection 25 days ago

DeepSeek-V4

4 items • Updated 29 days ago • 652

upvoted a paper about 2 months ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published Apr 6 • 123

upvoted an article about 2 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

+5

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 898

upvoted 2 papers about 2 months ago

BHDD: A Burmese Handwritten Digit Dataset

Paper • 2603.21966 • Published Mar 23 • 1

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Paper • 2604.02029 • Published Apr 2 • 151

upvoted an article about 2 months ago

Article

Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline

nvidia

•

Mar 13

• 40

upvoted a paper about 2 months ago

Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs

Paper • 2603.16932 • Published Mar 14 • 89

upvoted a paper 2 months ago

Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published Mar 11 • 155

upvoted a collection 3 months ago

Qwen3.5

21 items • Updated Mar 9 • 1.64k

upvoted an article 3 months ago

Article

Nemotron ColEmbed V2: Raising the Bar for Multimodal Retrieval with ViDoRe V3’s Top Model

nvidia

•

Feb 4

• 28

upvoted an article 4 months ago

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

thomwolf, matthieu-lapeyre

•

Jul 9, 2025

• 800

upvoted 3 papers 4 months ago

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 70

GutenOCR: A Grounded Vision-Language Front-End for Documents

Paper • 2601.14490 • Published Jan 20 • 37

Typhoon OCR: Open Vision-Language Model For Thai Document Extraction

Paper • 2601.14722 • Published Jan 21 • 15

upvoted a collection 4 months ago

PP-OCRv5

PP-OCRv5 is the latest text recognition solution, supporting Simplified Chinese, Chinese Pinyin, Traditional Chinese, English, and Japanese • 13 items • Updated 10 days ago • 56

upvoted a paper 6 months ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 268