view article Article Streaming datasets: 100x More Efficient +3 andito, lhoestq, burtenshaw, pcuenq, merve • Oct 27, 2025 • 86
view article Article Introducing Storage Buckets on the Hugging Face Hub +10 Wauplin, coyotte508, XciD, victor, julien-c, lhoestq, pierric, Sylvestre, hlarcher, rajatarya, seanses, assafvayner • Mar 10 • 194
The infrastructure powering IBM's Gen AI model development Paper • 2407.05467 • Published Jul 7, 2024 • 3
view article Article From Files to Chunks: Improving HF Storage Efficiency jsulz, erinys • Nov 20, 2024 • 73
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub +2 jsulz, yuchenglow, znation, saba9 • Feb 12, 2025 • 80
view article Article Xet is on the Hub +4 assafvayner, brianronan, seanses, jgodlewski, sirahd, jsulz • Mar 18, 2025 • 80
view changelog Hugging Face Changelog Introducing HF Jobs: Run scalable compute jobs on Hugging Face Jul 30, 2025 • 203
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels drbh, danieldk • Aug 18, 2025 • 98
view article Article Announcing the Synthetic Online Conversations Dataset (SOC) marcodsn • Aug 12, 2025 • 13
view article Article LLMGameHub: How We Won the Gradio Agents & MCP Hackathon 2025 kikikita • Jul 28, 2025 • 20
view article Article How to Train Your LLM Web Agent: A Statistical Diagnosis ppEmiliano • Jul 8, 2025 • 15
view article Article Efficient Request Queueing – Optimizing LLM Performance tngtech • Apr 2, 2025 • 26
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance tngtech • Apr 16, 2025 • 78
view article Article How Long Prompts Block Other Requests - Optimizing LLM Performance tngtech • Jun 12, 2025 • 13
view article Article What's Software 3.0? (Spoiler: You're Already Using It) fdaudens • Jun 19, 2025 • 3