view article Article PaddleOCR 3.5: Running OCR and Document Parsing Tasks with a Transformers Backend PaddlePaddle • 2 days ago • 29
Scaling Laws for Mixture Pretraining Under Data Constraints Paper • 2605.12715 • Published 9 days ago • 4
view article Article How to Comply with SOC 2 and ISO 27001 with Hugging Face: A Practical Guide to AI Model Supply Chain Governance jeffboudier • 6 days ago • 5
view article Article Two Years of Local AI on a Laptop: When Open Models Outpaced Moore's Law mishig • 10 days ago • 22
view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 12 days ago • 37
Forecasting Open-Weight AI Model Growth on Hugging Face Paper • 2502.15987 • Published Feb 21, 2025 • 12
Zero-To-CAD Collection Datasets (1M & 100K) and model for synthesizing executable CAD programs from an LLM in a CadQuery environment. No real data used. • 3 items • Updated 25 days ago • 18
view article Article Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents nvidia • 22 days ago • 56
view article Article DeepSeek-V4: a million-token context that agents can actually use burtenshaw • 27 days ago • 46
view article Article AI and the Future of Cybersecurity: Why Openness Matters +1 meg, yjernite, clem • about 1 month ago • 38
MediaTech Collection Collection of public datasets from the French administration, chunked, vectorized and ready to use in AI projects. • 9 items • Updated Feb 4 • 9
The ATOM Report: Measuring the Open Language Model Ecosystem Paper • 2604.07190 • Published Apr 8 • 5