view article Article A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality +2 Mar 4 • 78
view article Article 🐺🐦⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark Jan 2 • 41
Adding Conditional Control to Text-to-Image Diffusion Models Paper • 2302.05543 • Published Feb 10, 2023 • 57
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published Dec 9, 2024 • 92
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 159
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 45
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 Jul 5, 2024 • 306
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12, 2024 • 71
view article Article Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages +8 May 24, 2024 • 27
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model +1 May 14, 2024 • 278
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper • 2405.04434 • Published May 7, 2024 • 24
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 875