Running Featured 118 smolagents and tools gallery π¨ 118 Browse tools and agents to use in smolagents
view post Post 2761 NVIDIA just dropped a gigantic multimodal model called NVLM 72B π¦ nvidia/NVLM-D-72BPaper page NVLM: Open Frontier-Class Multimodal LLMs (2409.11402)The paper contains many ablation studies on various ways to use the LLM backbone ππ»π¦© Flamingo-like cross-attention (NVLM-X)π Llava-like concatenation of image and text embeddings to a decoder-only model (NVLM-D)β¨ a hybrid architecture (NVLM-H)Checking evaluations, NVLM-D and NVLM-H are best or second best compared to other models πThe released model is NVLM-D based on Qwen-2 Instruct, aligned with InternViT-6B using a huge mixture of different datasetsYou can easily use this model by loading it through transformers' AutoModel π π₯ 11 11 + Reply
view post Post 4062 If you feel like you missed out for ECCV 2024, there's an app to browse the papers, rank for popularity, filter for open models, datasets and demos π Get started at https://huggingface.co/spaces/ECCV/ECCV2024-papers β¨ π 11 11 π₯ 6 6 + Reply
Runtime error Featured 1.09k Open NotebookLM π 1.09k Personalised Podcasts For All - Available in 13 Languages