Blog-explorers

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

ccocks-deca new activity 20 days ago

blog-explorers/README:Card forensics using VLM localized prompt approach

kargaranamir authored a paper 23 days ago

Insights from the ICLR Peer Review and Rebuttal Process

ccocks-deca new activity about 2 months ago

blog-explorers/README:Pending Blog-Explorers Access Request

View all activity

elismasilva

posted an update 15 days ago

Post

283

Hey everyone,

I've built and deployed Panorama FLUX, a Gradio app for creating ultra-wide panoramic images from three different text prompts using the FLUX.1-schnell model.

It uses a custom "Mixture of Diffusers" pipeline to generate and seamlessly blend each section of the image.

Key Features:
- Multi-Prompt Input: Control the left, center, and right of the scene with unique prompts.
- Seamless Blending: Choose between Cosine and Gaussian blending methods to eliminate seams between tiles.
- Optimized for FLUX.1-schnell: Designed for fast, 4-step generation with embedded guidance.
- Multi-Language Support: On-the-fly translation for prompts written in Korean and Chinese.
- Memory Efficient: Supports both custom (mmgp) and standard diffusers offloading for use on consumer GPUs or in Spaces.

This was a fun project that involved deep-diving into the FLUX architecture to get the tiling, guidance, and positional embeddings right.

Try it out!
🚀 Live Demo on Hugging Face Spaces:
elismasilva/flux-1-panorama

Jofthomas

posted an update 15 days ago

Post

3437

The new Mistral 3 models are here !

Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters.

All models are released under the Apache 2.0 license.

Ministrals :
https://huggingface.co/collections/mistralai/ministral-3

Mistral Large 3:
https://huggingface.co/collections/mistralai/mistral-large-3

2 replies

reach-vb

authored a paper 20 days ago

Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation

Paper • 2510.06961 • Published Oct 8 • 10

ccocks-deca

in blog-explorers/README 20 days ago

Card forensics using VLM localized prompt approach

#11 opened 21 days ago by

vijayksagi

kargaranamir

authored a paper 23 days ago

Insights from the ICLR Peer Review and Rebuttal Process

Paper • 2511.15462 • Published 28 days ago • 6

elismasilva

posted an update about 1 month ago

Post

2562

I'm thrilled to launch SUP Toolbox! An AI tool for image restoration & upscaling using SUPIR, FaithDiff & ControlUnion. Built with Diffusers and Gradio UI featuring 14 custom components I developed.

Space Demo: elismasilva/sup-toolbox-app
App repository: https://github.com/DEVAIEXP/sup-toolbox-app
CLI repository: https://github.com/DEVAIEXP/sup-toolbox

Feedback is welcome!

2 replies

jasoncorkill

posted an update about 1 month ago

Post

4688

Do you remember https://thispersondoesnotexist.com/ ? It was one of the first cases where the future of generative media really hit us. Humans are incredibly good at recognizing and analyzing faces, so they are a very good litmus test for any generative image model.

But none of the current benchmarks measure the ability of models to generate humans independently. So we built our own. We measure the models ability to generate a diverse set of human faces and using over 20'000 human annotations we ranked all of the major models on their ability to generate faces. Find the full ranking here:
https://app.rapidata.ai/mri/benchmarks/68af24ae74482280b62f7596

We have release the full underlying data publicly here on huggingface: Rapidata/Face_Generation_Benchmark

ccocks-deca

in blog-explorers/README about 2 months ago

Pending Blog-Explorers Access Request

#10 opened about 2 months ago by

KarthikAvinash

pagezyhf

posted an update about 2 months ago

Post

2803

🚀 Big news for AI builders!

We’re thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.

We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.

🔍 Highlights:

- Deploy Qwen3-VL instantly via managed endpoints
- Built-in governance, telemetry, and lifecycle management
- True multimodal reasoning — vision, language, and code understanding
- State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5
- Available in both *Instruct* and *Thinking* modes, across 24 model sizes

👉 Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.

1 reply

meg

posted an update about 2 months ago

Post

3788

🤖 Did you know your voice might be cloned without your consent from just *one sentence* of audio?
That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: ✨The Voice Consent Gate.✨
Details, code, here: https://huggingface.co/blog/voice-consent-gate

3 replies

kargaranamir

authored a paper 2 months ago

CoBia: Constructed Conversations Can Trigger Otherwise Concealed Societal Biases in LLMs

Paper • 2510.09871 • Published Oct 10 • 2

pagezyhf

posted an update 3 months ago

Post

847

What’s your biggest headache deploying Hugging Face models to the cloud—and how can we fix it for you?

8 replies

KnutJaegersberg

posted an update 3 months ago

Post

525

The Formative Mind: Theories of Consciousness as Practice

Instead of treating consciousness as a passive byproduct of a powerful unconscious engine, think of it as the engine itself: a process that builds rich representations (self-organizing), predicts and models its own processing (metarepresentation), and thereby brings an agent and its world into being (individuation). A brief synthesis.

https://huggingface.co/blog/KnutJaegersberg/formative-mind

mariagrandury

authored 4 papers 3 months ago

Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings

Paper • 2509.14405 • Published Sep 17 • 2

Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans

Paper • 2506.22439 • Published May 29 • 3

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17 • 14

La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America

Paper • 2507.00999 • Published Jul 1 • 1

meg

posted an update 3 months ago

Post

2903

🤖 As AI-generated content is shared in movies/TV/across the web, there's one simple low-hanging fruit 🍇 to help know what's real: Visible watermarks. With the Gradio team, I've made sure it's trivially easy to add this disclosure to images, video, chatbot text. See how: https://huggingface.co/blog/watermarking-with-gradio
Thanks to the code collab in particular from @abidlabs and Yuvraj Sharma.

Hguimaraes

authored 2 papers 3 months ago

UrBAN: Urban Beehive Acoustics and PheNotyping Dataset

Paper • 2406.03657 • Published Jun 5, 2024

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Paper • 2411.05361 • Published Nov 8, 2024 • 3

AI & ML interests

Recent Activity

Team members 752

blog-explorers's activity

Card forensics using VLM localized prompt approach

Pending Blog-Explorers Access Request