magicwpf's picture

2 50

magicwpf

magicwpf

·

https://magicwpf.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

GARDO: Reinforcing Diffusion Models without Reward Hacking

upvoted a paper 16 days ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

upvoted a paper 21 days ago

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

View all activity

Organizations

None yet

upvoted a paper 9 days ago

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published 16 days ago • 28

upvoted a paper 16 days ago

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models

Paper • 2512.15560 • Published 29 days ago • 24

upvoted a paper 21 days ago

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

Paper • 2512.21094 • Published 22 days ago • 24

upvoted a paper 22 days ago

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published 23 days ago • 91

upvoted 3 papers 27 days ago

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Paper • 2512.16905 • Published 28 days ago • 31

StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Paper • 2512.16915 • Published 28 days ago • 37

Kling-Omni Technical Report

Paper • 2512.16776 • Published 28 days ago • 166

upvoted 2 papers 29 days ago

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Paper • 2512.14699 • Published 30 days ago • 27

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Paper • 2512.12675 • Published Dec 14, 2025 • 40

upvoted 5 papers about 1 month ago

KlingAvatar 2.0 Technical Report

Paper • 2512.13313 • Published about 1 month ago • 42

SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Paper • 2512.11749 • Published Dec 12, 2025 • 38

UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation

Paper • 2512.07831 • Published Dec 8, 2025 • 16

ViDiC: Video Difference Captioning

Paper • 2512.03405 • Published Dec 3, 2025 • 27

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

Paper • 2512.03041 • Published Dec 2, 2025 • 62

upvoted 4 papers about 2 months ago

Terminal Velocity Matching

Paper • 2511.19797 • Published Nov 24, 2025 • 12

Monet: Reasoning in Latent Visual Space Beyond Images and Language

Paper • 2511.21395 • Published Nov 26, 2025 • 16

Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO

Paper • 2511.16669 • Published Nov 20, 2025 • 31

Simulating the Visual World with Artificial Intelligence: A Roadmap

Paper • 2511.08585 • Published Nov 11, 2025 • 29

upvoted 2 papers 2 months ago

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Paper • 2511.07250 • Published Nov 10, 2025 • 17

VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models

Paper • 2511.02712 • Published Nov 4, 2025 • 4