Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
On Vacation ποΈ
23
16
27
Ji Xie
PRO
sanaka87
Follow
XXXMARK's profile picture
bnu-wangxun's profile picture
PsychAnshul's profile picture
68 followers
Β·
23 following
https://horizonwind2004.github.io/
HorizonWind2004
HorizonWind2004
AI & ML interests
Generative Model
Recent Activity
liked
a dataset
16 days ago
Marlo-Z/SegLLM_dataset
reacted
to
their
post
with π₯
20 days ago
π Introducing VideoCoF: Unified Video Editing with a Temporal Reasoner (Chain-of-Frames)! Weβre excited to introduce VideoCoF, a unified framework for instruction-based video editing that enables temporal reasoning and ~4Γ video length extrapolation, trained with only 50k video pairs. π₯ π What makes VideoCoF different? π§ Chain-of-Frames reasoning , mimic human thinking process like Seeing β Reasoning β Editing to apply edits accurately over time without external masks, ensuring physically plausible results. π Strong length generalization β trained on 33-frame clips, yet supports multi-shot editing and long-video extrapolation (~4Γ). π― Unified fine-grained editing β Object Removal, Addition, Swap, and Local Style Transfer, with instance-level & part-level, spatial-aware control. β‘ Fast inference update π H100: ~20s / video with 4-step inference, making high-quality video editing far more practical for real-world use. π Links π Paper: https://arxiv.org/abs/2512.07469 π» Code: https://github.com/knightyxp/VideoCoF π€ Demo: https://huggingface.co/spaces/XiangpengYang/VideoCoF π§© Models: https://huggingface.co/XiangpengYang/VideoCoF π Project Page: https://videocof.github.io/ #VideoEditing #DiffusionModels #GenerativeAI #ComputerVision #AI
posted
an
update
21 days ago
π Introducing VideoCoF: Unified Video Editing with a Temporal Reasoner (Chain-of-Frames)! Weβre excited to introduce VideoCoF, a unified framework for instruction-based video editing that enables temporal reasoning and ~4Γ video length extrapolation, trained with only 50k video pairs. π₯ π What makes VideoCoF different? π§ Chain-of-Frames reasoning , mimic human thinking process like Seeing β Reasoning β Editing to apply edits accurately over time without external masks, ensuring physically plausible results. π Strong length generalization β trained on 33-frame clips, yet supports multi-shot editing and long-video extrapolation (~4Γ). π― Unified fine-grained editing β Object Removal, Addition, Swap, and Local Style Transfer, with instance-level & part-level, spatial-aware control. β‘ Fast inference update π H100: ~20s / video with 4-step inference, making high-quality video editing far more practical for real-world use. π Links π Paper: https://arxiv.org/abs/2512.07469 π» Code: https://github.com/knightyxp/VideoCoF π€ Demo: https://huggingface.co/spaces/XiangpengYang/VideoCoF π§© Models: https://huggingface.co/XiangpengYang/VideoCoF π Project Page: https://videocof.github.io/ #VideoEditing #DiffusionModels #GenerativeAI #ComputerVision #AI
View all activity
Organizations
None yet
sanaka87
's Spaces
4
Sort:Β Recently updated
Sleeping
VideoCoF
π₯
Unified Video Editing with Temporal Reasoner
Running
undefined
π³
Explore Ji Xie's AI research portfolio
Running
on
Zero
18
BAGEL
π
Demo for BAGEL
Running
reca-page
π³