2 63 7

knightnemo

https://knightnemo.github.io

AI & ML interests

World Models, World Action Models, VLA Models, Test-time Adaptation & Self-Improvement, Dexterous Manipulation.

Recent Activity

updated a model about 10 hours ago

knightnemo/robotwin-icl-v3-arx-x5-vam-ti2v5b-openwam-eef-mask-v3-no-ref-train20-100k

published a model about 10 hours ago

knightnemo/robotwin-icl-v3-arx-x5-vam-ti2v5b-openwam-eef-mask-v3-no-ref-train20-100k

updated a model 1 day ago

knightnemo/robotwin-icl-v3-arx-x5-vam-ti2v5b-openwam-eef-ref-train20-100k

View all activity

Organizations

upvoted a paper 3 days ago

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Paper • 2605.18739 • Published 7 days ago • 108

upvoted a collection 3 days ago

Cambrian-P Models

Collection

5 items • Updated 3 days ago • 1

upvoted a paper 10 days ago

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published 11 days ago • 80

upvoted 2 papers 13 days ago

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

Paper • 2605.08678 • Published 16 days ago • 8

STARFlow2: Bridging Language Models and Normalizing Flows for Unified Multimodal Generation

Paper • 2605.08029 • Published 17 days ago • 12

upvoted a paper 19 days ago

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published 21 days ago • 336

upvoted a collection 21 days ago

Nano-World-Model

Collection

🌍 A minimalist repository for training video world models based on diffusion-forcing. • 20 items • Updated 8 days ago • 7

upvoted 2 papers 24 days ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published 28 days ago • 71

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

Paper • 2604.26694 • Published 26 days ago • 6

upvoted a paper about 1 month ago

ELT: Elastic Looped Transformers for Visual Generation

Paper • 2604.09168 • Published Apr 10 • 23

upvoted 2 papers 2 months ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 154

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91

upvoted 3 papers 4 months ago

upvoted 5 papers 6 months ago

Generative Neural Video Compression via Video Diffusion Prior

Paper • 2512.05016 • Published Dec 4, 2025 • 10

What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards

Paper • 2512.00425 • Published Nov 29, 2025 • 53

First Frame Is the Place to Go for Video Content Customization

Paper • 2511.15700 • Published Nov 19, 2025 • 54

WorldGen: From Text to Traversable and Interactive 3D Worlds

Paper • 2511.16825 • Published Nov 20, 2025 • 24

RynnVLA-002: A Unified Vision-Language-Action and World Model

Paper • 2511.17502 • Published Nov 21, 2025 • 28

knightnemo

AI & ML interests

Recent Activity

Organizations

knightnemo's activity