idea - a Hankto Collection

Hankto 's Collections

idea

updated about 8 hours ago

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published about 1 month ago • 151
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data

Paper • 2603.15594 • Published 12 days ago • 145
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence

Paper • 2603.13398 • Published 17 days ago • 150
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders

Paper • 2603.06569 • Published 22 days ago • 117
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 25 days ago • 100
Demystifing Video Reasoning

Paper • 2603.16870 • Published 11 days ago • 361
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published 16 days ago • 63
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Paper • 2603.03269 • Published 25 days ago • 61
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

Paper • 2603.06577 • Published 22 days ago • 48
Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 11 days ago • 57
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising

Paper • 2603.08703 • Published 19 days ago • 31
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Paper • 2603.09095 • Published 19 days ago • 28
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

Paper • 2603.25702 • Published 2 days ago • 4
MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution

Paper • 2603.18718 • Published 10 days ago • 6
Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Paper • 2603.13904 • Published 14 days ago • 2