engineerA314
's Collections
interesting papers
updated
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth
Approach
Paper
•
2502.05171
•
Published
•
151
Agency Is Frame-Dependent
Paper
•
2502.04403
•
Published
•
23
Distillation Scaling Laws
Paper
•
2502.08606
•
Published
•
47
LLM Pretraining with Continuous Concepts
Paper
•
2502.08524
•
Published
•
29
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence
Generation up to 100K Tokens
Paper
•
2502.18890
•
Published
•
30
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four
Habits of Highly Effective STaRs
Paper
•
2503.01307
•
Published
•
38
RWKV-7 "Goose" with Expressive Dynamic State Evolution
Paper
•
2503.14456
•
Published
•
153
Transformers without Normalization
Paper
•
2503.10622
•
Published
•
170
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Paper
•
2503.07572
•
Published
•
47
Implicit Reasoning in Transformers is Reasoning through Shortcuts
Paper
•
2503.07604
•
Published
•
23