Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
engineerA314 's Collections
interesting papers

interesting papers

updated Mar 20, 2025
Upvote
1

  • Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

    Paper • 2502.05171 • Published Feb 7, 2025 • 151

  • Agency Is Frame-Dependent

    Paper • 2502.04403 • Published Feb 6, 2025 • 23

  • Distillation Scaling Laws

    Paper • 2502.08606 • Published Feb 12, 2025 • 47

  • LLM Pretraining with Continuous Concepts

    Paper • 2502.08524 • Published Feb 12, 2025 • 29

  • From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens

    Paper • 2502.18890 • Published Feb 26, 2025 • 30

  • Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

    Paper • 2503.01307 • Published Mar 3, 2025 • 38

  • RWKV-7 "Goose" with Expressive Dynamic State Evolution

    Paper • 2503.14456 • Published Mar 18, 2025 • 153

  • Transformers without Normalization

    Paper • 2503.10622 • Published Mar 13, 2025 • 170

  • Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Paper • 2503.07572 • Published Mar 10, 2025 • 47

  • Implicit Reasoning in Transformers is Reasoning through Shortcuts

    Paper • 2503.07604 • Published Mar 10, 2025 • 23
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs