1 5 2

Lu Li

luli2949

AI & ML interests

None yet

Recent Activity

liked a Space 2 months ago

HuggingFaceTB/smol-training-playbook

authored a paper 2 months ago

STRICT: Stress Test of Rendering Images Containing Text

authored a paper 2 months ago

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

View all activity

Organizations

liked a Space 2 months ago

The Smol Training Playbook

📚

2.81k

The secrets to building world-class LLMs

authored 3 papers 2 months ago

STRICT: Stress Test of Rendering Images Containing Text

Paper • 2505.18985 • Published May 25, 2025

MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation

Paper • 2406.07529 • Published Jun 11, 2024

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 221

upvoted a paper 2 months ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29, 2025 • 221

upvoted a collection 8 months ago

Context Clues

Collection

Models from the paper Context Clues • 16 items • Updated Nov 7, 2025 • 8

upvoted an article 11 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

•

717

liked a dataset about 1 year ago

danjacobellis/chexpert

Viewer • Updated Jul 18, 2024 • 224k • 677 • 15

upvoted a collection over 1 year ago

Qwen2

Collection

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 8 days ago • 374

authored a paper over 1 year ago

VCR: Visual Caption Restoration

Paper • 2406.06462 • Published Jun 10, 2024 • 13

upvoted a paper over 1 year ago