Natural and Artificial Intelligence Lab

university

https://k-r-allen.github.io

AI & ML interests

None defined yet.

Recent Activity

AgPerry updated a dataset 3 days ago

NAIL-Group/ClawBenchV1Trace

AgPerry submitted a paper 9 days ago

RewardHarness: Self-Evolving Agentic Post-Training

AgPerry updated a dataset 10 days ago

NAIL-Group/ClawBenchV2Trace

View all activity

Papers

RewardHarness: Self-Evolving Agentic Post-Training

ClawBench: Can AI Agents Complete Everyday Online Tasks?

View all Papers

updated a dataset 3 days ago

NAIL-Group/ClawBenchV1Trace

Updated 3 days ago • 1.32k

submitted a paper to Daily Papers 9 days ago

RewardHarness: Self-Evolving Agentic Post-Training

Paper • 2605.08703 • Published 15 days ago • 9

updated 2 datasets 10 days ago

NAIL-Group/ClawBenchV2Trace

Updated 10 days ago • 386

NAIL-Group/ClawBench

Viewer • Updated 10 days ago • 153 • 405 • 2

updated a Space 12 days ago

ClawBench Leaderboard

Live leaderboard for the ClawBench web-agent benchmark

updated a collection 12 days ago

ClawBench — Browser Agent Benchmark Suite

Benchmark dataset (V1+V2), live leaderboard Space, and full V1 execution traces — everything you need to run, regrade, or compare on ClawBench. • 5 items • Updated 12 days ago • 1

published a dataset 12 days ago

NAIL-Group/ClawBenchV2Trace

Updated 10 days ago • 386

updated a collection 13 days ago

ClawBench — Browser Agent Benchmark Suite

Benchmark dataset (V1+V2), live leaderboard Space, and full V1 execution traces — everything you need to run, regrade, or compare on ClawBench. • 5 items • Updated 12 days ago • 1

published a Space 14 days ago

ClawBench Leaderboard

Live leaderboard for the ClawBench web-agent benchmark

published a dataset 19 days ago

NAIL-Group/ClawBenchV1Trace

Updated 3 days ago • 1.32k

published a bucket about 1 month ago

NAIL-Group/README-storage

published a Space about 1 month ago

README

published a dataset about 1 month ago

NAIL-Group/ClawBench

Viewer • Updated 10 days ago • 153 • 405 • 2

submitted 2 papers to Daily Papers about 1 month ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 263

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published Apr 9 • 263

submitted a paper to Daily Papers about 2 months ago

Watch Before You Answer: Learning from Visually Grounded Post-Training

Paper • 2604.05117 • Published Apr 6 • 36