4 11 1

Jeff

JiayuJeff

JiayuJeff

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Creative Robot Tool Use with Large Language Models

upvoted a paper 7 days ago

Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind

upvoted a paper 11 days ago

Adaptation of Agentic AI

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Creative Robot Tool Use with Large Language Models

Paper • 2310.13065 • Published Oct 19, 2023 • 10

upvoted a paper 7 days ago

Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind

Paper • 2601.15715 • Published 12 days ago • 13

upvoted 2 papers 11 days ago

Adaptation of Agentic AI

Paper • 2512.16301 • Published Dec 18, 2025 • 105

Ministral 3

Paper • 2601.08584 • Published 21 days ago • 51

authored a paper 12 days ago

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published 18 days ago • 30

upvoted a paper 14 days ago

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published 18 days ago • 30

submitted a paper to Daily Papers 14 days ago

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published 18 days ago • 30

upvoted a paper 20 days ago

The Confidence Dichotomy: Analyzing and Mitigating Miscalibration in Tool-Use Agents

Paper • 2601.07264 • Published 22 days ago • 24

authored a paper 3 months ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

Paper • 2510.24505 • Published Oct 28, 2025 • 4

upvoted a paper 3 months ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

Paper • 2510.24505 • Published Oct 28, 2025 • 4

commented 2 papers 3 months ago

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

Paper • 2510.24505 • Published Oct 28, 2025 • 4 •

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4, 2025 • 22 •

authored 2 papers 3 months ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4, 2025 • 22

Mathematical Proof as a Litmus Test: Revealing Failure Modes of Advanced Large Reasoning Models

Paper • 2506.17114 • Published Jun 20, 2025

upvoted a paper 3 months ago

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4, 2025 • 22

upvoted a paper 4 months ago

UserRL: Training Interactive User-Centric Agent via Reinforcement Learning

Paper • 2509.19736 • Published Sep 24, 2025 • 12

liked a model 5 months ago

emrecanacikgoz/Qwen2.5-7B-Instruct-ToolRL-grpo-cold

Updated Apr 22, 2025 • 81 • 3

upvoted a paper 6 months ago

UserBench: An Interactive Gym Environment for User-Centric Agents

Paper • 2507.22034 • Published Jul 29, 2025 • 30

New activity in osunlp/Mind2Web 6 months ago

Dataset Viewer issue: TooBigContentError

#7 opened 9 months ago by

coung21

authored a paper 6 months ago

Diversity-Enhanced Reasoning for Subjective Questions

Paper • 2507.20187 • Published Jul 27, 2025 • 26

Jeff

AI & ML interests

Recent Activity

Organizations

JiayuJeff's activity

Dataset Viewer issue: TooBigContentError