13 17 74

Yang

jacklanda

AI & ML interests

Reasoning, Mech Interp, Semantics

Recent Activity

new activity 2 days ago

RuleReasoner/RuleCollection-32K:Update README

new activity 2 days ago

RuleReasoner/RuleCollection-32K:Update README.md

liked a Space 3 days ago

JoaquinVanschoren/croissant-checker

View all activity

Organizations

Collections 3

View 3 collections

Papers 13

spaces 2

Croissant Checker - Dev

🔎

Validate Croissant JSON‑LD for NeurIPS submissions

Distinct

👀

Create a static web page by editing HTML

models 1

jacklanda/Qwen-2.5-1.5B-Simple-RL

Updated Feb 17, 2025

datasets 2

jacklanda/SemanticQA

Updated 16 days ago • 45

jacklanda/LexBench

Preview • Updated May 21, 2024 • 9

Yang

AI & ML interests

Recent Activity

Organizations

Collections 3

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

LM-Lexicon: Improving Definition Modeling via Harmonizing Semantic Experts

Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

LM-Lexicon: Improving Definition Modeling via Harmonizing Semantic Experts

Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Revisiting a Pain in the Neck: A Semantic Reasoning Benchmark for Language Models

Papers 13

spaces 2

Croissant Checker - Dev

Distinct

models 1

jacklanda/Qwen-2.5-1.5B-Simple-RL

datasets 2

jacklanda/SemanticQA

jacklanda/LexBench

Yang

AI & ML interests

Recent Activity

Organizations

Collections 3

Papers 13

spaces 2 Sort: Recently updated

Croissant Checker - Dev

Distinct

models 1

datasets 2 Sort: Recently updated

spaces 2

datasets 2