airev-ae/qwen-agentic-json-cream
Preview • Updated • 46
World's first sub-1B parameter model with functional tool calling capability.
Built by AIREV for the OnDemand Agentic AI Platform.
| Metric | Score |
|---|---|
| JSON Validity (Easy queries) | 92% |
| Correct Plugin Selection | 94% |
| Exact Plugin ID Match (Easy) | 81% |
| Production Composite Score | 75.6% |
| Parameters | 752M |
| Quantized Size | ~400MB |
| Edge Inference Speed | ~30 tok/s |
Generates structured JSON execution plans for tool/plugin orchestration. Given a user request and available tools, it produces a valid JSON object specifying which tools to call, with what parameters, and in what order.
| Metric | Model | Base Qwen 0.8B | Improvement |
|---|---|---|---|
| Valid JSON | 94.0% | 18.0% | +76% |
| Correct Plugin IDs | 44.0% | 0.0% | +44% |
| Params Correct Type | 94.0% | 0.0% | +94% |
| Param Keys Match | 66.0% | 0.0% | +66% |
| Real Production IDs | 94.0% | 0.0% | +94% |
| Dependencies Present | 88.0% | 0.0% | +88% |
| Composite | 75.6% | 4.8% | +70.8% |
| Difficulty | JSON Valid | Real Plugin IDs | Exact Match |
|---|---|---|---|
| Easy (1 tool) | 92% | 92% | 81% |
| Medium (2-3 tools) | 96% | 96% | 4% |
This model was trained using a novel multi-stage approach developed by AIREV:
| Parameter | Value |
|---|---|
| Base Model | Qwen 3.5-0.8B (752M params) |
| Training Data | 47,400 samples (real production plugins) |
| SFT Epochs | 3 |
| GRPO Steps | 1,250 |
| Precision | bf16 |
| Hardware | NVIDIA H100 80GB |
| Attention | SDPA |
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained("airev-ae/Qwen-0.8B-AgentJSON", torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("airev-ae/Qwen-0.8B-AgentJSON")
messages = [
{"role": "system", "content": "You are an AI agent orchestrator. Generate a JSON execution plan."},
{"role": "user", "content": "Search for the latest AI news"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512, temperature=0.8, do_sample=True)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
@misc{airev2026agentjson,
title={AIREV Qwen-0.8B-AgentJSON: Sub-1B Tool Calling via Progressive Curriculum GRPO},
author={AIREV FZ-LLC},
year={2026},
url={https://huggingface.co/airev-ae/Qwen-0.8B-AgentJSON}
}
Built by AIREV | OnDemand Platform | Abu Dhabi, UAE
Trained with Progressive Curriculum GRPO — a novel approach for sub-1B structured output generation.