gemma3-1b-cardholder-task
On-device, INT4-quantized MediaPipe .task build of google/gemma-3-1b-it,
LoRA fine-tuned to emit search_contacts(...) tool calls for an offline business-card /
contact-holder Android app (via expo-llm-mediapipe).
The model decides whether a user request needs a contact search and, if so, emits a single
structured search_contacts(...) call. It runs fully offline on phones with as little as
2–4 GB RAM.
Files
| File | Size | Notes |
|---|---|---|
gemma3-1b-cardholder-int4.task |
~665 MB | INT4 weights (4-bit feed-forward, 8-bit attention/embedding), CPU backend, self-contained tokenizer |
Tool contract
The model is trained to call a single function:
search_contacts(name=..., company=..., title=..., email=..., phone=..., note=...)
All arguments are optional; only the fields relevant to the request are filled. For small-talk / non-search requests the model replies in natural language without a tool call.
Training
- Base:
google/gemma-3-1b-it(text-only, ~1B params) - Method: LoRA (r=16, alpha=16, dropout=0), 4-bit QLoRA via Unsloth
- Data: 5,000 multi-turn function-calling examples (es / en / no / pt / de / fr)
- Schedule: 3 epochs, 900 steps, effective batch 16, LR 2e-4 cosine,
train_on_responses_only - Final train loss: 0.0515
Evaluation (held-out 200 examples)
| Metric | Value |
|---|---|
| Examples checked | 200 |
| Should tool-call | 176 |
| Valid parseable calls | 176 (100.0%) |
| Gate (≥95%) | PASS |
Usage
Load with MediaPipe LLM Inference / expo-llm-mediapipe on Android and point the runtime
at gemma3-1b-cardholder-int4.task.
License
Use is governed by the Gemma Terms of Use.