gemma3-1b-cardholder-task

On-device, INT4-quantized MediaPipe .task build of google/gemma-3-1b-it, LoRA fine-tuned to emit search_contacts(...) tool calls for an offline business-card / contact-holder Android app (via expo-llm-mediapipe).

The model decides whether a user request needs a contact search and, if so, emits a single structured search_contacts(...) call. It runs fully offline on phones with as little as 2–4 GB RAM.

Files

File Size Notes
gemma3-1b-cardholder-int4.task ~665 MB INT4 weights (4-bit feed-forward, 8-bit attention/embedding), CPU backend, self-contained tokenizer

Tool contract

The model is trained to call a single function:

search_contacts(name=..., company=..., title=..., email=..., phone=..., note=...)

All arguments are optional; only the fields relevant to the request are filled. For small-talk / non-search requests the model replies in natural language without a tool call.

Training

  • Base: google/gemma-3-1b-it (text-only, ~1B params)
  • Method: LoRA (r=16, alpha=16, dropout=0), 4-bit QLoRA via Unsloth
  • Data: 5,000 multi-turn function-calling examples (es / en / no / pt / de / fr)
  • Schedule: 3 epochs, 900 steps, effective batch 16, LR 2e-4 cosine, train_on_responses_only
  • Final train loss: 0.0515

Evaluation (held-out 200 examples)

Metric Value
Examples checked 200
Should tool-call 176
Valid parseable calls 176 (100.0%)
Gate (≥95%) PASS

Usage

Load with MediaPipe LLM Inference / expo-llm-mediapipe on Android and point the runtime at gemma3-1b-cardholder-int4.task.

License

Use is governed by the Gemma Terms of Use.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ai-colombia/gemma3-1b-cardholder-task

Finetuned
(557)
this model