gemma3-1b-cardholder-task

On-device, INT4-quantized MediaPipe .task build of google/gemma-3-1b-it, LoRA fine-tuned to emit search_contacts(...) tool calls for an offline business-card / contact-holder Android app (via expo-llm-mediapipe).

The model decides whether a user request needs a contact search and, if so, emits a single structured search_contacts(...) call. It runs fully offline on phones with as little as 2–4 GB RAM.

Files

File	Size	Notes
`gemma3-1b-cardholder-int4.task`	~665 MB	INT4 weights (4-bit feed-forward, 8-bit attention/embedding), CPU backend, self-contained tokenizer

Tool contract

The model is trained to call a single function:

search_contacts(name=..., company=..., title=..., email=..., phone=..., note=...)

All arguments are optional; only the fields relevant to the request are filled. For small-talk / non-search requests the model replies in natural language without a tool call.

Training

Base: google/gemma-3-1b-it (text-only, ~1B params)
Method: LoRA (r=16, alpha=16, dropout=0), 4-bit QLoRA via Unsloth
Data: 5,000 multi-turn function-calling examples (es / en / no / pt / de / fr)
Schedule: 3 epochs, 900 steps, effective batch 16, LR 2e-4 cosine, train_on_responses_only
Final train loss: 0.0515

Evaluation (held-out 200 examples)

Metric	Value
Examples checked	200
Should tool-call	176
Valid parseable calls	176 (100.0%)
Gate (≥95%)	PASS

Usage

Load with MediaPipe LLM Inference / expo-llm-mediapipe on Android and point the runtime at gemma3-1b-cardholder-int4.task.

License

Use is governed by the Gemma Terms of Use.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for ai-colombia/gemma3-1b-cardholder-task

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Finetuned

(557)

this model