KZ/RU Privacy Filter — Phase 7 (morphology-hardened)

A fine-tune of openai/privacy-filter (OPF) specialized for Kazakh / Russian PII detection, with particular focus on case-inflected and morphologically varied personal names — the failure mode where the base model and naive fine-tunes leak Kazakh names that appear inline, declined, or as patronymics.

It is a drop-in OPF checkpoint: same architecture, same 8-category label taxonomy, run with the opf CLI / API.

Scope: this is a pre-send risk-reduction filter, not a guaranteed anonymizer. See "Limitations" — a small fraction of bare, standalone, inflected names can still pass.

Labels

The 8 OPF categories: private_person, private_address, private_email, private_phone, private_url, private_date, account_number, secret.

Usage

pip install <the opf package>   # https://github.com/openai/privacy-filter (Apache-2.0)
opf redact "Кеше Айгүлге хабарластық, ал Кузнецову Дмитрию Андреевичу жазылды." \
  --checkpoint /path/to/this/checkpoint --device cpu
from opf._api import OPF
opf = OPF(model="/path/to/this/checkpoint", device="cpu", output_mode="typed")
for s in opf.redact("Айгүлге хабарластық.").detected_spans:
    print(s.label, s.start, s.end, s.text)

Evaluation

Out-of-distribution Kazakh/Russian morphology stress set (109 hand-built examples / 120 spans / 66 person spans), span-level, compared across the project's fine-tune lineage. All numbers are copied from the eval metric files (constrained-Viterbi decode, typed eval mode):

metric base OPF 4D (synthetic) 5 (web) 6 (inline) 7 (this model)
detection.f1 0.8565 0.8758 0.8794 0.9390 0.9412
detection span recall 0.6000 0.6500 0.6167 0.8500 0.8833
detection span F1 0.6667 0.7126 0.7114 0.8206 0.8547
detection span precision 0.7500 0.7885 0.8404 0.7931 0.8279
private_person recall 0.4697 0.5909 0.4242 0.8485 0.9091
private_person precision 0.8750 0.8400 0.8750 0.8254 0.8769
private_person F1 0.6113 0.6938 0.5714 0.8368 0.8927

This checkpoint is the best of the lineage on every person metric, raising person recall from the base's 0.470 to 0.909 while also improving precision to 0.877.

Statistical caveat (read this): the eval set is small — 66 person spans. 95% confidence intervals are wide (e.g. person recall ≈ [0.82, 0.96]) and overlap between adjacent phases. Treat the lineage trend as solid but individual ±1–2-point differences as within noise.

OOD-purity caveat: Phase 7's training injector deliberately adds short standalone-name and name-final-boundary patterns that resemble (but never exactly duplicate — exact text collisions are dropped and asserted-zero at build time) the eval distribution. So for those two categories the eval is less purely out-of-distribution than for earlier phases; treat the absolute gains as an upper bound on true generalization.

Training data & provenance

  • Carrier text: real Kazakh/Russian prose from nitec/kazakh-unsorted (a privately-collected Kazakh web corpus, collected by QOSI), used with the collector's permission.
  • PII is entirely synthetic and fabricated. Real PII was stripped from the carrier text in a two-stage scrub (heuristic + model pass); fabricated-but-structurally-valid PII (IIN/IBAN with valid check digits, real KZ phone prefixes, public city/street names) was then injected at known offsets. No real personal data is in the training set or this model's intended outputs.
  • Corpus: 50k train / 6k val / 8k test / 4k hard-test, 1 epoch from the base checkpoint on an H200 (~4 min). Same size/config as the prior phases; only the injection distribution changed (boundary drills + public-name negatives + bare-standalone positives).

Limitations

  • Not a guaranteed anonymizer — risk reduction before sending text to an external LLM. In the full gateway (deterministic rules + this model), 1 of 90 PII-bearing OOD examples still leaked: a bare, inflected, standalone Kazakh given name with no other signal.
  • Public-name precision frontier: the model occasionally tags public organizations (Назарбаев Университеті, Ұлттық банк) as persons, and tags bare 12-digit numbers as account_number even without an IIN/БСН context token.
  • Optimized for Kazakh + Russian; not evaluated on other languages.

License & attribution

  • License: Apache 2.0 — inherited from the base model openai/privacy-filter, which is Apache-2.0 and covers the weights.
  • This is a modified derivative. It was fine-tuned from openai/privacy-filter; the weights differ. See NOTICE.
  • OpenAI's name and marks are used only to identify the upstream base model and do not imply endorsement.
  • The bundled viterbi_calibration.json is the base model's neutral (all-zero) calibration, included so this is a complete OPF checkpoint; it matches how the metrics above were produced.
Downloads last month
5
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for QOSIkz/kz-privacy-filter-v1

Finetuned
(38)
this model