KZ/RU Privacy Filter — Phase 7 (morphology-hardened)

A fine-tune of openai/privacy-filter (OPF) specialized for Kazakh / Russian PII detection, with particular focus on case-inflected and morphologically varied personal names — the failure mode where the base model and naive fine-tunes leak Kazakh names that appear inline, declined, or as patronymics.

It is a drop-in OPF checkpoint: same architecture, same 8-category label taxonomy, run with the opf CLI / API.

Scope: this is a pre-send risk-reduction filter, not a guaranteed anonymizer. See "Limitations" — a small fraction of bare, standalone, inflected names can still pass.

Labels

The 8 OPF categories: private_person, private_address, private_email, private_phone, private_url, private_date, account_number, secret.

Usage

pip install <the opf package>   # https://github.com/openai/privacy-filter (Apache-2.0)
opf redact "Кеше Айгүлге хабарластық, ал Кузнецову Дмитрию Андреевичу жазылды." \
  --checkpoint /path/to/this/checkpoint --device cpu

from opf._api import OPF
opf = OPF(model="/path/to/this/checkpoint", device="cpu", output_mode="typed")
for s in opf.redact("Айгүлге хабарластық.").detected_spans:
    print(s.label, s.start, s.end, s.text)

Evaluation

Out-of-distribution Kazakh/Russian morphology stress set (109 hand-built examples / 120 spans / 66 person spans), span-level, compared across the project's fine-tune lineage. All numbers are copied from the eval metric files (constrained-Viterbi decode, typed eval mode):

metric	base OPF	4D (synthetic)	5 (web)	6 (inline)	7 (this model)
detection.f1	0.8565	0.8758	0.8794	0.9390	0.9412
detection span recall	0.6000	0.6500	0.6167	0.8500	0.8833
detection span F1	0.6667	0.7126	0.7114	0.8206	0.8547
detection span precision	0.7500	0.7885	0.8404	0.7931	0.8279
`private_person` recall	0.4697	0.5909	0.4242	0.8485	0.9091
`private_person` precision	0.8750	0.8400	0.8750	0.8254	0.8769
`private_person` F1	0.6113	0.6938	0.5714	0.8368	0.8927

This checkpoint is the best of the lineage on every person metric, raising person recall from the base's 0.470 to 0.909 while also improving precision to 0.877.

Statistical caveat (read this): the eval set is small — 66 person spans. 95% confidence intervals are wide (e.g. person recall ≈ [0.82, 0.96]) and overlap between adjacent phases. Treat the lineage trend as solid but individual ±1–2-point differences as within noise.

OOD-purity caveat: Phase 7's training injector deliberately adds short standalone-name and name-final-boundary patterns that resemble (but never exactly duplicate — exact text collisions are dropped and asserted-zero at build time) the eval distribution. So for those two categories the eval is less purely out-of-distribution than for earlier phases; treat the absolute gains as an upper bound on true generalization.

Training data & provenance

Carrier text: real Kazakh/Russian prose from nitec/kazakh-unsorted (a privately-collected Kazakh web corpus, collected by QOSI), used with the collector's permission.
PII is entirely synthetic and fabricated. Real PII was stripped from the carrier text in a two-stage scrub (heuristic + model pass); fabricated-but-structurally-valid PII (IIN/IBAN with valid check digits, real KZ phone prefixes, public city/street names) was then injected at known offsets. No real personal data is in the training set or this model's intended outputs.
Corpus: 50k train / 6k val / 8k test / 4k hard-test, 1 epoch from the base checkpoint on an H200 (~4 min). Same size/config as the prior phases; only the injection distribution changed (boundary drills + public-name negatives + bare-standalone positives).

Limitations

Not a guaranteed anonymizer — risk reduction before sending text to an external LLM. In the full gateway (deterministic rules + this model), 1 of 90 PII-bearing OOD examples still leaked: a bare, inflected, standalone Kazakh given name with no other signal.
Public-name precision frontier: the model occasionally tags public organizations (Назарбаев Университеті, Ұлттық банк) as persons, and tags bare 12-digit numbers as account_number even without an IIN/БСН context token.
Optimized for Kazakh + Russian; not evaluated on other languages.

License & attribution

License: Apache 2.0 — inherited from the base model openai/privacy-filter, which is Apache-2.0 and covers the weights.
This is a modified derivative. It was fine-tuned from openai/privacy-filter; the weights differ. See NOTICE.
OpenAI's name and marks are used only to identify the upstream base model and do not imply endorsement.
The bundled viterbi_calibration.json is the base model's neutral (all-zero) calibration, included so this is a complete OPF checkpoint; it matches how the metrics above were produced.

Downloads last month: 5

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for QOSIkz/kz-privacy-filter-v1

Base model

openai/privacy-filter

Finetuned

(38)

this model