Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
khopilot
/
khmer-tokenizer-v7
like
1
Feature Extraction
Transformers
Khmer
khmer
tokenization
graph-regularization
sentencepiece
nlp
semantic-embeddings
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
khmer-tokenizer-v7
41 MB
1 contributor
History:
29 commits
khopilot
Upload CHANGELOG.md with huggingface_hub
7f966f9
verified
5 months ago
CHANGELOG.md
Safe
2.4 kB
Upload CHANGELOG.md with huggingface_hub
5 months ago
CITATION.cff
Safe
757 Bytes
Upload CITATION.cff with huggingface_hub
5 months ago
README.md
Safe
7.03 kB
Upload README.md with huggingface_hub
5 months ago
edges_pruned.tsv
Safe
126 kB
Upload edges_pruned.tsv with huggingface_hub
5 months ago
lexeme_embeddings.pt
Safe
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
38.9 MB
xet
Upload lexeme_embeddings.pt with huggingface_hub
5 months ago
lexeme_subwords_prod8k_v22.tsv
Safe
688 kB
Upload lexeme_subwords_prod8k_v22.tsv with huggingface_hub
5 months ago
metrics_corrected.yaml
Safe
5.86 kB
Upload metrics_corrected.yaml with huggingface_hub
5 months ago
nodes.tsv
Safe
950 kB
Upload nodes.tsv with huggingface_hub
5 months ago
spm_km_8k_prod.model
Safe
164 kB
xet
Upload spm_km_8k_prod.model with huggingface_hub
5 months ago
spm_km_8k_prod.vocab
Safe
169 kB
Upload spm_km_8k_prod.vocab with huggingface_hub
5 months ago
tokenizer_config.json
Safe
410 Bytes
Upload tokenizer_config.json with huggingface_hub
5 months ago