Text Ranking
sentence-transformers
Safetensors
cross-encoder
reranker
Generated from Trainer
dataset_size:1122
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
Instructions to use pujithapsx/address-crossencoder-stsb-roberta-large-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use pujithapsx/address-crossencoder-stsb-roberta-large-finetuned with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("pujithapsx/address-crossencoder-stsb-roberta-large-finetuned") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
metadata
tags:
- sentence-transformers
- cross-encoder
- reranker
- generated_from_trainer
- dataset_size:1122
- loss:BinaryCrossEntropyLoss
base_model: cross-encoder/stsb-roberta-large
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- accuracy
- accuracy_threshold
- f1
- f1_threshold
- precision
- recall
- average_precision
model-index:
- name: CrossEncoder based on cross-encoder/stsb-roberta-large
results:
- task:
type: cross-encoder-classification
name: Cross Encoder Classification
dataset:
name: validation
type: validation
metrics:
- type: accuracy
value: 0.95
name: Accuracy
- type: accuracy_threshold
value: 0.4996068775653839
name: Accuracy Threshold
- type: f1
value: 0.9517241379310345
name: F1
- type: f1_threshold
value: 0.3665258288383484
name: F1 Threshold
- type: precision
value: 0.9452054794520548
name: Precision
- type: recall
value: 0.9583333333333334
name: Recall
- type: average_precision
value: 0.9752684979366919
name: Average Precision
CrossEncoder based on cross-encoder/stsb-roberta-large
This is a Cross Encoder model finetuned from cross-encoder/stsb-roberta-large using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: cross-encoder/stsb-roberta-large
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("pujithapsx/address-crossencoder-stsb-roberta-large-finetuned")
# Get scores for pairs of texts
pairs = [
['C/O Rakesh Tower C Sector 137 Gurgaon', 'C Tower Sec-137 Gurugram'],
['Tellapur Hyderabad', 'Telapur Hyderabad'],
['Flat 703 Electronic City Bangalore', 'Flat 703 Electronic City Mumbai'],
['B-12 Malviya Nagar Delhi', 'B-22 Malviya Nagar Delhi'],
['Flat 1203 Lower Parel Mumbai', 'Flat 1203 Lower Parel Chennai'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'C/O Rakesh Tower C Sector 137 Gurgaon',
[
'C Tower Sec-137 Gurugram',
'Telapur Hyderabad',
'Flat 703 Electronic City Mumbai',
'B-22 Malviya Nagar Delhi',
'Flat 1203 Lower Parel Chennai',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Classification
- Dataset:
validation - Evaluated with
CrossEncoderClassificationEvaluator
| Metric | Value |
|---|---|
| accuracy | 0.95 |
| accuracy_threshold | 0.4996 |
| f1 | 0.9517 |
| f1_threshold | 0.3665 |
| precision | 0.9452 |
| recall | 0.9583 |
| average_precision | 0.9753 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,122 training samples
- Columns:
sentence1,sentence2, andlabel - Approximate statistics based on the first 1000 samples:
sentence1 sentence2 label type string string float details - min: 8 characters
- mean: 27.83 characters
- max: 55 characters
- min: 8 characters
- mean: 27.83 characters
- max: 61 characters
- min: 0.0
- mean: 0.55
- max: 1.0
- Samples:
sentence1 sentence2 label Eighty Eight 8th Cross HSR Layout Bengaluru52 Fifty Two D Second Lane Marathahalli Bengaluru0.0Flat 301 C/O Sharma Kondapur Near Hitech City HyderabadFlat 301 C/O Sharma Kondapoor Near Hi Tech City Hyd1.0Anna Nagar 12B Chennai 60004012B Anna Nagar Chennai1.0 - Loss:
BinaryCrossEntropyLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Evaluation Dataset
Unnamed Dataset
- Size: 140 evaluation samples
- Columns:
sentence1,sentence2, andlabel - Approximate statistics based on the first 140 samples:
sentence1 sentence2 label type string string float details - min: 13 characters
- mean: 29.04 characters
- max: 74 characters
- min: 12 characters
- mean: 29.13 characters
- max: 62 characters
- min: 0.0
- mean: 0.51
- max: 1.0
- Samples:
sentence1 sentence2 label C/O Rakesh Tower C Sector 137 GurgaonC Tower Sec-137 Gurugram1.0Tellapur HyderabadTelapur Hyderabad1.0Flat 703 Electronic City BangaloreFlat 703 Electronic City Mumbai0.0 - Loss:
BinaryCrossEntropyLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Training Hyperparameters
Non-Default Hyperparameters
num_train_epochs: 6learning_rate: 1.5e-05warmup_steps: 0.1weight_decay: 0.01gradient_accumulation_steps: 4disable_tqdm: Trueeval_strategy: epochper_device_eval_batch_size: 16load_best_model_at_end: True
All Hyperparameters
Click to expand
per_device_train_batch_size: 8num_train_epochs: 6max_steps: -1learning_rate: 1.5e-05lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_steps: 0.1optim: adamw_torch_fusedoptim_args: Noneweight_decay: 0.01adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08optim_target_modules: Nonegradient_accumulation_steps: 4average_tokens_across_devices: Truemax_grad_norm: 1.0label_smoothing_factor: 0.0bf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Nonetorch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneuse_liger_kernel: Falseliger_kernel_config: Noneuse_cache: Falseneftune_noise_alpha: Nonetorch_empty_cache_steps: Noneauto_find_batch_size: Falselog_on_each_node: Truelogging_nan_inf_filter: Trueinclude_num_input_tokens_seen: nolog_level: passivelog_level_replica: warningdisable_tqdm: Trueproject: huggingfacetrackio_space_id: trackioeval_strategy: epochper_device_eval_batch_size: 16prediction_loss_only: Trueeval_on_start: Falseeval_do_concat_batches: Trueeval_use_gather_object: Falseeval_accumulation_steps: Noneinclude_for_metrics: []batch_eval_metrics: Falsesave_only_model: Falsesave_on_each_node: Falseenable_jit_checkpoint: Falsepush_to_hub: Falsehub_private_repo: Nonehub_model_id: Nonehub_strategy: every_savehub_always_push: Falsehub_revision: Noneload_best_model_at_end: Trueignore_data_skip: Falserestore_callback_states_from_checkpoint: Falsefull_determinism: Falseseed: 42data_seed: Noneuse_cpu: Falseaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedataloader_drop_last: Falsedataloader_num_workers: 0dataloader_pin_memory: Truedataloader_persistent_workers: Falsedataloader_prefetch_factor: Noneremove_unused_columns: Truelabel_names: Nonetrain_sampling_strategy: randomlength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falseddp_backend: Noneddp_timeout: 1800fsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}deepspeed: Nonedebug: []skip_memory_metrics: Truedo_predict: Falseresume_from_checkpoint: Nonewarmup_ratio: Nonelocal_rank: -1prompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | Validation Loss | validation_average_precision |
|---|---|---|---|---|
| 0.2837 | 10 | 0.4552 | - | - |
| 0.5674 | 20 | 0.4294 | - | - |
| 0.8511 | 30 | 0.4078 | - | - |
| 1.0 | 36 | - | 0.2787 | 0.9570 |
| 1.1135 | 40 | 0.3982 | - | - |
| 1.3972 | 50 | 0.3678 | - | - |
| 1.6809 | 60 | 0.3367 | - | - |
| 1.9645 | 70 | 0.4198 | - | - |
| 2.0 | 72 | - | 0.2252 | 0.9702 |
| 2.2270 | 80 | 0.3148 | - | - |
| 2.5106 | 90 | 0.3862 | - | - |
| 2.7943 | 100 | 0.3374 | - | - |
| 3.0 | 108 | - | 0.1974 | 0.9725 |
| 3.0567 | 110 | 0.3272 | - | - |
| 3.3404 | 120 | 0.2932 | - | - |
| 3.6241 | 130 | 0.3010 | - | - |
| 3.9078 | 140 | 0.3119 | - | - |
| 4.0 | 144 | - | 0.1829 | 0.9736 |
| 4.1702 | 150 | 0.3005 | - | - |
| 4.4539 | 160 | 0.3292 | - | - |
| 4.7376 | 170 | 0.2207 | - | - |
| 5.0 | 180 | 0.2954 | 0.1745 | 0.9750 |
| 5.2837 | 190 | 0.2853 | - | - |
| 5.5674 | 200 | 0.2969 | - | - |
| 5.8511 | 210 | 0.2600 | - | - |
| 6.0 | 216 | - | 0.1719 | 0.9753 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.11
- Sentence Transformers: 5.3.0
- Transformers: 5.3.0
- PyTorch: 2.11.0+cpu
- Accelerate: 1.13.0
- Datasets: 4.8.4
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}