genv3pair1NoGT_1.5B_cdpo_ebs32_lr1e-06_beta0.1_epoch16.0_42
This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the YuchenLi01/MATH_Qwen2.5-1.5BInstruct_DPO_MoreUniqueResponseNoGTv3pair1 dataset. It achieves the following results on the evaluation set:
- Loss: 1.0560
- Rewards/chosen: -0.0768
- Rewards/rejected: 0.0
- Rewards/accuracies: 0.5500
- Rewards/margins: -0.0768
- Logps/rejected: -56.2549
- Logps/chosen: -30.8380
- Logits/rejected: -3.3810
- Logits/chosen: -3.2576
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-06
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 16.0
Training results
| Training Loss | Epoch | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.6912 | 0.1117 | 20 | -2.3786 | -2.2314 | -30.0473 | -41.6374 | 0.6928 | 0.5 | 0.0023 | 0.0023 | 0.0 |
| 0.6775 | 0.2235 | 40 | -2.3835 | -2.2418 | -29.7868 | -41.3972 | 0.6781 | 0.875 | 0.0283 | 0.0283 | 0.0 |
| 0.6352 | 0.3352 | 60 | -2.4506 | -2.3140 | -28.9355 | -40.3581 | 0.6334 | 0.9750 | 0.1135 | 0.1135 | 0.0 |
| 0.5527 | 0.4469 | 80 | -2.5961 | -2.4815 | -26.8917 | -38.1349 | 0.5456 | 1.0 | 0.3178 | 0.3178 | 0.0 |
| 0.4155 | 0.5587 | 100 | -2.7877 | -2.7043 | -24.6228 | -35.5825 | 0.4618 | 1.0 | 0.5447 | 0.5447 | 0.0 |
| 0.4148 | 0.6704 | 120 | -2.9855 | -2.9357 | -22.7730 | -33.9602 | 0.4124 | 0.9750 | 0.7297 | 0.7297 | 0.0 |
| 0.4048 | 0.7821 | 140 | -3.0851 | -3.0553 | -21.8011 | -33.0451 | 0.3880 | 0.9750 | 0.8269 | 0.8269 | 0.0 |
| 0.367 | 0.8939 | 160 | -3.1732 | -3.1657 | -20.7670 | -32.3537 | 0.3679 | 1.0 | 0.9303 | 0.9303 | 0.0 |
| 0.3302 | 1.0056 | 180 | -3.2066 | -3.2086 | -20.4659 | -31.9908 | 0.3615 | 0.9750 | 0.9604 | 0.9604 | 0.0 |
| 0.3375 | 1.1173 | 200 | -3.2463 | -3.2582 | -20.0837 | -31.6332 | 0.3559 | 0.9750 | 0.9986 | 0.9986 | 0.0 |
| 0.3172 | 1.2291 | 220 | -3.2746 | -3.2954 | -19.8890 | -31.5264 | 0.3517 | 0.9750 | 1.0181 | 1.0181 | 0.0 |
| 0.3177 | 1.3408 | 240 | -3.2941 | -3.3199 | -19.7086 | -31.3858 | 0.3488 | 0.9750 | 1.0361 | 1.0361 | 0.0 |
| 0.318 | 1.4525 | 260 | -3.3066 | -3.3382 | -19.4988 | -31.3553 | 0.3468 | 0.9750 | 1.0571 | 1.0571 | 0.0 |
| 0.3004 | 1.5642 | 280 | -3.3013 | -3.3296 | -19.4541 | -31.2717 | 0.3439 | 0.9750 | 1.0616 | 1.0616 | 0.0 |
| 0.3579 | 1.6760 | 300 | -3.3224 | -3.3619 | -19.2996 | -31.1175 | 0.3442 | 0.9750 | 1.0770 | 1.0770 | 0.0 |
| 0.3457 | 1.7877 | 320 | -3.3044 | -3.3413 | -19.3727 | -31.2252 | 0.3402 | 0.9750 | 1.0697 | 1.0697 | 0.0 |
| 0.3362 | 1.8994 | 340 | -3.3132 | -3.3506 | -19.2465 | -31.0188 | 0.3390 | 0.9750 | 1.0824 | 1.0824 | 0.0 |
| 0.2611 | 2.0112 | 360 | -3.3525 | -3.3943 | -19.0097 | -31.0481 | 0.3390 | 0.9750 | 1.1060 | 1.1060 | 0.0 |
| 0.2496 | 2.1229 | 380 | -3.3850 | -3.4511 | -19.0207 | -31.5875 | 0.3513 | 0.9500 | 1.1049 | 1.1049 | 0.0 |
| 0.2294 | 2.2346 | 400 | -3.4066 | -3.4721 | -18.7581 | -31.4791 | 0.3479 | 0.9750 | 1.1312 | 1.1312 | 0.0 |
| 0.2657 | 2.3464 | 420 | -3.4068 | -3.4797 | -18.9398 | -32.0904 | 0.3536 | 0.9500 | 1.1130 | 1.1130 | 0.0 |
| 0.2225 | 2.4581 | 440 | -3.4011 | -3.4714 | -18.8087 | -31.7098 | 0.3474 | 0.9750 | 1.1261 | 1.1261 | 0.0 |
| 0.2554 | 2.5698 | 460 | -3.4113 | -3.4856 | -18.9376 | -31.7534 | 0.3510 | 0.9500 | 1.1132 | 1.1132 | 0.0 |
| 0.2282 | 2.6816 | 480 | -3.4261 | -3.4980 | -18.8271 | -31.7000 | 0.3481 | 0.9500 | 1.1243 | 1.1243 | 0.0 |
| 0.2073 | 2.7933 | 500 | -3.4131 | -3.4875 | -18.7871 | -31.8831 | 0.3500 | 0.9500 | 1.1283 | 1.1283 | 0.0 |
| 0.2345 | 2.9050 | 520 | -3.4157 | -3.4942 | -18.8716 | -31.9919 | 0.3530 | 0.9500 | 1.1198 | 1.1198 | 0.0 |
| 0.1987 | 3.0168 | 540 | -3.4264 | -3.5064 | -18.6744 | -32.0745 | 0.3496 | 0.9250 | 1.1396 | 1.1396 | 0.0 |
| 0.1838 | 3.1285 | 560 | -3.4521 | -3.5512 | -19.5243 | -34.0481 | 0.3871 | 0.9000 | 1.0546 | 1.0546 | 0.0 |
| 0.1988 | 3.2402 | 580 | -3.4623 | -3.5623 | -19.6098 | -33.7099 | 0.3881 | 0.9000 | 1.0460 | 1.0460 | 0.0 |
| 0.1733 | 3.3520 | 600 | -3.4567 | -3.5549 | -19.3341 | -33.2678 | 0.3793 | 0.9000 | 1.0736 | 1.0736 | 0.0 |
| 0.1543 | 3.4637 | 620 | -3.4552 | -3.5590 | -19.7700 | -34.0098 | 0.3932 | 0.9250 | 1.0300 | 1.0300 | 0.0 |
| 0.17 | 3.5754 | 640 | -3.4528 | -3.5567 | -19.9679 | -34.2455 | 0.4013 | 0.9000 | 1.0102 | 1.0102 | 0.0 |
| 0.1905 | 3.6872 | 660 | -3.4496 | -3.5554 | -20.0944 | -34.5604 | 0.3994 | 0.9000 | 0.9976 | 0.9976 | 0.0 |
| 0.2188 | 3.7989 | 680 | -3.4679 | -3.5720 | -19.9418 | -34.7166 | 0.3983 | 0.9000 | 1.0128 | 1.0128 | 0.0 |
| 0.1544 | 3.9106 | 700 | -3.4570 | -3.5612 | -19.6301 | -34.3280 | 0.3955 | 0.9000 | 1.0440 | 1.0440 | 0.0 |
| 0.1485 | 4.0223 | 720 | -3.4488 | -3.5566 | -19.6168 | -34.5071 | 0.3981 | 0.9000 | 1.0453 | 1.0453 | 0.0 |
| 0.1312 | 4.1341 | 740 | -3.4462 | -3.5609 | -20.5134 | -37.0993 | 0.4548 | 0.9000 | 0.9557 | 0.9557 | 0.0 |
| 0.137 | 4.2458 | 760 | -3.4631 | -3.5800 | -20.7353 | -37.0974 | 0.4546 | 0.9000 | 0.9335 | 0.9335 | 0.0 |
| 0.1264 | 4.3575 | 780 | -3.4360 | -3.5525 | -20.9012 | -38.0580 | 0.4690 | 0.9000 | 0.9169 | 0.9169 | 0.0 |
| 0.1676 | 4.4693 | 800 | -3.4399 | -3.5572 | -20.5777 | -36.9247 | 0.4478 | 0.9000 | 0.9492 | 0.9492 | 0.0 |
| 0.1669 | 4.5810 | 820 | -3.4320 | -3.5501 | -20.6078 | -36.8322 | 0.4565 | 0.9000 | 0.9462 | 0.9462 | 0.0 |
| 0.2087 | 4.6927 | 840 | -3.4383 | -3.5573 | -20.8896 | -37.6431 | 0.4653 | 0.9000 | 0.9180 | 0.9180 | 0.0 |
| 0.1917 | 4.8045 | 860 | -3.4337 | -3.5523 | -20.8375 | -37.4122 | 0.4579 | 0.9000 | 0.9233 | 0.9233 | 0.0 |
| 0.1446 | 4.9162 | 880 | -3.4445 | -3.5623 | -21.0812 | -37.0777 | 0.4579 | 0.9000 | 0.8989 | 0.8989 | 0.0 |
| 0.1087 | 5.0279 | 900 | -3.4355 | -3.5585 | -21.4054 | -38.2455 | 0.4944 | 0.875 | 0.8665 | 0.8665 | 0.0 |
| 0.1107 | 5.1397 | 920 | -3.4260 | -3.5516 | -21.8624 | -39.4764 | 0.5151 | 0.9000 | 0.8208 | 0.8208 | 0.0 |
| 0.1205 | 5.2514 | 940 | -3.4314 | -3.5559 | -21.9157 | -40.4403 | 0.5230 | 0.875 | 0.8154 | 0.8154 | 0.0 |
| 0.1228 | 5.3631 | 960 | -3.4247 | -3.5500 | -21.7546 | -40.5583 | 0.5338 | 0.875 | 0.8315 | 0.8315 | 0.0 |
| 0.1248 | 5.4749 | 980 | -3.4306 | -3.5546 | -22.0767 | -40.2288 | 0.5253 | 0.875 | 0.7993 | 0.7993 | 0.0 |
| 0.1221 | 5.5866 | 1000 | -3.4348 | -3.5552 | -21.7049 | -39.9160 | 0.5152 | 0.9000 | 0.8365 | 0.8365 | 0.0 |
| 0.113 | 5.6983 | 1020 | -3.4238 | -3.5488 | -21.9340 | -40.4565 | 0.5297 | 0.875 | 0.8136 | 0.8136 | 0.0 |
| 0.1521 | 5.8101 | 1040 | -3.4173 | -3.5440 | -21.9173 | -40.1878 | 0.5310 | 0.875 | 0.8153 | 0.8153 | 0.0 |
| 0.1454 | 5.9218 | 1060 | -3.4203 | -3.5454 | -21.6730 | -40.2415 | 0.5311 | 0.9000 | 0.8397 | 0.8397 | 0.0 |
| 0.0924 | 6.0335 | 1080 | -3.4155 | -3.5412 | -22.3914 | -40.8432 | 0.5556 | 0.9000 | 0.7679 | 0.7679 | 0.0 |
| 0.0872 | 6.1453 | 1100 | -3.4080 | -3.5353 | -23.4502 | -42.1279 | 0.5976 | 0.875 | 0.6620 | 0.6620 | 0.0 |
| 0.1171 | 6.2570 | 1120 | -3.4042 | -3.5301 | -23.1504 | -42.3703 | 0.5868 | 0.875 | 0.6920 | 0.6920 | 0.0 |
| 0.1352 | 6.3687 | 1140 | -3.3884 | -3.5140 | -23.5642 | -42.2649 | 0.5905 | 0.8500 | 0.6506 | 0.6506 | 0.0 |
| 0.1121 | 6.4804 | 1160 | -3.3825 | -3.5106 | -22.8764 | -42.2739 | 0.5852 | 0.8500 | 0.7194 | 0.7194 | 0.0 |
| 0.1095 | 6.5922 | 1180 | -3.3981 | -3.5221 | -23.0010 | -42.5219 | 0.5904 | 0.875 | 0.7069 | 0.7069 | 0.0 |
| 0.1029 | 6.7039 | 1200 | -3.3989 | -3.5260 | -23.4520 | -43.0338 | 0.6176 | 0.8500 | 0.6618 | 0.6618 | 0.0 |
| 0.0999 | 6.8156 | 1220 | -3.4005 | -3.5274 | -23.5220 | -42.7762 | 0.6177 | 0.8500 | 0.6548 | 0.6548 | 0.0 |
| 0.1516 | 6.9274 | 1240 | -3.4047 | -3.5306 | -23.3477 | -43.3137 | 0.6201 | 0.875 | 0.6722 | 0.6722 | 0.0 |
| 0.1004 | 7.0391 | 1260 | -3.3980 | -3.5243 | -23.2792 | -42.6266 | 0.6063 | 0.8500 | 0.6791 | 0.6791 | 0.0 |
| 0.1213 | 7.1508 | 1280 | -3.3665 | -3.4955 | -25.0291 | -45.2048 | 0.6887 | 0.75 | 0.5041 | 0.5041 | 0.0 |
| 0.0933 | 7.2626 | 1300 | -3.3802 | -3.5086 | -24.5921 | -44.9486 | 0.6780 | 0.8250 | 0.5478 | 0.5478 | 0.0 |
| 0.0938 | 7.3743 | 1320 | -3.3620 | -3.4917 | -24.9811 | -44.8765 | 0.6838 | 0.7750 | 0.5089 | 0.5089 | 0.0 |
| 0.142 | 7.4860 | 1340 | -3.3718 | -3.5008 | -24.9029 | -45.3435 | 0.6902 | 0.7750 | 0.5167 | 0.5167 | 0.0 |
| 0.127 | 7.5978 | 1360 | -3.3774 | -3.5059 | -24.6134 | -45.3868 | 0.6831 | 0.7750 | 0.5457 | 0.5457 | 0.0 |
| 0.0918 | 7.7095 | 1380 | -3.3833 | -3.5088 | -24.9553 | -45.2748 | 0.6842 | 0.7750 | 0.5115 | 0.5115 | 0.0 |
| 0.1178 | 7.8212 | 1400 | -3.3662 | -3.4939 | -25.5868 | -46.2983 | 0.7027 | 0.7250 | 0.4483 | 0.4483 | 0.0 |
| 0.1169 | 7.9330 | 1420 | -3.3615 | -3.4883 | -25.5333 | -45.5611 | 0.6962 | 0.7750 | 0.4537 | 0.4537 | 0.0 |
| 0.1522 | 8.0447 | 1440 | -3.3602 | -3.4881 | -25.4634 | -45.9803 | 0.7025 | 0.7250 | 0.4607 | 0.4607 | 0.0 |
| 0.1275 | 8.1564 | 1460 | -3.3589 | -3.4881 | -26.1552 | -48.0564 | 0.7525 | 0.7250 | 0.3915 | 0.3915 | 0.0 |
| 0.0848 | 8.2682 | 1480 | -3.3670 | -3.4939 | -25.9976 | -47.4926 | 0.7414 | 0.7000 | 0.4072 | 0.4072 | 0.0 |
| 0.1039 | 8.3799 | 1500 | -3.3645 | -3.4922 | -26.2727 | -48.1866 | 0.7770 | 0.7250 | 0.3797 | 0.3797 | 0.0 |
| 0.1203 | 8.4916 | 1520 | -3.3499 | -3.4808 | -26.0705 | -47.7090 | 0.7492 | 0.7250 | 0.4000 | 0.4000 | 0.0 |
| 0.0618 | 8.6034 | 1540 | -3.3557 | -3.4830 | -26.3612 | -48.1454 | 0.7531 | 0.75 | 0.3709 | 0.3709 | 0.0 |
| 0.0899 | 8.7151 | 1560 | -3.3437 | -3.4721 | -26.0612 | -47.8259 | 0.7493 | 0.7250 | 0.4009 | 0.4009 | 0.0 |
| 0.1591 | 8.8268 | 1580 | -3.3498 | -3.4778 | -26.6553 | -48.5006 | 0.7682 | 0.6750 | 0.3415 | 0.3415 | 0.0 |
| 0.0961 | 8.9385 | 1600 | -3.3444 | -3.4703 | -27.0557 | -48.8106 | 0.7781 | 0.6750 | 0.3014 | 0.3014 | 0.0 |
| 0.0892 | 9.0503 | 1620 | -3.3324 | -3.4578 | -26.8131 | -48.4665 | 0.7767 | 0.6500 | 0.3257 | 0.3257 | 0.0 |
| 0.1226 | 9.1620 | 1640 | -3.3216 | -3.4506 | -27.6348 | -50.1392 | 0.8401 | 0.6500 | 0.2435 | 0.2435 | 0.0 |
| 0.1012 | 9.2737 | 1660 | -3.3205 | -3.4494 | -27.0937 | -50.0571 | 0.8226 | 0.6750 | 0.2976 | 0.2976 | 0.0 |
| 0.0825 | 9.3855 | 1680 | -3.3115 | -3.4402 | -27.6659 | -50.3234 | 0.8500 | 0.6500 | 0.2404 | 0.2404 | 0.0 |
| 0.1345 | 9.4972 | 1700 | -3.3097 | -3.4362 | -27.4484 | -50.4653 | 0.8455 | 0.7000 | 0.2622 | 0.2622 | 0.0 |
| 0.1178 | 9.6089 | 1720 | -3.3052 | -3.4332 | -27.4479 | -50.4414 | 0.8560 | 0.7000 | 0.2622 | 0.2622 | 0.0 |
| 0.138 | 9.7207 | 1740 | -3.3118 | -3.4380 | -27.7125 | -50.2650 | 0.8467 | 0.625 | 0.2358 | 0.2358 | 0.0 |
| 0.1465 | 9.8324 | 1760 | -3.3077 | -3.4372 | -27.4658 | -49.9790 | 0.8316 | 0.6750 | 0.2604 | 0.2604 | 0.0 |
| 0.0996 | 9.9441 | 1780 | -3.3067 | -3.4345 | -27.8794 | -50.4825 | 0.8560 | 0.625 | 0.2191 | 0.2191 | 0.0 |
| 0.1189 | 10.0559 | 1800 | -3.3062 | -3.4330 | -27.8648 | -50.3372 | 0.8531 | 0.625 | 0.2205 | 0.2205 | 0.0 |
| 0.1078 | 10.1676 | 1820 | -3.3068 | -3.4342 | -28.0472 | -50.6231 | 0.8598 | 0.6750 | 0.2023 | 0.2023 | 0.0 |
| 0.0897 | 10.2793 | 1840 | -3.2946 | -3.4231 | -28.0809 | -51.4095 | 0.8764 | 0.625 | 0.1989 | 0.1989 | 0.0 |
| 0.0831 | 10.3911 | 1860 | -3.2957 | -3.4226 | -28.6957 | -52.6367 | 0.9198 | 0.6000 | 0.1374 | 0.1374 | 0.0 |
| 0.0949 | 10.5028 | 1880 | -3.3039 | -3.4304 | -28.8144 | -53.0358 | 0.9273 | 0.5750 | 0.1256 | 0.1256 | 0.0 |
| 0.1054 | 10.6145 | 1900 | -3.3017 | -3.4274 | -28.8699 | -52.3021 | 0.9307 | 0.5750 | 0.1200 | 0.1200 | 0.0 |
| 0.0955 | 10.7263 | 1920 | -3.2988 | -3.4243 | -28.5700 | -52.1654 | 0.9160 | 0.6000 | 0.1500 | 0.1500 | 0.0 |
| 0.1188 | 10.8380 | 1940 | -3.2948 | -3.4213 | -28.7836 | -52.6606 | 0.9211 | 0.5750 | 0.1286 | 0.1286 | 0.0 |
| 0.1093 | 10.9497 | 1960 | -3.2943 | -3.4220 | -28.8628 | -52.4896 | 0.9208 | 0.6000 | 0.1207 | 0.1207 | 0.0 |
| 0.133 | 11.0615 | 1980 | -3.2865 | -3.4145 | -29.1527 | -53.5414 | 0.9443 | 0.6000 | 0.0917 | 0.0917 | 0.0 |
| 0.1008 | 11.1732 | 2000 | -3.2907 | -3.4160 | -29.3006 | -53.6194 | 0.9536 | 0.6000 | 0.0769 | 0.0769 | 0.0 |
| 0.0795 | 11.2849 | 2020 | -3.2707 | -3.3993 | -29.3426 | -53.7331 | 0.9601 | 0.5500 | 0.0727 | 0.0727 | 0.0 |
| 0.1097 | 11.3966 | 2040 | -3.2880 | -3.4116 | -29.6362 | -53.5940 | 0.9723 | 0.6000 | 0.0434 | 0.0434 | 0.0 |
| 0.1379 | 11.5084 | 2060 | -3.2820 | -3.4072 | -29.6360 | -53.6930 | 0.9694 | 0.6000 | 0.0434 | 0.0434 | 0.0 |
| 0.0688 | 11.6201 | 2080 | -3.2765 | -3.4013 | -29.5118 | -54.0503 | 0.9727 | 0.5750 | 0.0558 | 0.0558 | 0.0 |
| 0.1238 | 11.7318 | 2100 | -3.2738 | -3.3992 | -29.3518 | -53.9649 | 0.9643 | 0.5750 | 0.0718 | 0.0718 | 0.0 |
| 0.102 | 11.8436 | 2120 | -3.2728 | -3.3986 | -29.6406 | -54.0280 | 0.9608 | 0.5750 | 0.0429 | 0.0429 | 0.0 |
| 0.0781 | 11.9553 | 2140 | -3.2767 | -3.4007 | -29.5877 | -54.2530 | 0.9735 | 0.5750 | 0.0482 | 0.0482 | 0.0 |
| 0.0932 | 12.0670 | 2160 | -3.2663 | -3.3913 | -29.9586 | -54.9684 | 0.9989 | 0.6000 | 0.0111 | 0.0111 | 0.0 |
| 0.0887 | 12.1788 | 2180 | -3.2603 | -3.3863 | -30.0693 | -55.1009 | 1.0059 | 0.5500 | 0.0001 | 0.0001 | 0.0 |
| 0.1067 | 12.2905 | 2200 | -3.2736 | -3.3964 | -30.0728 | -54.9080 | 1.0082 | 0.5500 | -0.0003 | -0.0003 | 0.0 |
| 0.0828 | 12.4022 | 2220 | -3.2638 | -3.3891 | -29.9830 | -54.6157 | 1.0083 | 0.5750 | 0.0087 | 0.0087 | 0.0 |
| 0.1253 | 12.5140 | 2240 | -3.2658 | -3.3908 | -30.0711 | -55.2232 | 1.0133 | 0.5750 | -0.0001 | -0.0001 | 0.0 |
| 0.106 | 12.6257 | 2260 | -3.2566 | -3.3827 | -30.2506 | -55.4329 | 1.0206 | 0.5750 | -0.0181 | -0.0181 | 0.0 |
| 0.0846 | 12.7374 | 2280 | -3.2632 | -3.3876 | -30.6053 | -55.6918 | 1.0288 | 0.5750 | -0.0535 | -0.0535 | 0.0 |
| 0.1123 | 12.8492 | 2300 | -3.2658 | -3.3896 | -30.2257 | -55.2281 | 1.0185 | 0.5500 | -0.0156 | -0.0156 | 0.0 |
| 0.1002 | 12.9609 | 2320 | -3.2582 | -3.3840 | -30.5014 | -55.0736 | 1.0290 | 0.5250 | -0.0431 | -0.0431 | 0.0 |
| 0.109 | 13.0726 | 2340 | -3.2652 | -3.3902 | -30.2310 | -54.9712 | 1.0176 | 0.5750 | -0.0161 | -0.0161 | 0.0 |
| 0.0913 | 13.1844 | 2360 | -3.2562 | -3.3817 | -30.5128 | -55.4944 | 1.0323 | 0.5500 | -0.0443 | -0.0443 | 0.0 |
| 0.1148 | 13.2961 | 2380 | -3.2598 | -3.3840 | -30.4925 | -55.8914 | 1.0291 | 0.5500 | -0.0422 | -0.0422 | 0.0 |
| 0.1118 | 13.4078 | 2400 | -3.2585 | -3.3820 | -30.6324 | -56.0202 | 1.0348 | 0.5750 | -0.0562 | -0.0562 | 0.0 |
| 0.0812 | 13.5196 | 2420 | -3.2651 | -3.3868 | -30.6670 | -55.8481 | 1.0371 | 0.5750 | -0.0597 | -0.0597 | 0.0 |
| 0.1006 | 13.6313 | 2440 | -3.2630 | -3.3861 | -30.3289 | -55.7380 | 1.0399 | 0.5250 | -0.0259 | -0.0259 | 0.0 |
| 0.076 | 13.7430 | 2460 | -3.2569 | -3.3818 | -30.7570 | -55.9962 | 1.0451 | 0.5250 | -0.0687 | -0.0687 | 0.0 |
| 0.0781 | 13.8547 | 2480 | -3.2592 | -3.3826 | -30.6637 | -55.9950 | 1.0458 | 0.5500 | -0.0594 | -0.0594 | 0.0 |
| 0.0892 | 13.9665 | 2500 | -3.2666 | -3.3896 | -30.5092 | -55.8917 | 1.0433 | 0.5500 | -0.0439 | -0.0439 | 0.0 |
| 0.1012 | 14.0782 | 2520 | -3.2542 | -3.3787 | -30.5755 | -56.2111 | 1.0447 | 0.5500 | -0.0505 | -0.0505 | 0.0 |
| 0.1257 | 14.1899 | 2540 | -3.2573 | -3.3817 | -30.8714 | -56.0508 | 1.0483 | 0.5250 | -0.0801 | -0.0801 | 0.0 |
| 0.1197 | 14.3017 | 2560 | -3.2555 | -3.3789 | -30.9074 | -56.2701 | 1.0567 | 0.5750 | -0.0837 | -0.0837 | 0.0 |
| 0.1024 | 14.4134 | 2580 | -3.2508 | -3.3756 | -30.6748 | -56.0792 | 1.0568 | 0.5500 | -0.0605 | -0.0605 | 0.0 |
| 0.0841 | 14.5251 | 2600 | -3.2517 | -3.3768 | -30.8211 | -56.1266 | 1.0530 | 0.5750 | -0.0751 | -0.0751 | 0.0 |
| 0.1166 | 14.6369 | 2620 | -3.2521 | -3.3776 | -30.7581 | -56.2008 | 1.0520 | 0.5250 | -0.0688 | -0.0688 | 0.0 |
| 0.0786 | 14.7486 | 2640 | -3.2588 | -3.3815 | -30.6962 | -56.2018 | 1.0571 | 0.5500 | -0.0626 | -0.0626 | 0.0 |
| 0.1008 | 14.8603 | 2660 | -3.2547 | -3.3795 | -30.6797 | -55.7110 | 1.0555 | 0.5250 | -0.0610 | -0.0610 | 0.0 |
| 0.1146 | 14.9721 | 2680 | -3.2503 | -3.3754 | -30.6332 | -55.9605 | 1.0556 | 0.5500 | -0.0563 | -0.0563 | 0.0 |
| 0.0965 | 15.0838 | 2700 | -3.2535 | -3.3774 | -30.8946 | -56.2887 | 1.0547 | 0.5750 | -0.0825 | -0.0825 | 0.0 |
| 0.0833 | 15.1955 | 2720 | -3.2535 | -3.3774 | -30.6622 | -55.8942 | 1.0543 | 0.5750 | -0.0592 | -0.0592 | 0.0 |
| 0.1128 | 15.3073 | 2740 | -3.2487 | -3.3741 | -30.6323 | -55.9934 | 1.0510 | 0.5500 | -0.0562 | -0.0562 | 0.0 |
| 0.1008 | 15.4190 | 2760 | -3.2560 | -3.3804 | -30.8279 | -56.2560 | 1.0566 | 0.5250 | -0.0758 | -0.0758 | 0.0 |
| 0.1223 | 15.5307 | 2780 | -3.2573 | -3.3805 | -30.8725 | -56.2568 | 1.0601 | 0.5250 | -0.0802 | -0.0802 | 0.0 |
| 0.1231 | 15.6425 | 2800 | -3.2499 | -3.3753 | -30.7777 | -56.1515 | 1.0558 | 0.5500 | -0.0708 | -0.0708 | 0.0 |
| 0.1085 | 15.7542 | 2820 | -3.2575 | -3.3811 | -30.6396 | -56.0537 | 1.0494 | 0.5500 | -0.0570 | -0.0570 | 0.0 |
| 0.0907 | 15.8659 | 2840 | -3.2582 | -3.3814 | -30.8428 | -56.6040 | 1.0577 | 0.5250 | -0.0773 | -0.0773 | 0.0 |
| 0.1089 | 15.9777 | 2860 | -3.2576 | -3.3810 | -30.7676 | -56.1844 | 1.0558 | 0.5500 | -0.0698 | -0.0698 | 0.0 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.5.1+cu121
- Datasets 3.5.0
- Tokenizers 0.20.3
- Downloads last month
- 2