open-thoughts/OpenThoughts3-1.2M
Viewer • Updated • 1.2M • 24.6k • 235
This model is the final checkpoint from the exp2199a2_redo2 experiment, which fine-tunes the long-context extended Marin-8B model on the OpenThoughts3 dataset.
| Parameter | Value |
|---|---|
| Epochs | 5 |
| Batch Size | 512 |
| Learning Rate | 8e-5 |
| Max Sequence Length | 16384 |
| LR Schedule | Cosine |
| Warmup | 10% |
| Decay | 0.9 |
| Weight Decay | 0.0 |
| Beta1 | 0.9 |
| Beta2 | 0.999 |
| Hardware | TPU v4-512 |