rudyon commited on
Commit
0053036
·
verified ·
1 Parent(s): 0493f64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -50,7 +50,7 @@ The model is a 12-layer causal transformer with the following architecture:
50
 
51
  ## training
52
 
53
- - **Datasets**: HuggingFaceFW/fineweb-edu (~700k docs) + mlfoundations/dclm-baseline-1.0 (~250k docs)
54
  - **Tokenizer**: Custom ByteLevelBPE (vocab size: 32768)
55
  - **Batch size**: 524,288 tokens
56
  - **Sequence length**: 1024
 
50
 
51
  ## training
52
 
53
+ - **Datasets**: HuggingFaceFW/fineweb-edu (\~700k docs) + mlfoundations/dclm-baseline-1.0 (\~250k docs)
54
  - **Tokenizer**: Custom ByteLevelBPE (vocab size: 32768)
55
  - **Batch size**: 524,288 tokens
56
  - **Sequence length**: 1024