view post Post 4163 I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore. See translation
view post Post 611 the training for SnowflakeCore-G1-1B and 7B would be retaken because now I implemented DeepSpeed and management to use two gpus. See translation
Chess A collection dedicated to my pre-train LMs for chess. FlameF0X/ChessSLM Text Generation • 30.3M • Updated about 16 hours ago • 154
LFM2-350M-Pro FlameF0X/LFM2-350M-Pro Text Generation • 0.4B • Updated Jan 24 • 1 mradermacher/LFM2-350M-Pro-GGUF 0.4B • Updated Jan 21 • 101
Chess A collection dedicated to my pre-train LMs for chess. FlameF0X/ChessSLM Text Generation • 30.3M • Updated about 16 hours ago • 154
LFM2-350M-Pro FlameF0X/LFM2-350M-Pro Text Generation • 0.4B • Updated Jan 24 • 1 mradermacher/LFM2-350M-Pro-GGUF 0.4B • Updated Jan 21 • 101