Methods on RL refinement training
Great job on the improvement. I am curious if there is a preprint on arXiv or a post about your methods on the RL refinement process. How did you do the RL training and what was the inspiration that led you to refining in this way?
Thanks, I am a starter and trying to make things work but with babysteps. I will trying to answer your question tonight, I am really very knew to this this.
I appreciate it! Same here, I am new to HF as a whole framework and platform so I'm trying to learn from you.
lol bad idea, I had good ideas in the beginning of januari but then sudden block. I have deleted a script this morning where my API was public, Rabbitbot helped me fix the script. It was in Github. But I was busy with too much things at the same time, overload. I was at a point that I did things that I didn't understood but it worked haha. I'll keep you posted.