[ICLR'24 Spotlight] Tool-Augmented Reward Modeling
ernie-research
community
AI & ML interests
Large Language Models
models
12
ernie-research/Themis-7b
Updated
•
16
•
4
ernie-research/APPS-Gemma-7B-MA-PPO-Fixed10
9B
•
Updated
•
51
ernie-research/APPS-Gemma-2B-MA-PPO-Fixed10
3B
•
Updated
•
11
ernie-research/HH-RLHF-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
18
ernie-research/HH-RLHF-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
14
ernie-research/TLDR-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
16
ernie-research/TLDR-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
16
•
1
ernie-research/TLDR-Gemma-2-27B-MA-PPO-Fixed5
27B
•
Updated
•
19
ernie-research/ernie-code-560m
Updated
•
22
•
10
ernie-research/MonoGPT
Text Generation
•
0.4B
•
Updated
•
22
•
2