can you update and add the new models

#2
by shadi123 - opened

Many models have been around, like GLM 5, Kimi 2.5, and Claude 4.6
Can you add all the new models

SILMA AI - Arabic Language Models org

Thanks Shadi for the recommendation, we will benchmark them soon isa

shadi123 changed discussion status to closed
SILMA AI - Arabic Language Models org

@shadi123 done. Surprising results 🙂

really interesting results 😆
gpt-4.1 still 🤯 but GLM 🤔

SILMA AI - Arabic Language Models org

4.1 is up only because it is the Judge

Can I ask why 4.1 is the judge?
What will the results be if it is removed and replaced with a newer model?
especially that Opanai announced the retirement of multiple models, including 4.1
https://openai.com/index/retiring-gpt-4o-and-older-models/

SILMA AI - Arabic Language Models org

why 4.1 is the judge?

  • It was the best at that time

What will the results be if it is removed?

  • The comparison will not be fair for new models since the one who judged the old models is different from the one who judged the new models

If 4.1 is acting as the judge, shouldn’t it be excluded from the tier list to avoid bias?
Also, I think it should be evaluated by a separate judge.

SILMA AI - Arabic Language Models org

Yes since GPT 4.1 is irrelevant now + it is the judge, we will think about removing it from the list

Sign up or log in to comment