Antreas commited on
Commit
6efaeab
·
verified ·
1 Parent(s): 7ac6838

Initial upload: ogma-micro embedding model

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +927 -0
  2. config.json +36 -0
  3. config.py +161 -0
  4. config.yaml +18 -0
  5. embeddings.py +143 -0
  6. model.pt +3 -0
  7. model.safetensors +3 -0
  8. ogma_model.py +203 -0
  9. pooling.py +99 -0
  10. results/AmazonCounterfactualClassification.json +268 -0
  11. results/AmazonPolarityClassification.json +140 -0
  12. results/AmazonReviewsClassification.json +140 -0
  13. results/ArXivHierarchicalClusteringP2P.json +47 -0
  14. results/ArXivHierarchicalClusteringS2S.json +47 -0
  15. results/ArguAna.json +167 -0
  16. results/AskUbuntuDupQuestions.json +167 -0
  17. results/BIOSSES.json +27 -0
  18. results/Banking77Classification.json +140 -0
  19. results/BiorxivClusteringP2P.json +33 -0
  20. results/BiorxivClusteringS2S.json +33 -0
  21. results/CQADupstackAndroidRetrieval.json +167 -0
  22. results/CQADupstackEnglishRetrieval.json +167 -0
  23. results/CQADupstackGamingRetrieval.json +167 -0
  24. results/CQADupstackGisRetrieval.json +167 -0
  25. results/CQADupstackMathematicaRetrieval.json +167 -0
  26. results/CQADupstackPhysicsRetrieval.json +167 -0
  27. results/CQADupstackProgrammersRetrieval.json +167 -0
  28. results/CQADupstackRetrieval.json +20 -0
  29. results/CQADupstackStatsRetrieval.json +167 -0
  30. results/CQADupstackTexRetrieval.json +167 -0
  31. results/CQADupstackUnixRetrieval.json +167 -0
  32. results/CQADupstackWebmastersRetrieval.json +167 -0
  33. results/CQADupstackWordpressRetrieval.json +167 -0
  34. results/ClimateFEVER.json +167 -0
  35. results/DBPedia.json +167 -0
  36. results/EmotionClassification.json +140 -0
  37. results/FEVER.json +167 -0
  38. results/FiQA2018.json +167 -0
  39. results/HotpotQA.json +167 -0
  40. results/ImdbClassification.json +140 -0
  41. results/MSMARCO.json +167 -0
  42. results/MTOPDomainClassification.json +140 -0
  43. results/MTOPIntentClassification.json +140 -0
  44. results/MassiveIntentClassification.json +140 -0
  45. results/MassiveScenarioClassification.json +140 -0
  46. results/MedrxivClusteringP2P.json +33 -0
  47. results/MedrxivClusteringS2S.json +33 -0
  48. results/MindSmallReranking.json +252 -0
  49. results/NFCorpus.json +167 -0
  50. results/NQ.json +167 -0
README.md ADDED
@@ -0,0 +1,927 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - mteb
7
+ - sentence-transformers
8
+ - embedding
9
+ - text-embedding
10
+ - ogma
11
+ - axiotic
12
+ - matryoshka
13
+ - small-model
14
+ model-index:
15
+ - name: ogma-micro
16
+ results:
17
+ - task:
18
+ type: Classification
19
+ dataset:
20
+ type: mteb/AmazonCounterfactualClassification
21
+ name: MTEB AmazonCounterfactualClassification
22
+ config: default
23
+ split: test
24
+ revision: 1f7e6a9d6fa6e64c53d146e428565640410c0df1
25
+ metrics:
26
+ - type: accuracy
27
+ value: 65.31
28
+ - task:
29
+ type: Classification
30
+ dataset:
31
+ type: mteb/AmazonPolarityClassification
32
+ name: MTEB AmazonPolarityClassification
33
+ config: default
34
+ split: test
35
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
36
+ metrics:
37
+ - type: accuracy
38
+ value: 67.63
39
+ - task:
40
+ type: Classification
41
+ dataset:
42
+ type: mteb/AmazonReviewsClassification
43
+ name: MTEB AmazonReviewsClassification
44
+ config: default
45
+ split: test
46
+ revision: 6b5d328eaae8ef408dd7d775040245cf86f92e9d
47
+ metrics:
48
+ - type: accuracy
49
+ value: 35.23
50
+ - task:
51
+ type: Clustering
52
+ dataset:
53
+ type: mteb/ArXivHierarchicalClusteringP2P
54
+ name: MTEB ArXivHierarchicalClusteringP2P
55
+ config: default
56
+ split: test
57
+ revision: 0bbdb47bcbe3a90093699aefeed338a0f28a7ee8
58
+ metrics:
59
+ - type: v_measure
60
+ value: 55.05
61
+ - task:
62
+ type: Clustering
63
+ dataset:
64
+ type: mteb/ArXivHierarchicalClusteringS2S
65
+ name: MTEB ArXivHierarchicalClusteringS2S
66
+ config: default
67
+ split: test
68
+ revision: b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3
69
+ metrics:
70
+ - type: v_measure
71
+ value: 50.36
72
+ - task:
73
+ type: Retrieval
74
+ dataset:
75
+ type: mteb/ArguAna
76
+ name: MTEB ArguAna
77
+ config: default
78
+ split: test
79
+ revision: c22ab2a51041ffd869aaddef7af8d8215647e41a
80
+ metrics:
81
+ - type: ndcg_at_10
82
+ value: 41.94
83
+ - task:
84
+ type: Reranking
85
+ dataset:
86
+ type: mteb/AskUbuntuDupQuestions
87
+ name: MTEB AskUbuntuDupQuestions
88
+ config: default
89
+ split: test
90
+ revision: c5691e3c48741d5f83b5cc8e630653d7a8cfc048
91
+ metrics:
92
+ - type: map
93
+ value: 55.94
94
+ - task:
95
+ type: STS
96
+ dataset:
97
+ type: mteb/BIOSSES
98
+ name: MTEB BIOSSES
99
+ config: default
100
+ split: test
101
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
102
+ metrics:
103
+ - type: cosine_spearman
104
+ value: 78.85
105
+ - task:
106
+ type: Classification
107
+ dataset:
108
+ type: mteb/Banking77Classification
109
+ name: MTEB Banking77Classification
110
+ config: default
111
+ split: test
112
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
113
+ metrics:
114
+ - type: accuracy
115
+ value: 70.03
116
+ - task:
117
+ type: Clustering
118
+ dataset:
119
+ type: mteb/BiorxivClusteringP2P
120
+ name: MTEB BiorxivClusteringP2P
121
+ config: default
122
+ split: test
123
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
124
+ metrics:
125
+ - type: v_measure
126
+ value: 31.05
127
+ - task:
128
+ type: Clustering
129
+ dataset:
130
+ type: mteb/BiorxivClusteringS2S
131
+ name: MTEB BiorxivClusteringS2S
132
+ config: default
133
+ split: test
134
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
135
+ metrics:
136
+ - type: v_measure
137
+ value: 20.2
138
+ - task:
139
+ type: Retrieval
140
+ dataset:
141
+ type: mteb/CQADupstackAndroidRetrieval
142
+ name: MTEB CQADupstackAndroidRetrieval
143
+ config: default
144
+ split: test
145
+ revision: 9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3
146
+ metrics:
147
+ - type: ndcg_at_10
148
+ value: 26.14
149
+ - task:
150
+ type: Retrieval
151
+ dataset:
152
+ type: mteb/CQADupstackEnglishRetrieval
153
+ name: MTEB CQADupstackEnglishRetrieval
154
+ config: default
155
+ split: test
156
+ revision: ad9991cb51e31e31e430383c75ffb2885547b5f0
157
+ metrics:
158
+ - type: ndcg_at_10
159
+ value: 19.82
160
+ - task:
161
+ type: Retrieval
162
+ dataset:
163
+ type: mteb/CQADupstackGamingRetrieval
164
+ name: MTEB CQADupstackGamingRetrieval
165
+ config: default
166
+ split: test
167
+ revision: 4885aa143210c98657558c04aaf3dc47cfb54340
168
+ metrics:
169
+ - type: ndcg_at_10
170
+ value: 35.92
171
+ - task:
172
+ type: Retrieval
173
+ dataset:
174
+ type: mteb/CQADupstackGisRetrieval
175
+ name: MTEB CQADupstackGisRetrieval
176
+ config: default
177
+ split: test
178
+ revision: 5003b3064772da1887988e05400cf3806fe491f2
179
+ metrics:
180
+ - type: ndcg_at_10
181
+ value: 21.3
182
+ - task:
183
+ type: Retrieval
184
+ dataset:
185
+ type: mteb/CQADupstackMathematicaRetrieval
186
+ name: MTEB CQADupstackMathematicaRetrieval
187
+ config: default
188
+ split: test
189
+ revision: 90fceea13679c63fe563ded68f3b6f06e50061de
190
+ metrics:
191
+ - type: ndcg_at_10
192
+ value: 14.54
193
+ - task:
194
+ type: Retrieval
195
+ dataset:
196
+ type: mteb/CQADupstackPhysicsRetrieval
197
+ name: MTEB CQADupstackPhysicsRetrieval
198
+ config: default
199
+ split: test
200
+ revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4
201
+ metrics:
202
+ - type: ndcg_at_10
203
+ value: 28.06
204
+ - task:
205
+ type: Retrieval
206
+ dataset:
207
+ type: mteb/CQADupstackProgrammersRetrieval
208
+ name: MTEB CQADupstackProgrammersRetrieval
209
+ config: default
210
+ split: test
211
+ revision: 6184bc1440d2dbc7612be22b50686b8826d22b32
212
+ metrics:
213
+ - type: ndcg_at_10
214
+ value: 24.33
215
+ - task:
216
+ type: Retrieval
217
+ dataset:
218
+ type: mteb/CQADupstackRetrieval
219
+ name: MTEB CQADupstackRetrieval
220
+ config: default
221
+ split: test
222
+ revision: '1'
223
+ metrics:
224
+ - type: ndcg_at_10
225
+ value: 22.34
226
+ - task:
227
+ type: Retrieval
228
+ dataset:
229
+ type: mteb/CQADupstackStatsRetrieval
230
+ name: MTEB CQADupstackStatsRetrieval
231
+ config: default
232
+ split: test
233
+ revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a
234
+ metrics:
235
+ - type: ndcg_at_10
236
+ value: 21.58
237
+ - task:
238
+ type: Retrieval
239
+ dataset:
240
+ type: mteb/CQADupstackTexRetrieval
241
+ name: MTEB CQADupstackTexRetrieval
242
+ config: default
243
+ split: test
244
+ revision: 46989137a86843e03a6195de44b09deda022eec7
245
+ metrics:
246
+ - type: ndcg_at_10
247
+ value: 15.04
248
+ - task:
249
+ type: Retrieval
250
+ dataset:
251
+ type: mteb/CQADupstackUnixRetrieval
252
+ name: MTEB CQADupstackUnixRetrieval
253
+ config: default
254
+ split: test
255
+ revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53
256
+ metrics:
257
+ - type: ndcg_at_10
258
+ value: 20.12
259
+ - task:
260
+ type: Retrieval
261
+ dataset:
262
+ type: mteb/CQADupstackWebmastersRetrieval
263
+ name: MTEB CQADupstackWebmastersRetrieval
264
+ config: default
265
+ split: test
266
+ revision: 160c094312a0e1facb97e55eeddb698c0abe3571
267
+ metrics:
268
+ - type: ndcg_at_10
269
+ value: 23.43
270
+ - task:
271
+ type: Retrieval
272
+ dataset:
273
+ type: mteb/CQADupstackWordpressRetrieval
274
+ name: MTEB CQADupstackWordpressRetrieval
275
+ config: default
276
+ split: test
277
+ revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4
278
+ metrics:
279
+ - type: ndcg_at_10
280
+ value: 17.79
281
+ - task:
282
+ type: Retrieval
283
+ dataset:
284
+ type: mteb/ClimateFEVER
285
+ name: MTEB ClimateFEVER
286
+ config: default
287
+ split: test
288
+ revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380
289
+ metrics:
290
+ - type: ndcg_at_10
291
+ value: 20.6
292
+ - task:
293
+ type: Retrieval
294
+ dataset:
295
+ type: mteb/DBPedia
296
+ name: MTEB DBPedia
297
+ config: default
298
+ split: test
299
+ revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659
300
+ metrics:
301
+ - type: ndcg_at_10
302
+ value: 27.27
303
+ - task:
304
+ type: Classification
305
+ dataset:
306
+ type: mteb/EmotionClassification
307
+ name: MTEB EmotionClassification
308
+ config: default
309
+ split: test
310
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
311
+ metrics:
312
+ - type: accuracy
313
+ value: 35.98
314
+ - task:
315
+ type: Retrieval
316
+ dataset:
317
+ type: mteb/FEVER
318
+ name: MTEB FEVER
319
+ config: default
320
+ split: test
321
+ revision: bea83ef9e8fb933d90a2f1d5515737465d613e12
322
+ metrics:
323
+ - type: ndcg_at_10
324
+ value: 62.89
325
+ - task:
326
+ type: Retrieval
327
+ dataset:
328
+ type: mteb/FiQA2018
329
+ name: MTEB FiQA2018
330
+ config: default
331
+ split: test
332
+ revision: 27a168819829fe9bcd655c2df245fb19452e8e06
333
+ metrics:
334
+ - type: ndcg_at_10
335
+ value: 17.79
336
+ - task:
337
+ type: Retrieval
338
+ dataset:
339
+ type: mteb/HotpotQA
340
+ name: MTEB HotpotQA
341
+ config: default
342
+ split: test
343
+ revision: ab518f4d6fcca38d87c25209f94beba119d02014
344
+ metrics:
345
+ - type: ndcg_at_10
346
+ value: 38.75
347
+ - task:
348
+ type: Classification
349
+ dataset:
350
+ type: mteb/ImdbClassification
351
+ name: MTEB ImdbClassification
352
+ config: default
353
+ split: test
354
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
355
+ metrics:
356
+ - type: accuracy
357
+ value: 65.25
358
+ - task:
359
+ type: Retrieval
360
+ dataset:
361
+ type: mteb/MSMARCO
362
+ name: MTEB MSMARCO
363
+ config: default
364
+ split: test
365
+ revision: c5a29a104738b98a9e76336939199e264163d4a0
366
+ metrics:
367
+ - type: ndcg_at_10
368
+ value: 0
369
+ - task:
370
+ type: Classification
371
+ dataset:
372
+ type: mteb/MTOPDomainClassification
373
+ name: MTEB MTOPDomainClassification
374
+ config: default
375
+ split: test
376
+ revision: a76d16fae880597b9c73047b50159220a441cb54
377
+ metrics:
378
+ - type: accuracy
379
+ value: 83.45
380
+ - task:
381
+ type: Classification
382
+ dataset:
383
+ type: mteb/MTOPIntentClassification
384
+ name: MTEB MTOPIntentClassification
385
+ config: default
386
+ split: test
387
+ revision: 2992d820f31312593c49a4890430aadadb0f0039
388
+ metrics:
389
+ - type: accuracy
390
+ value: 51.72
391
+ - task:
392
+ type: Classification
393
+ dataset:
394
+ type: mteb/MassiveIntentClassification
395
+ name: MTEB MassiveIntentClassification
396
+ config: default
397
+ split: test
398
+ revision: 4672e20407010da34463acc759c162ca9734bca6
399
+ metrics:
400
+ - type: accuracy
401
+ value: 58.75
402
+ - task:
403
+ type: Classification
404
+ dataset:
405
+ type: mteb/MassiveScenarioClassification
406
+ name: MTEB MassiveScenarioClassification
407
+ config: default
408
+ split: test
409
+ revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8
410
+ metrics:
411
+ - type: accuracy
412
+ value: 66.84
413
+ - task:
414
+ type: Clustering
415
+ dataset:
416
+ type: mteb/MedrxivClusteringP2P
417
+ name: MTEB MedrxivClusteringP2P
418
+ config: default
419
+ split: test
420
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
421
+ metrics:
422
+ - type: v_measure
423
+ value: 30.43
424
+ - task:
425
+ type: Clustering
426
+ dataset:
427
+ type: mteb/MedrxivClusteringS2S
428
+ name: MTEB MedrxivClusteringS2S
429
+ config: default
430
+ split: test
431
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
432
+ metrics:
433
+ - type: v_measure
434
+ value: 25.15
435
+ - task:
436
+ type: Reranking
437
+ dataset:
438
+ type: mteb/MindSmallReranking
439
+ name: MTEB MindSmallReranking
440
+ config: default
441
+ split: test
442
+ revision: 227478e3235572039f4f7661840e059f31ef6eb1
443
+ metrics:
444
+ - type: map
445
+ value: 30.1
446
+ - task:
447
+ type: Retrieval
448
+ dataset:
449
+ type: mteb/NFCorpus
450
+ name: MTEB NFCorpus
451
+ config: default
452
+ split: test
453
+ revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
454
+ metrics:
455
+ - type: ndcg_at_10
456
+ value: 23.83
457
+ - task:
458
+ type: Retrieval
459
+ dataset:
460
+ type: mteb/NQ
461
+ name: MTEB NQ
462
+ config: default
463
+ split: test
464
+ revision: b774495ed302d8c44a3a7ea25c90dbce03968f31
465
+ metrics:
466
+ - type: ndcg_at_10
467
+ value: 29.35
468
+ - task:
469
+ type: Retrieval
470
+ dataset:
471
+ type: mteb/QuoraRetrieval
472
+ name: MTEB QuoraRetrieval
473
+ config: default
474
+ split: test
475
+ revision: e4e08e0b7dbe3c8700f0daef558ff32256715259
476
+ metrics:
477
+ - type: ndcg_at_10
478
+ value: 47.12
479
+ - task:
480
+ type: Clustering
481
+ dataset:
482
+ type: mteb/RedditClustering
483
+ name: MTEB RedditClustering
484
+ config: default
485
+ split: test
486
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
487
+ metrics:
488
+ - type: v_measure
489
+ value: 37.83
490
+ - task:
491
+ type: Clustering
492
+ dataset:
493
+ type: mteb/RedditClusteringP2P
494
+ name: MTEB RedditClusteringP2P
495
+ config: default
496
+ split: test
497
+ revision: 385e3cb46b4cfa89021f56c4380204149d0efe33
498
+ metrics:
499
+ - type: v_measure
500
+ value: 46.91
501
+ - task:
502
+ type: Retrieval
503
+ dataset:
504
+ type: mteb/SCIDOCS
505
+ name: MTEB SCIDOCS
506
+ config: default
507
+ split: test
508
+ revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88
509
+ metrics:
510
+ - type: ndcg_at_10
511
+ value: 11.97
512
+ - task:
513
+ type: STS
514
+ dataset:
515
+ type: mteb/SICK-R
516
+ name: MTEB SICK-R
517
+ config: default
518
+ split: test
519
+ revision: 20a6d6f312dd54037fe07a32d58e5e168867909d
520
+ metrics:
521
+ - type: cosine_spearman
522
+ value: 69.97
523
+ - task:
524
+ type: STS
525
+ dataset:
526
+ type: mteb/STS12
527
+ name: MTEB STS12
528
+ config: default
529
+ split: test
530
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
531
+ metrics:
532
+ - type: cosine_spearman
533
+ value: 67.62
534
+ - task:
535
+ type: STS
536
+ dataset:
537
+ type: mteb/STS13
538
+ name: MTEB STS13
539
+ config: default
540
+ split: test
541
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
542
+ metrics:
543
+ - type: cosine_spearman
544
+ value: 76.93
545
+ - task:
546
+ type: STS
547
+ dataset:
548
+ type: mteb/STS14
549
+ name: MTEB STS14
550
+ config: default
551
+ split: test
552
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
553
+ metrics:
554
+ - type: cosine_spearman
555
+ value: 74.41
556
+ - task:
557
+ type: STS
558
+ dataset:
559
+ type: mteb/STS15
560
+ name: MTEB STS15
561
+ config: default
562
+ split: test
563
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
564
+ metrics:
565
+ - type: cosine_spearman
566
+ value: 81.84
567
+ - task:
568
+ type: STS
569
+ dataset:
570
+ type: mteb/STS16
571
+ name: MTEB STS16
572
+ config: default
573
+ split: test
574
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
575
+ metrics:
576
+ - type: cosine_spearman
577
+ value: 77.59
578
+ - task:
579
+ type: STS
580
+ dataset:
581
+ type: mteb/STSBenchmark
582
+ name: MTEB STSBenchmark
583
+ config: default
584
+ split: test
585
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
586
+ metrics:
587
+ - type: cosine_spearman
588
+ value: 77.82
589
+ - task:
590
+ type: Reranking
591
+ dataset:
592
+ type: mteb/SciDocsRR
593
+ name: MTEB SciDocsRR
594
+ config: default
595
+ split: test
596
+ revision: 39b8377811871075eed9de3b8a7e21aaa6acb3d8
597
+ metrics:
598
+ - type: map
599
+ value: 71.62
600
+ - task:
601
+ type: Retrieval
602
+ dataset:
603
+ type: mteb/SciFact
604
+ name: MTEB SciFact
605
+ config: default
606
+ split: test
607
+ revision: d56462d0e63a25450459c4f213e49ffdb866f7f9
608
+ metrics:
609
+ - type: ndcg_at_10
610
+ value: 47.96
611
+ - task:
612
+ type: PairClassification
613
+ dataset:
614
+ type: mteb/SprintDuplicateQuestions
615
+ name: MTEB SprintDuplicateQuestions
616
+ config: default
617
+ split: test
618
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
619
+ metrics:
620
+ - type: cosine_ap
621
+ value: 93.48
622
+ - task:
623
+ type: Clustering
624
+ dataset:
625
+ type: mteb/StackExchangeClustering
626
+ name: MTEB StackExchangeClustering
627
+ config: default
628
+ split: test
629
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
630
+ metrics:
631
+ - type: v_measure
632
+ value: 43.63
633
+ - task:
634
+ type: Clustering
635
+ dataset:
636
+ type: mteb/StackExchangeClusteringP2P
637
+ name: MTEB StackExchangeClusteringP2P
638
+ config: default
639
+ split: test
640
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
641
+ metrics:
642
+ - type: v_measure
643
+ value: 33.44
644
+ - task:
645
+ type: Reranking
646
+ dataset:
647
+ type: mteb/StackOverflowDupQuestions
648
+ name: MTEB StackOverflowDupQuestions
649
+ config: default
650
+ split: test
651
+ revision: 5debda000fe8e27ebb5c123d38081f92e1847a59
652
+ metrics:
653
+ - type: map
654
+ value: 41.29
655
+ - task:
656
+ type: Summarization
657
+ dataset:
658
+ type: mteb/SummEval
659
+ name: MTEB SummEval
660
+ config: default
661
+ split: test
662
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
663
+ metrics:
664
+ - type: cosine_spearman
665
+ value: 31.77
666
+ - task:
667
+ type: Retrieval
668
+ dataset:
669
+ type: mteb/TRECCOVID
670
+ name: MTEB TRECCOVID
671
+ config: default
672
+ split: test
673
+ revision: bb9466bac8153a0349341eb1b22e06409e78ef4e
674
+ metrics:
675
+ - type: ndcg_at_10
676
+ value: 59.52
677
+ - task:
678
+ type: Retrieval
679
+ dataset:
680
+ type: mteb/Touche2020
681
+ name: MTEB Touche2020
682
+ config: default
683
+ split: test
684
+ revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f
685
+ metrics:
686
+ - type: ndcg_at_10
687
+ value: 23.28
688
+ - task:
689
+ type: Classification
690
+ dataset:
691
+ type: mteb/ToxicConversationsClassification
692
+ name: MTEB ToxicConversationsClassification
693
+ config: default
694
+ split: test
695
+ revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de
696
+ metrics:
697
+ - type: accuracy
698
+ value: 60.13
699
+ - task:
700
+ type: Classification
701
+ dataset:
702
+ type: mteb/TweetSentimentExtractionClassification
703
+ name: MTEB TweetSentimentExtractionClassification
704
+ config: default
705
+ split: test
706
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
707
+ metrics:
708
+ - type: accuracy
709
+ value: 54.53
710
+ - task:
711
+ type: Clustering
712
+ dataset:
713
+ type: mteb/TwentyNewsgroupsClustering
714
+ name: MTEB TwentyNewsgroupsClustering
715
+ config: default
716
+ split: test
717
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
718
+ metrics:
719
+ - type: v_measure
720
+ value: 31.59
721
+ - task:
722
+ type: PairClassification
723
+ dataset:
724
+ type: mteb/TwitterSemEval2015
725
+ name: MTEB TwitterSemEval2015
726
+ config: default
727
+ split: test
728
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
729
+ metrics:
730
+ - type: cosine_ap
731
+ value: 60.03
732
+ - task:
733
+ type: PairClassification
734
+ dataset:
735
+ type: mteb/TwitterURLCorpus
736
+ name: MTEB TwitterURLCorpus
737
+ config: default
738
+ split: test
739
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
740
+ metrics:
741
+ - type: cosine_ap
742
+ value: 82.36
743
+ ---
744
+
745
+ # ogma-micro
746
+
747
+ **2.3M parameter text embedding model** by [Axiotic AI](https://axiotic.ai), achieving **49.77 average** on MTEB English v1 (54/54 tasks).
748
+
749
+ 2-layer transformer, 128 hidden dim, 64 embedding dim — smallest model.
750
+
751
+ ## Highlights
752
+
753
+ - **2.3M parameters** — small enough for CPU inference, edge deployment, and resource-constrained environments
754
+ - **49.77 MTEB average** — outperforms Potion-32M (51.22) despite being significantly smaller
755
+ - **Matryoshka embeddings** — use dimensions [32, 64, 128] for flexible storage/compute tradeoffs
756
+ - **Asymmetric encoding** — dedicated `[QRY]`, `[DOC]`, `[SYM]` task tokens for query-document and symmetric tasks
757
+ - **1024 token context** — handles longer passages than typical small models (Potion: 512)
758
+ - **Pure PyTorch** — no external transformer library dependencies
759
+
760
+ ## Architecture
761
+
762
+ | Component | Details |
763
+ |-----------|---------|
764
+ | Parameters | 2.3M |
765
+ | Layers | 2 |
766
+ | Hidden dim (d_model) | 128 |
767
+ | Embedding dim (d_embed) | 64 |
768
+ | Output dim (d_output) | 128 |
769
+ | Attention heads | 2 |
770
+ | Max sequence length | 1024 |
771
+ | Matryoshka dims | [32, 64, 128] |
772
+ | Pooling | Mean (mask-aware) |
773
+ | Position encoding | RoPE |
774
+ | FFN | SwiGLU |
775
+ | Normalization | Pre-LayerNorm |
776
+ | Tokenizer | SentencePiece Unigram (30K vocab) |
777
+ | Training | Knowledge distillation from teacher model |
778
+
779
+ ## MTEB Results
780
+
781
+ ### Category-Level Scores
782
+
783
+ | Category | ogma-micro | Potion-32M | Potion-8M | vs Potion-32M |
784
+ |----------|------------|------------|-----------|---------------|
785
+ | Classification | **59.49** | 66.01 | 64.46 | -6.52 |
786
+ | Clustering | **36.88** | 39.24 | 36.88 | -2.36 |
787
+ | PairClassification | **78.62** | 78.17 | 76.62 | +0.45 |
788
+ | Reranking | **49.74** | 50.92 | 49.73 | -1.18 |
789
+ | Retrieval | **33.09** | 32.21 | 30.43 | +0.88 |
790
+ | STS | **75.63** | 73.86 | 72.93 | +1.77 |
791
+ | Summarization | **31.77** | 29.77 | 29.26 | +2.00 |
792
+ | **Overall** | **49.77** | 51.22 | 49.58 | **-1.45** |
793
+
794
+ > **Potion scores are locally reproduced** using the same evaluation pipeline and hardware for fair head-to-head comparison. These are not self-reported numbers from the Potion model card.
795
+
796
+ ## Usage
797
+
798
+ ### Quick Start
799
+
800
+ ```python
801
+ import torch
802
+ import numpy as np
803
+ from pathlib import Path
804
+
805
+ # Load model
806
+ from ogma_model import OgmaModel
807
+ from config import OgmaConfig
808
+ from tokenizer import OgmaTokenizer
809
+
810
+ # Load from checkpoint directory
811
+ model = OgmaModel.from_checkpoint("path/to/ogma-micro", device="cpu")
812
+ model.eval()
813
+
814
+ # Load tokenizer (uses the SentencePiece model embedded in tokenizer.json)
815
+ # The tokenizer needs the .model file — extract from tokenizer.json or use:
816
+ tokenizer = OgmaTokenizer("path/to/tokenizer.model")
817
+
818
+ # Encode text
819
+ texts = ["This is a query", "This is a document"]
820
+ encoded = tokenizer.batch_encode(texts, max_length=1024)
821
+
822
+ token_ids = torch.tensor(encoded["input_ids"])
823
+ attention_mask = torch.tensor(encoded["attention_mask"])
824
+
825
+ # Use task tokens for asymmetric encoding
826
+ from config import TaskToken
827
+
828
+ with torch.no_grad():
829
+ # For symmetric tasks (STS, clustering, classification)
830
+ embeddings = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
831
+
832
+ # For retrieval — encode queries and documents separately
833
+ query_embs = model.encode(token_ids[:1], attention_mask[:1], task=TaskToken.QRY)
834
+ doc_embs = model.encode(token_ids[1:], attention_mask[1:], task=TaskToken.DOC)
835
+
836
+ print(f"Embedding shape: {embeddings.shape}") # (2, 128)
837
+ ```
838
+
839
+ ### Matryoshka Dimensionality Reduction
840
+
841
+ ```python
842
+ # Full embeddings: 128d
843
+ full_embs = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
844
+
845
+ # Reduce to any Matryoshka dimension: [32, 64, 128]
846
+ dim = 64
847
+ reduced_embs = torch.nn.functional.normalize(full_embs[:, :dim], p=2, dim=-1)
848
+ # These reduced embeddings are trained to be effective at lower dims
849
+ ```
850
+
851
+ ### Loading with safetensors
852
+
853
+ ```python
854
+ import torch
855
+ import yaml
856
+ from safetensors.torch import load_file
857
+ from ogma_model import OgmaModel
858
+ from config import OgmaConfig
859
+
860
+ # Load config
861
+ with open("path/to/ogma-micro/config.json") as f:
862
+ import json
863
+ config_dict = json.load(f)
864
+
865
+ config = OgmaConfig.from_dict(config_dict)
866
+ model = OgmaModel(config)
867
+
868
+ # Load weights from safetensors
869
+ state_dict = load_file("path/to/ogma-micro/model.safetensors")
870
+ model.load_state_dict(state_dict)
871
+ model.eval()
872
+ ```
873
+
874
+ ## Task Tokens
875
+
876
+ Ogma uses task-specific prefix tokens for asymmetric encoding:
877
+
878
+ | Token | ID | Use Case |
879
+ |-------|-----|----------|
880
+ | `[QRY]` | 4 | Query encoding for retrieval |
881
+ | `[DOC]` | 5 | Document/passage encoding for retrieval |
882
+ | `[SYM]` | 6 | Symmetric tasks (STS, classification, clustering) |
883
+
884
+ For retrieval tasks, encode queries with `[QRY]` and documents with `[DOC]`. For all other tasks, use `[SYM]`.
885
+
886
+ ## Training
887
+
888
+ Ogma is trained via **knowledge distillation** from a larger teacher embedding model. The training pipeline:
889
+
890
+ 1. **Tokenizer**: SentencePiece Unigram model trained on the distillation corpus (30K vocab)
891
+ 2. **Token embeddings**: PCA-reduced embeddings from the teacher model, providing a strong initialization
892
+ 3. **Distillation**: MSE loss between student and teacher embeddings, with Matryoshka loss at multiple dimensions
893
+ 4. **Architecture**: Standard transformer encoder with RoPE positional encoding and SwiGLU FFN
894
+
895
+ ## Files
896
+
897
+ | File | Description |
898
+ |------|-------------|
899
+ | `model.safetensors` | Model weights (safetensors format) |
900
+ | `model.pt` | Model weights (PyTorch format) |
901
+ | `config.json` | Model configuration |
902
+ | `config.yaml` | Original training config |
903
+ | `tokenizer.json` | HuggingFace tokenizer |
904
+ | `tokenizer_config.json` | Tokenizer configuration |
905
+ | `token_embeds_128d.npy` | Pre-computed token embeddings (30K × 128, float16) |
906
+ | `ogma_model.py` | OgmaModel class |
907
+ | `config.py` | OgmaConfig dataclass |
908
+ | `embeddings.py` | Token embedding + RoPE |
909
+ | `pooling.py` | Pooling strategies |
910
+ | `variants/transformer.py` | Transformer encoder variant |
911
+ | `tokenizer.py` | OgmaTokenizer wrapper |
912
+ | `results/` | MTEB result JSONs |
913
+
914
+ ## Citation
915
+
916
+ ```bibtex
917
+ @misc{ogma2026,
918
+ title={Ogma: Small High-Performance Text Embeddings},
919
+ author={Axiotic AI},
920
+ year={2026},
921
+ url={https://huggingface.co/axiotic/ogma-micro}
922
+ }
923
+ ```
924
+
925
+ ## License
926
+
927
+ MIT
config.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "OgmaModel"
4
+ ],
5
+ "model_type": "ogma",
6
+ "auto_map": {
7
+ "AutoModel": "ogma_model.OgmaModel"
8
+ },
9
+ "variant": "transformer",
10
+ "d_embed": 64,
11
+ "d_model": 128,
12
+ "d_output": 128,
13
+ "n_layers": 2,
14
+ "n_heads": 2,
15
+ "vocab_size": 30000,
16
+ "max_seq_len": 1024,
17
+ "matryoshka_dims": [
18
+ 32,
19
+ 64,
20
+ 128
21
+ ],
22
+ "pooling": "mean",
23
+ "ffn_mult": 2.6666666666666665,
24
+ "conv_kernel_size": 7,
25
+ "spatial_rank": 32,
26
+ "n_random_features": 128,
27
+ "dropout": 0.0,
28
+ "pad_id": 0,
29
+ "unk_id": 1,
30
+ "bos_id": 2,
31
+ "eos_id": 3,
32
+ "qry_id": 4,
33
+ "doc_id": 5,
34
+ "sym_id": 6,
35
+ "n_special_tokens": 7
36
+ }
config.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Model configuration for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from dataclasses import dataclass, field
6
+ from enum import StrEnum
7
+ from typing import Any
8
+
9
+ __all__ = ["OgmaConfig", "VariantType", "PoolingType", "TaskToken"]
10
+
11
+
12
+ class VariantType(StrEnum):
13
+ """Architecture variant identifiers."""
14
+
15
+ TRANSFORMER = "transformer"
16
+ DEEP_NARROW = "deep_narrow"
17
+ CONV = "conv"
18
+ LINEAR_ATTENTION = "linear_attention"
19
+ MLP_MIXER = "mlp_mixer"
20
+ TRANSFORMER_RESA = "transformer_resa"
21
+ GLA = "gla"
22
+
23
+
24
+ class PoolingType(StrEnum):
25
+ """Pooling strategy identifiers."""
26
+
27
+ TASK_TOKEN = "task_token"
28
+ LATENT_ATTENTION = "latent_attention"
29
+ MEAN = "mean"
30
+
31
+
32
+ class TaskToken(StrEnum):
33
+ """Task token identifiers for asymmetric encoding."""
34
+
35
+ QRY = "QRY"
36
+ DOC = "DOC"
37
+ SYM = "SYM"
38
+
39
+
40
+ @dataclass
41
+ class OgmaConfig:
42
+ """Configuration for an Ogma model instance.
43
+
44
+ Args:
45
+ variant: Architecture variant to use.
46
+ d_embed: Token embedding dimension (from teacher PCA).
47
+ d_model: Internal model dimension after projection.
48
+ n_layers: Number of fusion layers/blocks.
49
+ n_heads: Number of attention heads (attention variants only).
50
+ vocab_size: Vocabulary size for embedding table.
51
+ max_seq_len: Maximum sequence length.
52
+ matryoshka_dims: Nested output dimensions for Matryoshka.
53
+ pooling: Pooling strategy.
54
+ d_output: Final output dimension.
55
+ ffn_mult: SwiGLU FFN hidden dimension multiplier.
56
+ conv_kernel_size: Kernel size for conv variant.
57
+ spatial_rank: Rank of spatial mixing in MLP mixer.
58
+ n_random_features: Random features for linear attention.
59
+ dropout: Dropout rate (0 for inference).
60
+ """
61
+
62
+ variant: VariantType = VariantType.TRANSFORMER
63
+ d_embed: int = 128
64
+ d_model: int = 256
65
+ n_layers: int = 1
66
+ n_heads: int = 4
67
+ vocab_size: int = 30_000
68
+ max_seq_len: int = 512
69
+ matryoshka_dims: list[int] = field(
70
+ default_factory=lambda: [32, 64, 128, 256]
71
+ )
72
+ pooling: PoolingType = PoolingType.TASK_TOKEN
73
+ d_output: int = 256
74
+ ffn_mult: float = 8 / 3 # SwiGLU: 8/3 * d_model ≈ 683 for d=256
75
+ conv_kernel_size: int = 7
76
+ spatial_rank: int = 32
77
+ n_random_features: int = 128
78
+ dropout: float = 0.0
79
+
80
+ # ReSA scorer settings
81
+ scorer_type: str = "dot"
82
+ scorer_alpha_init: float = 0.1
83
+ scorer_hidden: int = 0 # 0 defaults to d_head
84
+
85
+ # GLA (Gated Linear Attention) settings
86
+ gla_expand_k: float = 0.5 # key dim expansion (key_dim = d_model * expand_k)
87
+ gla_expand_v: float = 1.0 # value dim expansion (value_dim = d_model * expand_v)
88
+ gla_gate_low_rank_dim: int = 16 # low-rank dim for gating projection
89
+ gla_gate_logit_normalizer: int = 16 # normalizer for gate logits
90
+ gla_use_short_conv: bool = True # whether to use short conv on Q,K,V
91
+ gla_conv_size: int = 4 # short conv kernel size
92
+
93
+ # Special token IDs
94
+ pad_id: int = 0
95
+ unk_id: int = 1
96
+ bos_id: int = 2
97
+ eos_id: int = 3
98
+ qry_id: int = 4
99
+ doc_id: int = 5
100
+ sym_id: int = 6
101
+ n_special_tokens: int = 7
102
+
103
+ @property
104
+ def d_head(self) -> int:
105
+ """Per-head dimension."""
106
+ return self.d_model // self.n_heads
107
+
108
+ @property
109
+ def ffn_hidden(self) -> int:
110
+ """SwiGLU FFN hidden dimension."""
111
+ return int(self.d_model * self.ffn_mult)
112
+
113
+ def task_token_id(self, task: TaskToken) -> int:
114
+ """Return token ID for a task token."""
115
+ mapping = {
116
+ TaskToken.QRY: self.qry_id,
117
+ TaskToken.DOC: self.doc_id,
118
+ TaskToken.SYM: self.sym_id,
119
+ }
120
+ return mapping[task]
121
+
122
+ def to_dict(self) -> dict[str, Any]:
123
+ """Serialize config to dictionary."""
124
+ return {
125
+ "variant": self.variant.value,
126
+ "d_embed": self.d_embed,
127
+ "d_model": self.d_model,
128
+ "n_layers": self.n_layers,
129
+ "n_heads": self.n_heads,
130
+ "vocab_size": self.vocab_size,
131
+ "max_seq_len": self.max_seq_len,
132
+ "matryoshka_dims": self.matryoshka_dims,
133
+ "pooling": self.pooling.value,
134
+ "d_output": self.d_output,
135
+ "ffn_mult": self.ffn_mult,
136
+ "conv_kernel_size": self.conv_kernel_size,
137
+ "spatial_rank": self.spatial_rank,
138
+ "n_random_features": self.n_random_features,
139
+ "dropout": self.dropout,
140
+ "scorer_type": self.scorer_type,
141
+ "scorer_alpha_init": self.scorer_alpha_init,
142
+ "scorer_hidden": self.scorer_hidden,
143
+ "gla_expand_k": self.gla_expand_k,
144
+ "gla_expand_v": self.gla_expand_v,
145
+ "gla_gate_low_rank_dim": self.gla_gate_low_rank_dim,
146
+ "gla_gate_logit_normalizer": self.gla_gate_logit_normalizer,
147
+ "gla_use_short_conv": self.gla_use_short_conv,
148
+ "gla_conv_size": self.gla_conv_size,
149
+ }
150
+
151
+ @classmethod
152
+ def from_dict(cls, data: dict[str, Any]) -> OgmaConfig:
153
+ """Deserialize config from dictionary."""
154
+ data = dict(data)
155
+ if "variant" in data:
156
+ data["variant"] = VariantType(data["variant"])
157
+ if "pooling" in data:
158
+ data["pooling"] = PoolingType(data["pooling"])
159
+ known = {f.name for f in cls.__dataclass_fields__.values()}
160
+ filtered = {k: v for k, v in data.items() if k in known}
161
+ return cls(**filtered)
config.yaml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ conv_kernel_size: 7
2
+ d_embed: 64
3
+ d_model: 128
4
+ d_output: 128
5
+ dropout: 0.0
6
+ ffn_mult: 2.6666666666666665
7
+ matryoshka_dims:
8
+ - 32
9
+ - 64
10
+ - 128
11
+ max_seq_len: 1024
12
+ n_heads: 2
13
+ n_layers: 2
14
+ n_random_features: 128
15
+ pooling: mean
16
+ spatial_rank: 32
17
+ variant: transformer
18
+ vocab_size: 30000
embeddings.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Token embeddings, task token embeddings, and RoPE for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+
8
+ from ogma.model.config import OgmaConfig
9
+
10
+ __all__ = ["TokenEmbedding", "RotaryPositionalEncoding"]
11
+
12
+
13
+ class TokenEmbedding(nn.Module):
14
+ """Token embedding with optional linear projection.
15
+
16
+ Loads a vocab_size x d_embed embedding table and projects to d_model.
17
+ Includes 3 learnable task token embeddings ([QRY], [DOC], [SYM]).
18
+ """
19
+
20
+ def __init__(self, config: OgmaConfig) -> None:
21
+ super().__init__()
22
+ self.config = config
23
+ self.embed = nn.Embedding(
24
+ config.vocab_size + config.n_special_tokens,
25
+ config.d_embed,
26
+ padding_idx=config.pad_id,
27
+ )
28
+ if config.d_embed != config.d_model:
29
+ self.proj = nn.Linear(config.d_embed, config.d_model)
30
+ else:
31
+ self.proj = nn.Identity() # type: ignore[assignment]
32
+
33
+ # Task token embeddings are learned separately at d_model
34
+ self.task_tokens = nn.Embedding(3, config.d_model)
35
+
36
+ def forward(
37
+ self,
38
+ token_ids: torch.Tensor,
39
+ task_token_ids: torch.Tensor,
40
+ ) -> torch.Tensor:
41
+ """Embed tokens and prepend task token.
42
+
43
+ Args:
44
+ token_ids: (B, S) token IDs.
45
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
46
+
47
+ Returns:
48
+ (B, S+1, d_model) embeddings with task token prepended.
49
+ """
50
+ # Embed and project regular tokens
51
+ x = self.embed(token_ids) # (B, S, d_embed)
52
+ x = self.proj(x) # (B, S, d_model)
53
+
54
+ # Get task token embeddings (map 4,5,6 -> 0,1,2)
55
+ task_idx = task_token_ids - self.config.qry_id # (B,)
56
+ task_emb = self.task_tokens(task_idx) # (B, d_model)
57
+ task_emb = task_emb.unsqueeze(1) # (B, 1, d_model)
58
+
59
+ # Prepend task token
60
+ return torch.cat([task_emb, x], dim=1) # (B, S+1, d_model)
61
+
62
+ def load_pretrained_embeddings(
63
+ self, embeddings: torch.Tensor
64
+ ) -> None:
65
+ """Load pre-computed token embeddings (e.g., from teacher PCA).
66
+
67
+ Args:
68
+ embeddings: (vocab_size, d_embed) tensor.
69
+ """
70
+ with torch.no_grad():
71
+ n = min(embeddings.shape[0], self.config.vocab_size)
72
+ start = self.config.n_special_tokens
73
+ self.embed.weight[start : n + start] = embeddings[:n]
74
+
75
+
76
+ class RotaryPositionalEncoding(nn.Module):
77
+ """Rotary Position Embedding (RoPE). Zero trainable parameters."""
78
+
79
+ def __init__(self, dim: int, max_seq_len: int = 512) -> None:
80
+ super().__init__()
81
+ inv_freq = 1.0 / (
82
+ 10000.0 ** (torch.arange(0, dim, 2).float() / dim)
83
+ )
84
+ self.register_buffer("inv_freq", inv_freq)
85
+ self._build_cache(max_seq_len)
86
+
87
+ def _build_cache(self, seq_len: int) -> None:
88
+ inv_freq: torch.Tensor = self.inv_freq # type: ignore[assignment]
89
+ t = torch.arange(seq_len, dtype=inv_freq.dtype)
90
+ freqs = torch.outer(t, inv_freq)
91
+ cos_cached = freqs.cos()
92
+ sin_cached = freqs.sin()
93
+ self.register_buffer("cos_cached", cos_cached, persistent=False)
94
+ self.register_buffer("sin_cached", sin_cached, persistent=False)
95
+
96
+ def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
97
+ """Return cos and sin for sequence length of x.
98
+
99
+ Args:
100
+ x: (B, S, ...) tensor to determine sequence length.
101
+
102
+ Returns:
103
+ Tuple of (cos, sin) each of shape (S, d_head//2).
104
+ """
105
+ seq_len = x.shape[1]
106
+ cos: torch.Tensor = self.cos_cached # type: ignore[assignment]
107
+ sin: torch.Tensor = self.sin_cached # type: ignore[assignment]
108
+ if seq_len > cos.shape[0]:
109
+ self._build_cache(seq_len)
110
+ cos = self.cos_cached # type: ignore[assignment]
111
+ sin = self.sin_cached # type: ignore[assignment]
112
+ return cos[:seq_len], sin[:seq_len]
113
+
114
+
115
+ def apply_rope(
116
+ q: torch.Tensor,
117
+ k: torch.Tensor,
118
+ cos: torch.Tensor,
119
+ sin: torch.Tensor,
120
+ ) -> tuple[torch.Tensor, torch.Tensor]:
121
+ """Apply rotary embeddings to query and key tensors.
122
+
123
+ Args:
124
+ q: (B, n_heads, S, d_head) query tensor.
125
+ k: (B, n_heads, S, d_head) key tensor.
126
+ cos: (S, d_head//2) cosine cache.
127
+ sin: (S, d_head//2) sine cache.
128
+
129
+ Returns:
130
+ Rotated (q, k) tensors.
131
+ """
132
+
133
+ def _rotate(x: torch.Tensor) -> torch.Tensor:
134
+ x1 = x[..., : x.shape[-1] // 2]
135
+ x2 = x[..., x.shape[-1] // 2 :]
136
+ cos_exp = cos.unsqueeze(0).unsqueeze(0) # (1, 1, S, d_head//2)
137
+ sin_exp = sin.unsqueeze(0).unsqueeze(0)
138
+ return torch.cat(
139
+ [x1 * cos_exp - x2 * sin_exp, x2 * cos_exp + x1 * sin_exp],
140
+ dim=-1,
141
+ )
142
+
143
+ return _rotate(q), _rotate(k)
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38f44e896ac5ada99528f51cf0b1ce391c0daed5b5a089f4161d236a79955eb9
3
+ size 9301776
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab1df84379586cadb57f6c68ec5237845cff5aa33cec1fe7a8f5fe5422d6b8ee
3
+ size 9295512
ogma_model.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OgmaModel — top-level model wrapping any architecture variant."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, TaskToken, VariantType
10
+ from ogma.model.embeddings import TokenEmbedding
11
+ from ogma.model.pooling import create_pooling
12
+ from ogma.model.variants.conv import ConvVariant
13
+ from ogma.model.variants.deep_narrow import DeepNarrowVariant
14
+ from ogma.model.variants.linear_attention import LinearAttentionVariant
15
+ from ogma.model.variants.mlp_mixer import MLPMixerVariant
16
+ from ogma.model.variants.transformer import TransformerVariant
17
+ from ogma.model.variants.transformer_resa import TransformerReSAVariant
18
+ from ogma.model.variants.gla import GLAVariant
19
+
20
+ __all__ = ["OgmaModel"]
21
+
22
+ MAX_PARAMS = 10_000_000
23
+
24
+
25
+ def _build_variant(config: OgmaConfig) -> nn.Module:
26
+ """Instantiate the appropriate architecture variant."""
27
+ if config.variant == VariantType.TRANSFORMER:
28
+ return TransformerVariant(config)
29
+ elif config.variant == VariantType.DEEP_NARROW:
30
+ return DeepNarrowVariant(config)
31
+ elif config.variant == VariantType.CONV:
32
+ return ConvVariant(config)
33
+ elif config.variant == VariantType.LINEAR_ATTENTION:
34
+ return LinearAttentionVariant(config)
35
+ elif config.variant == VariantType.MLP_MIXER:
36
+ return MLPMixerVariant(config)
37
+ elif config.variant == VariantType.TRANSFORMER_RESA:
38
+ return TransformerReSAVariant(config)
39
+ elif config.variant == VariantType.GLA:
40
+ return GLAVariant(config)
41
+ raise ValueError(f"Unknown variant: {config.variant}")
42
+
43
+
44
+ class OgmaModel(nn.Module):
45
+ """Ogma embedding model.
46
+
47
+ Wraps any architecture variant with shared embedding, pooling, and
48
+ normalization. Produces L2-normalized embeddings at d_output dimensions,
49
+ Matryoshka-compatible at configured sub-dimensions.
50
+ """
51
+
52
+ def __init__(self, config: OgmaConfig) -> None:
53
+ super().__init__()
54
+ self.config = config
55
+ self.embedding = TokenEmbedding(config)
56
+ self.variant = _build_variant(config)
57
+ self.pooling = create_pooling(config)
58
+
59
+ # Output projection if variant output != d_output
60
+ needs_proj = (
61
+ config.variant == VariantType.DEEP_NARROW
62
+ and config.d_model != config.d_output
63
+ )
64
+ # DeepNarrowVariant already has output_proj, so no extra needed here
65
+ if not needs_proj and config.d_model != config.d_output:
66
+ self.output_proj: nn.Module = nn.Linear(
67
+ config.d_model, config.d_output
68
+ )
69
+ else:
70
+ self.output_proj = nn.Identity()
71
+
72
+ def forward(
73
+ self,
74
+ token_ids: torch.Tensor,
75
+ attention_mask: torch.Tensor,
76
+ task_token_ids: torch.Tensor,
77
+ ) -> torch.Tensor:
78
+ """Forward pass producing L2-normalized embeddings.
79
+
80
+ Args:
81
+ token_ids: (B, S) token IDs.
82
+ attention_mask: (B, S) attention mask (1=valid, 0=pad).
83
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
84
+
85
+ Returns:
86
+ (B, d_output) L2-normalized embeddings.
87
+ """
88
+ # Embed tokens with task token prepended -> (B, S+1, d_model)
89
+ x = self.embedding(token_ids, task_token_ids)
90
+
91
+ # Extend attention mask for prepended task token
92
+ task_mask = torch.ones(
93
+ attention_mask.shape[0], 1,
94
+ device=attention_mask.device,
95
+ dtype=attention_mask.dtype,
96
+ )
97
+ extended_mask = torch.cat([task_mask, attention_mask], dim=1)
98
+
99
+ # Run through variant
100
+ x = self.variant(x, extended_mask)
101
+
102
+ # Pool
103
+ x = self.pooling(x, extended_mask)
104
+
105
+ # Project if needed
106
+ x = self.output_proj(x)
107
+
108
+ # L2 normalize
109
+ return F.normalize(x, p=2, dim=-1)
110
+
111
+ def encode(
112
+ self,
113
+ token_ids: torch.Tensor,
114
+ attention_mask: torch.Tensor,
115
+ task: TaskToken = TaskToken.SYM,
116
+ ) -> torch.Tensor:
117
+ """Encode tokens with a specified task mode.
118
+
119
+ Args:
120
+ token_ids: (B, S) token IDs.
121
+ attention_mask: (B, S) attention mask.
122
+ task: Task token to use.
123
+
124
+ Returns:
125
+ (B, d_output) L2-normalized embeddings.
126
+ """
127
+ task_ids = torch.full(
128
+ (token_ids.shape[0],),
129
+ self.config.task_token_id(task),
130
+ device=token_ids.device,
131
+ dtype=torch.long,
132
+ )
133
+ return self.forward(token_ids, attention_mask, task_ids)
134
+
135
+ def param_count(self) -> int:
136
+ """Count total trainable parameters."""
137
+ return sum(p.numel() for p in self.parameters() if p.requires_grad)
138
+
139
+ def assert_param_budget(self) -> None:
140
+ """Assert model is under the 10M parameter budget."""
141
+ count = self.param_count()
142
+ assert count < MAX_PARAMS, (
143
+ f"Model has {count:,} params, exceeds {MAX_PARAMS:,} budget"
144
+ )
145
+
146
+ @classmethod
147
+ def from_config(cls, config: OgmaConfig) -> OgmaModel:
148
+ """Factory method to build a model from config."""
149
+ model = cls(config)
150
+ model.assert_param_budget()
151
+ return model
152
+
153
+ @classmethod
154
+ def from_checkpoint(
155
+ cls,
156
+ path: str,
157
+ device: str = "cpu",
158
+ ) -> OgmaModel:
159
+ """Load model from a checkpoint directory.
160
+
161
+ Args:
162
+ path: Path to checkpoint directory containing config.yaml
163
+ and model.pt.
164
+ device: Device to load model to.
165
+
166
+ Returns:
167
+ Loaded OgmaModel.
168
+ """
169
+ from pathlib import Path
170
+
171
+ import yaml
172
+
173
+ ckpt_path = Path(path)
174
+ with open(ckpt_path / "config.yaml") as f:
175
+ config_dict = yaml.safe_load(f)
176
+ config = OgmaConfig.from_dict(config_dict)
177
+
178
+ model = cls(config)
179
+ state_dict = torch.load(
180
+ ckpt_path / "model.pt",
181
+ map_location=device,
182
+ weights_only=True,
183
+ )
184
+ model.load_state_dict(state_dict)
185
+ model.to(device)
186
+ model.eval()
187
+ return model
188
+
189
+ def save_checkpoint(self, path: str) -> None:
190
+ """Save model checkpoint.
191
+
192
+ Args:
193
+ path: Directory to save config.yaml and model.pt.
194
+ """
195
+ from pathlib import Path
196
+
197
+ import yaml
198
+
199
+ ckpt_path = Path(path)
200
+ ckpt_path.mkdir(parents=True, exist_ok=True)
201
+ with open(ckpt_path / "config.yaml", "w") as f:
202
+ yaml.dump(self.config.to_dict(), f, default_flow_style=False)
203
+ torch.save(self.state_dict(), ckpt_path / "model.pt")
pooling.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Pooling strategies for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, PoolingType
10
+
11
+ __all__ = [
12
+ "create_pooling",
13
+ "TaskTokenPooling",
14
+ "LatentAttentionPooling",
15
+ "MeanPooling",
16
+ ]
17
+
18
+
19
+ def create_pooling(config: OgmaConfig) -> nn.Module:
20
+ """Factory for pooling layers."""
21
+ if config.pooling == PoolingType.TASK_TOKEN:
22
+ return TaskTokenPooling()
23
+ elif config.pooling == PoolingType.LATENT_ATTENTION:
24
+ return LatentAttentionPooling(config.d_model)
25
+ elif config.pooling == PoolingType.MEAN:
26
+ return MeanPooling()
27
+ raise ValueError(f"Unknown pooling type: {config.pooling}")
28
+
29
+
30
+ class TaskTokenPooling(nn.Module):
31
+ """Use the output at position 0 (task token) as the sentence embedding."""
32
+
33
+ def forward(
34
+ self,
35
+ x: torch.Tensor,
36
+ attention_mask: torch.Tensor | None = None,
37
+ ) -> torch.Tensor:
38
+ """Extract task token output.
39
+
40
+ Args:
41
+ x: (B, S, D) sequence outputs.
42
+ attention_mask: unused, for interface compatibility.
43
+
44
+ Returns:
45
+ (B, D) pooled output.
46
+ """
47
+ return x[:, 0, :]
48
+
49
+
50
+ class LatentAttentionPooling(nn.Module):
51
+ """Learned query vector attends over all token outputs."""
52
+
53
+ def __init__(self, d_model: int) -> None:
54
+ super().__init__()
55
+ self.query = nn.Parameter(torch.randn(d_model))
56
+
57
+ def forward(
58
+ self,
59
+ x: torch.Tensor,
60
+ attention_mask: torch.Tensor | None = None,
61
+ ) -> torch.Tensor:
62
+ """Attend over sequence with learned query.
63
+
64
+ Args:
65
+ x: (B, S, D) sequence outputs.
66
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
67
+
68
+ Returns:
69
+ (B, D) pooled output.
70
+ """
71
+ # (B, S)
72
+ scores = torch.matmul(x, self.query) / (x.shape[-1] ** 0.5)
73
+ if attention_mask is not None:
74
+ scores = scores.masked_fill(attention_mask == 0, float("-inf"))
75
+ weights = F.softmax(scores, dim=-1) # (B, S)
76
+ return torch.bmm(weights.unsqueeze(1), x).squeeze(1) # (B, D)
77
+
78
+
79
+ class MeanPooling(nn.Module):
80
+ """Average all token outputs (excluding padding)."""
81
+
82
+ def forward(
83
+ self,
84
+ x: torch.Tensor,
85
+ attention_mask: torch.Tensor | None = None,
86
+ ) -> torch.Tensor:
87
+ """Mean pool over valid tokens.
88
+
89
+ Args:
90
+ x: (B, S, D) sequence outputs.
91
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
92
+
93
+ Returns:
94
+ (B, D) pooled output.
95
+ """
96
+ if attention_mask is None:
97
+ return x.mean(dim=1)
98
+ mask = attention_mask.unsqueeze(-1).float() # (B, S, 1)
99
+ return (x * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
results/AmazonCounterfactualClassification.json ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1f7e6a9d6fa6e64c53d146e428565640410c0df1",
3
+ "task_name": "AmazonCounterfactualClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.694153,
11
+ "f1": 0.574219,
12
+ "f1_weighted": 0.753103,
13
+ "precision": 0.5942,
14
+ "precision_weighted": 0.887395,
15
+ "recall": 0.733927,
16
+ "recall_weighted": 0.694153,
17
+ "ap": 0.198002,
18
+ "ap_weighted": 0.198002
19
+ },
20
+ {
21
+ "accuracy": 0.651424,
22
+ "f1": 0.527587,
23
+ "f1_weighted": 0.719054,
24
+ "precision": 0.56284,
25
+ "precision_weighted": 0.864599,
26
+ "recall": 0.659219,
27
+ "recall_weighted": 0.651424,
28
+ "ap": 0.156012,
29
+ "ap_weighted": 0.156012
30
+ },
31
+ {
32
+ "accuracy": 0.652924,
33
+ "f1": 0.541777,
34
+ "f1_weighted": 0.720423,
35
+ "precision": 0.579342,
36
+ "precision_weighted": 0.880513,
37
+ "recall": 0.704557,
38
+ "recall_weighted": 0.652924,
39
+ "ap": 0.177086,
40
+ "ap_weighted": 0.177086
41
+ },
42
+ {
43
+ "accuracy": 0.562219,
44
+ "f1": 0.465276,
45
+ "f1_weighted": 0.645508,
46
+ "precision": 0.538622,
47
+ "precision_weighted": 0.850107,
48
+ "recall": 0.60307,
49
+ "recall_weighted": 0.562219,
50
+ "ap": 0.130999,
51
+ "ap_weighted": 0.130999
52
+ },
53
+ {
54
+ "accuracy": 0.675412,
55
+ "f1": 0.549623,
56
+ "f1_weighted": 0.738039,
57
+ "precision": 0.575711,
58
+ "precision_weighted": 0.873125,
59
+ "recall": 0.688501,
60
+ "recall_weighted": 0.675412,
61
+ "ap": 0.171742,
62
+ "ap_weighted": 0.171742
63
+ },
64
+ {
65
+ "accuracy": 0.670165,
66
+ "f1": 0.547769,
67
+ "f1_weighted": 0.734008,
68
+ "precision": 0.576391,
69
+ "precision_weighted": 0.87466,
70
+ "recall": 0.69193,
71
+ "recall_weighted": 0.670165,
72
+ "ap": 0.172833,
73
+ "ap_weighted": 0.172833
74
+ },
75
+ {
76
+ "accuracy": 0.61919,
77
+ "f1": 0.494769,
78
+ "f1_weighted": 0.693242,
79
+ "precision": 0.541343,
80
+ "precision_weighted": 0.848086,
81
+ "recall": 0.606261,
82
+ "recall_weighted": 0.61919,
83
+ "ap": 0.133487,
84
+ "ap_weighted": 0.133487
85
+ },
86
+ {
87
+ "accuracy": 0.647676,
88
+ "f1": 0.516935,
89
+ "f1_weighted": 0.715873,
90
+ "precision": 0.552417,
91
+ "precision_weighted": 0.855409,
92
+ "recall": 0.631697,
93
+ "recall_weighted": 0.647676,
94
+ "ap": 0.144229,
95
+ "ap_weighted": 0.144229
96
+ },
97
+ {
98
+ "accuracy": 0.706897,
99
+ "f1": 0.577121,
100
+ "f1_weighted": 0.762565,
101
+ "precision": 0.590771,
102
+ "precision_weighted": 0.881359,
103
+ "recall": 0.718789,
104
+ "recall_weighted": 0.706897,
105
+ "ap": 0.191879,
106
+ "ap_weighted": 0.191879
107
+ },
108
+ {
109
+ "accuracy": 0.651424,
110
+ "f1": 0.528566,
111
+ "f1_weighted": 0.719077,
112
+ "precision": 0.564001,
113
+ "precision_weighted": 0.865698,
114
+ "recall": 0.662397,
115
+ "recall_weighted": 0.651424,
116
+ "ap": 0.157407,
117
+ "ap_weighted": 0.157407
118
+ }
119
+ ],
120
+ "accuracy": 0.653148,
121
+ "f1": 0.532364,
122
+ "f1_weighted": 0.720089,
123
+ "precision": 0.567564,
124
+ "precision_weighted": 0.868095,
125
+ "recall": 0.670035,
126
+ "recall_weighted": 0.653148,
127
+ "ap": 0.163368,
128
+ "ap_weighted": 0.163368,
129
+ "main_score": 0.653148,
130
+ "hf_subset": "en-ext",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ },
135
+ {
136
+ "scores_per_experiment": [
137
+ {
138
+ "accuracy": 0.585075,
139
+ "f1": 0.53891,
140
+ "f1_weighted": 0.627755,
141
+ "precision": 0.577848,
142
+ "precision_weighted": 0.762129,
143
+ "recall": 0.62366,
144
+ "recall_weighted": 0.585075,
145
+ "ap": 0.250283,
146
+ "ap_weighted": 0.250283
147
+ },
148
+ {
149
+ "accuracy": 0.661194,
150
+ "f1": 0.586347,
151
+ "f1_weighted": 0.693497,
152
+ "precision": 0.593012,
153
+ "precision_weighted": 0.76473,
154
+ "recall": 0.639189,
155
+ "recall_weighted": 0.661194,
156
+ "ap": 0.265176,
157
+ "ap_weighted": 0.265176
158
+ },
159
+ {
160
+ "accuracy": 0.632836,
161
+ "f1": 0.57121,
162
+ "f1_weighted": 0.670199,
163
+ "precision": 0.589383,
164
+ "precision_weighted": 0.766672,
165
+ "recall": 0.638899,
166
+ "recall_weighted": 0.632836,
167
+ "ap": 0.262175,
168
+ "ap_weighted": 0.262175
169
+ },
170
+ {
171
+ "accuracy": 0.623881,
172
+ "f1": 0.575591,
173
+ "f1_weighted": 0.662769,
174
+ "precision": 0.604335,
175
+ "precision_weighted": 0.785576,
176
+ "recall": 0.665114,
177
+ "recall_weighted": 0.623881,
178
+ "ap": 0.277003,
179
+ "ap_weighted": 0.277003
180
+ },
181
+ {
182
+ "accuracy": 0.653731,
183
+ "f1": 0.600785,
184
+ "f1_weighted": 0.689318,
185
+ "precision": 0.619073,
186
+ "precision_weighted": 0.795829,
187
+ "recall": 0.686556,
188
+ "recall_weighted": 0.653731,
189
+ "ap": 0.294219,
190
+ "ap_weighted": 0.294219
191
+ },
192
+ {
193
+ "accuracy": 0.638806,
194
+ "f1": 0.573603,
195
+ "f1_weighted": 0.67514,
196
+ "precision": 0.588751,
197
+ "precision_weighted": 0.764807,
198
+ "recall": 0.636831,
199
+ "recall_weighted": 0.638806,
200
+ "ap": 0.261489,
201
+ "ap_weighted": 0.261489
202
+ },
203
+ {
204
+ "accuracy": 0.708955,
205
+ "f1": 0.626367,
206
+ "f1_weighted": 0.733338,
207
+ "precision": 0.621397,
208
+ "precision_weighted": 0.782315,
209
+ "recall": 0.671763,
210
+ "recall_weighted": 0.708955,
211
+ "ap": 0.294222,
212
+ "ap_weighted": 0.294222
213
+ },
214
+ {
215
+ "accuracy": 0.695522,
216
+ "f1": 0.621683,
217
+ "f1_weighted": 0.723462,
218
+ "precision": 0.621033,
219
+ "precision_weighted": 0.786205,
220
+ "recall": 0.67786,
221
+ "recall_weighted": 0.695522,
222
+ "ap": 0.295622,
223
+ "ap_weighted": 0.295622
224
+ },
225
+ {
226
+ "accuracy": 0.61791,
227
+ "f1": 0.575399,
228
+ "f1_weighted": 0.657213,
229
+ "precision": 0.610702,
230
+ "precision_weighted": 0.794506,
231
+ "recall": 0.675849,
232
+ "recall_weighted": 0.61791,
233
+ "ap": 0.282911,
234
+ "ap_weighted": 0.282911
235
+ },
236
+ {
237
+ "accuracy": 0.620896,
238
+ "f1": 0.562928,
239
+ "f1_weighted": 0.659857,
240
+ "precision": 0.585763,
241
+ "precision_weighted": 0.76494,
242
+ "recall": 0.634367,
243
+ "recall_weighted": 0.620896,
244
+ "ap": 0.258547,
245
+ "ap_weighted": 0.258547
246
+ }
247
+ ],
248
+ "accuracy": 0.643881,
249
+ "f1": 0.583282,
250
+ "f1_weighted": 0.679255,
251
+ "precision": 0.60113,
252
+ "precision_weighted": 0.776771,
253
+ "recall": 0.655009,
254
+ "recall_weighted": 0.643881,
255
+ "ap": 0.274165,
256
+ "ap_weighted": 0.274165,
257
+ "main_score": 0.643881,
258
+ "hf_subset": "en",
259
+ "languages": [
260
+ "eng-Latn"
261
+ ]
262
+ }
263
+ ]
264
+ },
265
+ "evaluation_time": 16.708924055099487,
266
+ "kg_co2_emissions": null,
267
+ "date": null
268
+ }
results/AmazonPolarityClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e2d317d38cd51312af73b3d32a06d1a08b442046",
3
+ "task_name": "AmazonPolarityClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.73794,
11
+ "f1": 0.737817,
12
+ "f1_weighted": 0.737817,
13
+ "precision": 0.738387,
14
+ "precision_weighted": 0.738387,
15
+ "recall": 0.73794,
16
+ "recall_weighted": 0.73794,
17
+ "ap": 0.678148,
18
+ "ap_weighted": 0.678148
19
+ },
20
+ {
21
+ "accuracy": 0.631555,
22
+ "f1": 0.629723,
23
+ "f1_weighted": 0.629723,
24
+ "precision": 0.634211,
25
+ "precision_weighted": 0.634211,
26
+ "recall": 0.631555,
27
+ "recall_weighted": 0.631555,
28
+ "ap": 0.58095,
29
+ "ap_weighted": 0.58095
30
+ },
31
+ {
32
+ "accuracy": 0.680837,
33
+ "f1": 0.680055,
34
+ "f1_weighted": 0.680055,
35
+ "precision": 0.682624,
36
+ "precision_weighted": 0.682624,
37
+ "recall": 0.680837,
38
+ "recall_weighted": 0.680837,
39
+ "ap": 0.620178,
40
+ "ap_weighted": 0.620178
41
+ },
42
+ {
43
+ "accuracy": 0.637493,
44
+ "f1": 0.619155,
45
+ "f1_weighted": 0.619155,
46
+ "precision": 0.670291,
47
+ "precision_weighted": 0.670291,
48
+ "recall": 0.637493,
49
+ "recall_weighted": 0.637493,
50
+ "ap": 0.602435,
51
+ "ap_weighted": 0.602435
52
+ },
53
+ {
54
+ "accuracy": 0.741537,
55
+ "f1": 0.741521,
56
+ "f1_weighted": 0.741521,
57
+ "precision": 0.741601,
58
+ "precision_weighted": 0.741601,
59
+ "recall": 0.741537,
60
+ "recall_weighted": 0.741537,
61
+ "ap": 0.680069,
62
+ "ap_weighted": 0.680069
63
+ },
64
+ {
65
+ "accuracy": 0.660072,
66
+ "f1": 0.659239,
67
+ "f1_weighted": 0.659239,
68
+ "precision": 0.661654,
69
+ "precision_weighted": 0.661654,
70
+ "recall": 0.660073,
71
+ "recall_weighted": 0.660072,
72
+ "ap": 0.608473,
73
+ "ap_weighted": 0.608473
74
+ },
75
+ {
76
+ "accuracy": 0.65324,
77
+ "f1": 0.652185,
78
+ "f1_weighted": 0.652185,
79
+ "precision": 0.655122,
80
+ "precision_weighted": 0.655122,
81
+ "recall": 0.65324,
82
+ "recall_weighted": 0.65324,
83
+ "ap": 0.597773,
84
+ "ap_weighted": 0.597773
85
+ },
86
+ {
87
+ "accuracy": 0.724,
88
+ "f1": 0.720896,
89
+ "f1_weighted": 0.720896,
90
+ "precision": 0.73443,
91
+ "precision_weighted": 0.73443,
92
+ "recall": 0.724,
93
+ "recall_weighted": 0.724,
94
+ "ap": 0.675589,
95
+ "ap_weighted": 0.675589
96
+ },
97
+ {
98
+ "accuracy": 0.60639,
99
+ "f1": 0.60148,
100
+ "f1_weighted": 0.60148,
101
+ "precision": 0.611905,
102
+ "precision_weighted": 0.611905,
103
+ "recall": 0.60639,
104
+ "recall_weighted": 0.60639,
105
+ "ap": 0.562458,
106
+ "ap_weighted": 0.562458
107
+ },
108
+ {
109
+ "accuracy": 0.689525,
110
+ "f1": 0.687915,
111
+ "f1_weighted": 0.687915,
112
+ "precision": 0.693519,
113
+ "precision_weighted": 0.693519,
114
+ "recall": 0.689525,
115
+ "recall_weighted": 0.689525,
116
+ "ap": 0.62617,
117
+ "ap_weighted": 0.62617
118
+ }
119
+ ],
120
+ "accuracy": 0.676259,
121
+ "f1": 0.672998,
122
+ "f1_weighted": 0.672998,
123
+ "precision": 0.682375,
124
+ "precision_weighted": 0.682375,
125
+ "recall": 0.676259,
126
+ "recall_weighted": 0.676259,
127
+ "ap": 0.623224,
128
+ "ap_weighted": 0.623224,
129
+ "main_score": 0.676259,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 4011.0146062374115,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/AmazonReviewsClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6b5d328eaae8ef408dd7d775040245cf86f92e9d",
3
+ "task_name": "AmazonReviewsClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.365,
11
+ "f1": 0.349148,
12
+ "f1_weighted": 0.349148,
13
+ "precision": 0.359526,
14
+ "precision_weighted": 0.359526,
15
+ "recall": 0.365,
16
+ "recall_weighted": 0.365,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.3698,
22
+ "f1": 0.351267,
23
+ "f1_weighted": 0.351267,
24
+ "precision": 0.352476,
25
+ "precision_weighted": 0.352476,
26
+ "recall": 0.3698,
27
+ "recall_weighted": 0.3698,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.3704,
33
+ "f1": 0.363844,
34
+ "f1_weighted": 0.363844,
35
+ "precision": 0.362399,
36
+ "precision_weighted": 0.362399,
37
+ "recall": 0.3704,
38
+ "recall_weighted": 0.3704,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.3466,
44
+ "f1": 0.343156,
45
+ "f1_weighted": 0.343156,
46
+ "precision": 0.341429,
47
+ "precision_weighted": 0.341429,
48
+ "recall": 0.3466,
49
+ "recall_weighted": 0.3466,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.3762,
55
+ "f1": 0.369796,
56
+ "f1_weighted": 0.369796,
57
+ "precision": 0.387146,
58
+ "precision_weighted": 0.387146,
59
+ "recall": 0.3762,
60
+ "recall_weighted": 0.3762,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.3362,
66
+ "f1": 0.326752,
67
+ "f1_weighted": 0.326752,
68
+ "precision": 0.331569,
69
+ "precision_weighted": 0.331569,
70
+ "recall": 0.3362,
71
+ "recall_weighted": 0.3362,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.2954,
77
+ "f1": 0.291593,
78
+ "f1_weighted": 0.291593,
79
+ "precision": 0.290708,
80
+ "precision_weighted": 0.290708,
81
+ "recall": 0.2954,
82
+ "recall_weighted": 0.2954,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.3608,
88
+ "f1": 0.358988,
89
+ "f1_weighted": 0.358988,
90
+ "precision": 0.361238,
91
+ "precision_weighted": 0.361238,
92
+ "recall": 0.3608,
93
+ "recall_weighted": 0.3608,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.352,
99
+ "f1": 0.341936,
100
+ "f1_weighted": 0.341936,
101
+ "precision": 0.357493,
102
+ "precision_weighted": 0.357493,
103
+ "recall": 0.352,
104
+ "recall_weighted": 0.352,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.3502,
110
+ "f1": 0.342521,
111
+ "f1_weighted": 0.342521,
112
+ "precision": 0.351133,
113
+ "precision_weighted": 0.351133,
114
+ "recall": 0.3502,
115
+ "recall_weighted": 0.3502,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.35226,
121
+ "f1": 0.3439,
122
+ "f1_weighted": 0.3439,
123
+ "precision": 0.349512,
124
+ "precision_weighted": 0.349512,
125
+ "recall": 0.35226,
126
+ "recall_weighted": 0.35226,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.35226,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 69.37177419662476,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/ArXivHierarchicalClusteringP2P.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0bbdb47bcbe3a90093699aefeed338a0f28a7ee8",
3
+ "task_name": "ArXivHierarchicalClusteringP2P",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.530497,
11
+ 0.533721,
12
+ 0.488791,
13
+ 0.510777,
14
+ 0.492772,
15
+ 0.498383,
16
+ 0.522707,
17
+ 0.526601,
18
+ 0.510455,
19
+ 0.551458
20
+ ],
21
+ "Level 1": [
22
+ 0.554812,
23
+ 0.593484,
24
+ 0.565365,
25
+ 0.591058,
26
+ 0.596269,
27
+ 0.571074,
28
+ 0.581801,
29
+ 0.592718,
30
+ 0.595931,
31
+ 0.602276
32
+ ]
33
+ },
34
+ "v_measure": 0.550548,
35
+ "v_measure_std": 0.037962,
36
+ "main_score": 0.550548,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.146183967590332,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArXivHierarchicalClusteringS2S.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3",
3
+ "task_name": "ArXivHierarchicalClusteringS2S",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.457681,
11
+ 0.451223,
12
+ 0.463337,
13
+ 0.412571,
14
+ 0.417498,
15
+ 0.433274,
16
+ 0.440556,
17
+ 0.457856,
18
+ 0.447197,
19
+ 0.432635
20
+ ],
21
+ "Level 1": [
22
+ 0.534436,
23
+ 0.531277,
24
+ 0.553902,
25
+ 0.602184,
26
+ 0.585077,
27
+ 0.535768,
28
+ 0.588285,
29
+ 0.590154,
30
+ 0.570984,
31
+ 0.566735
32
+ ]
33
+ },
34
+ "v_measure": 0.503631,
35
+ "v_measure_std": 0.065663,
36
+ "main_score": 0.503631,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.2057979106903076,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArguAna.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c22ab2a51041ffd869aaddef7af8d8215647e41a",
3
+ "task_name": "ArguAna",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.19203,
9
+ "ndcg_at_3": 0.31055,
10
+ "ndcg_at_5": 0.36099,
11
+ "ndcg_at_10": 0.41938,
12
+ "ndcg_at_20": 0.4518,
13
+ "ndcg_at_100": 0.47563,
14
+ "ndcg_at_1000": 0.48106,
15
+ "map_at_1": 0.19203,
16
+ "map_at_3": 0.28058,
17
+ "map_at_5": 0.30878,
18
+ "map_at_10": 0.33313,
19
+ "map_at_20": 0.34224,
20
+ "map_at_100": 0.34566,
21
+ "map_at_1000": 0.34587,
22
+ "recall_at_1": 0.19203,
23
+ "recall_at_3": 0.39758,
24
+ "recall_at_5": 0.5192,
25
+ "recall_at_10": 0.69844,
26
+ "recall_at_20": 0.82504,
27
+ "recall_at_100": 0.95164,
28
+ "recall_at_1000": 0.9936,
29
+ "accuracy": 0.19203,
30
+ "precision_at_1": 0.19203,
31
+ "precision_at_3": 0.13253,
32
+ "precision_at_5": 0.10384,
33
+ "precision_at_10": 0.06984,
34
+ "precision_at_20": 0.04125,
35
+ "precision_at_100": 0.00952,
36
+ "precision_at_1000": 0.00099,
37
+ "mrr_at_1": 0.200569,
38
+ "mrr_at_3": 0.283073,
39
+ "mrr_at_5": 0.311984,
40
+ "mrr_at_10": 0.336188,
41
+ "mrr_at_20": 0.345299,
42
+ "mrr_at_100": 0.348723,
43
+ "mrr_at_1000": 0.348937,
44
+ "nauc_ndcg_at_1_max": -0.033963,
45
+ "nauc_ndcg_at_1_std": -0.078887,
46
+ "nauc_ndcg_at_1_diff1": 0.137136,
47
+ "nauc_ndcg_at_3_max": -0.001158,
48
+ "nauc_ndcg_at_3_std": -0.046323,
49
+ "nauc_ndcg_at_3_diff1": 0.090432,
50
+ "nauc_ndcg_at_5_max": 0.012382,
51
+ "nauc_ndcg_at_5_std": -0.052262,
52
+ "nauc_ndcg_at_5_diff1": 0.090119,
53
+ "nauc_ndcg_at_10_max": 0.017533,
54
+ "nauc_ndcg_at_10_std": -0.039259,
55
+ "nauc_ndcg_at_10_diff1": 0.085411,
56
+ "nauc_ndcg_at_20_max": 0.030605,
57
+ "nauc_ndcg_at_20_std": -0.02752,
58
+ "nauc_ndcg_at_20_diff1": 0.097365,
59
+ "nauc_ndcg_at_100_max": 0.022808,
60
+ "nauc_ndcg_at_100_std": -0.025696,
61
+ "nauc_ndcg_at_100_diff1": 0.095737,
62
+ "nauc_ndcg_at_1000_max": 0.015301,
63
+ "nauc_ndcg_at_1000_std": -0.036379,
64
+ "nauc_ndcg_at_1000_diff1": 0.095835,
65
+ "nauc_map_at_1_max": -0.033963,
66
+ "nauc_map_at_1_std": -0.078887,
67
+ "nauc_map_at_1_diff1": 0.137136,
68
+ "nauc_map_at_3_max": -0.009863,
69
+ "nauc_map_at_3_std": -0.053146,
70
+ "nauc_map_at_3_diff1": 0.09841,
71
+ "nauc_map_at_5_max": -0.002054,
72
+ "nauc_map_at_5_std": -0.056483,
73
+ "nauc_map_at_5_diff1": 0.098083,
74
+ "nauc_map_at_10_max": -0.001443,
75
+ "nauc_map_at_10_std": -0.051552,
76
+ "nauc_map_at_10_diff1": 0.096516,
77
+ "nauc_map_at_20_max": 0.001471,
78
+ "nauc_map_at_20_std": -0.049334,
79
+ "nauc_map_at_20_diff1": 0.099675,
80
+ "nauc_map_at_100_max": 0.000836,
81
+ "nauc_map_at_100_std": -0.048746,
82
+ "nauc_map_at_100_diff1": 0.099654,
83
+ "nauc_map_at_1000_max": 0.00064,
84
+ "nauc_map_at_1000_std": -0.049046,
85
+ "nauc_map_at_1000_diff1": 0.099637,
86
+ "nauc_recall_at_1_max": -0.033963,
87
+ "nauc_recall_at_1_std": -0.078887,
88
+ "nauc_recall_at_1_diff1": 0.137136,
89
+ "nauc_recall_at_3_max": 0.02158,
90
+ "nauc_recall_at_3_std": -0.029026,
91
+ "nauc_recall_at_3_diff1": 0.070872,
92
+ "nauc_recall_at_5_max": 0.052827,
93
+ "nauc_recall_at_5_std": -0.042115,
94
+ "nauc_recall_at_5_diff1": 0.070059,
95
+ "nauc_recall_at_10_max": 0.08948,
96
+ "nauc_recall_at_10_std": 0.007646,
97
+ "nauc_recall_at_10_diff1": 0.046799,
98
+ "nauc_recall_at_20_max": 0.214884,
99
+ "nauc_recall_at_20_std": 0.116807,
100
+ "nauc_recall_at_20_diff1": 0.103556,
101
+ "nauc_recall_at_100_max": 0.390406,
102
+ "nauc_recall_at_100_std": 0.460203,
103
+ "nauc_recall_at_100_diff1": 0.059268,
104
+ "nauc_recall_at_1000_max": 0.447529,
105
+ "nauc_recall_at_1000_std": 0.356611,
106
+ "nauc_recall_at_1000_diff1": -0.123719,
107
+ "nauc_precision_at_1_max": -0.033963,
108
+ "nauc_precision_at_1_std": -0.078887,
109
+ "nauc_precision_at_1_diff1": 0.137136,
110
+ "nauc_precision_at_3_max": 0.02158,
111
+ "nauc_precision_at_3_std": -0.029026,
112
+ "nauc_precision_at_3_diff1": 0.070872,
113
+ "nauc_precision_at_5_max": 0.052827,
114
+ "nauc_precision_at_5_std": -0.042115,
115
+ "nauc_precision_at_5_diff1": 0.070059,
116
+ "nauc_precision_at_10_max": 0.08948,
117
+ "nauc_precision_at_10_std": 0.007646,
118
+ "nauc_precision_at_10_diff1": 0.046799,
119
+ "nauc_precision_at_20_max": 0.214884,
120
+ "nauc_precision_at_20_std": 0.116807,
121
+ "nauc_precision_at_20_diff1": 0.103556,
122
+ "nauc_precision_at_100_max": 0.390406,
123
+ "nauc_precision_at_100_std": 0.460203,
124
+ "nauc_precision_at_100_diff1": 0.059268,
125
+ "nauc_precision_at_1000_max": 0.447529,
126
+ "nauc_precision_at_1000_std": 0.356611,
127
+ "nauc_precision_at_1000_diff1": -0.123719,
128
+ "nauc_mrr_at_1_max": -0.028703,
129
+ "nauc_mrr_at_1_std": -0.073285,
130
+ "nauc_mrr_at_1_diff1": 0.103193,
131
+ "nauc_mrr_at_3_max": -0.01921,
132
+ "nauc_mrr_at_3_std": -0.051538,
133
+ "nauc_mrr_at_3_diff1": 0.074044,
134
+ "nauc_mrr_at_5_max": -0.009936,
135
+ "nauc_mrr_at_5_std": -0.054795,
136
+ "nauc_mrr_at_5_diff1": 0.075035,
137
+ "nauc_mrr_at_10_max": -0.009061,
138
+ "nauc_mrr_at_10_std": -0.050279,
139
+ "nauc_mrr_at_10_diff1": 0.07313,
140
+ "nauc_mrr_at_20_max": -0.006401,
141
+ "nauc_mrr_at_20_std": -0.048066,
142
+ "nauc_mrr_at_20_diff1": 0.075701,
143
+ "nauc_mrr_at_100_max": -0.007115,
144
+ "nauc_mrr_at_100_std": -0.047474,
145
+ "nauc_mrr_at_100_diff1": 0.075425,
146
+ "nauc_mrr_at_1000_max": -0.007314,
147
+ "nauc_mrr_at_1000_std": -0.047768,
148
+ "nauc_mrr_at_1000_diff1": 0.07539,
149
+ "hit_rate_at_1": 0.19203,
150
+ "hit_rate_at_3": 0.39758,
151
+ "hit_rate_at_5": 0.5192,
152
+ "hit_rate_at_10": 0.69844,
153
+ "hit_rate_at_20": 0.82504,
154
+ "hit_rate_at_100": 0.95164,
155
+ "hit_rate_at_1000": 0.9936,
156
+ "main_score": 0.41938,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 17.12871503829956,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/AskUbuntuDupQuestions.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5691e3c48741d5f83b5cc8e630653d7a8cfc048",
3
+ "task_name": "AskUbuntuDupQuestions",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.54017,
9
+ "ndcg_at_3": 0.51945,
10
+ "ndcg_at_5": 0.5399,
11
+ "ndcg_at_10": 0.60638,
12
+ "ndcg_at_20": 0.72633,
13
+ "ndcg_at_100": 0.72633,
14
+ "ndcg_at_1000": 0.72633,
15
+ "map_at_1": 0.14839,
16
+ "map_at_3": 0.26743,
17
+ "map_at_5": 0.34239,
18
+ "map_at_10": 0.44889,
19
+ "map_at_20": 0.55938,
20
+ "map_at_100": 0.55938,
21
+ "map_at_1000": 0.55938,
22
+ "recall_at_1": 0.14839,
23
+ "recall_at_3": 0.3105,
24
+ "recall_at_5": 0.45348,
25
+ "recall_at_10": 0.69196,
26
+ "recall_at_20": 1.0,
27
+ "recall_at_100": 1.0,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.14839,
30
+ "precision_at_1": 0.54017,
31
+ "precision_at_3": 0.45891,
32
+ "precision_at_5": 0.42161,
33
+ "precision_at_10": 0.35208,
34
+ "precision_at_20": 0.27355,
35
+ "precision_at_100": 0.05471,
36
+ "precision_at_1000": 0.00547,
37
+ "mrr_at_1": 0.540166,
38
+ "mrr_at_3": 0.638042,
39
+ "mrr_at_5": 0.663666,
40
+ "mrr_at_10": 0.675241,
41
+ "mrr_at_20": 0.678552,
42
+ "mrr_at_100": 0.678552,
43
+ "mrr_at_1000": 0.678552,
44
+ "nauc_ndcg_at_1_max": 0.187969,
45
+ "nauc_ndcg_at_1_std": 0.098684,
46
+ "nauc_ndcg_at_1_diff1": 0.140114,
47
+ "nauc_ndcg_at_3_max": 0.169333,
48
+ "nauc_ndcg_at_3_std": 0.078815,
49
+ "nauc_ndcg_at_3_diff1": 0.101676,
50
+ "nauc_ndcg_at_5_max": 0.105826,
51
+ "nauc_ndcg_at_5_std": 0.10383,
52
+ "nauc_ndcg_at_5_diff1": 0.093132,
53
+ "nauc_ndcg_at_10_max": 0.104036,
54
+ "nauc_ndcg_at_10_std": 0.11952,
55
+ "nauc_ndcg_at_10_diff1": 0.065996,
56
+ "nauc_ndcg_at_20_max": 0.182239,
57
+ "nauc_ndcg_at_20_std": 0.075949,
58
+ "nauc_ndcg_at_20_diff1": 0.109653,
59
+ "nauc_ndcg_at_100_max": 0.182239,
60
+ "nauc_ndcg_at_100_std": 0.075949,
61
+ "nauc_ndcg_at_100_diff1": 0.109653,
62
+ "nauc_ndcg_at_1000_max": 0.182239,
63
+ "nauc_ndcg_at_1000_std": 0.075949,
64
+ "nauc_ndcg_at_1000_diff1": 0.109653,
65
+ "nauc_map_at_1_max": 0.029568,
66
+ "nauc_map_at_1_std": 0.027308,
67
+ "nauc_map_at_1_diff1": 0.194186,
68
+ "nauc_map_at_3_max": 0.039012,
69
+ "nauc_map_at_3_std": 0.094197,
70
+ "nauc_map_at_3_diff1": 0.154304,
71
+ "nauc_map_at_5_max": 0.034493,
72
+ "nauc_map_at_5_std": 0.129768,
73
+ "nauc_map_at_5_diff1": 0.131847,
74
+ "nauc_map_at_10_max": 0.077837,
75
+ "nauc_map_at_10_std": 0.134118,
76
+ "nauc_map_at_10_diff1": 0.084892,
77
+ "nauc_map_at_20_max": 0.146276,
78
+ "nauc_map_at_20_std": 0.092598,
79
+ "nauc_map_at_20_diff1": 0.093169,
80
+ "nauc_map_at_100_max": 0.146276,
81
+ "nauc_map_at_100_std": 0.092598,
82
+ "nauc_map_at_100_diff1": 0.093169,
83
+ "nauc_map_at_1000_max": 0.146276,
84
+ "nauc_map_at_1000_std": 0.092598,
85
+ "nauc_map_at_1000_diff1": 0.093169,
86
+ "nauc_recall_at_1_max": 0.029568,
87
+ "nauc_recall_at_1_std": 0.027308,
88
+ "nauc_recall_at_1_diff1": 0.194186,
89
+ "nauc_recall_at_3_max": 0.010515,
90
+ "nauc_recall_at_3_std": 0.079916,
91
+ "nauc_recall_at_3_diff1": 0.138109,
92
+ "nauc_recall_at_5_max": -0.064249,
93
+ "nauc_recall_at_5_std": 0.124009,
94
+ "nauc_recall_at_5_diff1": 0.086964,
95
+ "nauc_recall_at_10_max": -0.092403,
96
+ "nauc_recall_at_10_std": 0.137224,
97
+ "nauc_recall_at_10_diff1": -0.040869,
98
+ "nauc_recall_at_20_max": NaN,
99
+ "nauc_recall_at_20_std": NaN,
100
+ "nauc_recall_at_20_diff1": NaN,
101
+ "nauc_recall_at_100_max": NaN,
102
+ "nauc_recall_at_100_std": NaN,
103
+ "nauc_recall_at_100_diff1": NaN,
104
+ "nauc_recall_at_1000_max": NaN,
105
+ "nauc_recall_at_1000_std": NaN,
106
+ "nauc_recall_at_1000_diff1": NaN,
107
+ "nauc_precision_at_1_max": 0.187969,
108
+ "nauc_precision_at_1_std": 0.098684,
109
+ "nauc_precision_at_1_diff1": 0.140114,
110
+ "nauc_precision_at_3_max": 0.183564,
111
+ "nauc_precision_at_3_std": 0.093594,
112
+ "nauc_precision_at_3_diff1": 0.043976,
113
+ "nauc_precision_at_5_max": 0.143817,
114
+ "nauc_precision_at_5_std": 0.101552,
115
+ "nauc_precision_at_5_diff1": -0.007572,
116
+ "nauc_precision_at_10_max": 0.17357,
117
+ "nauc_precision_at_10_std": 0.019679,
118
+ "nauc_precision_at_10_diff1": -0.061011,
119
+ "nauc_precision_at_20_max": 0.184725,
120
+ "nauc_precision_at_20_std": -0.043787,
121
+ "nauc_precision_at_20_diff1": -0.013542,
122
+ "nauc_precision_at_100_max": 0.184725,
123
+ "nauc_precision_at_100_std": -0.043787,
124
+ "nauc_precision_at_100_diff1": -0.013542,
125
+ "nauc_precision_at_1000_max": 0.184725,
126
+ "nauc_precision_at_1000_std": -0.043787,
127
+ "nauc_precision_at_1000_diff1": -0.013542,
128
+ "nauc_mrr_at_1_max": 0.187969,
129
+ "nauc_mrr_at_1_std": 0.098684,
130
+ "nauc_mrr_at_1_diff1": 0.140114,
131
+ "nauc_mrr_at_3_max": 0.196748,
132
+ "nauc_mrr_at_3_std": 0.082906,
133
+ "nauc_mrr_at_3_diff1": 0.147367,
134
+ "nauc_mrr_at_5_max": 0.179894,
135
+ "nauc_mrr_at_5_std": 0.07978,
136
+ "nauc_mrr_at_5_diff1": 0.143359,
137
+ "nauc_mrr_at_10_max": 0.180577,
138
+ "nauc_mrr_at_10_std": 0.074538,
139
+ "nauc_mrr_at_10_diff1": 0.136449,
140
+ "nauc_mrr_at_20_max": 0.18298,
141
+ "nauc_mrr_at_20_std": 0.079361,
142
+ "nauc_mrr_at_20_diff1": 0.140037,
143
+ "nauc_mrr_at_100_max": 0.18298,
144
+ "nauc_mrr_at_100_std": 0.079361,
145
+ "nauc_mrr_at_100_diff1": 0.140037,
146
+ "nauc_mrr_at_1000_max": 0.18298,
147
+ "nauc_mrr_at_1000_std": 0.079361,
148
+ "nauc_mrr_at_1000_diff1": 0.140037,
149
+ "hit_rate_at_1": 0.54017,
150
+ "hit_rate_at_3": 0.759,
151
+ "hit_rate_at_5": 0.86981,
152
+ "hit_rate_at_10": 0.95568,
153
+ "hit_rate_at_20": 1.0,
154
+ "hit_rate_at_100": 1.0,
155
+ "hit_rate_at_1000": 1.0,
156
+ "main_score": 0.55938,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 8.199298858642578,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/BIOSSES.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "d3fb88f8f02e40887cd149695127462bbcf29b4a",
3
+ "task_name": "BIOSSES",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "pearson": 0.818482,
9
+ "spearman": 0.788464,
10
+ "cosine_pearson": 0.818482,
11
+ "cosine_spearman": 0.788464,
12
+ "manhattan_pearson": 0.801505,
13
+ "manhattan_spearman": 0.786048,
14
+ "euclidean_pearson": 0.805944,
15
+ "euclidean_spearman": 0.788464,
16
+ "main_score": 0.788464,
17
+ "hf_subset": "default",
18
+ "languages": [
19
+ "eng-Latn"
20
+ ]
21
+ }
22
+ ]
23
+ },
24
+ "evaluation_time": 0.6089718341827393,
25
+ "kg_co2_emissions": null,
26
+ "date": null
27
+ }
results/Banking77Classification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0fd18e25b25c072e09e0d92ab615fda904d66300",
3
+ "task_name": "Banking77Classification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.709416,
11
+ "f1": 0.698921,
12
+ "f1_weighted": 0.698921,
13
+ "precision": 0.727839,
14
+ "precision_weighted": 0.727839,
15
+ "recall": 0.709416,
16
+ "recall_weighted": 0.709416,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.706494,
22
+ "f1": 0.693388,
23
+ "f1_weighted": 0.693388,
24
+ "precision": 0.737462,
25
+ "precision_weighted": 0.737462,
26
+ "recall": 0.706494,
27
+ "recall_weighted": 0.706494,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.705195,
33
+ "f1": 0.695146,
34
+ "f1_weighted": 0.695146,
35
+ "precision": 0.737951,
36
+ "precision_weighted": 0.737951,
37
+ "recall": 0.705195,
38
+ "recall_weighted": 0.705195,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.710065,
44
+ "f1": 0.700522,
45
+ "f1_weighted": 0.700522,
46
+ "precision": 0.737201,
47
+ "precision_weighted": 0.737201,
48
+ "recall": 0.710065,
49
+ "recall_weighted": 0.710065,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.697078,
55
+ "f1": 0.684067,
56
+ "f1_weighted": 0.684067,
57
+ "precision": 0.720585,
58
+ "precision_weighted": 0.720585,
59
+ "recall": 0.697078,
60
+ "recall_weighted": 0.697078,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.708766,
66
+ "f1": 0.703068,
67
+ "f1_weighted": 0.703068,
68
+ "precision": 0.735709,
69
+ "precision_weighted": 0.735709,
70
+ "recall": 0.708766,
71
+ "recall_weighted": 0.708766,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.681494,
77
+ "f1": 0.66852,
78
+ "f1_weighted": 0.66852,
79
+ "precision": 0.700757,
80
+ "precision_weighted": 0.700757,
81
+ "recall": 0.681494,
82
+ "recall_weighted": 0.681494,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.691234,
88
+ "f1": 0.680002,
89
+ "f1_weighted": 0.680002,
90
+ "precision": 0.714478,
91
+ "precision_weighted": 0.714478,
92
+ "recall": 0.691234,
93
+ "recall_weighted": 0.691234,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.698377,
99
+ "f1": 0.687639,
100
+ "f1_weighted": 0.687639,
101
+ "precision": 0.72417,
102
+ "precision_weighted": 0.72417,
103
+ "recall": 0.698377,
104
+ "recall_weighted": 0.698377,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.69513,
110
+ "f1": 0.682002,
111
+ "f1_weighted": 0.682002,
112
+ "precision": 0.721036,
113
+ "precision_weighted": 0.721036,
114
+ "recall": 0.69513,
115
+ "recall_weighted": 0.69513,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.700325,
121
+ "f1": 0.689327,
122
+ "f1_weighted": 0.689327,
123
+ "precision": 0.725719,
124
+ "precision_weighted": 0.725719,
125
+ "recall": 0.700325,
126
+ "recall_weighted": 0.700325,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.700325,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 24.914488792419434,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/BiorxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65b79d1d13f80053f67aca9498d9402c2d9f1f40",
3
+ "task_name": "BiorxivClusteringP2P",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.310535,
9
+ "v_measure_std": 0.010861,
10
+ "v_measures": [
11
+ 0.306892,
12
+ 0.309547,
13
+ 0.318327,
14
+ 0.282693,
15
+ 0.306354,
16
+ 0.310341,
17
+ 0.30948,
18
+ 0.32279,
19
+ 0.32023,
20
+ 0.318697
21
+ ],
22
+ "main_score": 0.310535,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 119.78395438194275,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/BiorxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "258694dd0231531bc1fd9de6ceb52a0853c6d908",
3
+ "task_name": "BiorxivClusteringS2S",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.202003,
9
+ "v_measure_std": 0.005492,
10
+ "v_measures": [
11
+ 0.197235,
12
+ 0.202476,
13
+ 0.199225,
14
+ 0.196073,
15
+ 0.194072,
16
+ 0.204599,
17
+ 0.210904,
18
+ 0.203625,
19
+ 0.20072,
20
+ 0.211096
21
+ ],
22
+ "main_score": 0.202003,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 104.13116502761841,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/CQADupstackAndroidRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3",
3
+ "task_name": "CQADupstackAndroidRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.18884,
9
+ "ndcg_at_3": 0.2231,
10
+ "ndcg_at_5": 0.24157,
11
+ "ndcg_at_10": 0.2614,
12
+ "ndcg_at_20": 0.28319,
13
+ "ndcg_at_100": 0.31782,
14
+ "ndcg_at_1000": 0.35087,
15
+ "map_at_1": 0.15194,
16
+ "map_at_3": 0.195,
17
+ "map_at_5": 0.20855,
18
+ "map_at_10": 0.21893,
19
+ "map_at_20": 0.22541,
20
+ "map_at_100": 0.23093,
21
+ "map_at_1000": 0.23231,
22
+ "recall_at_1": 0.15194,
23
+ "recall_at_3": 0.24004,
24
+ "recall_at_5": 0.29012,
25
+ "recall_at_10": 0.34836,
26
+ "recall_at_20": 0.42972,
27
+ "recall_at_100": 0.59592,
28
+ "recall_at_1000": 0.82776,
29
+ "accuracy": 0.15194,
30
+ "precision_at_1": 0.18884,
31
+ "precision_at_3": 0.10968,
32
+ "precision_at_5": 0.08183,
33
+ "precision_at_10": 0.05122,
34
+ "precision_at_20": 0.03262,
35
+ "precision_at_100": 0.00983,
36
+ "precision_at_1000": 0.00156,
37
+ "mrr_at_1": 0.188841,
38
+ "mrr_at_3": 0.235813,
39
+ "mrr_at_5": 0.247902,
40
+ "mrr_at_10": 0.256707,
41
+ "mrr_at_20": 0.263373,
42
+ "mrr_at_100": 0.267998,
43
+ "mrr_at_1000": 0.26881,
44
+ "nauc_ndcg_at_1_max": 0.197606,
45
+ "nauc_ndcg_at_1_std": -0.019884,
46
+ "nauc_ndcg_at_1_diff1": 0.458478,
47
+ "nauc_ndcg_at_3_max": 0.178271,
48
+ "nauc_ndcg_at_3_std": 0.003251,
49
+ "nauc_ndcg_at_3_diff1": 0.398115,
50
+ "nauc_ndcg_at_5_max": 0.175196,
51
+ "nauc_ndcg_at_5_std": 0.005964,
52
+ "nauc_ndcg_at_5_diff1": 0.385679,
53
+ "nauc_ndcg_at_10_max": 0.175056,
54
+ "nauc_ndcg_at_10_std": 0.010168,
55
+ "nauc_ndcg_at_10_diff1": 0.382267,
56
+ "nauc_ndcg_at_20_max": 0.179689,
57
+ "nauc_ndcg_at_20_std": 0.006895,
58
+ "nauc_ndcg_at_20_diff1": 0.376813,
59
+ "nauc_ndcg_at_100_max": 0.191768,
60
+ "nauc_ndcg_at_100_std": 0.03296,
61
+ "nauc_ndcg_at_100_diff1": 0.375628,
62
+ "nauc_ndcg_at_1000_max": 0.199238,
63
+ "nauc_ndcg_at_1000_std": 0.045336,
64
+ "nauc_ndcg_at_1000_diff1": 0.380325,
65
+ "nauc_map_at_1_max": 0.197751,
66
+ "nauc_map_at_1_std": 0.006797,
67
+ "nauc_map_at_1_diff1": 0.494829,
68
+ "nauc_map_at_3_max": 0.192164,
69
+ "nauc_map_at_3_std": 0.005593,
70
+ "nauc_map_at_3_diff1": 0.435257,
71
+ "nauc_map_at_5_max": 0.188238,
72
+ "nauc_map_at_5_std": 0.007005,
73
+ "nauc_map_at_5_diff1": 0.420792,
74
+ "nauc_map_at_10_max": 0.187103,
75
+ "nauc_map_at_10_std": 0.009857,
76
+ "nauc_map_at_10_diff1": 0.416412,
77
+ "nauc_map_at_20_max": 0.189281,
78
+ "nauc_map_at_20_std": 0.009555,
79
+ "nauc_map_at_20_diff1": 0.414144,
80
+ "nauc_map_at_100_max": 0.191584,
81
+ "nauc_map_at_100_std": 0.014557,
82
+ "nauc_map_at_100_diff1": 0.414225,
83
+ "nauc_map_at_1000_max": 0.191893,
84
+ "nauc_map_at_1000_std": 0.014986,
85
+ "nauc_map_at_1000_diff1": 0.414433,
86
+ "nauc_recall_at_1_max": 0.197751,
87
+ "nauc_recall_at_1_std": 0.006797,
88
+ "nauc_recall_at_1_diff1": 0.494829,
89
+ "nauc_recall_at_3_max": 0.163213,
90
+ "nauc_recall_at_3_std": 0.014729,
91
+ "nauc_recall_at_3_diff1": 0.366635,
92
+ "nauc_recall_at_5_max": 0.146997,
93
+ "nauc_recall_at_5_std": 0.018867,
94
+ "nauc_recall_at_5_diff1": 0.322006,
95
+ "nauc_recall_at_10_max": 0.143418,
96
+ "nauc_recall_at_10_std": 0.027174,
97
+ "nauc_recall_at_10_diff1": 0.306785,
98
+ "nauc_recall_at_20_max": 0.15676,
99
+ "nauc_recall_at_20_std": 0.011467,
100
+ "nauc_recall_at_20_diff1": 0.281034,
101
+ "nauc_recall_at_100_max": 0.203891,
102
+ "nauc_recall_at_100_std": 0.135535,
103
+ "nauc_recall_at_100_diff1": 0.247326,
104
+ "nauc_recall_at_1000_max": 0.308993,
105
+ "nauc_recall_at_1000_std": 0.415926,
106
+ "nauc_recall_at_1000_diff1": 0.218363,
107
+ "nauc_precision_at_1_max": 0.197606,
108
+ "nauc_precision_at_1_std": -0.019884,
109
+ "nauc_precision_at_1_diff1": 0.458478,
110
+ "nauc_precision_at_3_max": 0.178951,
111
+ "nauc_precision_at_3_std": -0.003178,
112
+ "nauc_precision_at_3_diff1": 0.306824,
113
+ "nauc_precision_at_5_max": 0.166294,
114
+ "nauc_precision_at_5_std": -0.005949,
115
+ "nauc_precision_at_5_diff1": 0.253107,
116
+ "nauc_precision_at_10_max": 0.14608,
117
+ "nauc_precision_at_10_std": -0.008317,
118
+ "nauc_precision_at_10_diff1": 0.209245,
119
+ "nauc_precision_at_20_max": 0.130098,
120
+ "nauc_precision_at_20_std": -0.025533,
121
+ "nauc_precision_at_20_diff1": 0.166091,
122
+ "nauc_precision_at_100_max": 0.073078,
123
+ "nauc_precision_at_100_std": 0.016469,
124
+ "nauc_precision_at_100_diff1": 0.089054,
125
+ "nauc_precision_at_1000_max": -0.027536,
126
+ "nauc_precision_at_1000_std": 0.030826,
127
+ "nauc_precision_at_1000_diff1": -0.053711,
128
+ "nauc_mrr_at_1_max": 0.197606,
129
+ "nauc_mrr_at_1_std": -0.019884,
130
+ "nauc_mrr_at_1_diff1": 0.458478,
131
+ "nauc_mrr_at_3_max": 0.181138,
132
+ "nauc_mrr_at_3_std": -0.007979,
133
+ "nauc_mrr_at_3_diff1": 0.401962,
134
+ "nauc_mrr_at_5_max": 0.178232,
135
+ "nauc_mrr_at_5_std": -0.009104,
136
+ "nauc_mrr_at_5_diff1": 0.390387,
137
+ "nauc_mrr_at_10_max": 0.177762,
138
+ "nauc_mrr_at_10_std": -0.006743,
139
+ "nauc_mrr_at_10_diff1": 0.388696,
140
+ "nauc_mrr_at_20_max": 0.176751,
141
+ "nauc_mrr_at_20_std": -0.008116,
142
+ "nauc_mrr_at_20_diff1": 0.38646,
143
+ "nauc_mrr_at_100_max": 0.17894,
144
+ "nauc_mrr_at_100_std": -0.005141,
145
+ "nauc_mrr_at_100_diff1": 0.386043,
146
+ "nauc_mrr_at_1000_max": 0.179046,
147
+ "nauc_mrr_at_1000_std": -0.004902,
148
+ "nauc_mrr_at_1000_diff1": 0.386362,
149
+ "hit_rate_at_1": 0.18884,
150
+ "hit_rate_at_3": 0.29614,
151
+ "hit_rate_at_5": 0.34907,
152
+ "hit_rate_at_10": 0.41202,
153
+ "hit_rate_at_20": 0.5093,
154
+ "hit_rate_at_100": 0.67525,
155
+ "hit_rate_at_1000": 0.88698,
156
+ "main_score": 0.2614,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 29.754338264465332,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackEnglishRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ad9991cb51e31e31e430383c75ffb2885547b5f0",
3
+ "task_name": "CQADupstackEnglishRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.15541,
9
+ "ndcg_at_3": 0.17117,
10
+ "ndcg_at_5": 0.184,
11
+ "ndcg_at_10": 0.19824,
12
+ "ndcg_at_20": 0.2098,
13
+ "ndcg_at_100": 0.23209,
14
+ "ndcg_at_1000": 0.2615,
15
+ "map_at_1": 0.1213,
16
+ "map_at_3": 0.14994,
17
+ "map_at_5": 0.15878,
18
+ "map_at_10": 0.16559,
19
+ "map_at_20": 0.16941,
20
+ "map_at_100": 0.17291,
21
+ "map_at_1000": 0.17405,
22
+ "recall_at_1": 0.1213,
23
+ "recall_at_3": 0.181,
24
+ "recall_at_5": 0.21512,
25
+ "recall_at_10": 0.2589,
26
+ "recall_at_20": 0.30168,
27
+ "recall_at_100": 0.40954,
28
+ "recall_at_1000": 0.61684,
29
+ "accuracy": 0.1213,
30
+ "precision_at_1": 0.15541,
31
+ "precision_at_3": 0.08259,
32
+ "precision_at_5": 0.06038,
33
+ "precision_at_10": 0.03822,
34
+ "precision_at_20": 0.02328,
35
+ "precision_at_100": 0.00694,
36
+ "precision_at_1000": 0.00119,
37
+ "mrr_at_1": 0.155414,
38
+ "mrr_at_3": 0.18673,
39
+ "mrr_at_5": 0.195902,
40
+ "mrr_at_10": 0.202599,
41
+ "mrr_at_20": 0.206189,
42
+ "mrr_at_100": 0.208986,
43
+ "mrr_at_1000": 0.209771,
44
+ "nauc_ndcg_at_1_max": 0.195123,
45
+ "nauc_ndcg_at_1_std": 0.038382,
46
+ "nauc_ndcg_at_1_diff1": 0.445102,
47
+ "nauc_ndcg_at_3_max": 0.166586,
48
+ "nauc_ndcg_at_3_std": 0.044121,
49
+ "nauc_ndcg_at_3_diff1": 0.406788,
50
+ "nauc_ndcg_at_5_max": 0.159946,
51
+ "nauc_ndcg_at_5_std": 0.049405,
52
+ "nauc_ndcg_at_5_diff1": 0.39172,
53
+ "nauc_ndcg_at_10_max": 0.152806,
54
+ "nauc_ndcg_at_10_std": 0.063562,
55
+ "nauc_ndcg_at_10_diff1": 0.377718,
56
+ "nauc_ndcg_at_20_max": 0.148911,
57
+ "nauc_ndcg_at_20_std": 0.074106,
58
+ "nauc_ndcg_at_20_diff1": 0.367546,
59
+ "nauc_ndcg_at_100_max": 0.153238,
60
+ "nauc_ndcg_at_100_std": 0.090583,
61
+ "nauc_ndcg_at_100_diff1": 0.366295,
62
+ "nauc_ndcg_at_1000_max": 0.157625,
63
+ "nauc_ndcg_at_1000_std": 0.102018,
64
+ "nauc_ndcg_at_1000_diff1": 0.364599,
65
+ "nauc_map_at_1_max": 0.187378,
66
+ "nauc_map_at_1_std": 0.026564,
67
+ "nauc_map_at_1_diff1": 0.468802,
68
+ "nauc_map_at_3_max": 0.172059,
69
+ "nauc_map_at_3_std": 0.033254,
70
+ "nauc_map_at_3_diff1": 0.427237,
71
+ "nauc_map_at_5_max": 0.167614,
72
+ "nauc_map_at_5_std": 0.039833,
73
+ "nauc_map_at_5_diff1": 0.415822,
74
+ "nauc_map_at_10_max": 0.165745,
75
+ "nauc_map_at_10_std": 0.049076,
76
+ "nauc_map_at_10_diff1": 0.407507,
77
+ "nauc_map_at_20_max": 0.1652,
78
+ "nauc_map_at_20_std": 0.054117,
79
+ "nauc_map_at_20_diff1": 0.403296,
80
+ "nauc_map_at_100_max": 0.16542,
81
+ "nauc_map_at_100_std": 0.057323,
82
+ "nauc_map_at_100_diff1": 0.402134,
83
+ "nauc_map_at_1000_max": 0.165638,
84
+ "nauc_map_at_1000_std": 0.058157,
85
+ "nauc_map_at_1000_diff1": 0.401782,
86
+ "nauc_recall_at_1_max": 0.187378,
87
+ "nauc_recall_at_1_std": 0.026564,
88
+ "nauc_recall_at_1_diff1": 0.468802,
89
+ "nauc_recall_at_3_max": 0.14534,
90
+ "nauc_recall_at_3_std": 0.037792,
91
+ "nauc_recall_at_3_diff1": 0.390695,
92
+ "nauc_recall_at_5_max": 0.127747,
93
+ "nauc_recall_at_5_std": 0.048975,
94
+ "nauc_recall_at_5_diff1": 0.346998,
95
+ "nauc_recall_at_10_max": 0.112594,
96
+ "nauc_recall_at_10_std": 0.08543,
97
+ "nauc_recall_at_10_diff1": 0.308944,
98
+ "nauc_recall_at_20_max": 0.099463,
99
+ "nauc_recall_at_20_std": 0.112498,
100
+ "nauc_recall_at_20_diff1": 0.275566,
101
+ "nauc_recall_at_100_max": 0.120063,
102
+ "nauc_recall_at_100_std": 0.166315,
103
+ "nauc_recall_at_100_diff1": 0.274569,
104
+ "nauc_recall_at_1000_max": 0.131686,
105
+ "nauc_recall_at_1000_std": 0.227781,
106
+ "nauc_recall_at_1000_diff1": 0.253972,
107
+ "nauc_precision_at_1_max": 0.195123,
108
+ "nauc_precision_at_1_std": 0.038382,
109
+ "nauc_precision_at_1_diff1": 0.445102,
110
+ "nauc_precision_at_3_max": 0.169315,
111
+ "nauc_precision_at_3_std": 0.074688,
112
+ "nauc_precision_at_3_diff1": 0.338806,
113
+ "nauc_precision_at_5_max": 0.158105,
114
+ "nauc_precision_at_5_std": 0.095622,
115
+ "nauc_precision_at_5_diff1": 0.294577,
116
+ "nauc_precision_at_10_max": 0.150364,
117
+ "nauc_precision_at_10_std": 0.132855,
118
+ "nauc_precision_at_10_diff1": 0.243428,
119
+ "nauc_precision_at_20_max": 0.145514,
120
+ "nauc_precision_at_20_std": 0.170178,
121
+ "nauc_precision_at_20_diff1": 0.196139,
122
+ "nauc_precision_at_100_max": 0.132973,
123
+ "nauc_precision_at_100_std": 0.209984,
124
+ "nauc_precision_at_100_diff1": 0.128863,
125
+ "nauc_precision_at_1000_max": 0.127605,
126
+ "nauc_precision_at_1000_std": 0.224395,
127
+ "nauc_precision_at_1000_diff1": 0.025289,
128
+ "nauc_mrr_at_1_max": 0.195123,
129
+ "nauc_mrr_at_1_std": 0.038382,
130
+ "nauc_mrr_at_1_diff1": 0.445102,
131
+ "nauc_mrr_at_3_max": 0.167737,
132
+ "nauc_mrr_at_3_std": 0.045563,
133
+ "nauc_mrr_at_3_diff1": 0.406255,
134
+ "nauc_mrr_at_5_max": 0.165352,
135
+ "nauc_mrr_at_5_std": 0.048412,
136
+ "nauc_mrr_at_5_diff1": 0.398228,
137
+ "nauc_mrr_at_10_max": 0.160695,
138
+ "nauc_mrr_at_10_std": 0.052171,
139
+ "nauc_mrr_at_10_diff1": 0.391532,
140
+ "nauc_mrr_at_20_max": 0.15996,
141
+ "nauc_mrr_at_20_std": 0.055153,
142
+ "nauc_mrr_at_20_diff1": 0.388486,
143
+ "nauc_mrr_at_100_max": 0.160708,
144
+ "nauc_mrr_at_100_std": 0.057743,
145
+ "nauc_mrr_at_100_diff1": 0.38863,
146
+ "nauc_mrr_at_1000_max": 0.160638,
147
+ "nauc_mrr_at_1000_std": 0.057739,
148
+ "nauc_mrr_at_1000_diff1": 0.388626,
149
+ "hit_rate_at_1": 0.15541,
150
+ "hit_rate_at_3": 0.22803,
151
+ "hit_rate_at_5": 0.26815,
152
+ "hit_rate_at_10": 0.31847,
153
+ "hit_rate_at_20": 0.36879,
154
+ "hit_rate_at_100": 0.48408,
155
+ "hit_rate_at_1000": 0.69236,
156
+ "main_score": 0.19824,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 50.95594382286072,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGamingRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4885aa143210c98657558c04aaf3dc47cfb54340",
3
+ "task_name": "CQADupstackGamingRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.26771,
9
+ "ndcg_at_3": 0.31374,
10
+ "ndcg_at_5": 0.33452,
11
+ "ndcg_at_10": 0.35916,
12
+ "ndcg_at_20": 0.37826,
13
+ "ndcg_at_100": 0.40794,
14
+ "ndcg_at_1000": 0.43124,
15
+ "map_at_1": 0.22891,
16
+ "map_at_3": 0.28622,
17
+ "map_at_5": 0.29984,
18
+ "map_at_10": 0.31116,
19
+ "map_at_20": 0.31728,
20
+ "map_at_100": 0.32189,
21
+ "map_at_1000": 0.32279,
22
+ "recall_at_1": 0.22891,
23
+ "recall_at_3": 0.34627,
24
+ "recall_at_5": 0.39909,
25
+ "recall_at_10": 0.47317,
26
+ "recall_at_20": 0.54427,
27
+ "recall_at_100": 0.69127,
28
+ "recall_at_1000": 0.86522,
29
+ "accuracy": 0.22891,
30
+ "precision_at_1": 0.26771,
31
+ "precision_at_3": 0.14211,
32
+ "precision_at_5": 0.09918,
33
+ "precision_at_10": 0.05994,
34
+ "precision_at_20": 0.03539,
35
+ "precision_at_100": 0.00936,
36
+ "precision_at_1000": 0.00121,
37
+ "mrr_at_1": 0.267712,
38
+ "mrr_at_3": 0.320585,
39
+ "mrr_at_5": 0.333062,
40
+ "mrr_at_10": 0.343451,
41
+ "mrr_at_20": 0.348304,
42
+ "mrr_at_100": 0.352006,
43
+ "mrr_at_1000": 0.352627,
44
+ "nauc_ndcg_at_1_max": 0.319948,
45
+ "nauc_ndcg_at_1_std": -0.072456,
46
+ "nauc_ndcg_at_1_diff1": 0.525071,
47
+ "nauc_ndcg_at_3_max": 0.291007,
48
+ "nauc_ndcg_at_3_std": -0.090812,
49
+ "nauc_ndcg_at_3_diff1": 0.476591,
50
+ "nauc_ndcg_at_5_max": 0.287366,
51
+ "nauc_ndcg_at_5_std": -0.080975,
52
+ "nauc_ndcg_at_5_diff1": 0.468406,
53
+ "nauc_ndcg_at_10_max": 0.292958,
54
+ "nauc_ndcg_at_10_std": -0.074516,
55
+ "nauc_ndcg_at_10_diff1": 0.46282,
56
+ "nauc_ndcg_at_20_max": 0.293328,
57
+ "nauc_ndcg_at_20_std": -0.06336,
58
+ "nauc_ndcg_at_20_diff1": 0.456373,
59
+ "nauc_ndcg_at_100_max": 0.304053,
60
+ "nauc_ndcg_at_100_std": -0.04995,
61
+ "nauc_ndcg_at_100_diff1": 0.455003,
62
+ "nauc_ndcg_at_1000_max": 0.309144,
63
+ "nauc_ndcg_at_1000_std": -0.046061,
64
+ "nauc_ndcg_at_1000_diff1": 0.455945,
65
+ "nauc_map_at_1_max": 0.282218,
66
+ "nauc_map_at_1_std": -0.096136,
67
+ "nauc_map_at_1_diff1": 0.526218,
68
+ "nauc_map_at_3_max": 0.284393,
69
+ "nauc_map_at_3_std": -0.100838,
70
+ "nauc_map_at_3_diff1": 0.489897,
71
+ "nauc_map_at_5_max": 0.283897,
72
+ "nauc_map_at_5_std": -0.092852,
73
+ "nauc_map_at_5_diff1": 0.484307,
74
+ "nauc_map_at_10_max": 0.288442,
75
+ "nauc_map_at_10_std": -0.087498,
76
+ "nauc_map_at_10_diff1": 0.481224,
77
+ "nauc_map_at_20_max": 0.289385,
78
+ "nauc_map_at_20_std": -0.083784,
79
+ "nauc_map_at_20_diff1": 0.479195,
80
+ "nauc_map_at_100_max": 0.291723,
81
+ "nauc_map_at_100_std": -0.081364,
82
+ "nauc_map_at_100_diff1": 0.479064,
83
+ "nauc_map_at_1000_max": 0.292141,
84
+ "nauc_map_at_1000_std": -0.080995,
85
+ "nauc_map_at_1000_diff1": 0.479006,
86
+ "nauc_recall_at_1_max": 0.282218,
87
+ "nauc_recall_at_1_std": -0.096136,
88
+ "nauc_recall_at_1_diff1": 0.526218,
89
+ "nauc_recall_at_3_max": 0.261137,
90
+ "nauc_recall_at_3_std": -0.10829,
91
+ "nauc_recall_at_3_diff1": 0.443618,
92
+ "nauc_recall_at_5_max": 0.253186,
93
+ "nauc_recall_at_5_std": -0.079944,
94
+ "nauc_recall_at_5_diff1": 0.416745,
95
+ "nauc_recall_at_10_max": 0.268213,
96
+ "nauc_recall_at_10_std": -0.060301,
97
+ "nauc_recall_at_10_diff1": 0.400226,
98
+ "nauc_recall_at_20_max": 0.269083,
99
+ "nauc_recall_at_20_std": -0.016869,
100
+ "nauc_recall_at_20_diff1": 0.373554,
101
+ "nauc_recall_at_100_max": 0.311129,
102
+ "nauc_recall_at_100_std": 0.067892,
103
+ "nauc_recall_at_100_diff1": 0.335425,
104
+ "nauc_recall_at_1000_max": 0.377506,
105
+ "nauc_recall_at_1000_std": 0.218338,
106
+ "nauc_recall_at_1000_diff1": 0.268279,
107
+ "nauc_precision_at_1_max": 0.319948,
108
+ "nauc_precision_at_1_std": -0.072456,
109
+ "nauc_precision_at_1_diff1": 0.525071,
110
+ "nauc_precision_at_3_max": 0.300187,
111
+ "nauc_precision_at_3_std": -0.057119,
112
+ "nauc_precision_at_3_diff1": 0.392438,
113
+ "nauc_precision_at_5_max": 0.290237,
114
+ "nauc_precision_at_5_std": -0.016718,
115
+ "nauc_precision_at_5_diff1": 0.357827,
116
+ "nauc_precision_at_10_max": 0.305965,
117
+ "nauc_precision_at_10_std": 0.015331,
118
+ "nauc_precision_at_10_diff1": 0.309473,
119
+ "nauc_precision_at_20_max": 0.305504,
120
+ "nauc_precision_at_20_std": 0.073078,
121
+ "nauc_precision_at_20_diff1": 0.244688,
122
+ "nauc_precision_at_100_max": 0.312301,
123
+ "nauc_precision_at_100_std": 0.159407,
124
+ "nauc_precision_at_100_diff1": 0.143996,
125
+ "nauc_precision_at_1000_max": 0.30517,
126
+ "nauc_precision_at_1000_std": 0.214828,
127
+ "nauc_precision_at_1000_diff1": 0.02925,
128
+ "nauc_mrr_at_1_max": 0.319948,
129
+ "nauc_mrr_at_1_std": -0.072456,
130
+ "nauc_mrr_at_1_diff1": 0.525071,
131
+ "nauc_mrr_at_3_max": 0.30788,
132
+ "nauc_mrr_at_3_std": -0.071505,
133
+ "nauc_mrr_at_3_diff1": 0.480736,
134
+ "nauc_mrr_at_5_max": 0.306595,
135
+ "nauc_mrr_at_5_std": -0.065682,
136
+ "nauc_mrr_at_5_diff1": 0.476921,
137
+ "nauc_mrr_at_10_max": 0.307978,
138
+ "nauc_mrr_at_10_std": -0.063976,
139
+ "nauc_mrr_at_10_diff1": 0.474644,
140
+ "nauc_mrr_at_20_max": 0.308056,
141
+ "nauc_mrr_at_20_std": -0.061668,
142
+ "nauc_mrr_at_20_diff1": 0.473649,
143
+ "nauc_mrr_at_100_max": 0.309712,
144
+ "nauc_mrr_at_100_std": -0.060248,
145
+ "nauc_mrr_at_100_diff1": 0.474014,
146
+ "nauc_mrr_at_1000_max": 0.309686,
147
+ "nauc_mrr_at_1000_std": -0.060217,
148
+ "nauc_mrr_at_1000_diff1": 0.473992,
149
+ "hit_rate_at_1": 0.26771,
150
+ "hit_rate_at_3": 0.38871,
151
+ "hit_rate_at_5": 0.44263,
152
+ "hit_rate_at_10": 0.521,
153
+ "hit_rate_at_20": 0.5906,
154
+ "hit_rate_at_100": 0.73292,
155
+ "hit_rate_at_1000": 0.89028,
156
+ "main_score": 0.35916,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 57.79883646965027,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGisRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "5003b3064772da1887988e05400cf3806fe491f2",
3
+ "task_name": "CQADupstackGisRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.14802,
9
+ "ndcg_at_3": 0.18404,
10
+ "ndcg_at_5": 0.19539,
11
+ "ndcg_at_10": 0.21297,
12
+ "ndcg_at_20": 0.22774,
13
+ "ndcg_at_100": 0.25841,
14
+ "ndcg_at_1000": 0.28962,
15
+ "map_at_1": 0.13701,
16
+ "map_at_3": 0.16959,
17
+ "map_at_5": 0.17601,
18
+ "map_at_10": 0.18335,
19
+ "map_at_20": 0.18737,
20
+ "map_at_100": 0.19143,
21
+ "map_at_1000": 0.19245,
22
+ "recall_at_1": 0.13701,
23
+ "recall_at_3": 0.21143,
24
+ "recall_at_5": 0.23855,
25
+ "recall_at_10": 0.29166,
26
+ "recall_at_20": 0.34871,
27
+ "recall_at_100": 0.51082,
28
+ "recall_at_1000": 0.75759,
29
+ "accuracy": 0.13701,
30
+ "precision_at_1": 0.14802,
31
+ "precision_at_3": 0.07834,
32
+ "precision_at_5": 0.05356,
33
+ "precision_at_10": 0.03311,
34
+ "precision_at_20": 0.02,
35
+ "precision_at_100": 0.00599,
36
+ "precision_at_1000": 0.00091,
37
+ "mrr_at_1": 0.148023,
38
+ "mrr_at_3": 0.183427,
39
+ "mrr_at_5": 0.190377,
40
+ "mrr_at_10": 0.197632,
41
+ "mrr_at_20": 0.201503,
42
+ "mrr_at_100": 0.205503,
43
+ "mrr_at_1000": 0.206352,
44
+ "nauc_ndcg_at_1_max": 0.241446,
45
+ "nauc_ndcg_at_1_std": -0.094702,
46
+ "nauc_ndcg_at_1_diff1": 0.438894,
47
+ "nauc_ndcg_at_3_max": 0.201467,
48
+ "nauc_ndcg_at_3_std": -0.040296,
49
+ "nauc_ndcg_at_3_diff1": 0.38263,
50
+ "nauc_ndcg_at_5_max": 0.206062,
51
+ "nauc_ndcg_at_5_std": -0.021721,
52
+ "nauc_ndcg_at_5_diff1": 0.371526,
53
+ "nauc_ndcg_at_10_max": 0.202634,
54
+ "nauc_ndcg_at_10_std": -0.010369,
55
+ "nauc_ndcg_at_10_diff1": 0.3549,
56
+ "nauc_ndcg_at_20_max": 0.203727,
57
+ "nauc_ndcg_at_20_std": 0.007283,
58
+ "nauc_ndcg_at_20_diff1": 0.344759,
59
+ "nauc_ndcg_at_100_max": 0.201585,
60
+ "nauc_ndcg_at_100_std": 0.007198,
61
+ "nauc_ndcg_at_100_diff1": 0.343017,
62
+ "nauc_ndcg_at_1000_max": 0.20987,
63
+ "nauc_ndcg_at_1000_std": 0.008911,
64
+ "nauc_ndcg_at_1000_diff1": 0.339753,
65
+ "nauc_map_at_1_max": 0.229695,
66
+ "nauc_map_at_1_std": -0.090778,
67
+ "nauc_map_at_1_diff1": 0.442674,
68
+ "nauc_map_at_3_max": 0.205399,
69
+ "nauc_map_at_3_std": -0.05078,
70
+ "nauc_map_at_3_diff1": 0.398154,
71
+ "nauc_map_at_5_max": 0.207879,
72
+ "nauc_map_at_5_std": -0.039771,
73
+ "nauc_map_at_5_diff1": 0.391316,
74
+ "nauc_map_at_10_max": 0.206349,
75
+ "nauc_map_at_10_std": -0.034135,
76
+ "nauc_map_at_10_diff1": 0.383829,
77
+ "nauc_map_at_20_max": 0.207149,
78
+ "nauc_map_at_20_std": -0.028556,
79
+ "nauc_map_at_20_diff1": 0.380681,
80
+ "nauc_map_at_100_max": 0.207302,
81
+ "nauc_map_at_100_std": -0.028331,
82
+ "nauc_map_at_100_diff1": 0.379522,
83
+ "nauc_map_at_1000_max": 0.207586,
84
+ "nauc_map_at_1000_std": -0.028329,
85
+ "nauc_map_at_1000_diff1": 0.379381,
86
+ "nauc_recall_at_1_max": 0.229695,
87
+ "nauc_recall_at_1_std": -0.090778,
88
+ "nauc_recall_at_1_diff1": 0.442674,
89
+ "nauc_recall_at_3_max": 0.172604,
90
+ "nauc_recall_at_3_std": -0.013815,
91
+ "nauc_recall_at_3_diff1": 0.350774,
92
+ "nauc_recall_at_5_max": 0.184876,
93
+ "nauc_recall_at_5_std": 0.018596,
94
+ "nauc_recall_at_5_diff1": 0.326955,
95
+ "nauc_recall_at_10_max": 0.178987,
96
+ "nauc_recall_at_10_std": 0.041875,
97
+ "nauc_recall_at_10_diff1": 0.287236,
98
+ "nauc_recall_at_20_max": 0.174097,
99
+ "nauc_recall_at_20_std": 0.095634,
100
+ "nauc_recall_at_20_diff1": 0.2543,
101
+ "nauc_recall_at_100_max": 0.155359,
102
+ "nauc_recall_at_100_std": 0.094894,
103
+ "nauc_recall_at_100_diff1": 0.248938,
104
+ "nauc_recall_at_1000_max": 0.208551,
105
+ "nauc_recall_at_1000_std": 0.159475,
106
+ "nauc_recall_at_1000_diff1": 0.170272,
107
+ "nauc_precision_at_1_max": 0.241446,
108
+ "nauc_precision_at_1_std": -0.094702,
109
+ "nauc_precision_at_1_diff1": 0.438894,
110
+ "nauc_precision_at_3_max": 0.191184,
111
+ "nauc_precision_at_3_std": -0.009893,
112
+ "nauc_precision_at_3_diff1": 0.345798,
113
+ "nauc_precision_at_5_max": 0.203031,
114
+ "nauc_precision_at_5_std": 0.03056,
115
+ "nauc_precision_at_5_diff1": 0.317716,
116
+ "nauc_precision_at_10_max": 0.189584,
117
+ "nauc_precision_at_10_std": 0.059963,
118
+ "nauc_precision_at_10_diff1": 0.278288,
119
+ "nauc_precision_at_20_max": 0.205948,
120
+ "nauc_precision_at_20_std": 0.103578,
121
+ "nauc_precision_at_20_diff1": 0.241858,
122
+ "nauc_precision_at_100_max": 0.192145,
123
+ "nauc_precision_at_100_std": 0.105171,
124
+ "nauc_precision_at_100_diff1": 0.20836,
125
+ "nauc_precision_at_1000_max": 0.203403,
126
+ "nauc_precision_at_1000_std": 0.133913,
127
+ "nauc_precision_at_1000_diff1": 0.081355,
128
+ "nauc_mrr_at_1_max": 0.241446,
129
+ "nauc_mrr_at_1_std": -0.094702,
130
+ "nauc_mrr_at_1_diff1": 0.438894,
131
+ "nauc_mrr_at_3_max": 0.220107,
132
+ "nauc_mrr_at_3_std": -0.050075,
133
+ "nauc_mrr_at_3_diff1": 0.385785,
134
+ "nauc_mrr_at_5_max": 0.224537,
135
+ "nauc_mrr_at_5_std": -0.038183,
136
+ "nauc_mrr_at_5_diff1": 0.379752,
137
+ "nauc_mrr_at_10_max": 0.222755,
138
+ "nauc_mrr_at_10_std": -0.033685,
139
+ "nauc_mrr_at_10_diff1": 0.372554,
140
+ "nauc_mrr_at_20_max": 0.223395,
141
+ "nauc_mrr_at_20_std": -0.028966,
142
+ "nauc_mrr_at_20_diff1": 0.369638,
143
+ "nauc_mrr_at_100_max": 0.223009,
144
+ "nauc_mrr_at_100_std": -0.029118,
145
+ "nauc_mrr_at_100_diff1": 0.369214,
146
+ "nauc_mrr_at_1000_max": 0.22309,
147
+ "nauc_mrr_at_1000_std": -0.029286,
148
+ "nauc_mrr_at_1000_diff1": 0.369222,
149
+ "hit_rate_at_1": 0.14802,
150
+ "hit_rate_at_3": 0.23051,
151
+ "hit_rate_at_5": 0.26102,
152
+ "hit_rate_at_10": 0.31751,
153
+ "hit_rate_at_20": 0.37627,
154
+ "hit_rate_at_100": 0.55028,
155
+ "hit_rate_at_1000": 0.78418,
156
+ "main_score": 0.21297,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 50.68451809883118,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackMathematicaRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "90fceea13679c63fe563ded68f3b6f06e50061de",
3
+ "task_name": "CQADupstackMathematicaRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.09453,
9
+ "ndcg_at_3": 0.1192,
10
+ "ndcg_at_5": 0.13291,
11
+ "ndcg_at_10": 0.14541,
12
+ "ndcg_at_20": 0.1642,
13
+ "ndcg_at_100": 0.19451,
14
+ "ndcg_at_1000": 0.22926,
15
+ "map_at_1": 0.07654,
16
+ "map_at_3": 0.10319,
17
+ "map_at_5": 0.11125,
18
+ "map_at_10": 0.11665,
19
+ "map_at_20": 0.12185,
20
+ "map_at_100": 0.12613,
21
+ "map_at_1000": 0.12739,
22
+ "recall_at_1": 0.07654,
23
+ "recall_at_3": 0.13873,
24
+ "recall_at_5": 0.17145,
25
+ "recall_at_10": 0.20914,
26
+ "recall_at_20": 0.27913,
27
+ "recall_at_100": 0.42889,
28
+ "recall_at_1000": 0.68302,
29
+ "accuracy": 0.07654,
30
+ "precision_at_1": 0.09453,
31
+ "precision_at_3": 0.05929,
32
+ "precision_at_5": 0.04527,
33
+ "precision_at_10": 0.02811,
34
+ "precision_at_20": 0.01859,
35
+ "precision_at_100": 0.00604,
36
+ "precision_at_1000": 0.00104,
37
+ "mrr_at_1": 0.094527,
38
+ "mrr_at_3": 0.126451,
39
+ "mrr_at_5": 0.136277,
40
+ "mrr_at_10": 0.141619,
41
+ "mrr_at_20": 0.147175,
42
+ "mrr_at_100": 0.151338,
43
+ "mrr_at_1000": 0.152318,
44
+ "nauc_ndcg_at_1_max": 0.077207,
45
+ "nauc_ndcg_at_1_std": -0.067272,
46
+ "nauc_ndcg_at_1_diff1": 0.319897,
47
+ "nauc_ndcg_at_3_max": 0.08314,
48
+ "nauc_ndcg_at_3_std": -0.026599,
49
+ "nauc_ndcg_at_3_diff1": 0.200026,
50
+ "nauc_ndcg_at_5_max": 0.093394,
51
+ "nauc_ndcg_at_5_std": -0.000477,
52
+ "nauc_ndcg_at_5_diff1": 0.18578,
53
+ "nauc_ndcg_at_10_max": 0.100755,
54
+ "nauc_ndcg_at_10_std": 0.019686,
55
+ "nauc_ndcg_at_10_diff1": 0.184693,
56
+ "nauc_ndcg_at_20_max": 0.099157,
57
+ "nauc_ndcg_at_20_std": 0.027921,
58
+ "nauc_ndcg_at_20_diff1": 0.174479,
59
+ "nauc_ndcg_at_100_max": 0.101368,
60
+ "nauc_ndcg_at_100_std": 0.050521,
61
+ "nauc_ndcg_at_100_diff1": 0.162507,
62
+ "nauc_ndcg_at_1000_max": 0.10998,
63
+ "nauc_ndcg_at_1000_std": 0.054672,
64
+ "nauc_ndcg_at_1000_diff1": 0.164291,
65
+ "nauc_map_at_1_max": 0.126332,
66
+ "nauc_map_at_1_std": -0.03207,
67
+ "nauc_map_at_1_diff1": 0.287554,
68
+ "nauc_map_at_3_max": 0.098769,
69
+ "nauc_map_at_3_std": -0.021683,
70
+ "nauc_map_at_3_diff1": 0.217052,
71
+ "nauc_map_at_5_max": 0.102821,
72
+ "nauc_map_at_5_std": -0.006616,
73
+ "nauc_map_at_5_diff1": 0.210328,
74
+ "nauc_map_at_10_max": 0.107241,
75
+ "nauc_map_at_10_std": 0.00354,
76
+ "nauc_map_at_10_diff1": 0.208195,
77
+ "nauc_map_at_20_max": 0.10666,
78
+ "nauc_map_at_20_std": 0.006284,
79
+ "nauc_map_at_20_diff1": 0.20485,
80
+ "nauc_map_at_100_max": 0.106404,
81
+ "nauc_map_at_100_std": 0.010426,
82
+ "nauc_map_at_100_diff1": 0.201273,
83
+ "nauc_map_at_1000_max": 0.106967,
84
+ "nauc_map_at_1000_std": 0.010934,
85
+ "nauc_map_at_1000_diff1": 0.20099,
86
+ "nauc_recall_at_1_max": 0.126332,
87
+ "nauc_recall_at_1_std": -0.03207,
88
+ "nauc_recall_at_1_diff1": 0.287554,
89
+ "nauc_recall_at_3_max": 0.06686,
90
+ "nauc_recall_at_3_std": -0.010096,
91
+ "nauc_recall_at_3_diff1": 0.142578,
92
+ "nauc_recall_at_5_max": 0.086885,
93
+ "nauc_recall_at_5_std": 0.033905,
94
+ "nauc_recall_at_5_diff1": 0.12688,
95
+ "nauc_recall_at_10_max": 0.099594,
96
+ "nauc_recall_at_10_std": 0.071126,
97
+ "nauc_recall_at_10_diff1": 0.127556,
98
+ "nauc_recall_at_20_max": 0.097851,
99
+ "nauc_recall_at_20_std": 0.086969,
100
+ "nauc_recall_at_20_diff1": 0.105784,
101
+ "nauc_recall_at_100_max": 0.105419,
102
+ "nauc_recall_at_100_std": 0.148496,
103
+ "nauc_recall_at_100_diff1": 0.077287,
104
+ "nauc_recall_at_1000_max": 0.154024,
105
+ "nauc_recall_at_1000_std": 0.201868,
106
+ "nauc_recall_at_1000_diff1": 0.079451,
107
+ "nauc_precision_at_1_max": 0.077207,
108
+ "nauc_precision_at_1_std": -0.067272,
109
+ "nauc_precision_at_1_diff1": 0.319897,
110
+ "nauc_precision_at_3_max": 0.040015,
111
+ "nauc_precision_at_3_std": -0.043795,
112
+ "nauc_precision_at_3_diff1": 0.166762,
113
+ "nauc_precision_at_5_max": 0.059346,
114
+ "nauc_precision_at_5_std": -0.005412,
115
+ "nauc_precision_at_5_diff1": 0.148022,
116
+ "nauc_precision_at_10_max": 0.083602,
117
+ "nauc_precision_at_10_std": 0.042926,
118
+ "nauc_precision_at_10_diff1": 0.135007,
119
+ "nauc_precision_at_20_max": 0.068478,
120
+ "nauc_precision_at_20_std": 0.058986,
121
+ "nauc_precision_at_20_diff1": 0.101052,
122
+ "nauc_precision_at_100_max": 0.06294,
123
+ "nauc_precision_at_100_std": 0.127428,
124
+ "nauc_precision_at_100_diff1": 0.043471,
125
+ "nauc_precision_at_1000_max": 0.052201,
126
+ "nauc_precision_at_1000_std": 0.078407,
127
+ "nauc_precision_at_1000_diff1": 0.001548,
128
+ "nauc_mrr_at_1_max": 0.077207,
129
+ "nauc_mrr_at_1_std": -0.067272,
130
+ "nauc_mrr_at_1_diff1": 0.319897,
131
+ "nauc_mrr_at_3_max": 0.071593,
132
+ "nauc_mrr_at_3_std": -0.04877,
133
+ "nauc_mrr_at_3_diff1": 0.23058,
134
+ "nauc_mrr_at_5_max": 0.080837,
135
+ "nauc_mrr_at_5_std": -0.032153,
136
+ "nauc_mrr_at_5_diff1": 0.22148,
137
+ "nauc_mrr_at_10_max": 0.083453,
138
+ "nauc_mrr_at_10_std": -0.02441,
139
+ "nauc_mrr_at_10_diff1": 0.22032,
140
+ "nauc_mrr_at_20_max": 0.082106,
141
+ "nauc_mrr_at_20_std": -0.022884,
142
+ "nauc_mrr_at_20_diff1": 0.217055,
143
+ "nauc_mrr_at_100_max": 0.082625,
144
+ "nauc_mrr_at_100_std": -0.018697,
145
+ "nauc_mrr_at_100_diff1": 0.215846,
146
+ "nauc_mrr_at_1000_max": 0.082873,
147
+ "nauc_mrr_at_1000_std": -0.018702,
148
+ "nauc_mrr_at_1000_diff1": 0.215873,
149
+ "hit_rate_at_1": 0.09453,
150
+ "hit_rate_at_3": 0.17164,
151
+ "hit_rate_at_5": 0.21393,
152
+ "hit_rate_at_10": 0.25498,
153
+ "hit_rate_at_20": 0.33458,
154
+ "hit_rate_at_100": 0.50373,
155
+ "hit_rate_at_1000": 0.74627,
156
+ "main_score": 0.14541,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 24.72330355644226,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackPhysicsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4",
3
+ "task_name": "CQADupstackPhysicsRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.20982,
9
+ "ndcg_at_3": 0.24578,
10
+ "ndcg_at_5": 0.26211,
11
+ "ndcg_at_10": 0.28061,
12
+ "ndcg_at_20": 0.3014,
13
+ "ndcg_at_100": 0.33628,
14
+ "ndcg_at_1000": 0.36784,
15
+ "map_at_1": 0.1734,
16
+ "map_at_3": 0.21843,
17
+ "map_at_5": 0.22941,
18
+ "map_at_10": 0.23899,
19
+ "map_at_20": 0.24502,
20
+ "map_at_100": 0.25048,
21
+ "map_at_1000": 0.25187,
22
+ "recall_at_1": 0.1734,
23
+ "recall_at_3": 0.26761,
24
+ "recall_at_5": 0.30865,
25
+ "recall_at_10": 0.36498,
26
+ "recall_at_20": 0.4409,
27
+ "recall_at_100": 0.60724,
28
+ "recall_at_1000": 0.82477,
29
+ "accuracy": 0.1734,
30
+ "precision_at_1": 0.20982,
31
+ "precision_at_3": 0.1155,
32
+ "precision_at_5": 0.08335,
33
+ "precision_at_10": 0.0514,
34
+ "precision_at_20": 0.03162,
35
+ "precision_at_100": 0.00936,
36
+ "precision_at_1000": 0.0014,
37
+ "mrr_at_1": 0.209817,
38
+ "mrr_at_3": 0.260026,
39
+ "mrr_at_5": 0.271816,
40
+ "mrr_at_10": 0.279856,
41
+ "mrr_at_20": 0.285732,
42
+ "mrr_at_100": 0.289895,
43
+ "mrr_at_1000": 0.290729,
44
+ "nauc_ndcg_at_1_max": 0.254828,
45
+ "nauc_ndcg_at_1_std": -0.043127,
46
+ "nauc_ndcg_at_1_diff1": 0.470295,
47
+ "nauc_ndcg_at_3_max": 0.248655,
48
+ "nauc_ndcg_at_3_std": -0.044729,
49
+ "nauc_ndcg_at_3_diff1": 0.438501,
50
+ "nauc_ndcg_at_5_max": 0.242948,
51
+ "nauc_ndcg_at_5_std": -0.042773,
52
+ "nauc_ndcg_at_5_diff1": 0.446666,
53
+ "nauc_ndcg_at_10_max": 0.237071,
54
+ "nauc_ndcg_at_10_std": -0.035274,
55
+ "nauc_ndcg_at_10_diff1": 0.431205,
56
+ "nauc_ndcg_at_20_max": 0.236073,
57
+ "nauc_ndcg_at_20_std": -0.024615,
58
+ "nauc_ndcg_at_20_diff1": 0.419382,
59
+ "nauc_ndcg_at_100_max": 0.245409,
60
+ "nauc_ndcg_at_100_std": 0.005871,
61
+ "nauc_ndcg_at_100_diff1": 0.410673,
62
+ "nauc_ndcg_at_1000_max": 0.254067,
63
+ "nauc_ndcg_at_1000_std": 0.012659,
64
+ "nauc_ndcg_at_1000_diff1": 0.40722,
65
+ "nauc_map_at_1_max": 0.218463,
66
+ "nauc_map_at_1_std": -0.105778,
67
+ "nauc_map_at_1_diff1": 0.523157,
68
+ "nauc_map_at_3_max": 0.234017,
69
+ "nauc_map_at_3_std": -0.073921,
70
+ "nauc_map_at_3_diff1": 0.472238,
71
+ "nauc_map_at_5_max": 0.235592,
72
+ "nauc_map_at_5_std": -0.066933,
73
+ "nauc_map_at_5_diff1": 0.471779,
74
+ "nauc_map_at_10_max": 0.23685,
75
+ "nauc_map_at_10_std": -0.059201,
76
+ "nauc_map_at_10_diff1": 0.461365,
77
+ "nauc_map_at_20_max": 0.236919,
78
+ "nauc_map_at_20_std": -0.055254,
79
+ "nauc_map_at_20_diff1": 0.457038,
80
+ "nauc_map_at_100_max": 0.238626,
81
+ "nauc_map_at_100_std": -0.049963,
82
+ "nauc_map_at_100_diff1": 0.455506,
83
+ "nauc_map_at_1000_max": 0.239134,
84
+ "nauc_map_at_1000_std": -0.04921,
85
+ "nauc_map_at_1000_diff1": 0.455119,
86
+ "nauc_recall_at_1_max": 0.218463,
87
+ "nauc_recall_at_1_std": -0.105778,
88
+ "nauc_recall_at_1_diff1": 0.523157,
89
+ "nauc_recall_at_3_max": 0.225161,
90
+ "nauc_recall_at_3_std": -0.05949,
91
+ "nauc_recall_at_3_diff1": 0.415325,
92
+ "nauc_recall_at_5_max": 0.217246,
93
+ "nauc_recall_at_5_std": -0.038944,
94
+ "nauc_recall_at_5_diff1": 0.413718,
95
+ "nauc_recall_at_10_max": 0.2042,
96
+ "nauc_recall_at_10_std": -0.009831,
97
+ "nauc_recall_at_10_diff1": 0.36635,
98
+ "nauc_recall_at_20_max": 0.198935,
99
+ "nauc_recall_at_20_std": 0.026311,
100
+ "nauc_recall_at_20_diff1": 0.321622,
101
+ "nauc_recall_at_100_max": 0.229836,
102
+ "nauc_recall_at_100_std": 0.175171,
103
+ "nauc_recall_at_100_diff1": 0.266453,
104
+ "nauc_recall_at_1000_max": 0.327015,
105
+ "nauc_recall_at_1000_std": 0.368444,
106
+ "nauc_recall_at_1000_diff1": 0.16318,
107
+ "nauc_precision_at_1_max": 0.254828,
108
+ "nauc_precision_at_1_std": -0.043127,
109
+ "nauc_precision_at_1_diff1": 0.470295,
110
+ "nauc_precision_at_3_max": 0.287493,
111
+ "nauc_precision_at_3_std": 0.035242,
112
+ "nauc_precision_at_3_diff1": 0.347047,
113
+ "nauc_precision_at_5_max": 0.276118,
114
+ "nauc_precision_at_5_std": 0.058746,
115
+ "nauc_precision_at_5_diff1": 0.330775,
116
+ "nauc_precision_at_10_max": 0.266077,
117
+ "nauc_precision_at_10_std": 0.096808,
118
+ "nauc_precision_at_10_diff1": 0.242079,
119
+ "nauc_precision_at_20_max": 0.242944,
120
+ "nauc_precision_at_20_std": 0.128028,
121
+ "nauc_precision_at_20_diff1": 0.172198,
122
+ "nauc_precision_at_100_max": 0.210621,
123
+ "nauc_precision_at_100_std": 0.216472,
124
+ "nauc_precision_at_100_diff1": 0.050983,
125
+ "nauc_precision_at_1000_max": 0.144491,
126
+ "nauc_precision_at_1000_std": 0.202161,
127
+ "nauc_precision_at_1000_diff1": -0.100711,
128
+ "nauc_mrr_at_1_max": 0.254828,
129
+ "nauc_mrr_at_1_std": -0.043127,
130
+ "nauc_mrr_at_1_diff1": 0.470295,
131
+ "nauc_mrr_at_3_max": 0.2626,
132
+ "nauc_mrr_at_3_std": -0.022023,
133
+ "nauc_mrr_at_3_diff1": 0.423527,
134
+ "nauc_mrr_at_5_max": 0.258401,
135
+ "nauc_mrr_at_5_std": -0.019565,
136
+ "nauc_mrr_at_5_diff1": 0.427366,
137
+ "nauc_mrr_at_10_max": 0.254231,
138
+ "nauc_mrr_at_10_std": -0.017202,
139
+ "nauc_mrr_at_10_diff1": 0.419566,
140
+ "nauc_mrr_at_20_max": 0.253555,
141
+ "nauc_mrr_at_20_std": -0.014647,
142
+ "nauc_mrr_at_20_diff1": 0.416868,
143
+ "nauc_mrr_at_100_max": 0.255094,
144
+ "nauc_mrr_at_100_std": -0.011623,
145
+ "nauc_mrr_at_100_diff1": 0.416154,
146
+ "nauc_mrr_at_1000_max": 0.255383,
147
+ "nauc_mrr_at_1000_std": -0.011407,
148
+ "nauc_mrr_at_1000_diff1": 0.416047,
149
+ "hit_rate_at_1": 0.20982,
150
+ "hit_rate_at_3": 0.32243,
151
+ "hit_rate_at_5": 0.37344,
152
+ "hit_rate_at_10": 0.43311,
153
+ "hit_rate_at_20": 0.51877,
154
+ "hit_rate_at_100": 0.68046,
155
+ "hit_rate_at_1000": 0.87199,
156
+ "main_score": 0.28061,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 49.752307653427124,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackProgrammersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6184bc1440d2dbc7612be22b50686b8826d22b32",
3
+ "task_name": "CQADupstackProgrammersRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.17123,
9
+ "ndcg_at_3": 0.20564,
10
+ "ndcg_at_5": 0.223,
11
+ "ndcg_at_10": 0.24329,
12
+ "ndcg_at_20": 0.2617,
13
+ "ndcg_at_100": 0.29386,
14
+ "ndcg_at_1000": 0.32936,
15
+ "map_at_1": 0.13695,
16
+ "map_at_3": 0.18082,
17
+ "map_at_5": 0.19216,
18
+ "map_at_10": 0.20111,
19
+ "map_at_20": 0.20666,
20
+ "map_at_100": 0.21144,
21
+ "map_at_1000": 0.21283,
22
+ "recall_at_1": 0.13695,
23
+ "recall_at_3": 0.22597,
24
+ "recall_at_5": 0.27226,
25
+ "recall_at_10": 0.33283,
26
+ "recall_at_20": 0.39844,
27
+ "recall_at_100": 0.55443,
28
+ "recall_at_1000": 0.81224,
29
+ "accuracy": 0.13695,
30
+ "precision_at_1": 0.17123,
31
+ "precision_at_3": 0.10008,
32
+ "precision_at_5": 0.07306,
33
+ "precision_at_10": 0.04612,
34
+ "precision_at_20": 0.02848,
35
+ "precision_at_100": 0.00853,
36
+ "precision_at_1000": 0.00132,
37
+ "mrr_at_1": 0.171233,
38
+ "mrr_at_3": 0.22032,
39
+ "mrr_at_5": 0.23242,
40
+ "mrr_at_10": 0.241276,
41
+ "mrr_at_20": 0.246273,
42
+ "mrr_at_100": 0.249915,
43
+ "mrr_at_1000": 0.250883,
44
+ "nauc_ndcg_at_1_max": 0.297837,
45
+ "nauc_ndcg_at_1_std": 0.050543,
46
+ "nauc_ndcg_at_1_diff1": 0.436876,
47
+ "nauc_ndcg_at_3_max": 0.252802,
48
+ "nauc_ndcg_at_3_std": 0.06258,
49
+ "nauc_ndcg_at_3_diff1": 0.374362,
50
+ "nauc_ndcg_at_5_max": 0.254951,
51
+ "nauc_ndcg_at_5_std": 0.069593,
52
+ "nauc_ndcg_at_5_diff1": 0.372934,
53
+ "nauc_ndcg_at_10_max": 0.260979,
54
+ "nauc_ndcg_at_10_std": 0.079412,
55
+ "nauc_ndcg_at_10_diff1": 0.368212,
56
+ "nauc_ndcg_at_20_max": 0.267407,
57
+ "nauc_ndcg_at_20_std": 0.084276,
58
+ "nauc_ndcg_at_20_diff1": 0.355256,
59
+ "nauc_ndcg_at_100_max": 0.284182,
60
+ "nauc_ndcg_at_100_std": 0.121671,
61
+ "nauc_ndcg_at_100_diff1": 0.3371,
62
+ "nauc_ndcg_at_1000_max": 0.290727,
63
+ "nauc_ndcg_at_1000_std": 0.132371,
64
+ "nauc_ndcg_at_1000_diff1": 0.344108,
65
+ "nauc_map_at_1_max": 0.259783,
66
+ "nauc_map_at_1_std": 0.028129,
67
+ "nauc_map_at_1_diff1": 0.465984,
68
+ "nauc_map_at_3_max": 0.249202,
69
+ "nauc_map_at_3_std": 0.054965,
70
+ "nauc_map_at_3_diff1": 0.400199,
71
+ "nauc_map_at_5_max": 0.252626,
72
+ "nauc_map_at_5_std": 0.05979,
73
+ "nauc_map_at_5_diff1": 0.397735,
74
+ "nauc_map_at_10_max": 0.256813,
75
+ "nauc_map_at_10_std": 0.06531,
76
+ "nauc_map_at_10_diff1": 0.394459,
77
+ "nauc_map_at_20_max": 0.259883,
78
+ "nauc_map_at_20_std": 0.066747,
79
+ "nauc_map_at_20_diff1": 0.390146,
80
+ "nauc_map_at_100_max": 0.263054,
81
+ "nauc_map_at_100_std": 0.073393,
82
+ "nauc_map_at_100_diff1": 0.387143,
83
+ "nauc_map_at_1000_max": 0.263528,
84
+ "nauc_map_at_1000_std": 0.074184,
85
+ "nauc_map_at_1000_diff1": 0.38713,
86
+ "nauc_recall_at_1_max": 0.259783,
87
+ "nauc_recall_at_1_std": 0.028129,
88
+ "nauc_recall_at_1_diff1": 0.465984,
89
+ "nauc_recall_at_3_max": 0.225685,
90
+ "nauc_recall_at_3_std": 0.070156,
91
+ "nauc_recall_at_3_diff1": 0.338033,
92
+ "nauc_recall_at_5_max": 0.234934,
93
+ "nauc_recall_at_5_std": 0.083898,
94
+ "nauc_recall_at_5_diff1": 0.324011,
95
+ "nauc_recall_at_10_max": 0.244585,
96
+ "nauc_recall_at_10_std": 0.108694,
97
+ "nauc_recall_at_10_diff1": 0.303497,
98
+ "nauc_recall_at_20_max": 0.257007,
99
+ "nauc_recall_at_20_std": 0.121982,
100
+ "nauc_recall_at_20_diff1": 0.266772,
101
+ "nauc_recall_at_100_max": 0.317556,
102
+ "nauc_recall_at_100_std": 0.277659,
103
+ "nauc_recall_at_100_diff1": 0.181336,
104
+ "nauc_recall_at_1000_max": 0.439803,
105
+ "nauc_recall_at_1000_std": 0.520536,
106
+ "nauc_recall_at_1000_diff1": 0.163722,
107
+ "nauc_precision_at_1_max": 0.297837,
108
+ "nauc_precision_at_1_std": 0.050543,
109
+ "nauc_precision_at_1_diff1": 0.436876,
110
+ "nauc_precision_at_3_max": 0.264821,
111
+ "nauc_precision_at_3_std": 0.099565,
112
+ "nauc_precision_at_3_diff1": 0.30401,
113
+ "nauc_precision_at_5_max": 0.272477,
114
+ "nauc_precision_at_5_std": 0.113092,
115
+ "nauc_precision_at_5_diff1": 0.297559,
116
+ "nauc_precision_at_10_max": 0.28837,
117
+ "nauc_precision_at_10_std": 0.127413,
118
+ "nauc_precision_at_10_diff1": 0.250093,
119
+ "nauc_precision_at_20_max": 0.297813,
120
+ "nauc_precision_at_20_std": 0.137803,
121
+ "nauc_precision_at_20_diff1": 0.183561,
122
+ "nauc_precision_at_100_max": 0.268929,
123
+ "nauc_precision_at_100_std": 0.198684,
124
+ "nauc_precision_at_100_diff1": 0.043035,
125
+ "nauc_precision_at_1000_max": 0.132982,
126
+ "nauc_precision_at_1000_std": 0.148289,
127
+ "nauc_precision_at_1000_diff1": -0.064595,
128
+ "nauc_mrr_at_1_max": 0.297837,
129
+ "nauc_mrr_at_1_std": 0.050543,
130
+ "nauc_mrr_at_1_diff1": 0.436876,
131
+ "nauc_mrr_at_3_max": 0.266739,
132
+ "nauc_mrr_at_3_std": 0.058922,
133
+ "nauc_mrr_at_3_diff1": 0.374443,
134
+ "nauc_mrr_at_5_max": 0.267419,
135
+ "nauc_mrr_at_5_std": 0.062022,
136
+ "nauc_mrr_at_5_diff1": 0.371348,
137
+ "nauc_mrr_at_10_max": 0.270051,
138
+ "nauc_mrr_at_10_std": 0.065013,
139
+ "nauc_mrr_at_10_diff1": 0.3702,
140
+ "nauc_mrr_at_20_max": 0.270384,
141
+ "nauc_mrr_at_20_std": 0.066474,
142
+ "nauc_mrr_at_20_diff1": 0.366143,
143
+ "nauc_mrr_at_100_max": 0.272236,
144
+ "nauc_mrr_at_100_std": 0.071285,
145
+ "nauc_mrr_at_100_diff1": 0.364096,
146
+ "nauc_mrr_at_1000_max": 0.272188,
147
+ "nauc_mrr_at_1000_std": 0.071368,
148
+ "nauc_mrr_at_1000_diff1": 0.3645,
149
+ "hit_rate_at_1": 0.17123,
150
+ "hit_rate_at_3": 0.27854,
151
+ "hit_rate_at_5": 0.33105,
152
+ "hit_rate_at_10": 0.40068,
153
+ "hit_rate_at_20": 0.47374,
154
+ "hit_rate_at_100": 0.62329,
155
+ "hit_rate_at_1000": 0.85959,
156
+ "main_score": 0.24329,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 42.90849280357361,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackRetrieval.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1",
3
+ "task_name": "CQADupstackRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_10": 0.223393,
9
+ "main_score": 0.223393,
10
+ "hf_subset": "default",
11
+ "languages": [
12
+ "eng-Latn"
13
+ ]
14
+ }
15
+ ]
16
+ },
17
+ "evaluation_time": 617.5961873531342,
18
+ "kg_co2_emissions": NaN,
19
+ "date": 1774563874.323062
20
+ }
results/CQADupstackStatsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65ac3a16b8e91f9cee4c9828cc7c335575432a2a",
3
+ "task_name": "CQADupstackStatsRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.17791,
9
+ "ndcg_at_3": 0.196,
10
+ "ndcg_at_5": 0.20613,
11
+ "ndcg_at_10": 0.21584,
12
+ "ndcg_at_20": 0.23198,
13
+ "ndcg_at_100": 0.2541,
14
+ "ndcg_at_1000": 0.2823,
15
+ "map_at_1": 0.15916,
16
+ "map_at_3": 0.18159,
17
+ "map_at_5": 0.18832,
18
+ "map_at_10": 0.19288,
19
+ "map_at_20": 0.19742,
20
+ "map_at_100": 0.20071,
21
+ "map_at_1000": 0.20158,
22
+ "recall_at_1": 0.15916,
23
+ "recall_at_3": 0.20938,
24
+ "recall_at_5": 0.23693,
25
+ "recall_at_10": 0.26651,
26
+ "recall_at_20": 0.32725,
27
+ "recall_at_100": 0.44016,
28
+ "recall_at_1000": 0.66098,
29
+ "accuracy": 0.15916,
30
+ "precision_at_1": 0.17791,
31
+ "precision_at_3": 0.08282,
32
+ "precision_at_5": 0.05675,
33
+ "precision_at_10": 0.03221,
34
+ "precision_at_20": 0.01986,
35
+ "precision_at_100": 0.00544,
36
+ "precision_at_1000": 0.00086,
37
+ "mrr_at_1": 0.177914,
38
+ "mrr_at_3": 0.202454,
39
+ "mrr_at_5": 0.209509,
40
+ "mrr_at_10": 0.213756,
41
+ "mrr_at_20": 0.218521,
42
+ "mrr_at_100": 0.221363,
43
+ "mrr_at_1000": 0.222132,
44
+ "nauc_ndcg_at_1_max": 0.183033,
45
+ "nauc_ndcg_at_1_std": 0.032397,
46
+ "nauc_ndcg_at_1_diff1": 0.494928,
47
+ "nauc_ndcg_at_3_max": 0.149511,
48
+ "nauc_ndcg_at_3_std": 0.02736,
49
+ "nauc_ndcg_at_3_diff1": 0.452294,
50
+ "nauc_ndcg_at_5_max": 0.140584,
51
+ "nauc_ndcg_at_5_std": 0.022732,
52
+ "nauc_ndcg_at_5_diff1": 0.441298,
53
+ "nauc_ndcg_at_10_max": 0.127631,
54
+ "nauc_ndcg_at_10_std": 0.022824,
55
+ "nauc_ndcg_at_10_diff1": 0.428288,
56
+ "nauc_ndcg_at_20_max": 0.14009,
57
+ "nauc_ndcg_at_20_std": 0.04502,
58
+ "nauc_ndcg_at_20_diff1": 0.420768,
59
+ "nauc_ndcg_at_100_max": 0.139258,
60
+ "nauc_ndcg_at_100_std": 0.077834,
61
+ "nauc_ndcg_at_100_diff1": 0.401035,
62
+ "nauc_ndcg_at_1000_max": 0.140727,
63
+ "nauc_ndcg_at_1000_std": 0.08085,
64
+ "nauc_ndcg_at_1000_diff1": 0.399758,
65
+ "nauc_map_at_1_max": 0.162492,
66
+ "nauc_map_at_1_std": -0.005286,
67
+ "nauc_map_at_1_diff1": 0.516272,
68
+ "nauc_map_at_3_max": 0.147141,
69
+ "nauc_map_at_3_std": 0.011165,
70
+ "nauc_map_at_3_diff1": 0.47213,
71
+ "nauc_map_at_5_max": 0.144364,
72
+ "nauc_map_at_5_std": 0.012534,
73
+ "nauc_map_at_5_diff1": 0.46441,
74
+ "nauc_map_at_10_max": 0.138004,
75
+ "nauc_map_at_10_std": 0.01317,
76
+ "nauc_map_at_10_diff1": 0.459172,
77
+ "nauc_map_at_20_max": 0.141604,
78
+ "nauc_map_at_20_std": 0.019593,
79
+ "nauc_map_at_20_diff1": 0.456463,
80
+ "nauc_map_at_100_max": 0.141807,
81
+ "nauc_map_at_100_std": 0.024972,
82
+ "nauc_map_at_100_diff1": 0.453337,
83
+ "nauc_map_at_1000_max": 0.142015,
84
+ "nauc_map_at_1000_std": 0.02531,
85
+ "nauc_map_at_1000_diff1": 0.453118,
86
+ "nauc_recall_at_1_max": 0.162492,
87
+ "nauc_recall_at_1_std": -0.005286,
88
+ "nauc_recall_at_1_diff1": 0.516272,
89
+ "nauc_recall_at_3_max": 0.12429,
90
+ "nauc_recall_at_3_std": 0.022123,
91
+ "nauc_recall_at_3_diff1": 0.419082,
92
+ "nauc_recall_at_5_max": 0.111799,
93
+ "nauc_recall_at_5_std": 0.022117,
94
+ "nauc_recall_at_5_diff1": 0.382958,
95
+ "nauc_recall_at_10_max": 0.083367,
96
+ "nauc_recall_at_10_std": 0.024539,
97
+ "nauc_recall_at_10_diff1": 0.352838,
98
+ "nauc_recall_at_20_max": 0.121184,
99
+ "nauc_recall_at_20_std": 0.09396,
100
+ "nauc_recall_at_20_diff1": 0.327596,
101
+ "nauc_recall_at_100_max": 0.114498,
102
+ "nauc_recall_at_100_std": 0.228954,
103
+ "nauc_recall_at_100_diff1": 0.244536,
104
+ "nauc_recall_at_1000_max": 0.099269,
105
+ "nauc_recall_at_1000_std": 0.282387,
106
+ "nauc_recall_at_1000_diff1": 0.195672,
107
+ "nauc_precision_at_1_max": 0.183033,
108
+ "nauc_precision_at_1_std": 0.032397,
109
+ "nauc_precision_at_1_diff1": 0.494928,
110
+ "nauc_precision_at_3_max": 0.151787,
111
+ "nauc_precision_at_3_std": 0.067144,
112
+ "nauc_precision_at_3_diff1": 0.368324,
113
+ "nauc_precision_at_5_max": 0.142679,
114
+ "nauc_precision_at_5_std": 0.063829,
115
+ "nauc_precision_at_5_diff1": 0.347896,
116
+ "nauc_precision_at_10_max": 0.106501,
117
+ "nauc_precision_at_10_std": 0.064661,
118
+ "nauc_precision_at_10_diff1": 0.311837,
119
+ "nauc_precision_at_20_max": 0.148322,
120
+ "nauc_precision_at_20_std": 0.139417,
121
+ "nauc_precision_at_20_diff1": 0.287879,
122
+ "nauc_precision_at_100_max": 0.15121,
123
+ "nauc_precision_at_100_std": 0.277803,
124
+ "nauc_precision_at_100_diff1": 0.19018,
125
+ "nauc_precision_at_1000_max": 0.142965,
126
+ "nauc_precision_at_1000_std": 0.246585,
127
+ "nauc_precision_at_1000_diff1": 0.088476,
128
+ "nauc_mrr_at_1_max": 0.183033,
129
+ "nauc_mrr_at_1_std": 0.032397,
130
+ "nauc_mrr_at_1_diff1": 0.494928,
131
+ "nauc_mrr_at_3_max": 0.165672,
132
+ "nauc_mrr_at_3_std": 0.042397,
133
+ "nauc_mrr_at_3_diff1": 0.45951,
134
+ "nauc_mrr_at_5_max": 0.161531,
135
+ "nauc_mrr_at_5_std": 0.040125,
136
+ "nauc_mrr_at_5_diff1": 0.449297,
137
+ "nauc_mrr_at_10_max": 0.157097,
138
+ "nauc_mrr_at_10_std": 0.04048,
139
+ "nauc_mrr_at_10_diff1": 0.443077,
140
+ "nauc_mrr_at_20_max": 0.16014,
141
+ "nauc_mrr_at_20_std": 0.045543,
142
+ "nauc_mrr_at_20_diff1": 0.44011,
143
+ "nauc_mrr_at_100_max": 0.159399,
144
+ "nauc_mrr_at_100_std": 0.049364,
145
+ "nauc_mrr_at_100_diff1": 0.437581,
146
+ "nauc_mrr_at_1000_max": 0.159609,
147
+ "nauc_mrr_at_1000_std": 0.049379,
148
+ "nauc_mrr_at_1000_diff1": 0.437715,
149
+ "hit_rate_at_1": 0.17791,
150
+ "hit_rate_at_3": 0.2362,
151
+ "hit_rate_at_5": 0.26687,
152
+ "hit_rate_at_10": 0.29755,
153
+ "hit_rate_at_20": 0.3635,
154
+ "hit_rate_at_100": 0.47853,
155
+ "hit_rate_at_1000": 0.70706,
156
+ "main_score": 0.21584,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 55.42213582992554,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackTexRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "46989137a86843e03a6195de44b09deda022eec7",
3
+ "task_name": "CQADupstackTexRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.10392,
9
+ "ndcg_at_3": 0.12455,
10
+ "ndcg_at_5": 0.1364,
11
+ "ndcg_at_10": 0.15038,
12
+ "ndcg_at_20": 0.16301,
13
+ "ndcg_at_100": 0.18763,
14
+ "ndcg_at_1000": 0.21916,
15
+ "map_at_1": 0.08824,
16
+ "map_at_3": 0.11102,
17
+ "map_at_5": 0.11816,
18
+ "map_at_10": 0.12414,
19
+ "map_at_20": 0.12777,
20
+ "map_at_100": 0.13106,
21
+ "map_at_1000": 0.13215,
22
+ "recall_at_1": 0.08824,
23
+ "recall_at_3": 0.1371,
24
+ "recall_at_5": 0.16692,
25
+ "recall_at_10": 0.20907,
26
+ "recall_at_20": 0.2558,
27
+ "recall_at_100": 0.3828,
28
+ "recall_at_1000": 0.6168,
29
+ "accuracy": 0.08824,
30
+ "precision_at_1": 0.10392,
31
+ "precision_at_3": 0.05816,
32
+ "precision_at_5": 0.0435,
33
+ "precision_at_10": 0.0277,
34
+ "precision_at_20": 0.01741,
35
+ "precision_at_100": 0.00543,
36
+ "precision_at_1000": 0.00095,
37
+ "mrr_at_1": 0.103923,
38
+ "mrr_at_3": 0.131624,
39
+ "mrr_at_5": 0.139504,
40
+ "mrr_at_10": 0.145981,
41
+ "mrr_at_20": 0.14976,
42
+ "mrr_at_100": 0.152972,
43
+ "mrr_at_1000": 0.153882,
44
+ "nauc_ndcg_at_1_max": 0.241745,
45
+ "nauc_ndcg_at_1_std": 0.019745,
46
+ "nauc_ndcg_at_1_diff1": 0.428808,
47
+ "nauc_ndcg_at_3_max": 0.213809,
48
+ "nauc_ndcg_at_3_std": 0.025011,
49
+ "nauc_ndcg_at_3_diff1": 0.345063,
50
+ "nauc_ndcg_at_5_max": 0.212832,
51
+ "nauc_ndcg_at_5_std": 0.031088,
52
+ "nauc_ndcg_at_5_diff1": 0.329769,
53
+ "nauc_ndcg_at_10_max": 0.207267,
54
+ "nauc_ndcg_at_10_std": 0.035282,
55
+ "nauc_ndcg_at_10_diff1": 0.314207,
56
+ "nauc_ndcg_at_20_max": 0.206616,
57
+ "nauc_ndcg_at_20_std": 0.043859,
58
+ "nauc_ndcg_at_20_diff1": 0.31432,
59
+ "nauc_ndcg_at_100_max": 0.209541,
60
+ "nauc_ndcg_at_100_std": 0.066365,
61
+ "nauc_ndcg_at_100_diff1": 0.300961,
62
+ "nauc_ndcg_at_1000_max": 0.226776,
63
+ "nauc_ndcg_at_1000_std": 0.084963,
64
+ "nauc_ndcg_at_1000_diff1": 0.296052,
65
+ "nauc_map_at_1_max": 0.24735,
66
+ "nauc_map_at_1_std": 0.037004,
67
+ "nauc_map_at_1_diff1": 0.437329,
68
+ "nauc_map_at_3_max": 0.217756,
69
+ "nauc_map_at_3_std": 0.028094,
70
+ "nauc_map_at_3_diff1": 0.369287,
71
+ "nauc_map_at_5_max": 0.218619,
72
+ "nauc_map_at_5_std": 0.031181,
73
+ "nauc_map_at_5_diff1": 0.359253,
74
+ "nauc_map_at_10_max": 0.21657,
75
+ "nauc_map_at_10_std": 0.032939,
76
+ "nauc_map_at_10_diff1": 0.350454,
77
+ "nauc_map_at_20_max": 0.216286,
78
+ "nauc_map_at_20_std": 0.035927,
79
+ "nauc_map_at_20_diff1": 0.349902,
80
+ "nauc_map_at_100_max": 0.217112,
81
+ "nauc_map_at_100_std": 0.03967,
82
+ "nauc_map_at_100_diff1": 0.347838,
83
+ "nauc_map_at_1000_max": 0.218067,
84
+ "nauc_map_at_1000_std": 0.040528,
85
+ "nauc_map_at_1000_diff1": 0.347386,
86
+ "nauc_recall_at_1_max": 0.24735,
87
+ "nauc_recall_at_1_std": 0.037004,
88
+ "nauc_recall_at_1_diff1": 0.437329,
89
+ "nauc_recall_at_3_max": 0.188438,
90
+ "nauc_recall_at_3_std": 0.02477,
91
+ "nauc_recall_at_3_diff1": 0.308673,
92
+ "nauc_recall_at_5_max": 0.188138,
93
+ "nauc_recall_at_5_std": 0.033885,
94
+ "nauc_recall_at_5_diff1": 0.282793,
95
+ "nauc_recall_at_10_max": 0.172316,
96
+ "nauc_recall_at_10_std": 0.038655,
97
+ "nauc_recall_at_10_diff1": 0.243524,
98
+ "nauc_recall_at_20_max": 0.173054,
99
+ "nauc_recall_at_20_std": 0.060024,
100
+ "nauc_recall_at_20_diff1": 0.246939,
101
+ "nauc_recall_at_100_max": 0.177744,
102
+ "nauc_recall_at_100_std": 0.134482,
103
+ "nauc_recall_at_100_diff1": 0.201046,
104
+ "nauc_recall_at_1000_max": 0.251574,
105
+ "nauc_recall_at_1000_std": 0.250301,
106
+ "nauc_recall_at_1000_diff1": 0.160561,
107
+ "nauc_precision_at_1_max": 0.241745,
108
+ "nauc_precision_at_1_std": 0.019745,
109
+ "nauc_precision_at_1_diff1": 0.428808,
110
+ "nauc_precision_at_3_max": 0.203234,
111
+ "nauc_precision_at_3_std": 0.01787,
112
+ "nauc_precision_at_3_diff1": 0.283149,
113
+ "nauc_precision_at_5_max": 0.21043,
114
+ "nauc_precision_at_5_std": 0.032574,
115
+ "nauc_precision_at_5_diff1": 0.249021,
116
+ "nauc_precision_at_10_max": 0.202864,
117
+ "nauc_precision_at_10_std": 0.049548,
118
+ "nauc_precision_at_10_diff1": 0.20947,
119
+ "nauc_precision_at_20_max": 0.205997,
120
+ "nauc_precision_at_20_std": 0.077502,
121
+ "nauc_precision_at_20_diff1": 0.202645,
122
+ "nauc_precision_at_100_max": 0.209608,
123
+ "nauc_precision_at_100_std": 0.146291,
124
+ "nauc_precision_at_100_diff1": 0.124621,
125
+ "nauc_precision_at_1000_max": 0.249273,
126
+ "nauc_precision_at_1000_std": 0.194692,
127
+ "nauc_precision_at_1000_diff1": 0.027011,
128
+ "nauc_mrr_at_1_max": 0.241745,
129
+ "nauc_mrr_at_1_std": 0.019745,
130
+ "nauc_mrr_at_1_diff1": 0.428808,
131
+ "nauc_mrr_at_3_max": 0.220519,
132
+ "nauc_mrr_at_3_std": 0.020093,
133
+ "nauc_mrr_at_3_diff1": 0.34723,
134
+ "nauc_mrr_at_5_max": 0.219326,
135
+ "nauc_mrr_at_5_std": 0.024094,
136
+ "nauc_mrr_at_5_diff1": 0.337164,
137
+ "nauc_mrr_at_10_max": 0.216468,
138
+ "nauc_mrr_at_10_std": 0.027108,
139
+ "nauc_mrr_at_10_diff1": 0.330603,
140
+ "nauc_mrr_at_20_max": 0.216346,
141
+ "nauc_mrr_at_20_std": 0.029627,
142
+ "nauc_mrr_at_20_diff1": 0.330308,
143
+ "nauc_mrr_at_100_max": 0.217423,
144
+ "nauc_mrr_at_100_std": 0.03228,
145
+ "nauc_mrr_at_100_diff1": 0.328667,
146
+ "nauc_mrr_at_1000_max": 0.217947,
147
+ "nauc_mrr_at_1000_std": 0.032705,
148
+ "nauc_mrr_at_1000_diff1": 0.32861,
149
+ "hit_rate_at_1": 0.10392,
150
+ "hit_rate_at_3": 0.16655,
151
+ "hit_rate_at_5": 0.20165,
152
+ "hit_rate_at_10": 0.25052,
153
+ "hit_rate_at_20": 0.30489,
154
+ "hit_rate_at_100": 0.44357,
155
+ "hit_rate_at_1000": 0.68032,
156
+ "main_score": 0.15038,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 102.44514036178589,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackUnixRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6c6430d3a6d36f8d2a829195bc5dc94d7e063e53",
3
+ "task_name": "CQADupstackUnixRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.15019,
9
+ "ndcg_at_3": 0.17164,
10
+ "ndcg_at_5": 0.18569,
11
+ "ndcg_at_10": 0.20122,
12
+ "ndcg_at_20": 0.21533,
13
+ "ndcg_at_100": 0.24505,
14
+ "ndcg_at_1000": 0.27639,
15
+ "map_at_1": 0.13004,
16
+ "map_at_3": 0.15652,
17
+ "map_at_5": 0.16524,
18
+ "map_at_10": 0.17179,
19
+ "map_at_20": 0.17567,
20
+ "map_at_100": 0.17953,
21
+ "map_at_1000": 0.1807,
22
+ "recall_at_1": 0.13004,
23
+ "recall_at_3": 0.18757,
24
+ "recall_at_5": 0.22323,
25
+ "recall_at_10": 0.27001,
26
+ "recall_at_20": 0.32078,
27
+ "recall_at_100": 0.47425,
28
+ "recall_at_1000": 0.70188,
29
+ "accuracy": 0.13004,
30
+ "precision_at_1": 0.15019,
31
+ "precision_at_3": 0.07556,
32
+ "precision_at_5": 0.05466,
33
+ "precision_at_10": 0.03321,
34
+ "precision_at_20": 0.02043,
35
+ "precision_at_100": 0.0063,
36
+ "precision_at_1000": 0.00101,
37
+ "mrr_at_1": 0.150187,
38
+ "mrr_at_3": 0.179415,
39
+ "mrr_at_5": 0.188884,
40
+ "mrr_at_10": 0.195785,
41
+ "mrr_at_20": 0.20022,
42
+ "mrr_at_100": 0.203962,
43
+ "mrr_at_1000": 0.20486,
44
+ "nauc_ndcg_at_1_max": 0.295887,
45
+ "nauc_ndcg_at_1_std": 0.024441,
46
+ "nauc_ndcg_at_1_diff1": 0.504683,
47
+ "nauc_ndcg_at_3_max": 0.255057,
48
+ "nauc_ndcg_at_3_std": 0.016128,
49
+ "nauc_ndcg_at_3_diff1": 0.427623,
50
+ "nauc_ndcg_at_5_max": 0.245321,
51
+ "nauc_ndcg_at_5_std": 0.017741,
52
+ "nauc_ndcg_at_5_diff1": 0.402641,
53
+ "nauc_ndcg_at_10_max": 0.241334,
54
+ "nauc_ndcg_at_10_std": 0.020173,
55
+ "nauc_ndcg_at_10_diff1": 0.382823,
56
+ "nauc_ndcg_at_20_max": 0.23189,
57
+ "nauc_ndcg_at_20_std": 0.025136,
58
+ "nauc_ndcg_at_20_diff1": 0.373795,
59
+ "nauc_ndcg_at_100_max": 0.236115,
60
+ "nauc_ndcg_at_100_std": 0.048624,
61
+ "nauc_ndcg_at_100_diff1": 0.354373,
62
+ "nauc_ndcg_at_1000_max": 0.242996,
63
+ "nauc_ndcg_at_1000_std": 0.072082,
64
+ "nauc_ndcg_at_1000_diff1": 0.360578,
65
+ "nauc_map_at_1_max": 0.296592,
66
+ "nauc_map_at_1_std": 0.033435,
67
+ "nauc_map_at_1_diff1": 0.510726,
68
+ "nauc_map_at_3_max": 0.262789,
69
+ "nauc_map_at_3_std": 0.020979,
70
+ "nauc_map_at_3_diff1": 0.446106,
71
+ "nauc_map_at_5_max": 0.257133,
72
+ "nauc_map_at_5_std": 0.02085,
73
+ "nauc_map_at_5_diff1": 0.432868,
74
+ "nauc_map_at_10_max": 0.255559,
75
+ "nauc_map_at_10_std": 0.021831,
76
+ "nauc_map_at_10_diff1": 0.422418,
77
+ "nauc_map_at_20_max": 0.252836,
78
+ "nauc_map_at_20_std": 0.023204,
79
+ "nauc_map_at_20_diff1": 0.419583,
80
+ "nauc_map_at_100_max": 0.25273,
81
+ "nauc_map_at_100_std": 0.026771,
82
+ "nauc_map_at_100_diff1": 0.417189,
83
+ "nauc_map_at_1000_max": 0.252914,
84
+ "nauc_map_at_1000_std": 0.027998,
85
+ "nauc_map_at_1000_diff1": 0.417595,
86
+ "nauc_recall_at_1_max": 0.296592,
87
+ "nauc_recall_at_1_std": 0.033435,
88
+ "nauc_recall_at_1_diff1": 0.510726,
89
+ "nauc_recall_at_3_max": 0.230233,
90
+ "nauc_recall_at_3_std": 0.010357,
91
+ "nauc_recall_at_3_diff1": 0.380169,
92
+ "nauc_recall_at_5_max": 0.206671,
93
+ "nauc_recall_at_5_std": 0.012754,
94
+ "nauc_recall_at_5_diff1": 0.324246,
95
+ "nauc_recall_at_10_max": 0.202004,
96
+ "nauc_recall_at_10_std": 0.018669,
97
+ "nauc_recall_at_10_diff1": 0.283519,
98
+ "nauc_recall_at_20_max": 0.169138,
99
+ "nauc_recall_at_20_std": 0.035128,
100
+ "nauc_recall_at_20_diff1": 0.259788,
101
+ "nauc_recall_at_100_max": 0.184388,
102
+ "nauc_recall_at_100_std": 0.127428,
103
+ "nauc_recall_at_100_diff1": 0.166812,
104
+ "nauc_recall_at_1000_max": 0.22002,
105
+ "nauc_recall_at_1000_std": 0.327518,
106
+ "nauc_recall_at_1000_diff1": 0.149213,
107
+ "nauc_precision_at_1_max": 0.295887,
108
+ "nauc_precision_at_1_std": 0.024441,
109
+ "nauc_precision_at_1_diff1": 0.504683,
110
+ "nauc_precision_at_3_max": 0.233733,
111
+ "nauc_precision_at_3_std": 0.009191,
112
+ "nauc_precision_at_3_diff1": 0.378122,
113
+ "nauc_precision_at_5_max": 0.217276,
114
+ "nauc_precision_at_5_std": 0.016583,
115
+ "nauc_precision_at_5_diff1": 0.334232,
116
+ "nauc_precision_at_10_max": 0.208799,
117
+ "nauc_precision_at_10_std": 0.021306,
118
+ "nauc_precision_at_10_diff1": 0.285606,
119
+ "nauc_precision_at_20_max": 0.173435,
120
+ "nauc_precision_at_20_std": 0.027575,
121
+ "nauc_precision_at_20_diff1": 0.236211,
122
+ "nauc_precision_at_100_max": 0.181052,
123
+ "nauc_precision_at_100_std": 0.123593,
124
+ "nauc_precision_at_100_diff1": 0.132078,
125
+ "nauc_precision_at_1000_max": 0.137968,
126
+ "nauc_precision_at_1000_std": 0.186307,
127
+ "nauc_precision_at_1000_diff1": 0.049652,
128
+ "nauc_mrr_at_1_max": 0.295887,
129
+ "nauc_mrr_at_1_std": 0.024441,
130
+ "nauc_mrr_at_1_diff1": 0.504683,
131
+ "nauc_mrr_at_3_max": 0.261039,
132
+ "nauc_mrr_at_3_std": 0.014762,
133
+ "nauc_mrr_at_3_diff1": 0.437004,
134
+ "nauc_mrr_at_5_max": 0.256131,
135
+ "nauc_mrr_at_5_std": 0.017518,
136
+ "nauc_mrr_at_5_diff1": 0.421538,
137
+ "nauc_mrr_at_10_max": 0.251926,
138
+ "nauc_mrr_at_10_std": 0.017594,
139
+ "nauc_mrr_at_10_diff1": 0.41397,
140
+ "nauc_mrr_at_20_max": 0.250024,
141
+ "nauc_mrr_at_20_std": 0.019302,
142
+ "nauc_mrr_at_20_diff1": 0.410323,
143
+ "nauc_mrr_at_100_max": 0.250205,
144
+ "nauc_mrr_at_100_std": 0.022022,
145
+ "nauc_mrr_at_100_diff1": 0.40803,
146
+ "nauc_mrr_at_1000_max": 0.250445,
147
+ "nauc_mrr_at_1000_std": 0.022606,
148
+ "nauc_mrr_at_1000_diff1": 0.408473,
149
+ "hit_rate_at_1": 0.15019,
150
+ "hit_rate_at_3": 0.21735,
151
+ "hit_rate_at_5": 0.2584,
152
+ "hit_rate_at_10": 0.31063,
153
+ "hit_rate_at_20": 0.37313,
154
+ "hit_rate_at_100": 0.53731,
155
+ "hit_rate_at_1000": 0.7584,
156
+ "main_score": 0.20122,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 65.10625290870667,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWebmastersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "160c094312a0e1facb97e55eeddb698c0abe3571",
3
+ "task_name": "CQADupstackWebmastersRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.17984,
9
+ "ndcg_at_3": 0.20963,
10
+ "ndcg_at_5": 0.21771,
11
+ "ndcg_at_10": 0.23431,
12
+ "ndcg_at_20": 0.24894,
13
+ "ndcg_at_100": 0.28075,
14
+ "ndcg_at_1000": 0.31653,
15
+ "map_at_1": 0.14097,
16
+ "map_at_3": 0.1786,
17
+ "map_at_5": 0.18656,
18
+ "map_at_10": 0.19553,
19
+ "map_at_20": 0.20073,
20
+ "map_at_100": 0.2066,
21
+ "map_at_1000": 0.20849,
22
+ "recall_at_1": 0.14097,
23
+ "recall_at_3": 0.2178,
24
+ "recall_at_5": 0.24477,
25
+ "recall_at_10": 0.30146,
26
+ "recall_at_20": 0.3626,
27
+ "recall_at_100": 0.52116,
28
+ "recall_at_1000": 0.76987,
29
+ "accuracy": 0.14097,
30
+ "precision_at_1": 0.17984,
31
+ "precision_at_3": 0.10408,
32
+ "precision_at_5": 0.07312,
33
+ "precision_at_10": 0.04625,
34
+ "precision_at_20": 0.02935,
35
+ "precision_at_100": 0.0104,
36
+ "precision_at_1000": 0.00194,
37
+ "mrr_at_1": 0.179842,
38
+ "mrr_at_3": 0.221014,
39
+ "mrr_at_5": 0.226449,
40
+ "mrr_at_10": 0.233916,
41
+ "mrr_at_20": 0.238566,
42
+ "mrr_at_100": 0.242487,
43
+ "mrr_at_1000": 0.243357,
44
+ "nauc_ndcg_at_1_max": 0.144377,
45
+ "nauc_ndcg_at_1_std": 0.078266,
46
+ "nauc_ndcg_at_1_diff1": 0.419658,
47
+ "nauc_ndcg_at_3_max": 0.147711,
48
+ "nauc_ndcg_at_3_std": 0.089727,
49
+ "nauc_ndcg_at_3_diff1": 0.357338,
50
+ "nauc_ndcg_at_5_max": 0.144376,
51
+ "nauc_ndcg_at_5_std": 0.094025,
52
+ "nauc_ndcg_at_5_diff1": 0.355542,
53
+ "nauc_ndcg_at_10_max": 0.161196,
54
+ "nauc_ndcg_at_10_std": 0.116452,
55
+ "nauc_ndcg_at_10_diff1": 0.353378,
56
+ "nauc_ndcg_at_20_max": 0.160927,
57
+ "nauc_ndcg_at_20_std": 0.124363,
58
+ "nauc_ndcg_at_20_diff1": 0.344711,
59
+ "nauc_ndcg_at_100_max": 0.163613,
60
+ "nauc_ndcg_at_100_std": 0.149225,
61
+ "nauc_ndcg_at_100_diff1": 0.349877,
62
+ "nauc_ndcg_at_1000_max": 0.171515,
63
+ "nauc_ndcg_at_1000_std": 0.152387,
64
+ "nauc_ndcg_at_1000_diff1": 0.352065,
65
+ "nauc_map_at_1_max": 0.201885,
66
+ "nauc_map_at_1_std": 0.071668,
67
+ "nauc_map_at_1_diff1": 0.45064,
68
+ "nauc_map_at_3_max": 0.179531,
69
+ "nauc_map_at_3_std": 0.078203,
70
+ "nauc_map_at_3_diff1": 0.383444,
71
+ "nauc_map_at_5_max": 0.172926,
72
+ "nauc_map_at_5_std": 0.078868,
73
+ "nauc_map_at_5_diff1": 0.379343,
74
+ "nauc_map_at_10_max": 0.178678,
75
+ "nauc_map_at_10_std": 0.09182,
76
+ "nauc_map_at_10_diff1": 0.376758,
77
+ "nauc_map_at_20_max": 0.175365,
78
+ "nauc_map_at_20_std": 0.096662,
79
+ "nauc_map_at_20_diff1": 0.373267,
80
+ "nauc_map_at_100_max": 0.172257,
81
+ "nauc_map_at_100_std": 0.104442,
82
+ "nauc_map_at_100_diff1": 0.374396,
83
+ "nauc_map_at_1000_max": 0.171002,
84
+ "nauc_map_at_1000_std": 0.105661,
85
+ "nauc_map_at_1000_diff1": 0.374857,
86
+ "nauc_recall_at_1_max": 0.201885,
87
+ "nauc_recall_at_1_std": 0.071668,
88
+ "nauc_recall_at_1_diff1": 0.45064,
89
+ "nauc_recall_at_3_max": 0.154562,
90
+ "nauc_recall_at_3_std": 0.087258,
91
+ "nauc_recall_at_3_diff1": 0.315356,
92
+ "nauc_recall_at_5_max": 0.145894,
93
+ "nauc_recall_at_5_std": 0.095729,
94
+ "nauc_recall_at_5_diff1": 0.309377,
95
+ "nauc_recall_at_10_max": 0.180662,
96
+ "nauc_recall_at_10_std": 0.154184,
97
+ "nauc_recall_at_10_diff1": 0.291307,
98
+ "nauc_recall_at_20_max": 0.161037,
99
+ "nauc_recall_at_20_std": 0.175767,
100
+ "nauc_recall_at_20_diff1": 0.262627,
101
+ "nauc_recall_at_100_max": 0.15021,
102
+ "nauc_recall_at_100_std": 0.295734,
103
+ "nauc_recall_at_100_diff1": 0.297012,
104
+ "nauc_recall_at_1000_max": 0.232754,
105
+ "nauc_recall_at_1000_std": 0.414566,
106
+ "nauc_recall_at_1000_diff1": 0.305282,
107
+ "nauc_precision_at_1_max": 0.144377,
108
+ "nauc_precision_at_1_std": 0.078266,
109
+ "nauc_precision_at_1_diff1": 0.419658,
110
+ "nauc_precision_at_3_max": 0.089233,
111
+ "nauc_precision_at_3_std": 0.089445,
112
+ "nauc_precision_at_3_diff1": 0.274911,
113
+ "nauc_precision_at_5_max": 0.057212,
114
+ "nauc_precision_at_5_std": 0.100545,
115
+ "nauc_precision_at_5_diff1": 0.244603,
116
+ "nauc_precision_at_10_max": 0.048237,
117
+ "nauc_precision_at_10_std": 0.174783,
118
+ "nauc_precision_at_10_diff1": 0.225841,
119
+ "nauc_precision_at_20_max": 0.001042,
120
+ "nauc_precision_at_20_std": 0.230658,
121
+ "nauc_precision_at_20_diff1": 0.205943,
122
+ "nauc_precision_at_100_max": -0.096917,
123
+ "nauc_precision_at_100_std": 0.296257,
124
+ "nauc_precision_at_100_diff1": 0.184804,
125
+ "nauc_precision_at_1000_max": -0.163878,
126
+ "nauc_precision_at_1000_std": 0.146807,
127
+ "nauc_precision_at_1000_diff1": 0.048667,
128
+ "nauc_mrr_at_1_max": 0.144377,
129
+ "nauc_mrr_at_1_std": 0.078266,
130
+ "nauc_mrr_at_1_diff1": 0.419658,
131
+ "nauc_mrr_at_3_max": 0.132415,
132
+ "nauc_mrr_at_3_std": 0.076628,
133
+ "nauc_mrr_at_3_diff1": 0.354441,
134
+ "nauc_mrr_at_5_max": 0.133479,
135
+ "nauc_mrr_at_5_std": 0.079324,
136
+ "nauc_mrr_at_5_diff1": 0.354965,
137
+ "nauc_mrr_at_10_max": 0.137146,
138
+ "nauc_mrr_at_10_std": 0.087689,
139
+ "nauc_mrr_at_10_diff1": 0.353278,
140
+ "nauc_mrr_at_20_max": 0.135717,
141
+ "nauc_mrr_at_20_std": 0.090461,
142
+ "nauc_mrr_at_20_diff1": 0.350742,
143
+ "nauc_mrr_at_100_max": 0.13693,
144
+ "nauc_mrr_at_100_std": 0.092824,
145
+ "nauc_mrr_at_100_diff1": 0.351948,
146
+ "nauc_mrr_at_1000_max": 0.137214,
147
+ "nauc_mrr_at_1000_std": 0.09266,
148
+ "nauc_mrr_at_1000_diff1": 0.35192,
149
+ "hit_rate_at_1": 0.17984,
150
+ "hit_rate_at_3": 0.27075,
151
+ "hit_rate_at_5": 0.29447,
152
+ "hit_rate_at_10": 0.3498,
153
+ "hit_rate_at_20": 0.41897,
154
+ "hit_rate_at_100": 0.58696,
155
+ "hit_rate_at_1000": 0.81423,
156
+ "main_score": 0.23431,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 23.373626232147217,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWordpressRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4ffe81d471b1924886b33c7567bfb200e9eec5c4",
3
+ "task_name": "CQADupstackWordpressRetrieval",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.1183,
9
+ "ndcg_at_3": 0.14215,
10
+ "ndcg_at_5": 0.16196,
11
+ "ndcg_at_10": 0.17788,
12
+ "ndcg_at_20": 0.19259,
13
+ "ndcg_at_100": 0.22007,
14
+ "ndcg_at_1000": 0.25292,
15
+ "map_at_1": 0.10672,
16
+ "map_at_3": 0.13038,
17
+ "map_at_5": 0.14213,
18
+ "map_at_10": 0.14878,
19
+ "map_at_20": 0.15284,
20
+ "map_at_100": 0.15647,
21
+ "map_at_1000": 0.15754,
22
+ "recall_at_1": 0.10672,
23
+ "recall_at_3": 0.16069,
24
+ "recall_at_5": 0.20943,
25
+ "recall_at_10": 0.25786,
26
+ "recall_at_20": 0.31275,
27
+ "recall_at_100": 0.45988,
28
+ "recall_at_1000": 0.71337,
29
+ "accuracy": 0.10672,
30
+ "precision_at_1": 0.1183,
31
+ "precision_at_3": 0.05977,
32
+ "precision_at_5": 0.04806,
33
+ "precision_at_10": 0.02976,
34
+ "precision_at_20": 0.01839,
35
+ "precision_at_100": 0.00549,
36
+ "precision_at_1000": 0.00092,
37
+ "mrr_at_1": 0.118299,
38
+ "mrr_at_3": 0.144177,
39
+ "mrr_at_5": 0.154621,
40
+ "mrr_at_10": 0.160999,
41
+ "mrr_at_20": 0.165328,
42
+ "mrr_at_100": 0.168711,
43
+ "mrr_at_1000": 0.169728,
44
+ "nauc_ndcg_at_1_max": 0.164695,
45
+ "nauc_ndcg_at_1_std": 0.032175,
46
+ "nauc_ndcg_at_1_diff1": 0.372785,
47
+ "nauc_ndcg_at_3_max": 0.164043,
48
+ "nauc_ndcg_at_3_std": 0.026037,
49
+ "nauc_ndcg_at_3_diff1": 0.339874,
50
+ "nauc_ndcg_at_5_max": 0.187577,
51
+ "nauc_ndcg_at_5_std": 0.021521,
52
+ "nauc_ndcg_at_5_diff1": 0.319884,
53
+ "nauc_ndcg_at_10_max": 0.182457,
54
+ "nauc_ndcg_at_10_std": 0.003809,
55
+ "nauc_ndcg_at_10_diff1": 0.307408,
56
+ "nauc_ndcg_at_20_max": 0.171887,
57
+ "nauc_ndcg_at_20_std": 0.005491,
58
+ "nauc_ndcg_at_20_diff1": 0.287707,
59
+ "nauc_ndcg_at_100_max": 0.158196,
60
+ "nauc_ndcg_at_100_std": 0.038635,
61
+ "nauc_ndcg_at_100_diff1": 0.285936,
62
+ "nauc_ndcg_at_1000_max": 0.170112,
63
+ "nauc_ndcg_at_1000_std": 0.057197,
64
+ "nauc_ndcg_at_1000_diff1": 0.29653,
65
+ "nauc_map_at_1_max": 0.140964,
66
+ "nauc_map_at_1_std": 0.014934,
67
+ "nauc_map_at_1_diff1": 0.389536,
68
+ "nauc_map_at_3_max": 0.153776,
69
+ "nauc_map_at_3_std": 0.018749,
70
+ "nauc_map_at_3_diff1": 0.351528,
71
+ "nauc_map_at_5_max": 0.16811,
72
+ "nauc_map_at_5_std": 0.017556,
73
+ "nauc_map_at_5_diff1": 0.337622,
74
+ "nauc_map_at_10_max": 0.165728,
75
+ "nauc_map_at_10_std": 0.010531,
76
+ "nauc_map_at_10_diff1": 0.331862,
77
+ "nauc_map_at_20_max": 0.162785,
78
+ "nauc_map_at_20_std": 0.0116,
79
+ "nauc_map_at_20_diff1": 0.325377,
80
+ "nauc_map_at_100_max": 0.16118,
81
+ "nauc_map_at_100_std": 0.017533,
82
+ "nauc_map_at_100_diff1": 0.325035,
83
+ "nauc_map_at_1000_max": 0.161753,
84
+ "nauc_map_at_1000_std": 0.018288,
85
+ "nauc_map_at_1000_diff1": 0.325424,
86
+ "nauc_recall_at_1_max": 0.140964,
87
+ "nauc_recall_at_1_std": 0.014934,
88
+ "nauc_recall_at_1_diff1": 0.389536,
89
+ "nauc_recall_at_3_max": 0.168015,
90
+ "nauc_recall_at_3_std": 0.016617,
91
+ "nauc_recall_at_3_diff1": 0.31809,
92
+ "nauc_recall_at_5_max": 0.21496,
93
+ "nauc_recall_at_5_std": 0.010094,
94
+ "nauc_recall_at_5_diff1": 0.276185,
95
+ "nauc_recall_at_10_max": 0.198277,
96
+ "nauc_recall_at_10_std": -0.031114,
97
+ "nauc_recall_at_10_diff1": 0.249145,
98
+ "nauc_recall_at_20_max": 0.164474,
99
+ "nauc_recall_at_20_std": -0.02835,
100
+ "nauc_recall_at_20_diff1": 0.195307,
101
+ "nauc_recall_at_100_max": 0.110713,
102
+ "nauc_recall_at_100_std": 0.091003,
103
+ "nauc_recall_at_100_diff1": 0.191395,
104
+ "nauc_recall_at_1000_max": 0.179606,
105
+ "nauc_recall_at_1000_std": 0.256015,
106
+ "nauc_recall_at_1000_diff1": 0.238062,
107
+ "nauc_precision_at_1_max": 0.164695,
108
+ "nauc_precision_at_1_std": 0.032175,
109
+ "nauc_precision_at_1_diff1": 0.372785,
110
+ "nauc_precision_at_3_max": 0.186435,
111
+ "nauc_precision_at_3_std": 0.042281,
112
+ "nauc_precision_at_3_diff1": 0.315545,
113
+ "nauc_precision_at_5_max": 0.235141,
114
+ "nauc_precision_at_5_std": 0.045026,
115
+ "nauc_precision_at_5_diff1": 0.26898,
116
+ "nauc_precision_at_10_max": 0.223709,
117
+ "nauc_precision_at_10_std": 0.002352,
118
+ "nauc_precision_at_10_diff1": 0.23372,
119
+ "nauc_precision_at_20_max": 0.186954,
120
+ "nauc_precision_at_20_std": 0.008967,
121
+ "nauc_precision_at_20_diff1": 0.178719,
122
+ "nauc_precision_at_100_max": 0.107271,
123
+ "nauc_precision_at_100_std": 0.131235,
124
+ "nauc_precision_at_100_diff1": 0.163608,
125
+ "nauc_precision_at_1000_max": 0.05915,
126
+ "nauc_precision_at_1000_std": 0.164467,
127
+ "nauc_precision_at_1000_diff1": 0.103451,
128
+ "nauc_mrr_at_1_max": 0.164695,
129
+ "nauc_mrr_at_1_std": 0.032175,
130
+ "nauc_mrr_at_1_diff1": 0.372785,
131
+ "nauc_mrr_at_3_max": 0.172279,
132
+ "nauc_mrr_at_3_std": 0.037802,
133
+ "nauc_mrr_at_3_diff1": 0.34134,
134
+ "nauc_mrr_at_5_max": 0.189396,
135
+ "nauc_mrr_at_5_std": 0.03853,
136
+ "nauc_mrr_at_5_diff1": 0.33312,
137
+ "nauc_mrr_at_10_max": 0.187599,
138
+ "nauc_mrr_at_10_std": 0.030987,
139
+ "nauc_mrr_at_10_diff1": 0.327884,
140
+ "nauc_mrr_at_20_max": 0.184218,
141
+ "nauc_mrr_at_20_std": 0.031971,
142
+ "nauc_mrr_at_20_diff1": 0.322255,
143
+ "nauc_mrr_at_100_max": 0.181757,
144
+ "nauc_mrr_at_100_std": 0.036436,
145
+ "nauc_mrr_at_100_diff1": 0.321251,
146
+ "nauc_mrr_at_1000_max": 0.182277,
147
+ "nauc_mrr_at_1000_std": 0.037031,
148
+ "nauc_mrr_at_1000_diff1": 0.321696,
149
+ "hit_rate_at_1": 0.1183,
150
+ "hit_rate_at_3": 0.17745,
151
+ "hit_rate_at_5": 0.22551,
152
+ "hit_rate_at_10": 0.27542,
153
+ "hit_rate_at_20": 0.33826,
154
+ "hit_rate_at_100": 0.48614,
155
+ "hit_rate_at_1000": 0.74861,
156
+ "main_score": 0.17788,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 64.67129135131836,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ClimateFEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "47f2ac6acb640fc46020b02a5b59fdda04d39380",
3
+ "task_name": "ClimateFEVER",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.18697,
9
+ "ndcg_at_3": 0.16559,
10
+ "ndcg_at_5": 0.18201,
11
+ "ndcg_at_10": 0.20597,
12
+ "ndcg_at_20": 0.22664,
13
+ "ndcg_at_100": 0.26689,
14
+ "ndcg_at_1000": 0.30112,
15
+ "map_at_1": 0.08073,
16
+ "map_at_3": 0.1171,
17
+ "map_at_5": 0.13045,
18
+ "map_at_10": 0.1405,
19
+ "map_at_20": 0.14697,
20
+ "map_at_100": 0.15434,
21
+ "map_at_1000": 0.15602,
22
+ "recall_at_1": 0.08073,
23
+ "recall_at_3": 0.15425,
24
+ "recall_at_5": 0.19965,
25
+ "recall_at_10": 0.25342,
26
+ "recall_at_20": 0.31314,
27
+ "recall_at_100": 0.46793,
28
+ "recall_at_1000": 0.6633,
29
+ "accuracy": 0.08073,
30
+ "precision_at_1": 0.18697,
31
+ "precision_at_3": 0.12421,
32
+ "precision_at_5": 0.0985,
33
+ "precision_at_10": 0.06502,
34
+ "precision_at_20": 0.04088,
35
+ "precision_at_100": 0.01294,
36
+ "precision_at_1000": 0.00192,
37
+ "mrr_at_1": 0.186971,
38
+ "mrr_at_3": 0.253855,
39
+ "mrr_at_5": 0.272584,
40
+ "mrr_at_10": 0.284693,
41
+ "mrr_at_20": 0.29073,
42
+ "mrr_at_100": 0.294483,
43
+ "mrr_at_1000": 0.295059,
44
+ "nauc_ndcg_at_1_max": 0.353095,
45
+ "nauc_ndcg_at_1_std": 0.146239,
46
+ "nauc_ndcg_at_1_diff1": 0.228305,
47
+ "nauc_ndcg_at_3_max": 0.347067,
48
+ "nauc_ndcg_at_3_std": 0.169276,
49
+ "nauc_ndcg_at_3_diff1": 0.206914,
50
+ "nauc_ndcg_at_5_max": 0.365993,
51
+ "nauc_ndcg_at_5_std": 0.193369,
52
+ "nauc_ndcg_at_5_diff1": 0.196676,
53
+ "nauc_ndcg_at_10_max": 0.388276,
54
+ "nauc_ndcg_at_10_std": 0.23325,
55
+ "nauc_ndcg_at_10_diff1": 0.185393,
56
+ "nauc_ndcg_at_20_max": 0.409102,
57
+ "nauc_ndcg_at_20_std": 0.260173,
58
+ "nauc_ndcg_at_20_diff1": 0.175124,
59
+ "nauc_ndcg_at_100_max": 0.420723,
60
+ "nauc_ndcg_at_100_std": 0.299366,
61
+ "nauc_ndcg_at_100_diff1": 0.158679,
62
+ "nauc_ndcg_at_1000_max": 0.430437,
63
+ "nauc_ndcg_at_1000_std": 0.314539,
64
+ "nauc_ndcg_at_1000_diff1": 0.158098,
65
+ "nauc_map_at_1_max": 0.345192,
66
+ "nauc_map_at_1_std": 0.089418,
67
+ "nauc_map_at_1_diff1": 0.304948,
68
+ "nauc_map_at_3_max": 0.335206,
69
+ "nauc_map_at_3_std": 0.133713,
70
+ "nauc_map_at_3_diff1": 0.242392,
71
+ "nauc_map_at_5_max": 0.344956,
72
+ "nauc_map_at_5_std": 0.151971,
73
+ "nauc_map_at_5_diff1": 0.229255,
74
+ "nauc_map_at_10_max": 0.358584,
75
+ "nauc_map_at_10_std": 0.174138,
76
+ "nauc_map_at_10_diff1": 0.2245,
77
+ "nauc_map_at_20_max": 0.369026,
78
+ "nauc_map_at_20_std": 0.187452,
79
+ "nauc_map_at_20_diff1": 0.218633,
80
+ "nauc_map_at_100_max": 0.374847,
81
+ "nauc_map_at_100_std": 0.200619,
82
+ "nauc_map_at_100_diff1": 0.212997,
83
+ "nauc_map_at_1000_max": 0.375553,
84
+ "nauc_map_at_1000_std": 0.202233,
85
+ "nauc_map_at_1000_diff1": 0.212706,
86
+ "nauc_recall_at_1_max": 0.345192,
87
+ "nauc_recall_at_1_std": 0.089418,
88
+ "nauc_recall_at_1_diff1": 0.304948,
89
+ "nauc_recall_at_3_max": 0.324329,
90
+ "nauc_recall_at_3_std": 0.169632,
91
+ "nauc_recall_at_3_diff1": 0.187597,
92
+ "nauc_recall_at_5_max": 0.335357,
93
+ "nauc_recall_at_5_std": 0.201046,
94
+ "nauc_recall_at_5_diff1": 0.15094,
95
+ "nauc_recall_at_10_max": 0.368034,
96
+ "nauc_recall_at_10_std": 0.272405,
97
+ "nauc_recall_at_10_diff1": 0.124281,
98
+ "nauc_recall_at_20_max": 0.395196,
99
+ "nauc_recall_at_20_std": 0.318697,
100
+ "nauc_recall_at_20_diff1": 0.093192,
101
+ "nauc_recall_at_100_max": 0.394402,
102
+ "nauc_recall_at_100_std": 0.395816,
103
+ "nauc_recall_at_100_diff1": 0.042315,
104
+ "nauc_recall_at_1000_max": 0.426842,
105
+ "nauc_recall_at_1000_std": 0.473,
106
+ "nauc_recall_at_1000_diff1": 0.016986,
107
+ "nauc_precision_at_1_max": 0.353095,
108
+ "nauc_precision_at_1_std": 0.146239,
109
+ "nauc_precision_at_1_diff1": 0.228305,
110
+ "nauc_precision_at_3_max": 0.370099,
111
+ "nauc_precision_at_3_std": 0.244677,
112
+ "nauc_precision_at_3_diff1": 0.141836,
113
+ "nauc_precision_at_5_max": 0.396618,
114
+ "nauc_precision_at_5_std": 0.284621,
115
+ "nauc_precision_at_5_diff1": 0.116768,
116
+ "nauc_precision_at_10_max": 0.412025,
117
+ "nauc_precision_at_10_std": 0.345366,
118
+ "nauc_precision_at_10_diff1": 0.082739,
119
+ "nauc_precision_at_20_max": 0.428439,
120
+ "nauc_precision_at_20_std": 0.380999,
121
+ "nauc_precision_at_20_diff1": 0.054982,
122
+ "nauc_precision_at_100_max": 0.351898,
123
+ "nauc_precision_at_100_std": 0.406904,
124
+ "nauc_precision_at_100_diff1": -0.010981,
125
+ "nauc_precision_at_1000_max": 0.293857,
126
+ "nauc_precision_at_1000_std": 0.383661,
127
+ "nauc_precision_at_1000_diff1": -0.030317,
128
+ "nauc_mrr_at_1_max": 0.353095,
129
+ "nauc_mrr_at_1_std": 0.146239,
130
+ "nauc_mrr_at_1_diff1": 0.228305,
131
+ "nauc_mrr_at_3_max": 0.361891,
132
+ "nauc_mrr_at_3_std": 0.19461,
133
+ "nauc_mrr_at_3_diff1": 0.184396,
134
+ "nauc_mrr_at_5_max": 0.37419,
135
+ "nauc_mrr_at_5_std": 0.206867,
136
+ "nauc_mrr_at_5_diff1": 0.182821,
137
+ "nauc_mrr_at_10_max": 0.380752,
138
+ "nauc_mrr_at_10_std": 0.220569,
139
+ "nauc_mrr_at_10_diff1": 0.176974,
140
+ "nauc_mrr_at_20_max": 0.383531,
141
+ "nauc_mrr_at_20_std": 0.223885,
142
+ "nauc_mrr_at_20_diff1": 0.177413,
143
+ "nauc_mrr_at_100_max": 0.382764,
144
+ "nauc_mrr_at_100_std": 0.223891,
145
+ "nauc_mrr_at_100_diff1": 0.178051,
146
+ "nauc_mrr_at_1000_max": 0.382749,
147
+ "nauc_mrr_at_1000_std": 0.223656,
148
+ "nauc_mrr_at_1000_diff1": 0.178182,
149
+ "hit_rate_at_1": 0.18697,
150
+ "hit_rate_at_3": 0.34267,
151
+ "hit_rate_at_5": 0.4241,
152
+ "hit_rate_at_10": 0.51596,
153
+ "hit_rate_at_20": 0.6013,
154
+ "hit_rate_at_100": 0.74267,
155
+ "hit_rate_at_1000": 0.87622,
156
+ "main_score": 0.20597,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 6545.029442071915,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/DBPedia.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c0f706b76e590d620bd6618b3ca8efdd34e2d659",
3
+ "task_name": "DBPedia",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.3725,
9
+ "ndcg_at_3": 0.30834,
10
+ "ndcg_at_5": 0.28779,
11
+ "ndcg_at_10": 0.27271,
12
+ "ndcg_at_20": 0.2639,
13
+ "ndcg_at_100": 0.28897,
14
+ "ndcg_at_1000": 0.34526,
15
+ "map_at_1": 0.06115,
16
+ "map_at_3": 0.08968,
17
+ "map_at_5": 0.10075,
18
+ "map_at_10": 0.11948,
19
+ "map_at_20": 0.13521,
20
+ "map_at_100": 0.15823,
21
+ "map_at_1000": 0.16634,
22
+ "recall_at_1": 0.06115,
23
+ "recall_at_3": 0.10106,
24
+ "recall_at_5": 0.1226,
25
+ "recall_at_10": 0.16469,
26
+ "recall_at_20": 0.20851,
27
+ "recall_at_100": 0.32697,
28
+ "recall_at_1000": 0.51643,
29
+ "accuracy": 0.06115,
30
+ "precision_at_1": 0.48,
31
+ "precision_at_3": 0.3425,
32
+ "precision_at_5": 0.286,
33
+ "precision_at_10": 0.22275,
34
+ "precision_at_20": 0.16325,
35
+ "precision_at_100": 0.0645,
36
+ "precision_at_1000": 0.01334,
37
+ "mrr_at_1": 0.48,
38
+ "mrr_at_3": 0.540833,
39
+ "mrr_at_5": 0.558083,
40
+ "mrr_at_10": 0.568289,
41
+ "mrr_at_20": 0.571822,
42
+ "mrr_at_100": 0.573882,
43
+ "mrr_at_1000": 0.574185,
44
+ "nauc_ndcg_at_1_max": 0.497154,
45
+ "nauc_ndcg_at_1_std": 0.203398,
46
+ "nauc_ndcg_at_1_diff1": 0.41684,
47
+ "nauc_ndcg_at_3_max": 0.523999,
48
+ "nauc_ndcg_at_3_std": 0.254433,
49
+ "nauc_ndcg_at_3_diff1": 0.339721,
50
+ "nauc_ndcg_at_5_max": 0.528254,
51
+ "nauc_ndcg_at_5_std": 0.282954,
52
+ "nauc_ndcg_at_5_diff1": 0.331557,
53
+ "nauc_ndcg_at_10_max": 0.508174,
54
+ "nauc_ndcg_at_10_std": 0.323317,
55
+ "nauc_ndcg_at_10_diff1": 0.311586,
56
+ "nauc_ndcg_at_20_max": 0.472542,
57
+ "nauc_ndcg_at_20_std": 0.331857,
58
+ "nauc_ndcg_at_20_diff1": 0.310803,
59
+ "nauc_ndcg_at_100_max": 0.483727,
60
+ "nauc_ndcg_at_100_std": 0.376655,
61
+ "nauc_ndcg_at_100_diff1": 0.310534,
62
+ "nauc_ndcg_at_1000_max": 0.533149,
63
+ "nauc_ndcg_at_1000_std": 0.427618,
64
+ "nauc_ndcg_at_1000_diff1": 0.303498,
65
+ "nauc_map_at_1_max": 0.137515,
66
+ "nauc_map_at_1_std": -0.03184,
67
+ "nauc_map_at_1_diff1": 0.482017,
68
+ "nauc_map_at_3_max": 0.190529,
69
+ "nauc_map_at_3_std": 0.013607,
70
+ "nauc_map_at_3_diff1": 0.437601,
71
+ "nauc_map_at_5_max": 0.227464,
72
+ "nauc_map_at_5_std": 0.058463,
73
+ "nauc_map_at_5_diff1": 0.416488,
74
+ "nauc_map_at_10_max": 0.279595,
75
+ "nauc_map_at_10_std": 0.143227,
76
+ "nauc_map_at_10_diff1": 0.374785,
77
+ "nauc_map_at_20_max": 0.332488,
78
+ "nauc_map_at_20_std": 0.219732,
79
+ "nauc_map_at_20_diff1": 0.341715,
80
+ "nauc_map_at_100_max": 0.396727,
81
+ "nauc_map_at_100_std": 0.317097,
82
+ "nauc_map_at_100_diff1": 0.292806,
83
+ "nauc_map_at_1000_max": 0.410874,
84
+ "nauc_map_at_1000_std": 0.335245,
85
+ "nauc_map_at_1000_diff1": 0.284577,
86
+ "nauc_recall_at_1_max": 0.137515,
87
+ "nauc_recall_at_1_std": -0.03184,
88
+ "nauc_recall_at_1_diff1": 0.482017,
89
+ "nauc_recall_at_3_max": 0.165529,
90
+ "nauc_recall_at_3_std": 0.002229,
91
+ "nauc_recall_at_3_diff1": 0.398718,
92
+ "nauc_recall_at_5_max": 0.182718,
93
+ "nauc_recall_at_5_std": 0.057512,
94
+ "nauc_recall_at_5_diff1": 0.351884,
95
+ "nauc_recall_at_10_max": 0.212233,
96
+ "nauc_recall_at_10_std": 0.148399,
97
+ "nauc_recall_at_10_diff1": 0.290395,
98
+ "nauc_recall_at_20_max": 0.26943,
99
+ "nauc_recall_at_20_std": 0.238377,
100
+ "nauc_recall_at_20_diff1": 0.267939,
101
+ "nauc_recall_at_100_max": 0.38184,
102
+ "nauc_recall_at_100_std": 0.377296,
103
+ "nauc_recall_at_100_diff1": 0.216022,
104
+ "nauc_recall_at_1000_max": 0.422032,
105
+ "nauc_recall_at_1000_std": 0.458662,
106
+ "nauc_recall_at_1000_diff1": 0.198624,
107
+ "nauc_precision_at_1_max": 0.567101,
108
+ "nauc_precision_at_1_std": 0.258366,
109
+ "nauc_precision_at_1_diff1": 0.45144,
110
+ "nauc_precision_at_3_max": 0.571874,
111
+ "nauc_precision_at_3_std": 0.326545,
112
+ "nauc_precision_at_3_diff1": 0.213306,
113
+ "nauc_precision_at_5_max": 0.569856,
114
+ "nauc_precision_at_5_std": 0.383785,
115
+ "nauc_precision_at_5_diff1": 0.142487,
116
+ "nauc_precision_at_10_max": 0.562835,
117
+ "nauc_precision_at_10_std": 0.48185,
118
+ "nauc_precision_at_10_diff1": 0.052277,
119
+ "nauc_precision_at_20_max": 0.548349,
120
+ "nauc_precision_at_20_std": 0.521195,
121
+ "nauc_precision_at_20_diff1": -0.009725,
122
+ "nauc_precision_at_100_max": 0.476447,
123
+ "nauc_precision_at_100_std": 0.502125,
124
+ "nauc_precision_at_100_diff1": -0.084779,
125
+ "nauc_precision_at_1000_max": 0.240115,
126
+ "nauc_precision_at_1000_std": 0.271573,
127
+ "nauc_precision_at_1000_diff1": -0.170055,
128
+ "nauc_mrr_at_1_max": 0.567101,
129
+ "nauc_mrr_at_1_std": 0.258366,
130
+ "nauc_mrr_at_1_diff1": 0.45144,
131
+ "nauc_mrr_at_3_max": 0.589796,
132
+ "nauc_mrr_at_3_std": 0.271793,
133
+ "nauc_mrr_at_3_diff1": 0.442743,
134
+ "nauc_mrr_at_5_max": 0.592412,
135
+ "nauc_mrr_at_5_std": 0.282811,
136
+ "nauc_mrr_at_5_diff1": 0.439699,
137
+ "nauc_mrr_at_10_max": 0.596289,
138
+ "nauc_mrr_at_10_std": 0.287586,
139
+ "nauc_mrr_at_10_diff1": 0.441186,
140
+ "nauc_mrr_at_20_max": 0.59656,
141
+ "nauc_mrr_at_20_std": 0.288992,
142
+ "nauc_mrr_at_20_diff1": 0.440401,
143
+ "nauc_mrr_at_100_max": 0.596245,
144
+ "nauc_mrr_at_100_std": 0.288647,
145
+ "nauc_mrr_at_100_diff1": 0.439713,
146
+ "nauc_mrr_at_1000_max": 0.596121,
147
+ "nauc_mrr_at_1000_std": 0.288472,
148
+ "nauc_mrr_at_1000_diff1": 0.439796,
149
+ "hit_rate_at_1": 0.48,
150
+ "hit_rate_at_3": 0.6125,
151
+ "hit_rate_at_5": 0.69,
152
+ "hit_rate_at_10": 0.7625,
153
+ "hit_rate_at_20": 0.8125,
154
+ "hit_rate_at_100": 0.895,
155
+ "hit_rate_at_1000": 0.9575,
156
+ "main_score": 0.27271,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 5195.377886295319,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/EmotionClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4f58c6b202a23cf9a4da393831edf4f9183cad37",
3
+ "task_name": "EmotionClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.356,
11
+ "f1": 0.313884,
12
+ "f1_weighted": 0.374535,
13
+ "precision": 0.33121,
14
+ "precision_weighted": 0.452348,
15
+ "recall": 0.361089,
16
+ "recall_weighted": 0.356,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.354,
22
+ "f1": 0.320946,
23
+ "f1_weighted": 0.366868,
24
+ "precision": 0.334794,
25
+ "precision_weighted": 0.444668,
26
+ "recall": 0.381024,
27
+ "recall_weighted": 0.354,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.3635,
33
+ "f1": 0.327014,
34
+ "f1_weighted": 0.386831,
35
+ "precision": 0.34005,
36
+ "precision_weighted": 0.456482,
37
+ "recall": 0.374576,
38
+ "recall_weighted": 0.3635,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.3785,
44
+ "f1": 0.331215,
45
+ "f1_weighted": 0.407928,
46
+ "precision": 0.350318,
47
+ "precision_weighted": 0.484867,
48
+ "recall": 0.374447,
49
+ "recall_weighted": 0.3785,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.393,
55
+ "f1": 0.353224,
56
+ "f1_weighted": 0.4149,
57
+ "precision": 0.361956,
58
+ "precision_weighted": 0.480519,
59
+ "recall": 0.400388,
60
+ "recall_weighted": 0.393,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.3605,
66
+ "f1": 0.325991,
67
+ "f1_weighted": 0.38654,
68
+ "precision": 0.344931,
69
+ "precision_weighted": 0.470827,
70
+ "recall": 0.376869,
71
+ "recall_weighted": 0.3605,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.354,
77
+ "f1": 0.317711,
78
+ "f1_weighted": 0.372707,
79
+ "precision": 0.332454,
80
+ "precision_weighted": 0.449806,
81
+ "recall": 0.375164,
82
+ "recall_weighted": 0.354,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.308,
88
+ "f1": 0.287969,
89
+ "f1_weighted": 0.321515,
90
+ "precision": 0.302838,
91
+ "precision_weighted": 0.402776,
92
+ "recall": 0.34963,
93
+ "recall_weighted": 0.308,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.3655,
99
+ "f1": 0.32421,
100
+ "f1_weighted": 0.388026,
101
+ "precision": 0.338831,
102
+ "precision_weighted": 0.467192,
103
+ "recall": 0.368127,
104
+ "recall_weighted": 0.3655,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.3645,
110
+ "f1": 0.324163,
111
+ "f1_weighted": 0.390604,
112
+ "precision": 0.345343,
113
+ "precision_weighted": 0.481376,
114
+ "recall": 0.375761,
115
+ "recall_weighted": 0.3645,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.35975,
121
+ "f1": 0.322633,
122
+ "f1_weighted": 0.381045,
123
+ "precision": 0.338272,
124
+ "precision_weighted": 0.459086,
125
+ "recall": 0.373707,
126
+ "recall_weighted": 0.35975,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.35975,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 14.806202173233032,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/FEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "bea83ef9e8fb933d90a2f1d5515737465d613e12",
3
+ "task_name": "FEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.51305,
9
+ "ndcg_at_3": 0.58855,
10
+ "ndcg_at_5": 0.60905,
11
+ "ndcg_at_10": 0.62893,
12
+ "ndcg_at_20": 0.64157,
13
+ "ndcg_at_100": 0.65385,
14
+ "ndcg_at_1000": 0.66062,
15
+ "map_at_1": 0.47477,
16
+ "map_at_3": 0.55382,
17
+ "map_at_5": 0.56584,
18
+ "map_at_10": 0.57433,
19
+ "map_at_20": 0.57798,
20
+ "map_at_100": 0.57976,
21
+ "map_at_1000": 0.58002,
22
+ "recall_at_1": 0.47477,
23
+ "recall_at_3": 0.64738,
24
+ "recall_at_5": 0.69823,
25
+ "recall_at_10": 0.75902,
26
+ "recall_at_20": 0.80724,
27
+ "recall_at_100": 0.87108,
28
+ "recall_at_1000": 0.92253,
29
+ "accuracy": 0.47477,
30
+ "precision_at_1": 0.51305,
31
+ "precision_at_3": 0.23472,
32
+ "precision_at_5": 0.15206,
33
+ "precision_at_10": 0.08284,
34
+ "precision_at_20": 0.04414,
35
+ "precision_at_100": 0.00958,
36
+ "precision_at_1000": 0.00103,
37
+ "mrr_at_1": 0.513051,
38
+ "mrr_at_3": 0.59521,
39
+ "mrr_at_5": 0.607376,
40
+ "mrr_at_10": 0.615719,
41
+ "mrr_at_20": 0.619156,
42
+ "mrr_at_100": 0.620734,
43
+ "mrr_at_1000": 0.62092,
44
+ "nauc_ndcg_at_1_max": 0.297627,
45
+ "nauc_ndcg_at_1_std": -0.024369,
46
+ "nauc_ndcg_at_1_diff1": 0.575845,
47
+ "nauc_ndcg_at_3_max": 0.338889,
48
+ "nauc_ndcg_at_3_std": 0.024107,
49
+ "nauc_ndcg_at_3_diff1": 0.49428,
50
+ "nauc_ndcg_at_5_max": 0.344122,
51
+ "nauc_ndcg_at_5_std": 0.033029,
52
+ "nauc_ndcg_at_5_diff1": 0.48792,
53
+ "nauc_ndcg_at_10_max": 0.351964,
54
+ "nauc_ndcg_at_10_std": 0.049084,
55
+ "nauc_ndcg_at_10_diff1": 0.485923,
56
+ "nauc_ndcg_at_20_max": 0.354029,
57
+ "nauc_ndcg_at_20_std": 0.056291,
58
+ "nauc_ndcg_at_20_diff1": 0.485283,
59
+ "nauc_ndcg_at_100_max": 0.35023,
60
+ "nauc_ndcg_at_100_std": 0.055537,
61
+ "nauc_ndcg_at_100_diff1": 0.488315,
62
+ "nauc_ndcg_at_1000_max": 0.347499,
63
+ "nauc_ndcg_at_1000_std": 0.050659,
64
+ "nauc_ndcg_at_1000_diff1": 0.491225,
65
+ "nauc_map_at_1_max": 0.273378,
66
+ "nauc_map_at_1_std": -0.01573,
67
+ "nauc_map_at_1_diff1": 0.536779,
68
+ "nauc_map_at_3_max": 0.317032,
69
+ "nauc_map_at_3_std": 0.015589,
70
+ "nauc_map_at_3_diff1": 0.498439,
71
+ "nauc_map_at_5_max": 0.319895,
72
+ "nauc_map_at_5_std": 0.020102,
73
+ "nauc_map_at_5_diff1": 0.495492,
74
+ "nauc_map_at_10_max": 0.32294,
75
+ "nauc_map_at_10_std": 0.026048,
76
+ "nauc_map_at_10_diff1": 0.495111,
77
+ "nauc_map_at_20_max": 0.323397,
78
+ "nauc_map_at_20_std": 0.027703,
79
+ "nauc_map_at_20_diff1": 0.495098,
80
+ "nauc_map_at_100_max": 0.322927,
81
+ "nauc_map_at_100_std": 0.027525,
82
+ "nauc_map_at_100_diff1": 0.495536,
83
+ "nauc_map_at_1000_max": 0.322873,
84
+ "nauc_map_at_1000_std": 0.027401,
85
+ "nauc_map_at_1000_diff1": 0.495637,
86
+ "nauc_recall_at_1_max": 0.273378,
87
+ "nauc_recall_at_1_std": -0.01573,
88
+ "nauc_recall_at_1_diff1": 0.536779,
89
+ "nauc_recall_at_3_max": 0.361206,
90
+ "nauc_recall_at_3_std": 0.058503,
91
+ "nauc_recall_at_3_diff1": 0.423484,
92
+ "nauc_recall_at_5_max": 0.376141,
93
+ "nauc_recall_at_5_std": 0.085278,
94
+ "nauc_recall_at_5_diff1": 0.393905,
95
+ "nauc_recall_at_10_max": 0.406727,
96
+ "nauc_recall_at_10_std": 0.156474,
97
+ "nauc_recall_at_10_diff1": 0.362355,
98
+ "nauc_recall_at_20_max": 0.423185,
99
+ "nauc_recall_at_20_std": 0.218353,
100
+ "nauc_recall_at_20_diff1": 0.326159,
101
+ "nauc_recall_at_100_max": 0.40197,
102
+ "nauc_recall_at_100_std": 0.279232,
103
+ "nauc_recall_at_100_diff1": 0.284107,
104
+ "nauc_recall_at_1000_max": 0.35688,
105
+ "nauc_recall_at_1000_std": 0.284323,
106
+ "nauc_recall_at_1000_diff1": 0.237602,
107
+ "nauc_precision_at_1_max": 0.297627,
108
+ "nauc_precision_at_1_std": -0.024369,
109
+ "nauc_precision_at_1_diff1": 0.575845,
110
+ "nauc_precision_at_3_max": 0.415345,
111
+ "nauc_precision_at_3_std": 0.0588,
112
+ "nauc_precision_at_3_diff1": 0.456083,
113
+ "nauc_precision_at_5_max": 0.437404,
114
+ "nauc_precision_at_5_std": 0.089709,
115
+ "nauc_precision_at_5_diff1": 0.420775,
116
+ "nauc_precision_at_10_max": 0.47021,
117
+ "nauc_precision_at_10_std": 0.166818,
118
+ "nauc_precision_at_10_diff1": 0.373816,
119
+ "nauc_precision_at_20_max": 0.481111,
120
+ "nauc_precision_at_20_std": 0.228804,
121
+ "nauc_precision_at_20_diff1": 0.314439,
122
+ "nauc_precision_at_100_max": 0.419104,
123
+ "nauc_precision_at_100_std": 0.259242,
124
+ "nauc_precision_at_100_diff1": 0.205885,
125
+ "nauc_precision_at_1000_max": 0.277079,
126
+ "nauc_precision_at_1000_std": 0.178332,
127
+ "nauc_precision_at_1000_diff1": 0.073636,
128
+ "nauc_mrr_at_1_max": 0.297627,
129
+ "nauc_mrr_at_1_std": -0.024369,
130
+ "nauc_mrr_at_1_diff1": 0.575845,
131
+ "nauc_mrr_at_3_max": 0.347023,
132
+ "nauc_mrr_at_3_std": 0.006922,
133
+ "nauc_mrr_at_3_diff1": 0.538683,
134
+ "nauc_mrr_at_5_max": 0.350657,
135
+ "nauc_mrr_at_5_std": 0.011081,
136
+ "nauc_mrr_at_5_diff1": 0.537367,
137
+ "nauc_mrr_at_10_max": 0.35374,
138
+ "nauc_mrr_at_10_std": 0.016192,
139
+ "nauc_mrr_at_10_diff1": 0.537882,
140
+ "nauc_mrr_at_20_max": 0.354194,
141
+ "nauc_mrr_at_20_std": 0.01792,
142
+ "nauc_mrr_at_20_diff1": 0.537998,
143
+ "nauc_mrr_at_100_max": 0.353376,
144
+ "nauc_mrr_at_100_std": 0.017321,
145
+ "nauc_mrr_at_100_diff1": 0.538507,
146
+ "nauc_mrr_at_1000_max": 0.35322,
147
+ "nauc_mrr_at_1000_std": 0.017142,
148
+ "nauc_mrr_at_1000_diff1": 0.538588,
149
+ "hit_rate_at_1": 0.51305,
150
+ "hit_rate_at_3": 0.69592,
151
+ "hit_rate_at_5": 0.74887,
152
+ "hit_rate_at_10": 0.81113,
153
+ "hit_rate_at_20": 0.85914,
154
+ "hit_rate_at_100": 0.91989,
155
+ "hit_rate_at_1000": 0.964,
156
+ "main_score": 0.62893,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 7178.4095685482025,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/FiQA2018.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "27a168819829fe9bcd655c2df245fb19452e8e06",
3
+ "task_name": "FiQA2018",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.16975,
9
+ "ndcg_at_3": 0.1541,
10
+ "ndcg_at_5": 0.16404,
11
+ "ndcg_at_10": 0.17792,
12
+ "ndcg_at_20": 0.19885,
13
+ "ndcg_at_100": 0.23803,
14
+ "ndcg_at_1000": 0.28389,
15
+ "map_at_1": 0.08232,
16
+ "map_at_3": 0.11206,
17
+ "map_at_5": 0.12362,
18
+ "map_at_10": 0.13219,
19
+ "map_at_20": 0.13841,
20
+ "map_at_100": 0.1445,
21
+ "map_at_1000": 0.14666,
22
+ "recall_at_1": 0.08232,
23
+ "recall_at_3": 0.13711,
24
+ "recall_at_5": 0.17659,
25
+ "recall_at_10": 0.21994,
26
+ "recall_at_20": 0.28892,
27
+ "recall_at_100": 0.45482,
28
+ "recall_at_1000": 0.73843,
29
+ "accuracy": 0.08232,
30
+ "precision_at_1": 0.16975,
31
+ "precision_at_3": 0.10288,
32
+ "precision_at_5": 0.08025,
33
+ "precision_at_10": 0.0517,
34
+ "precision_at_20": 0.03356,
35
+ "precision_at_100": 0.01105,
36
+ "precision_at_1000": 0.0019,
37
+ "mrr_at_1": 0.169753,
38
+ "mrr_at_3": 0.20679,
39
+ "mrr_at_5": 0.219213,
40
+ "mrr_at_10": 0.227375,
41
+ "mrr_at_20": 0.233761,
42
+ "mrr_at_100": 0.238624,
43
+ "mrr_at_1000": 0.239643,
44
+ "nauc_ndcg_at_1_max": 0.178251,
45
+ "nauc_ndcg_at_1_std": 0.014228,
46
+ "nauc_ndcg_at_1_diff1": 0.30773,
47
+ "nauc_ndcg_at_3_max": 0.194185,
48
+ "nauc_ndcg_at_3_std": 0.013241,
49
+ "nauc_ndcg_at_3_diff1": 0.297239,
50
+ "nauc_ndcg_at_5_max": 0.18185,
51
+ "nauc_ndcg_at_5_std": 0.00939,
52
+ "nauc_ndcg_at_5_diff1": 0.288324,
53
+ "nauc_ndcg_at_10_max": 0.179298,
54
+ "nauc_ndcg_at_10_std": 0.021885,
55
+ "nauc_ndcg_at_10_diff1": 0.27074,
56
+ "nauc_ndcg_at_20_max": 0.174511,
57
+ "nauc_ndcg_at_20_std": 0.024129,
58
+ "nauc_ndcg_at_20_diff1": 0.265427,
59
+ "nauc_ndcg_at_100_max": 0.192485,
60
+ "nauc_ndcg_at_100_std": 0.052853,
61
+ "nauc_ndcg_at_100_diff1": 0.262464,
62
+ "nauc_ndcg_at_1000_max": 0.225912,
63
+ "nauc_ndcg_at_1000_std": 0.083654,
64
+ "nauc_ndcg_at_1000_diff1": 0.256481,
65
+ "nauc_map_at_1_max": 0.12902,
66
+ "nauc_map_at_1_std": -0.030152,
67
+ "nauc_map_at_1_diff1": 0.314637,
68
+ "nauc_map_at_3_max": 0.167905,
69
+ "nauc_map_at_3_std": 0.000359,
70
+ "nauc_map_at_3_diff1": 0.300638,
71
+ "nauc_map_at_5_max": 0.170125,
72
+ "nauc_map_at_5_std": 0.006301,
73
+ "nauc_map_at_5_diff1": 0.302449,
74
+ "nauc_map_at_10_max": 0.175099,
75
+ "nauc_map_at_10_std": 0.016106,
76
+ "nauc_map_at_10_diff1": 0.29918,
77
+ "nauc_map_at_20_max": 0.175106,
78
+ "nauc_map_at_20_std": 0.017351,
79
+ "nauc_map_at_20_diff1": 0.297618,
80
+ "nauc_map_at_100_max": 0.179393,
81
+ "nauc_map_at_100_std": 0.024671,
82
+ "nauc_map_at_100_diff1": 0.2969,
83
+ "nauc_map_at_1000_max": 0.182251,
84
+ "nauc_map_at_1000_std": 0.027343,
85
+ "nauc_map_at_1000_diff1": 0.296837,
86
+ "nauc_recall_at_1_max": 0.12902,
87
+ "nauc_recall_at_1_std": -0.030152,
88
+ "nauc_recall_at_1_diff1": 0.314637,
89
+ "nauc_recall_at_3_max": 0.158659,
90
+ "nauc_recall_at_3_std": -0.001546,
91
+ "nauc_recall_at_3_diff1": 0.263667,
92
+ "nauc_recall_at_5_max": 0.16029,
93
+ "nauc_recall_at_5_std": 0.007566,
94
+ "nauc_recall_at_5_diff1": 0.248731,
95
+ "nauc_recall_at_10_max": 0.158019,
96
+ "nauc_recall_at_10_std": 0.028845,
97
+ "nauc_recall_at_10_diff1": 0.210033,
98
+ "nauc_recall_at_20_max": 0.133979,
99
+ "nauc_recall_at_20_std": 0.03164,
100
+ "nauc_recall_at_20_diff1": 0.182313,
101
+ "nauc_recall_at_100_max": 0.162035,
102
+ "nauc_recall_at_100_std": 0.103538,
103
+ "nauc_recall_at_100_diff1": 0.157165,
104
+ "nauc_recall_at_1000_max": 0.292165,
105
+ "nauc_recall_at_1000_std": 0.2899,
106
+ "nauc_recall_at_1000_diff1": 0.076536,
107
+ "nauc_precision_at_1_max": 0.178251,
108
+ "nauc_precision_at_1_std": 0.014228,
109
+ "nauc_precision_at_1_diff1": 0.30773,
110
+ "nauc_precision_at_3_max": 0.223039,
111
+ "nauc_precision_at_3_std": 0.050037,
112
+ "nauc_precision_at_3_diff1": 0.278468,
113
+ "nauc_precision_at_5_max": 0.20883,
114
+ "nauc_precision_at_5_std": 0.038668,
115
+ "nauc_precision_at_5_diff1": 0.254262,
116
+ "nauc_precision_at_10_max": 0.21141,
117
+ "nauc_precision_at_10_std": 0.070175,
118
+ "nauc_precision_at_10_diff1": 0.200751,
119
+ "nauc_precision_at_20_max": 0.20731,
120
+ "nauc_precision_at_20_std": 0.068706,
121
+ "nauc_precision_at_20_diff1": 0.192906,
122
+ "nauc_precision_at_100_max": 0.238595,
123
+ "nauc_precision_at_100_std": 0.13889,
124
+ "nauc_precision_at_100_diff1": 0.150031,
125
+ "nauc_precision_at_1000_max": 0.309616,
126
+ "nauc_precision_at_1000_std": 0.178257,
127
+ "nauc_precision_at_1000_diff1": 0.06167,
128
+ "nauc_mrr_at_1_max": 0.178251,
129
+ "nauc_mrr_at_1_std": 0.014228,
130
+ "nauc_mrr_at_1_diff1": 0.30773,
131
+ "nauc_mrr_at_3_max": 0.186063,
132
+ "nauc_mrr_at_3_std": 0.012137,
133
+ "nauc_mrr_at_3_diff1": 0.293384,
134
+ "nauc_mrr_at_5_max": 0.185008,
135
+ "nauc_mrr_at_5_std": 0.009447,
136
+ "nauc_mrr_at_5_diff1": 0.283231,
137
+ "nauc_mrr_at_10_max": 0.184845,
138
+ "nauc_mrr_at_10_std": 0.012409,
139
+ "nauc_mrr_at_10_diff1": 0.273406,
140
+ "nauc_mrr_at_20_max": 0.181774,
141
+ "nauc_mrr_at_20_std": 0.013337,
142
+ "nauc_mrr_at_20_diff1": 0.271251,
143
+ "nauc_mrr_at_100_max": 0.183822,
144
+ "nauc_mrr_at_100_std": 0.016012,
145
+ "nauc_mrr_at_100_diff1": 0.27173,
146
+ "nauc_mrr_at_1000_max": 0.184533,
147
+ "nauc_mrr_at_1000_std": 0.01679,
148
+ "nauc_mrr_at_1000_diff1": 0.272002,
149
+ "hit_rate_at_1": 0.16975,
150
+ "hit_rate_at_3": 0.25463,
151
+ "hit_rate_at_5": 0.31019,
152
+ "hit_rate_at_10": 0.36883,
153
+ "hit_rate_at_20": 0.46296,
154
+ "hit_rate_at_100": 0.65123,
155
+ "hit_rate_at_1000": 0.86883,
156
+ "main_score": 0.17792,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 81.94854092597961,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/HotpotQA.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ab518f4d6fcca38d87c25209f94beba119d02014",
3
+ "task_name": "HotpotQA",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.47279,
9
+ "ndcg_at_3": 0.35283,
10
+ "ndcg_at_5": 0.37095,
11
+ "ndcg_at_10": 0.38753,
12
+ "ndcg_at_20": 0.39981,
13
+ "ndcg_at_100": 0.42087,
14
+ "ndcg_at_1000": 0.4407,
15
+ "map_at_1": 0.23639,
16
+ "map_at_3": 0.28874,
17
+ "map_at_5": 0.30024,
18
+ "map_at_10": 0.30798,
19
+ "map_at_20": 0.3119,
20
+ "map_at_100": 0.31537,
21
+ "map_at_1000": 0.31621,
22
+ "recall_at_1": 0.23639,
23
+ "recall_at_3": 0.32377,
24
+ "recall_at_5": 0.35962,
25
+ "recall_at_10": 0.40122,
26
+ "recall_at_20": 0.44085,
27
+ "recall_at_100": 0.53464,
28
+ "recall_at_1000": 0.66705,
29
+ "accuracy": 0.23639,
30
+ "precision_at_1": 0.47279,
31
+ "precision_at_3": 0.21585,
32
+ "precision_at_5": 0.14385,
33
+ "precision_at_10": 0.08024,
34
+ "precision_at_20": 0.04409,
35
+ "precision_at_100": 0.01069,
36
+ "precision_at_1000": 0.00133,
37
+ "mrr_at_1": 0.472789,
38
+ "mrr_at_3": 0.523498,
39
+ "mrr_at_5": 0.533322,
40
+ "mrr_at_10": 0.540316,
41
+ "mrr_at_20": 0.543449,
42
+ "mrr_at_100": 0.5457,
43
+ "mrr_at_1000": 0.546093,
44
+ "nauc_ndcg_at_1_max": 0.447852,
45
+ "nauc_ndcg_at_1_std": 0.082736,
46
+ "nauc_ndcg_at_1_diff1": 0.679787,
47
+ "nauc_ndcg_at_3_max": 0.411597,
48
+ "nauc_ndcg_at_3_std": 0.098192,
49
+ "nauc_ndcg_at_3_diff1": 0.553342,
50
+ "nauc_ndcg_at_5_max": 0.39996,
51
+ "nauc_ndcg_at_5_std": 0.102258,
52
+ "nauc_ndcg_at_5_diff1": 0.525478,
53
+ "nauc_ndcg_at_10_max": 0.392021,
54
+ "nauc_ndcg_at_10_std": 0.107106,
55
+ "nauc_ndcg_at_10_diff1": 0.504657,
56
+ "nauc_ndcg_at_20_max": 0.390068,
57
+ "nauc_ndcg_at_20_std": 0.113906,
58
+ "nauc_ndcg_at_20_diff1": 0.496535,
59
+ "nauc_ndcg_at_100_max": 0.38637,
60
+ "nauc_ndcg_at_100_std": 0.122683,
61
+ "nauc_ndcg_at_100_diff1": 0.482956,
62
+ "nauc_ndcg_at_1000_max": 0.386426,
63
+ "nauc_ndcg_at_1000_std": 0.127217,
64
+ "nauc_ndcg_at_1000_diff1": 0.476792,
65
+ "nauc_map_at_1_max": 0.447852,
66
+ "nauc_map_at_1_std": 0.082736,
67
+ "nauc_map_at_1_diff1": 0.679787,
68
+ "nauc_map_at_3_max": 0.400696,
69
+ "nauc_map_at_3_std": 0.096209,
70
+ "nauc_map_at_3_diff1": 0.534213,
71
+ "nauc_map_at_5_max": 0.39108,
72
+ "nauc_map_at_5_std": 0.100042,
73
+ "nauc_map_at_5_diff1": 0.511798,
74
+ "nauc_map_at_10_max": 0.385924,
75
+ "nauc_map_at_10_std": 0.10316,
76
+ "nauc_map_at_10_diff1": 0.499714,
77
+ "nauc_map_at_20_max": 0.385118,
78
+ "nauc_map_at_20_std": 0.1053,
79
+ "nauc_map_at_20_diff1": 0.496797,
80
+ "nauc_map_at_100_max": 0.384312,
81
+ "nauc_map_at_100_std": 0.106989,
82
+ "nauc_map_at_100_diff1": 0.493992,
83
+ "nauc_map_at_1000_max": 0.384287,
84
+ "nauc_map_at_1000_std": 0.107298,
85
+ "nauc_map_at_1000_diff1": 0.493542,
86
+ "nauc_recall_at_1_max": 0.447852,
87
+ "nauc_recall_at_1_std": 0.082736,
88
+ "nauc_recall_at_1_diff1": 0.679787,
89
+ "nauc_recall_at_3_max": 0.385602,
90
+ "nauc_recall_at_3_std": 0.10807,
91
+ "nauc_recall_at_3_diff1": 0.476182,
92
+ "nauc_recall_at_5_max": 0.349782,
93
+ "nauc_recall_at_5_std": 0.113261,
94
+ "nauc_recall_at_5_diff1": 0.404143,
95
+ "nauc_recall_at_10_max": 0.321474,
96
+ "nauc_recall_at_10_std": 0.123924,
97
+ "nauc_recall_at_10_diff1": 0.3405,
98
+ "nauc_recall_at_20_max": 0.299218,
99
+ "nauc_recall_at_20_std": 0.142078,
100
+ "nauc_recall_at_20_diff1": 0.29496,
101
+ "nauc_recall_at_100_max": 0.248058,
102
+ "nauc_recall_at_100_std": 0.167171,
103
+ "nauc_recall_at_100_diff1": 0.193928,
104
+ "nauc_recall_at_1000_max": 0.201433,
105
+ "nauc_recall_at_1000_std": 0.189114,
106
+ "nauc_recall_at_1000_diff1": 0.082959,
107
+ "nauc_precision_at_1_max": 0.447852,
108
+ "nauc_precision_at_1_std": 0.082736,
109
+ "nauc_precision_at_1_diff1": 0.679787,
110
+ "nauc_precision_at_3_max": 0.385602,
111
+ "nauc_precision_at_3_std": 0.10807,
112
+ "nauc_precision_at_3_diff1": 0.476182,
113
+ "nauc_precision_at_5_max": 0.349782,
114
+ "nauc_precision_at_5_std": 0.113261,
115
+ "nauc_precision_at_5_diff1": 0.404143,
116
+ "nauc_precision_at_10_max": 0.321474,
117
+ "nauc_precision_at_10_std": 0.123924,
118
+ "nauc_precision_at_10_diff1": 0.3405,
119
+ "nauc_precision_at_20_max": 0.299218,
120
+ "nauc_precision_at_20_std": 0.142078,
121
+ "nauc_precision_at_20_diff1": 0.29496,
122
+ "nauc_precision_at_100_max": 0.248058,
123
+ "nauc_precision_at_100_std": 0.167171,
124
+ "nauc_precision_at_100_diff1": 0.193928,
125
+ "nauc_precision_at_1000_max": 0.201433,
126
+ "nauc_precision_at_1000_std": 0.189114,
127
+ "nauc_precision_at_1000_diff1": 0.082959,
128
+ "nauc_mrr_at_1_max": 0.447852,
129
+ "nauc_mrr_at_1_std": 0.082736,
130
+ "nauc_mrr_at_1_diff1": 0.679787,
131
+ "nauc_mrr_at_3_max": 0.444397,
132
+ "nauc_mrr_at_3_std": 0.091482,
133
+ "nauc_mrr_at_3_diff1": 0.640122,
134
+ "nauc_mrr_at_5_max": 0.442845,
135
+ "nauc_mrr_at_5_std": 0.091811,
136
+ "nauc_mrr_at_5_diff1": 0.63535,
137
+ "nauc_mrr_at_10_max": 0.442276,
138
+ "nauc_mrr_at_10_std": 0.092113,
139
+ "nauc_mrr_at_10_diff1": 0.632324,
140
+ "nauc_mrr_at_20_max": 0.442532,
141
+ "nauc_mrr_at_20_std": 0.093368,
142
+ "nauc_mrr_at_20_diff1": 0.631839,
143
+ "nauc_mrr_at_100_max": 0.442706,
144
+ "nauc_mrr_at_100_std": 0.09397,
145
+ "nauc_mrr_at_100_diff1": 0.631998,
146
+ "nauc_mrr_at_1000_max": 0.442711,
147
+ "nauc_mrr_at_1000_std": 0.093954,
148
+ "nauc_mrr_at_1000_diff1": 0.632132,
149
+ "hit_rate_at_1": 0.47279,
150
+ "hit_rate_at_3": 0.58677,
151
+ "hit_rate_at_5": 0.62984,
152
+ "hit_rate_at_10": 0.68143,
153
+ "hit_rate_at_20": 0.72654,
154
+ "hit_rate_at_100": 0.81621,
155
+ "hit_rate_at_1000": 0.90871,
156
+ "main_score": 0.38753,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 7573.657062530518,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ImdbClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "3d86128a09e091d6018b6d26cad27f2739fc2db7",
3
+ "task_name": "ImdbClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.72012,
11
+ "f1": 0.719295,
12
+ "f1_weighted": 0.719295,
13
+ "precision": 0.722737,
14
+ "precision_weighted": 0.722737,
15
+ "recall": 0.72012,
16
+ "recall_weighted": 0.72012,
17
+ "ap": 0.664404,
18
+ "ap_weighted": 0.664404
19
+ },
20
+ {
21
+ "accuracy": 0.6878,
22
+ "f1": 0.686915,
23
+ "f1_weighted": 0.686915,
24
+ "precision": 0.689947,
25
+ "precision_weighted": 0.689947,
26
+ "recall": 0.6878,
27
+ "recall_weighted": 0.6878,
28
+ "ap": 0.633365,
29
+ "ap_weighted": 0.633365
30
+ },
31
+ {
32
+ "accuracy": 0.55628,
33
+ "f1": 0.55228,
34
+ "f1_weighted": 0.55228,
35
+ "precision": 0.558366,
36
+ "precision_weighted": 0.558366,
37
+ "recall": 0.55628,
38
+ "recall_weighted": 0.55628,
39
+ "ap": 0.530804,
40
+ "ap_weighted": 0.530804
41
+ },
42
+ {
43
+ "accuracy": 0.68408,
44
+ "f1": 0.681937,
45
+ "f1_weighted": 0.681937,
46
+ "precision": 0.689178,
47
+ "precision_weighted": 0.689178,
48
+ "recall": 0.68408,
49
+ "recall_weighted": 0.68408,
50
+ "ap": 0.621147,
51
+ "ap_weighted": 0.621147
52
+ },
53
+ {
54
+ "accuracy": 0.65156,
55
+ "f1": 0.649208,
56
+ "f1_weighted": 0.649208,
57
+ "precision": 0.655736,
58
+ "precision_weighted": 0.655736,
59
+ "recall": 0.65156,
60
+ "recall_weighted": 0.65156,
61
+ "ap": 0.595518,
62
+ "ap_weighted": 0.595518
63
+ },
64
+ {
65
+ "accuracy": 0.615,
66
+ "f1": 0.609233,
67
+ "f1_weighted": 0.609233,
68
+ "precision": 0.622214,
69
+ "precision_weighted": 0.622214,
70
+ "recall": 0.615,
71
+ "recall_weighted": 0.615,
72
+ "ap": 0.56814,
73
+ "ap_weighted": 0.56814
74
+ },
75
+ {
76
+ "accuracy": 0.62396,
77
+ "f1": 0.623576,
78
+ "f1_weighted": 0.623576,
79
+ "precision": 0.624469,
80
+ "precision_weighted": 0.624469,
81
+ "recall": 0.62396,
82
+ "recall_weighted": 0.62396,
83
+ "ap": 0.576423,
84
+ "ap_weighted": 0.576423
85
+ },
86
+ {
87
+ "accuracy": 0.66244,
88
+ "f1": 0.653612,
89
+ "f1_weighted": 0.653612,
90
+ "precision": 0.680879,
91
+ "precision_weighted": 0.680879,
92
+ "recall": 0.66244,
93
+ "recall_weighted": 0.66244,
94
+ "ap": 0.601221,
95
+ "ap_weighted": 0.601221
96
+ },
97
+ {
98
+ "accuracy": 0.67884,
99
+ "f1": 0.678042,
100
+ "f1_weighted": 0.678042,
101
+ "precision": 0.680632,
102
+ "precision_weighted": 0.680632,
103
+ "recall": 0.67884,
104
+ "recall_weighted": 0.67884,
105
+ "ap": 0.624942,
106
+ "ap_weighted": 0.624942
107
+ },
108
+ {
109
+ "accuracy": 0.64528,
110
+ "f1": 0.643365,
111
+ "f1_weighted": 0.643365,
112
+ "precision": 0.648469,
113
+ "precision_weighted": 0.648469,
114
+ "recall": 0.64528,
115
+ "recall_weighted": 0.64528,
116
+ "ap": 0.591048,
117
+ "ap_weighted": 0.591048
118
+ }
119
+ ],
120
+ "accuracy": 0.652536,
121
+ "f1": 0.649746,
122
+ "f1_weighted": 0.649746,
123
+ "precision": 0.657263,
124
+ "precision_weighted": 0.657263,
125
+ "recall": 0.652536,
126
+ "recall_weighted": 0.652536,
127
+ "ap": 0.600701,
128
+ "ap_weighted": 0.600701,
129
+ "main_score": 0.652536,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 162.55868816375732,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MSMARCO.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5a29a104738b98a9e76336939199e264163d4a0",
3
+ "task_name": "MSMARCO",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "dev": [
7
+ {
8
+ "ndcg_at_1": 0.11461,
9
+ "ndcg_at_3": 0.17182,
10
+ "ndcg_at_5": 0.1936,
11
+ "ndcg_at_10": 0.21784,
12
+ "ndcg_at_20": 0.23692,
13
+ "ndcg_at_100": 0.26977,
14
+ "ndcg_at_1000": 0.29731,
15
+ "map_at_1": 0.1122,
16
+ "map_at_3": 0.15632,
17
+ "map_at_5": 0.16834,
18
+ "map_at_10": 0.17843,
19
+ "map_at_20": 0.18373,
20
+ "map_at_100": 0.1881,
21
+ "map_at_1000": 0.18906,
22
+ "recall_at_1": 0.1122,
23
+ "recall_at_3": 0.2138,
24
+ "recall_at_5": 0.26625,
25
+ "recall_at_10": 0.34026,
26
+ "recall_at_20": 0.41458,
27
+ "recall_at_100": 0.59222,
28
+ "recall_at_1000": 0.81116,
29
+ "accuracy": 0.1122,
30
+ "precision_at_1": 0.11461,
31
+ "precision_at_3": 0.0734,
32
+ "precision_at_5": 0.05496,
33
+ "precision_at_10": 0.03523,
34
+ "precision_at_20": 0.02155,
35
+ "precision_at_100": 0.00621,
36
+ "precision_at_1000": 0.00086,
37
+ "mrr_at_1": 0.114613,
38
+ "mrr_at_3": 0.15936,
39
+ "mrr_at_5": 0.171681,
40
+ "mrr_at_10": 0.181781,
41
+ "mrr_at_20": 0.187041,
42
+ "mrr_at_100": 0.191327,
43
+ "mrr_at_1000": 0.192241,
44
+ "nauc_ndcg_at_1_max": 0.111535,
45
+ "nauc_ndcg_at_1_std": -0.050303,
46
+ "nauc_ndcg_at_1_diff1": 0.353043,
47
+ "nauc_ndcg_at_3_max": 0.091036,
48
+ "nauc_ndcg_at_3_std": -0.051064,
49
+ "nauc_ndcg_at_3_diff1": 0.290861,
50
+ "nauc_ndcg_at_5_max": 0.100927,
51
+ "nauc_ndcg_at_5_std": -0.040306,
52
+ "nauc_ndcg_at_5_diff1": 0.28728,
53
+ "nauc_ndcg_at_10_max": 0.110586,
54
+ "nauc_ndcg_at_10_std": -0.024864,
55
+ "nauc_ndcg_at_10_diff1": 0.280544,
56
+ "nauc_ndcg_at_20_max": 0.116732,
57
+ "nauc_ndcg_at_20_std": -0.005223,
58
+ "nauc_ndcg_at_20_diff1": 0.276306,
59
+ "nauc_ndcg_at_100_max": 0.134284,
60
+ "nauc_ndcg_at_100_std": 0.03079,
61
+ "nauc_ndcg_at_100_diff1": 0.270399,
62
+ "nauc_ndcg_at_1000_max": 0.144507,
63
+ "nauc_ndcg_at_1000_std": 0.040038,
64
+ "nauc_ndcg_at_1000_diff1": 0.272823,
65
+ "nauc_map_at_1_max": 0.111328,
66
+ "nauc_map_at_1_std": -0.053875,
67
+ "nauc_map_at_1_diff1": 0.353094,
68
+ "nauc_map_at_3_max": 0.095556,
69
+ "nauc_map_at_3_std": -0.05271,
70
+ "nauc_map_at_3_diff1": 0.302798,
71
+ "nauc_map_at_5_max": 0.101432,
72
+ "nauc_map_at_5_std": -0.046538,
73
+ "nauc_map_at_5_diff1": 0.300369,
74
+ "nauc_map_at_10_max": 0.105529,
75
+ "nauc_map_at_10_std": -0.039464,
76
+ "nauc_map_at_10_diff1": 0.296848,
77
+ "nauc_map_at_20_max": 0.107349,
78
+ "nauc_map_at_20_std": -0.033467,
79
+ "nauc_map_at_20_diff1": 0.295497,
80
+ "nauc_map_at_100_max": 0.109933,
81
+ "nauc_map_at_100_std": -0.028243,
82
+ "nauc_map_at_100_diff1": 0.294602,
83
+ "nauc_map_at_1000_max": 0.110315,
84
+ "nauc_map_at_1000_std": -0.027802,
85
+ "nauc_map_at_1000_diff1": 0.294706,
86
+ "nauc_recall_at_1_max": 0.111328,
87
+ "nauc_recall_at_1_std": -0.053875,
88
+ "nauc_recall_at_1_diff1": 0.353094,
89
+ "nauc_recall_at_3_max": 0.080301,
90
+ "nauc_recall_at_3_std": -0.048732,
91
+ "nauc_recall_at_3_diff1": 0.263059,
92
+ "nauc_recall_at_5_max": 0.100478,
93
+ "nauc_recall_at_5_std": -0.027442,
94
+ "nauc_recall_at_5_diff1": 0.259588,
95
+ "nauc_recall_at_10_max": 0.123545,
96
+ "nauc_recall_at_10_std": 0.008087,
97
+ "nauc_recall_at_10_diff1": 0.245267,
98
+ "nauc_recall_at_20_max": 0.140601,
99
+ "nauc_recall_at_20_std": 0.065103,
100
+ "nauc_recall_at_20_diff1": 0.232832,
101
+ "nauc_recall_at_100_max": 0.220679,
102
+ "nauc_recall_at_100_std": 0.237144,
103
+ "nauc_recall_at_100_diff1": 0.200915,
104
+ "nauc_recall_at_1000_max": 0.395262,
105
+ "nauc_recall_at_1000_std": 0.485329,
106
+ "nauc_recall_at_1000_diff1": 0.177546,
107
+ "nauc_precision_at_1_max": 0.111535,
108
+ "nauc_precision_at_1_std": -0.050303,
109
+ "nauc_precision_at_1_diff1": 0.353043,
110
+ "nauc_precision_at_3_max": 0.081735,
111
+ "nauc_precision_at_3_std": -0.047576,
112
+ "nauc_precision_at_3_diff1": 0.263161,
113
+ "nauc_precision_at_5_max": 0.102197,
114
+ "nauc_precision_at_5_std": -0.025197,
115
+ "nauc_precision_at_5_diff1": 0.257365,
116
+ "nauc_precision_at_10_max": 0.126583,
117
+ "nauc_precision_at_10_std": 0.012395,
118
+ "nauc_precision_at_10_diff1": 0.241528,
119
+ "nauc_precision_at_20_max": 0.146218,
120
+ "nauc_precision_at_20_std": 0.074603,
121
+ "nauc_precision_at_20_diff1": 0.225723,
122
+ "nauc_precision_at_100_max": 0.228143,
123
+ "nauc_precision_at_100_std": 0.242327,
124
+ "nauc_precision_at_100_diff1": 0.182524,
125
+ "nauc_precision_at_1000_max": 0.359227,
126
+ "nauc_precision_at_1000_std": 0.414945,
127
+ "nauc_precision_at_1000_diff1": 0.121014,
128
+ "nauc_mrr_at_1_max": 0.111535,
129
+ "nauc_mrr_at_1_std": -0.050303,
130
+ "nauc_mrr_at_1_diff1": 0.353043,
131
+ "nauc_mrr_at_3_max": 0.095327,
132
+ "nauc_mrr_at_3_std": -0.049796,
133
+ "nauc_mrr_at_3_diff1": 0.301965,
134
+ "nauc_mrr_at_5_max": 0.101145,
135
+ "nauc_mrr_at_5_std": -0.043263,
136
+ "nauc_mrr_at_5_diff1": 0.29868,
137
+ "nauc_mrr_at_10_max": 0.105596,
138
+ "nauc_mrr_at_10_std": -0.036029,
139
+ "nauc_mrr_at_10_diff1": 0.295521,
140
+ "nauc_mrr_at_20_max": 0.107637,
141
+ "nauc_mrr_at_20_std": -0.029976,
142
+ "nauc_mrr_at_20_diff1": 0.294311,
143
+ "nauc_mrr_at_100_max": 0.110082,
144
+ "nauc_mrr_at_100_std": -0.025023,
145
+ "nauc_mrr_at_100_diff1": 0.2935,
146
+ "nauc_mrr_at_1000_max": 0.110362,
147
+ "nauc_mrr_at_1000_std": -0.02471,
148
+ "nauc_mrr_at_1000_diff1": 0.293613,
149
+ "hit_rate_at_1": 0.11461,
150
+ "hit_rate_at_3": 0.21848,
151
+ "hit_rate_at_5": 0.27249,
152
+ "hit_rate_at_10": 0.34771,
153
+ "hit_rate_at_20": 0.42307,
154
+ "hit_rate_at_100": 0.60201,
155
+ "hit_rate_at_1000": 0.82006,
156
+ "main_score": 0.21784,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 13183.535454511642,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/MTOPDomainClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "a76d16fae880597b9c73047b50159220a441cb54",
3
+ "task_name": "MTOPDomainClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.818513,
11
+ "f1": 0.810204,
12
+ "f1_weighted": 0.816732,
13
+ "precision": 0.806449,
14
+ "precision_weighted": 0.823746,
15
+ "recall": 0.822678,
16
+ "recall_weighted": 0.818513,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.844733,
22
+ "f1": 0.836794,
23
+ "f1_weighted": 0.845319,
24
+ "precision": 0.831906,
25
+ "precision_weighted": 0.850534,
26
+ "recall": 0.846952,
27
+ "recall_weighted": 0.844733,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.833561,
33
+ "f1": 0.827572,
34
+ "f1_weighted": 0.832275,
35
+ "precision": 0.823368,
36
+ "precision_weighted": 0.836827,
37
+ "recall": 0.837811,
38
+ "recall_weighted": 0.833561,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.853169,
44
+ "f1": 0.847384,
45
+ "f1_weighted": 0.853745,
46
+ "precision": 0.843439,
47
+ "precision_weighted": 0.86278,
48
+ "recall": 0.858848,
49
+ "recall_weighted": 0.853169,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.832649,
55
+ "f1": 0.824785,
56
+ "f1_weighted": 0.833431,
57
+ "precision": 0.822689,
58
+ "precision_weighted": 0.843265,
59
+ "recall": 0.837919,
60
+ "recall_weighted": 0.832649,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.843137,
66
+ "f1": 0.835542,
67
+ "f1_weighted": 0.842324,
68
+ "precision": 0.827848,
69
+ "precision_weighted": 0.849532,
70
+ "recall": 0.850703,
71
+ "recall_weighted": 0.843137,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.81213,
77
+ "f1": 0.80362,
78
+ "f1_weighted": 0.809889,
79
+ "precision": 0.797391,
80
+ "precision_weighted": 0.816831,
81
+ "recall": 0.820061,
82
+ "recall_weighted": 0.81213,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.832877,
88
+ "f1": 0.821307,
89
+ "f1_weighted": 0.834923,
90
+ "precision": 0.820621,
91
+ "precision_weighted": 0.845399,
92
+ "recall": 0.83353,
93
+ "recall_weighted": 0.832877,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.833561,
99
+ "f1": 0.826919,
100
+ "f1_weighted": 0.834565,
101
+ "precision": 0.820338,
102
+ "precision_weighted": 0.84293,
103
+ "recall": 0.841185,
104
+ "recall_weighted": 0.833561,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.840173,
110
+ "f1": 0.828592,
111
+ "f1_weighted": 0.839495,
112
+ "precision": 0.82793,
113
+ "precision_weighted": 0.843386,
114
+ "recall": 0.83434,
115
+ "recall_weighted": 0.840173,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.834451,
121
+ "f1": 0.826272,
122
+ "f1_weighted": 0.83427,
123
+ "precision": 0.822198,
124
+ "precision_weighted": 0.841523,
125
+ "recall": 0.838403,
126
+ "recall_weighted": 0.834451,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.834451,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 26.04588770866394,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MTOPIntentClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "2992d820f31312593c49a4890430aadadb0f0039",
3
+ "task_name": "MTOPIntentClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.48746,
11
+ "f1": 0.32061,
12
+ "f1_weighted": 0.503444,
13
+ "precision": 0.314244,
14
+ "precision_weighted": 0.713996,
15
+ "recall": 0.522296,
16
+ "recall_weighted": 0.48746,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.5285,
22
+ "f1": 0.349812,
23
+ "f1_weighted": 0.555743,
24
+ "precision": 0.347551,
25
+ "precision_weighted": 0.743027,
26
+ "recall": 0.546812,
27
+ "recall_weighted": 0.5285,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.540356,
33
+ "f1": 0.346726,
34
+ "f1_weighted": 0.573882,
35
+ "precision": 0.340707,
36
+ "precision_weighted": 0.751017,
37
+ "recall": 0.524012,
38
+ "recall_weighted": 0.540356,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.521888,
44
+ "f1": 0.339828,
45
+ "f1_weighted": 0.547966,
46
+ "precision": 0.3375,
47
+ "precision_weighted": 0.742117,
48
+ "recall": 0.527615,
49
+ "recall_weighted": 0.521888,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.514592,
55
+ "f1": 0.345729,
56
+ "f1_weighted": 0.530088,
57
+ "precision": 0.3408,
58
+ "precision_weighted": 0.734854,
59
+ "recall": 0.535731,
60
+ "recall_weighted": 0.514592,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.508664,
66
+ "f1": 0.34659,
67
+ "f1_weighted": 0.526555,
68
+ "precision": 0.336049,
69
+ "precision_weighted": 0.746518,
70
+ "recall": 0.524452,
71
+ "recall_weighted": 0.508664,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.48176,
77
+ "f1": 0.332331,
78
+ "f1_weighted": 0.505281,
79
+ "precision": 0.330377,
80
+ "precision_weighted": 0.727788,
81
+ "recall": 0.515638,
82
+ "recall_weighted": 0.48176,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.52622,
88
+ "f1": 0.341885,
89
+ "f1_weighted": 0.548056,
90
+ "precision": 0.335051,
91
+ "precision_weighted": 0.741829,
92
+ "recall": 0.546468,
93
+ "recall_weighted": 0.52622,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.52508,
99
+ "f1": 0.350598,
100
+ "f1_weighted": 0.550702,
101
+ "precision": 0.339187,
102
+ "precision_weighted": 0.7579,
103
+ "recall": 0.550679,
104
+ "recall_weighted": 0.52508,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.537392,
110
+ "f1": 0.360391,
111
+ "f1_weighted": 0.559791,
112
+ "precision": 0.350729,
113
+ "precision_weighted": 0.734016,
114
+ "recall": 0.537884,
115
+ "recall_weighted": 0.537392,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.517191,
121
+ "f1": 0.34345,
122
+ "f1_weighted": 0.540151,
123
+ "precision": 0.33722,
124
+ "precision_weighted": 0.739306,
125
+ "recall": 0.533159,
126
+ "recall_weighted": 0.517191,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.517191,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 54.93283700942993,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MassiveIntentClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4672e20407010da34463acc759c162ca9734bca6",
3
+ "task_name": "MassiveIntentClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.603564,
11
+ "f1": 0.570528,
12
+ "f1_weighted": 0.589522,
13
+ "precision": 0.568116,
14
+ "precision_weighted": 0.655833,
15
+ "recall": 0.675778,
16
+ "recall_weighted": 0.603564,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.613652,
22
+ "f1": 0.583694,
23
+ "f1_weighted": 0.605809,
24
+ "precision": 0.569205,
25
+ "precision_weighted": 0.655619,
26
+ "recall": 0.663644,
27
+ "recall_weighted": 0.613652,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.587424,
33
+ "f1": 0.552513,
34
+ "f1_weighted": 0.573412,
35
+ "precision": 0.534538,
36
+ "precision_weighted": 0.615953,
37
+ "recall": 0.663471,
38
+ "recall_weighted": 0.587424,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.591123,
44
+ "f1": 0.566348,
45
+ "f1_weighted": 0.578383,
46
+ "precision": 0.557587,
47
+ "precision_weighted": 0.628252,
48
+ "recall": 0.663657,
49
+ "recall_weighted": 0.591123,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.584734,
55
+ "f1": 0.560817,
56
+ "f1_weighted": 0.574269,
57
+ "precision": 0.562118,
58
+ "precision_weighted": 0.628024,
59
+ "recall": 0.66789,
60
+ "recall_weighted": 0.584734,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.550773,
66
+ "f1": 0.538328,
67
+ "f1_weighted": 0.533015,
68
+ "precision": 0.539745,
69
+ "precision_weighted": 0.632553,
70
+ "recall": 0.651002,
71
+ "recall_weighted": 0.550773,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.591459,
77
+ "f1": 0.568811,
78
+ "f1_weighted": 0.577795,
79
+ "precision": 0.563855,
80
+ "precision_weighted": 0.644678,
81
+ "recall": 0.675252,
82
+ "recall_weighted": 0.591459,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.582044,
88
+ "f1": 0.555797,
89
+ "f1_weighted": 0.561683,
90
+ "precision": 0.551595,
91
+ "precision_weighted": 0.639627,
92
+ "recall": 0.648589,
93
+ "recall_weighted": 0.582044,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.565568,
99
+ "f1": 0.54813,
100
+ "f1_weighted": 0.537545,
101
+ "precision": 0.55524,
102
+ "precision_weighted": 0.598254,
103
+ "recall": 0.651648,
104
+ "recall_weighted": 0.565568,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.604573,
110
+ "f1": 0.583999,
111
+ "f1_weighted": 0.597251,
112
+ "precision": 0.578571,
113
+ "precision_weighted": 0.658989,
114
+ "recall": 0.68452,
115
+ "recall_weighted": 0.604573,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.587492,
121
+ "f1": 0.562897,
122
+ "f1_weighted": 0.572868,
123
+ "precision": 0.558057,
124
+ "precision_weighted": 0.635778,
125
+ "recall": 0.664545,
126
+ "recall_weighted": 0.587492,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.587492,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 21.46303081512451,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MassiveScenarioClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "fad2c6e8459f9e1c45d9315f4953d921437d70f8",
3
+ "task_name": "MassiveScenarioClassification",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.695024,
11
+ "f1": 0.68675,
12
+ "f1_weighted": 0.693857,
13
+ "precision": 0.671789,
14
+ "precision_weighted": 0.731113,
15
+ "recall": 0.74644,
16
+ "recall_weighted": 0.695024,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.689644,
22
+ "f1": 0.674565,
23
+ "f1_weighted": 0.687827,
24
+ "precision": 0.655422,
25
+ "precision_weighted": 0.725211,
26
+ "recall": 0.73514,
27
+ "recall_weighted": 0.689644,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.685945,
33
+ "f1": 0.663039,
34
+ "f1_weighted": 0.68531,
35
+ "precision": 0.643775,
36
+ "precision_weighted": 0.717489,
37
+ "recall": 0.722481,
38
+ "recall_weighted": 0.685945,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.6577,
44
+ "f1": 0.650784,
45
+ "f1_weighted": 0.660956,
46
+ "precision": 0.645337,
47
+ "precision_weighted": 0.720549,
48
+ "recall": 0.717772,
49
+ "recall_weighted": 0.6577,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.673504,
55
+ "f1": 0.653667,
56
+ "f1_weighted": 0.664931,
57
+ "precision": 0.634656,
58
+ "precision_weighted": 0.69932,
59
+ "recall": 0.717893,
60
+ "recall_weighted": 0.673504,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.633154,
66
+ "f1": 0.622078,
67
+ "f1_weighted": 0.628199,
68
+ "precision": 0.62024,
69
+ "precision_weighted": 0.704904,
70
+ "recall": 0.699368,
71
+ "recall_weighted": 0.633154,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.665098,
77
+ "f1": 0.654545,
78
+ "f1_weighted": 0.666008,
79
+ "precision": 0.644418,
80
+ "precision_weighted": 0.716619,
81
+ "recall": 0.71792,
82
+ "recall_weighted": 0.665098,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.677875,
88
+ "f1": 0.666261,
89
+ "f1_weighted": 0.679235,
90
+ "precision": 0.654583,
91
+ "precision_weighted": 0.726398,
92
+ "recall": 0.726468,
93
+ "recall_weighted": 0.677875,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.662408,
99
+ "f1": 0.650293,
100
+ "f1_weighted": 0.661054,
101
+ "precision": 0.641938,
102
+ "precision_weighted": 0.709202,
103
+ "recall": 0.70937,
104
+ "recall_weighted": 0.662408,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.643578,
110
+ "f1": 0.631138,
111
+ "f1_weighted": 0.641083,
112
+ "precision": 0.623784,
113
+ "precision_weighted": 0.694028,
114
+ "recall": 0.700743,
115
+ "recall_weighted": 0.643578,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.668393,
121
+ "f1": 0.655312,
122
+ "f1_weighted": 0.666846,
123
+ "precision": 0.643594,
124
+ "precision_weighted": 0.714483,
125
+ "recall": 0.719359,
126
+ "recall_weighted": 0.668393,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.668393,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 18.570295572280884,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MedrxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e7a26af6f3ae46b30dde8737f02c07b1505bcc73",
3
+ "task_name": "MedrxivClusteringP2P",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.304325,
9
+ "v_measure_std": 0.012447,
10
+ "v_measures": [
11
+ 0.294355,
12
+ 0.298097,
13
+ 0.286296,
14
+ 0.289325,
15
+ 0.294845,
16
+ 0.310458,
17
+ 0.321027,
18
+ 0.319648,
19
+ 0.312503,
20
+ 0.316694
21
+ ],
22
+ "main_score": 0.304325,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 61.879947662353516,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MedrxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "35191c8c0dca72d8ff3efcd72aa802307d469663",
3
+ "task_name": "MedrxivClusteringS2S",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.251473,
9
+ "v_measure_std": 0.015194,
10
+ "v_measures": [
11
+ 0.237388,
12
+ 0.239269,
13
+ 0.231219,
14
+ 0.237882,
15
+ 0.239119,
16
+ 0.267754,
17
+ 0.25499,
18
+ 0.271697,
19
+ 0.267987,
20
+ 0.26743
21
+ ],
22
+ "main_score": 0.251473,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 51.59626126289368,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MindSmallReranking.json ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "227478e3235572039f4f7661840e059f31ef6eb1",
3
+ "task_name": "MindSmallReranking",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.12488,
9
+ "ndcg_at_3": 0.19852,
10
+ "ndcg_at_5": 0.24408,
11
+ "ndcg_at_10": 0.30643,
12
+ "ndcg_at_20": 0.35975,
13
+ "ndcg_at_100": 0.42412,
14
+ "ndcg_at_1000": 0.42767,
15
+ "map_at_1": 0.09538,
16
+ "map_at_3": 0.16278,
17
+ "map_at_5": 0.18968,
18
+ "map_at_10": 0.21748,
19
+ "map_at_20": 0.23482,
20
+ "map_at_100": 0.24853,
21
+ "map_at_1000": 0.24897,
22
+ "recall_at_1": 0.09538,
23
+ "recall_at_3": 0.24732,
24
+ "recall_at_5": 0.35599,
25
+ "recall_at_10": 0.53246,
26
+ "recall_at_20": 0.71439,
27
+ "recall_at_100": 0.98283,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.09538,
30
+ "precision_at_1": 0.12488,
31
+ "precision_at_3": 0.1096,
32
+ "precision_at_5": 0.09693,
33
+ "precision_at_10": 0.0764,
34
+ "precision_at_20": 0.05494,
35
+ "precision_at_100": 0.01763,
36
+ "precision_at_1000": 0.00183,
37
+ "mrr_at_1": 0.124882,
38
+ "mrr_at_3": 0.205615,
39
+ "mrr_at_5": 0.234869,
40
+ "mrr_at_10": 0.260852,
41
+ "mrr_at_20": 0.273243,
42
+ "mrr_at_100": 0.27873,
43
+ "mrr_at_1000": 0.278776,
44
+ "nauc_ndcg_at_1_max": -0.098919,
45
+ "nauc_ndcg_at_1_std": -0.009063,
46
+ "nauc_ndcg_at_1_diff1": 0.122263,
47
+ "nauc_ndcg_at_3_max": -0.207895,
48
+ "nauc_ndcg_at_3_std": -0.056492,
49
+ "nauc_ndcg_at_3_diff1": 0.137183,
50
+ "nauc_ndcg_at_5_max": -0.246361,
51
+ "nauc_ndcg_at_5_std": -0.06611,
52
+ "nauc_ndcg_at_5_diff1": 0.133926,
53
+ "nauc_ndcg_at_10_max": -0.284735,
54
+ "nauc_ndcg_at_10_std": -0.07363,
55
+ "nauc_ndcg_at_10_diff1": 0.129436,
56
+ "nauc_ndcg_at_20_max": -0.295387,
57
+ "nauc_ndcg_at_20_std": -0.072006,
58
+ "nauc_ndcg_at_20_diff1": 0.127416,
59
+ "nauc_ndcg_at_100_max": -0.227769,
60
+ "nauc_ndcg_at_100_std": -0.053103,
61
+ "nauc_ndcg_at_100_diff1": 0.125532,
62
+ "nauc_ndcg_at_1000_max": -0.215506,
63
+ "nauc_ndcg_at_1000_std": -0.051425,
64
+ "nauc_ndcg_at_1000_diff1": 0.125028,
65
+ "nauc_map_at_1_max": -0.169264,
66
+ "nauc_map_at_1_std": -0.042674,
67
+ "nauc_map_at_1_diff1": 0.145497,
68
+ "nauc_map_at_3_max": -0.221901,
69
+ "nauc_map_at_3_std": -0.063588,
70
+ "nauc_map_at_3_diff1": 0.145752,
71
+ "nauc_map_at_5_max": -0.241508,
72
+ "nauc_map_at_5_std": -0.06724,
73
+ "nauc_map_at_5_diff1": 0.141901,
74
+ "nauc_map_at_10_max": -0.257848,
75
+ "nauc_map_at_10_std": -0.06938,
76
+ "nauc_map_at_10_diff1": 0.138745,
77
+ "nauc_map_at_20_max": -0.259416,
78
+ "nauc_map_at_20_std": -0.067857,
79
+ "nauc_map_at_20_diff1": 0.137454,
80
+ "nauc_map_at_100_max": -0.245405,
81
+ "nauc_map_at_100_std": -0.063405,
82
+ "nauc_map_at_100_diff1": 0.136866,
83
+ "nauc_map_at_1000_max": -0.244113,
84
+ "nauc_map_at_1000_std": -0.063225,
85
+ "nauc_map_at_1000_diff1": 0.136821,
86
+ "nauc_recall_at_1_max": -0.169264,
87
+ "nauc_recall_at_1_std": -0.042674,
88
+ "nauc_recall_at_1_diff1": 0.145497,
89
+ "nauc_recall_at_3_max": -0.25088,
90
+ "nauc_recall_at_3_std": -0.075601,
91
+ "nauc_recall_at_3_diff1": 0.135896,
92
+ "nauc_recall_at_5_max": -0.302693,
93
+ "nauc_recall_at_5_std": -0.087196,
94
+ "nauc_recall_at_5_diff1": 0.124967,
95
+ "nauc_recall_at_10_max": -0.39333,
96
+ "nauc_recall_at_10_std": -0.106836,
97
+ "nauc_recall_at_10_diff1": 0.113586,
98
+ "nauc_recall_at_20_max": -0.491824,
99
+ "nauc_recall_at_20_std": -0.120887,
100
+ "nauc_recall_at_20_diff1": 0.107092,
101
+ "nauc_recall_at_100_max": -0.810174,
102
+ "nauc_recall_at_100_std": -0.123547,
103
+ "nauc_recall_at_100_diff1": 0.112294,
104
+ "nauc_recall_at_1000_max": -0.788982,
105
+ "nauc_recall_at_1000_std": -0.067133,
106
+ "nauc_recall_at_1000_diff1": -0.089823,
107
+ "nauc_precision_at_1_max": -0.098919,
108
+ "nauc_precision_at_1_std": -0.009063,
109
+ "nauc_precision_at_1_diff1": 0.122263,
110
+ "nauc_precision_at_3_max": -0.165937,
111
+ "nauc_precision_at_3_std": -0.032429,
112
+ "nauc_precision_at_3_diff1": 0.110777,
113
+ "nauc_precision_at_5_max": -0.191458,
114
+ "nauc_precision_at_5_std": -0.033232,
115
+ "nauc_precision_at_5_diff1": 0.093247,
116
+ "nauc_precision_at_10_max": -0.175322,
117
+ "nauc_precision_at_10_std": -0.019614,
118
+ "nauc_precision_at_10_diff1": 0.053311,
119
+ "nauc_precision_at_20_max": -0.065268,
120
+ "nauc_precision_at_20_std": 0.017691,
121
+ "nauc_precision_at_20_diff1": 0.006087,
122
+ "nauc_precision_at_100_max": 0.261064,
123
+ "nauc_precision_at_100_std": 0.086402,
124
+ "nauc_precision_at_100_diff1": -0.059402,
125
+ "nauc_precision_at_1000_max": 0.292315,
126
+ "nauc_precision_at_1000_std": 0.08814,
127
+ "nauc_precision_at_1000_diff1": -0.062116,
128
+ "nauc_mrr_at_1_max": -0.098919,
129
+ "nauc_mrr_at_1_std": -0.009063,
130
+ "nauc_mrr_at_1_diff1": 0.122263,
131
+ "nauc_mrr_at_3_max": -0.147941,
132
+ "nauc_mrr_at_3_std": -0.030114,
133
+ "nauc_mrr_at_3_diff1": 0.122004,
134
+ "nauc_mrr_at_5_max": -0.164765,
135
+ "nauc_mrr_at_5_std": -0.034314,
136
+ "nauc_mrr_at_5_diff1": 0.120039,
137
+ "nauc_mrr_at_10_max": -0.176443,
138
+ "nauc_mrr_at_10_std": -0.036872,
139
+ "nauc_mrr_at_10_diff1": 0.118896,
140
+ "nauc_mrr_at_20_max": -0.176877,
141
+ "nauc_mrr_at_20_std": -0.036473,
142
+ "nauc_mrr_at_20_diff1": 0.119153,
143
+ "nauc_mrr_at_100_max": -0.172641,
144
+ "nauc_mrr_at_100_std": -0.035365,
145
+ "nauc_mrr_at_100_diff1": 0.119588,
146
+ "nauc_mrr_at_1000_max": -0.172525,
147
+ "nauc_mrr_at_1000_std": -0.035349,
148
+ "nauc_mrr_at_1000_diff1": 0.119591,
149
+ "hit_rate_at_1": 0.12488,
150
+ "hit_rate_at_3": 0.31303,
151
+ "hit_rate_at_5": 0.44189,
152
+ "hit_rate_at_10": 0.63645,
153
+ "hit_rate_at_20": 0.81262,
154
+ "hit_rate_at_100": 0.99422,
155
+ "hit_rate_at_1000": 1.0,
156
+ "max_over_subqueries_ndcg_at_1": 0.16335,
157
+ "max_over_subqueries_ndcg_at_3": 0.25648,
158
+ "max_over_subqueries_ndcg_at_5": 0.30706,
159
+ "max_over_subqueries_ndcg_at_10": 0.3693,
160
+ "max_over_subqueries_ndcg_at_20": 0.41641,
161
+ "max_over_subqueries_ndcg_at_100": 0.46321,
162
+ "max_over_subqueries_ndcg_at_1000": 0.46515,
163
+ "max_over_subqueries_map_at_1": 0.13585,
164
+ "max_over_subqueries_map_at_3": 0.21876,
165
+ "max_over_subqueries_map_at_5": 0.2483,
166
+ "max_over_subqueries_map_at_10": 0.27601,
167
+ "max_over_subqueries_map_at_20": 0.29116,
168
+ "max_over_subqueries_map_at_100": 0.30084,
169
+ "max_over_subqueries_map_at_1000": 0.30104,
170
+ "max_over_subqueries_recall_at_1": 0.13585,
171
+ "max_over_subqueries_recall_at_3": 0.32143,
172
+ "max_over_subqueries_recall_at_5": 0.44129,
173
+ "max_over_subqueries_recall_at_10": 0.61992,
174
+ "max_over_subqueries_recall_at_20": 0.78529,
175
+ "max_over_subqueries_recall_at_100": 0.98951,
176
+ "max_over_subqueries_recall_at_1000": 0.99999,
177
+ "max_over_subqueries_accuracy": 0.13585,
178
+ "max_over_subqueries_precision_at_1": 0.16335,
179
+ "max_over_subqueries_precision_at_3": 0.13127,
180
+ "max_over_subqueries_precision_at_5": 0.11042,
181
+ "max_over_subqueries_precision_at_10": 0.08081,
182
+ "max_over_subqueries_precision_at_20": 0.05388,
183
+ "max_over_subqueries_precision_at_100": 0.01493,
184
+ "max_over_subqueries_precision_at_1000": 0.00152,
185
+ "max_over_subqueries_mrr_at_1_max": -0.090392,
186
+ "max_over_subqueries_mrr_at_1_std": 0.005459,
187
+ "max_over_subqueries_mrr_at_1_diff1": 0.133363,
188
+ "max_over_subqueries_mrr_at_3_max": -0.161398,
189
+ "max_over_subqueries_mrr_at_3_std": -0.038464,
190
+ "max_over_subqueries_mrr_at_3_diff1": 0.092479,
191
+ "max_over_subqueries_mrr_at_5_max": -0.175499,
192
+ "max_over_subqueries_mrr_at_5_std": -0.050052,
193
+ "max_over_subqueries_mrr_at_5_diff1": 0.069578,
194
+ "max_over_subqueries_mrr_at_10_max": -0.131875,
195
+ "max_over_subqueries_mrr_at_10_std": -0.043984,
196
+ "max_over_subqueries_mrr_at_10_diff1": 0.034711,
197
+ "max_over_subqueries_mrr_at_20_max": 0.021877,
198
+ "max_over_subqueries_mrr_at_20_std": 0.017501,
199
+ "max_over_subqueries_mrr_at_20_diff1": 0.010941,
200
+ "max_over_subqueries_mrr_at_100_max": 0.380003,
201
+ "max_over_subqueries_mrr_at_100_std": 0.154387,
202
+ "max_over_subqueries_mrr_at_100_diff1": -0.015138,
203
+ "max_over_subqueries_mrr_at_1000_max": 0.402237,
204
+ "max_over_subqueries_mrr_at_1000_std": 0.159152,
205
+ "max_over_subqueries_mrr_at_1000_diff1": -0.016636,
206
+ "max_over_subqueries_mrr_at_1": 0.163354,
207
+ "max_over_subqueries_mrr_at_3": 0.25612,
208
+ "max_over_subqueries_mrr_at_5": 0.286192,
209
+ "max_over_subqueries_mrr_at_10": 0.310937,
210
+ "max_over_subqueries_mrr_at_20": 0.321854,
211
+ "max_over_subqueries_mrr_at_100": 0.326305,
212
+ "max_over_subqueries_mrr_at_1000": 0.326345,
213
+ "max_over_subqueries_nauc_mrr_at_1_max": -0.090392,
214
+ "max_over_subqueries_nauc_mrr_at_1_std": 0.005459,
215
+ "max_over_subqueries_nauc_mrr_at_1_diff1": 0.133363,
216
+ "max_over_subqueries_nauc_mrr_at_3_max": -0.140351,
217
+ "max_over_subqueries_nauc_mrr_at_3_std": -0.024796,
218
+ "max_over_subqueries_nauc_mrr_at_3_diff1": 0.115522,
219
+ "max_over_subqueries_nauc_mrr_at_5_max": -0.154342,
220
+ "max_over_subqueries_nauc_mrr_at_5_std": -0.032357,
221
+ "max_over_subqueries_nauc_mrr_at_5_diff1": 0.11181,
222
+ "max_over_subqueries_nauc_mrr_at_10_max": -0.162515,
223
+ "max_over_subqueries_nauc_mrr_at_10_std": -0.036825,
224
+ "max_over_subqueries_nauc_mrr_at_10_diff1": 0.11061,
225
+ "max_over_subqueries_nauc_mrr_at_20_max": -0.161477,
226
+ "max_over_subqueries_nauc_mrr_at_20_std": -0.035703,
227
+ "max_over_subqueries_nauc_mrr_at_20_diff1": 0.111926,
228
+ "max_over_subqueries_nauc_mrr_at_100_max": -0.157975,
229
+ "max_over_subqueries_nauc_mrr_at_100_std": -0.033866,
230
+ "max_over_subqueries_nauc_mrr_at_100_diff1": 0.112638,
231
+ "max_over_subqueries_nauc_mrr_at_1000_max": -0.157911,
232
+ "max_over_subqueries_nauc_mrr_at_1000_std": -0.033841,
233
+ "max_over_subqueries_nauc_mrr_at_1000_diff1": 0.11264,
234
+ "max_over_subqueries_hit_rate_at_1": 0.16335,
235
+ "max_over_subqueries_hit_rate_at_3": 0.37756,
236
+ "max_over_subqueries_hit_rate_at_5": 0.50999,
237
+ "max_over_subqueries_hit_rate_at_10": 0.69496,
238
+ "max_over_subqueries_hit_rate_at_20": 0.84949,
239
+ "max_over_subqueries_hit_rate_at_100": 0.99509,
240
+ "max_over_subqueries_hit_rate_at_1000": 0.99999,
241
+ "main_score": 0.30104,
242
+ "hf_subset": "default",
243
+ "languages": [
244
+ "eng-Latn"
245
+ ]
246
+ }
247
+ ]
248
+ },
249
+ "evaluation_time": 3437.6135506629944,
250
+ "kg_co2_emissions": null,
251
+ "date": null
252
+ }
results/NFCorpus.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ec0fa4fe99da2ff19ca1214b7966684033a58814",
3
+ "task_name": "NFCorpus",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.31579,
9
+ "ndcg_at_3": 0.27462,
10
+ "ndcg_at_5": 0.25569,
11
+ "ndcg_at_10": 0.23834,
12
+ "ndcg_at_20": 0.21919,
13
+ "ndcg_at_100": 0.21742,
14
+ "ndcg_at_1000": 0.30869,
15
+ "map_at_1": 0.03145,
16
+ "map_at_3": 0.05121,
17
+ "map_at_5": 0.0577,
18
+ "map_at_10": 0.07051,
19
+ "map_at_20": 0.07759,
20
+ "map_at_100": 0.09077,
21
+ "map_at_1000": 0.10291,
22
+ "recall_at_1": 0.03145,
23
+ "recall_at_3": 0.0623,
24
+ "recall_at_5": 0.07403,
25
+ "recall_at_10": 0.10477,
26
+ "recall_at_20": 0.12896,
27
+ "recall_at_100": 0.22975,
28
+ "recall_at_1000": 0.55977,
29
+ "accuracy": 0.03145,
30
+ "precision_at_1": 0.33437,
31
+ "precision_at_3": 0.26006,
32
+ "precision_at_5": 0.22353,
33
+ "precision_at_10": 0.18421,
34
+ "precision_at_20": 0.13653,
35
+ "precision_at_100": 0.06133,
36
+ "precision_at_1000": 0.01877,
37
+ "mrr_at_1": 0.337461,
38
+ "mrr_at_3": 0.396285,
39
+ "mrr_at_5": 0.408669,
40
+ "mrr_at_10": 0.416509,
41
+ "mrr_at_20": 0.421025,
42
+ "mrr_at_100": 0.424748,
43
+ "mrr_at_1000": 0.425346,
44
+ "nauc_ndcg_at_1_max": 0.459318,
45
+ "nauc_ndcg_at_1_std": 0.27033,
46
+ "nauc_ndcg_at_1_diff1": 0.316696,
47
+ "nauc_ndcg_at_3_max": 0.474554,
48
+ "nauc_ndcg_at_3_std": 0.306883,
49
+ "nauc_ndcg_at_3_diff1": 0.218656,
50
+ "nauc_ndcg_at_5_max": 0.470383,
51
+ "nauc_ndcg_at_5_std": 0.315809,
52
+ "nauc_ndcg_at_5_diff1": 0.188027,
53
+ "nauc_ndcg_at_10_max": 0.478836,
54
+ "nauc_ndcg_at_10_std": 0.338947,
55
+ "nauc_ndcg_at_10_diff1": 0.169437,
56
+ "nauc_ndcg_at_20_max": 0.48157,
57
+ "nauc_ndcg_at_20_std": 0.351689,
58
+ "nauc_ndcg_at_20_diff1": 0.168868,
59
+ "nauc_ndcg_at_100_max": 0.480831,
60
+ "nauc_ndcg_at_100_std": 0.363828,
61
+ "nauc_ndcg_at_100_diff1": 0.189329,
62
+ "nauc_ndcg_at_1000_max": 0.512654,
63
+ "nauc_ndcg_at_1000_std": 0.409454,
64
+ "nauc_ndcg_at_1000_diff1": 0.186264,
65
+ "nauc_map_at_1_max": 0.329477,
66
+ "nauc_map_at_1_std": 0.004434,
67
+ "nauc_map_at_1_diff1": 0.383079,
68
+ "nauc_map_at_3_max": 0.275909,
69
+ "nauc_map_at_3_std": 0.016162,
70
+ "nauc_map_at_3_diff1": 0.304976,
71
+ "nauc_map_at_5_max": 0.304369,
72
+ "nauc_map_at_5_std": 0.045549,
73
+ "nauc_map_at_5_diff1": 0.29173,
74
+ "nauc_map_at_10_max": 0.337623,
75
+ "nauc_map_at_10_std": 0.092832,
76
+ "nauc_map_at_10_diff1": 0.281856,
77
+ "nauc_map_at_20_max": 0.35907,
78
+ "nauc_map_at_20_std": 0.133461,
79
+ "nauc_map_at_20_diff1": 0.267778,
80
+ "nauc_map_at_100_max": 0.387407,
81
+ "nauc_map_at_100_std": 0.197857,
82
+ "nauc_map_at_100_diff1": 0.237283,
83
+ "nauc_map_at_1000_max": 0.40509,
84
+ "nauc_map_at_1000_std": 0.24319,
85
+ "nauc_map_at_1000_diff1": 0.213861,
86
+ "nauc_recall_at_1_max": 0.329477,
87
+ "nauc_recall_at_1_std": 0.004434,
88
+ "nauc_recall_at_1_diff1": 0.383079,
89
+ "nauc_recall_at_3_max": 0.211063,
90
+ "nauc_recall_at_3_std": 0.00042,
91
+ "nauc_recall_at_3_diff1": 0.236115,
92
+ "nauc_recall_at_5_max": 0.239416,
93
+ "nauc_recall_at_5_std": 0.035873,
94
+ "nauc_recall_at_5_diff1": 0.212737,
95
+ "nauc_recall_at_10_max": 0.263227,
96
+ "nauc_recall_at_10_std": 0.098066,
97
+ "nauc_recall_at_10_diff1": 0.201827,
98
+ "nauc_recall_at_20_max": 0.295839,
99
+ "nauc_recall_at_20_std": 0.160337,
100
+ "nauc_recall_at_20_diff1": 0.188195,
101
+ "nauc_recall_at_100_max": 0.294902,
102
+ "nauc_recall_at_100_std": 0.260836,
103
+ "nauc_recall_at_100_diff1": 0.103631,
104
+ "nauc_recall_at_1000_max": 0.216515,
105
+ "nauc_recall_at_1000_std": 0.23711,
106
+ "nauc_recall_at_1000_diff1": 0.021351,
107
+ "nauc_precision_at_1_max": 0.48935,
108
+ "nauc_precision_at_1_std": 0.27657,
109
+ "nauc_precision_at_1_diff1": 0.339521,
110
+ "nauc_precision_at_3_max": 0.469932,
111
+ "nauc_precision_at_3_std": 0.344502,
112
+ "nauc_precision_at_3_diff1": 0.170023,
113
+ "nauc_precision_at_5_max": 0.460347,
114
+ "nauc_precision_at_5_std": 0.362955,
115
+ "nauc_precision_at_5_diff1": 0.112784,
116
+ "nauc_precision_at_10_max": 0.466397,
117
+ "nauc_precision_at_10_std": 0.416047,
118
+ "nauc_precision_at_10_diff1": 0.047258,
119
+ "nauc_precision_at_20_max": 0.454347,
120
+ "nauc_precision_at_20_std": 0.466297,
121
+ "nauc_precision_at_20_diff1": -0.011887,
122
+ "nauc_precision_at_100_max": 0.385027,
123
+ "nauc_precision_at_100_std": 0.512,
124
+ "nauc_precision_at_100_diff1": -0.080123,
125
+ "nauc_precision_at_1000_max": 0.275528,
126
+ "nauc_precision_at_1000_std": 0.419107,
127
+ "nauc_precision_at_1000_diff1": -0.111991,
128
+ "nauc_mrr_at_1_max": 0.493566,
129
+ "nauc_mrr_at_1_std": 0.275078,
130
+ "nauc_mrr_at_1_diff1": 0.330219,
131
+ "nauc_mrr_at_3_max": 0.518618,
132
+ "nauc_mrr_at_3_std": 0.316954,
133
+ "nauc_mrr_at_3_diff1": 0.31088,
134
+ "nauc_mrr_at_5_max": 0.516216,
135
+ "nauc_mrr_at_5_std": 0.315947,
136
+ "nauc_mrr_at_5_diff1": 0.301179,
137
+ "nauc_mrr_at_10_max": 0.5163,
138
+ "nauc_mrr_at_10_std": 0.318001,
139
+ "nauc_mrr_at_10_diff1": 0.300749,
140
+ "nauc_mrr_at_20_max": 0.520015,
141
+ "nauc_mrr_at_20_std": 0.322258,
142
+ "nauc_mrr_at_20_diff1": 0.298741,
143
+ "nauc_mrr_at_100_max": 0.520575,
144
+ "nauc_mrr_at_100_std": 0.323967,
145
+ "nauc_mrr_at_100_diff1": 0.300687,
146
+ "nauc_mrr_at_1000_max": 0.520306,
147
+ "nauc_mrr_at_1000_std": 0.323464,
148
+ "nauc_mrr_at_1000_diff1": 0.300687,
149
+ "hit_rate_at_1": 0.33437,
150
+ "hit_rate_at_3": 0.47059,
151
+ "hit_rate_at_5": 0.52322,
152
+ "hit_rate_at_10": 0.57895,
153
+ "hit_rate_at_20": 0.64396,
154
+ "hit_rate_at_100": 0.78947,
155
+ "hit_rate_at_1000": 0.94427,
156
+ "main_score": 0.23834,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 8.434162616729736,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/NQ.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b774495ed302d8c44a3a7ea25c90dbce03968f31",
3
+ "task_name": "NQ",
4
+ "mteb_version": "2.10.12",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.17555,
9
+ "ndcg_at_3": 0.23645,
10
+ "ndcg_at_5": 0.26416,
11
+ "ndcg_at_10": 0.29353,
12
+ "ndcg_at_20": 0.31533,
13
+ "ndcg_at_100": 0.34906,
14
+ "ndcg_at_1000": 0.37007,
15
+ "map_at_1": 0.15423,
16
+ "map_at_3": 0.21278,
17
+ "map_at_5": 0.22864,
18
+ "map_at_10": 0.24153,
19
+ "map_at_20": 0.24786,
20
+ "map_at_100": 0.25291,
21
+ "map_at_1000": 0.25372,
22
+ "recall_at_1": 0.15423,
23
+ "recall_at_3": 0.28286,
24
+ "recall_at_5": 0.34688,
25
+ "recall_at_10": 0.43272,
26
+ "recall_at_20": 0.51504,
27
+ "recall_at_100": 0.68781,
28
+ "recall_at_1000": 0.84806,
29
+ "accuracy": 0.15423,
30
+ "precision_at_1": 0.17555,
31
+ "precision_at_3": 0.10921,
32
+ "precision_at_5": 0.08146,
33
+ "precision_at_10": 0.05122,
34
+ "precision_at_20": 0.03068,
35
+ "precision_at_100": 0.00829,
36
+ "precision_at_1000": 0.00103,
37
+ "mrr_at_1": 0.17555,
38
+ "mrr_at_3": 0.235419,
39
+ "mrr_at_5": 0.25099,
40
+ "mrr_at_10": 0.262868,
41
+ "mrr_at_20": 0.268445,
42
+ "mrr_at_100": 0.27266,
43
+ "mrr_at_1000": 0.273298,
44
+ "nauc_ndcg_at_1_max": 0.222918,
45
+ "nauc_ndcg_at_1_std": 0.025937,
46
+ "nauc_ndcg_at_1_diff1": 0.335423,
47
+ "nauc_ndcg_at_3_max": 0.231257,
48
+ "nauc_ndcg_at_3_std": 0.035808,
49
+ "nauc_ndcg_at_3_diff1": 0.275095,
50
+ "nauc_ndcg_at_5_max": 0.245777,
51
+ "nauc_ndcg_at_5_std": 0.056148,
52
+ "nauc_ndcg_at_5_diff1": 0.268844,
53
+ "nauc_ndcg_at_10_max": 0.2664,
54
+ "nauc_ndcg_at_10_std": 0.086454,
55
+ "nauc_ndcg_at_10_diff1": 0.259381,
56
+ "nauc_ndcg_at_20_max": 0.273978,
57
+ "nauc_ndcg_at_20_std": 0.093674,
58
+ "nauc_ndcg_at_20_diff1": 0.262795,
59
+ "nauc_ndcg_at_100_max": 0.287623,
60
+ "nauc_ndcg_at_100_std": 0.120658,
61
+ "nauc_ndcg_at_100_diff1": 0.25836,
62
+ "nauc_ndcg_at_1000_max": 0.288121,
63
+ "nauc_ndcg_at_1000_std": 0.122057,
64
+ "nauc_ndcg_at_1000_diff1": 0.260097,
65
+ "nauc_map_at_1_max": 0.206158,
66
+ "nauc_map_at_1_std": 0.007317,
67
+ "nauc_map_at_1_diff1": 0.337521,
68
+ "nauc_map_at_3_max": 0.225809,
69
+ "nauc_map_at_3_std": 0.026891,
70
+ "nauc_map_at_3_diff1": 0.2882,
71
+ "nauc_map_at_5_max": 0.235297,
72
+ "nauc_map_at_5_std": 0.039533,
73
+ "nauc_map_at_5_diff1": 0.284212,
74
+ "nauc_map_at_10_max": 0.245802,
75
+ "nauc_map_at_10_std": 0.05448,
76
+ "nauc_map_at_10_diff1": 0.279804,
77
+ "nauc_map_at_20_max": 0.248346,
78
+ "nauc_map_at_20_std": 0.056863,
79
+ "nauc_map_at_20_diff1": 0.281184,
80
+ "nauc_map_at_100_max": 0.250765,
81
+ "nauc_map_at_100_std": 0.061441,
82
+ "nauc_map_at_100_diff1": 0.280413,
83
+ "nauc_map_at_1000_max": 0.250821,
84
+ "nauc_map_at_1000_std": 0.061609,
85
+ "nauc_map_at_1000_diff1": 0.28043,
86
+ "nauc_recall_at_1_max": 0.206158,
87
+ "nauc_recall_at_1_std": 0.007317,
88
+ "nauc_recall_at_1_diff1": 0.337521,
89
+ "nauc_recall_at_3_max": 0.22917,
90
+ "nauc_recall_at_3_std": 0.042829,
91
+ "nauc_recall_at_3_diff1": 0.239276,
92
+ "nauc_recall_at_5_max": 0.253978,
93
+ "nauc_recall_at_5_std": 0.07995,
94
+ "nauc_recall_at_5_diff1": 0.225205,
95
+ "nauc_recall_at_10_max": 0.301655,
96
+ "nauc_recall_at_10_std": 0.153119,
97
+ "nauc_recall_at_10_diff1": 0.199061,
98
+ "nauc_recall_at_20_max": 0.327534,
99
+ "nauc_recall_at_20_std": 0.179411,
100
+ "nauc_recall_at_20_diff1": 0.207237,
101
+ "nauc_recall_at_100_max": 0.415498,
102
+ "nauc_recall_at_100_std": 0.346198,
103
+ "nauc_recall_at_100_diff1": 0.17209,
104
+ "nauc_recall_at_1000_max": 0.5239,
105
+ "nauc_recall_at_1000_std": 0.531621,
106
+ "nauc_recall_at_1000_diff1": 0.142387,
107
+ "nauc_precision_at_1_max": 0.222918,
108
+ "nauc_precision_at_1_std": 0.025937,
109
+ "nauc_precision_at_1_diff1": 0.335423,
110
+ "nauc_precision_at_3_max": 0.25372,
111
+ "nauc_precision_at_3_std": 0.06511,
112
+ "nauc_precision_at_3_diff1": 0.234871,
113
+ "nauc_precision_at_5_max": 0.28103,
114
+ "nauc_precision_at_5_std": 0.112169,
115
+ "nauc_precision_at_5_diff1": 0.217237,
116
+ "nauc_precision_at_10_max": 0.316536,
117
+ "nauc_precision_at_10_std": 0.185983,
118
+ "nauc_precision_at_10_diff1": 0.183687,
119
+ "nauc_precision_at_20_max": 0.328022,
120
+ "nauc_precision_at_20_std": 0.20735,
121
+ "nauc_precision_at_20_diff1": 0.179215,
122
+ "nauc_precision_at_100_max": 0.336547,
123
+ "nauc_precision_at_100_std": 0.309329,
124
+ "nauc_precision_at_100_diff1": 0.113377,
125
+ "nauc_precision_at_1000_max": 0.284647,
126
+ "nauc_precision_at_1000_std": 0.316097,
127
+ "nauc_precision_at_1000_diff1": 0.047262,
128
+ "nauc_mrr_at_1_max": 0.222918,
129
+ "nauc_mrr_at_1_std": 0.025937,
130
+ "nauc_mrr_at_1_diff1": 0.335423,
131
+ "nauc_mrr_at_3_max": 0.236996,
132
+ "nauc_mrr_at_3_std": 0.043899,
133
+ "nauc_mrr_at_3_diff1": 0.285884,
134
+ "nauc_mrr_at_5_max": 0.244093,
135
+ "nauc_mrr_at_5_std": 0.055587,
136
+ "nauc_mrr_at_5_diff1": 0.282415,
137
+ "nauc_mrr_at_10_max": 0.252022,
138
+ "nauc_mrr_at_10_std": 0.066881,
139
+ "nauc_mrr_at_10_diff1": 0.279005,
140
+ "nauc_mrr_at_20_max": 0.253407,
141
+ "nauc_mrr_at_20_std": 0.067787,
142
+ "nauc_mrr_at_20_diff1": 0.279861,
143
+ "nauc_mrr_at_100_max": 0.254793,
144
+ "nauc_mrr_at_100_std": 0.070438,
145
+ "nauc_mrr_at_100_diff1": 0.279326,
146
+ "nauc_mrr_at_1000_max": 0.25482,
147
+ "nauc_mrr_at_1000_std": 0.070489,
148
+ "nauc_mrr_at_1000_diff1": 0.279424,
149
+ "hit_rate_at_1": 0.17555,
150
+ "hit_rate_at_3": 0.31373,
151
+ "hit_rate_at_5": 0.3821,
152
+ "hit_rate_at_10": 0.46929,
153
+ "hit_rate_at_20": 0.55041,
154
+ "hit_rate_at_100": 0.71495,
155
+ "hit_rate_at_1000": 0.86269,
156
+ "main_score": 0.29353,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 3823.927045106888,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }