PeterYu commited on
Commit
4650c36
·
verified ·
1 Parent(s): 826be5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -3
README.md CHANGED
@@ -1,3 +1,84 @@
1
- ---
2
- license: cc-by-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ datasets:
4
+ - masakhane/InjongoIntent
5
+ language:
6
+ - en
7
+ - am
8
+ - ee
9
+ - ha
10
+ - ig
11
+ - rw
12
+ - ln
13
+ - om
14
+ - sn
15
+ - sot
16
+ - sw
17
+ - tw
18
+ - wo
19
+ - xh
20
+ - yo
21
+ - zu
22
+ - lg
23
+ base_model:
24
+ - google/gemma-2-9b-it
25
+ library_name: transformers
26
+ metrics:
27
+ - accuracy
28
+ ---
29
+
30
+ # INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
31
+ <!--
32
+ ## Evaluation Comparison
33
+ -->
34
+ ## Language Codes
35
+
36
+ - **eng**: English
37
+ - **amh**: Amharic
38
+ - **ewe**: Ewe
39
+ - **hau**: Hausa
40
+ - **ibo**: Igbo
41
+ - **kin**: Kinyarwanda
42
+ - **lin**: Lingala
43
+ - **lug**: Luganda
44
+ - **orm**: Oromo
45
+ - **sna**: Shona
46
+ - **sot**: Sesotho
47
+ - **swa**: Swahili
48
+ - **twi**: Twi
49
+ - **wol**: Wolof
50
+ - **xho**: Xhosa
51
+ - **yor**: Yoruba
52
+ - **zul**: Zulu
53
+
54
+ ## Notes
55
+
56
+ - **Bold** values indicate the best performing scores in each category
57
+ - The highlighted models (AfroXLMR 76L) show the top overall performance
58
+ - Multi-lingual training generally outperforms in-language training
59
+ - Standard deviations are reported alongside average scores
60
+ - AVG doest not include english results.
61
+
62
+ ### Citation
63
+ ```
64
+ @misc{yu2025injongo,
65
+ title={INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages},
66
+ author={Hao Yu and Jesujoba O. Alabi and Andiswa Bukula and Jian Yun Zhuang and En-Shiun Annie Lee and Tadesse Kebede Guge and Israel Abebe Azime and Happy Buzaaba and Blessing Kudzaishe Sibanda and Godson K. Kalipe and Jonathan Mukiibi and Salomon Kabongo Kabenamualu and Mmasibidi Setaka and Lolwethu Ndolela and Nkiruka Odu and Rooweither Mabuya and Shamsuddeen Hassan Muhammad and Salomey Osei and Sokhar Samb and Juliet W. Murage and Dietrich Klakow and David Ifeoluwa Adelani},
67
+ year={2025},
68
+ eprint={2502.09814},
69
+ archivePrefix={arXiv},
70
+ primaryClass={cs.CL},
71
+ url={https://arxiv.org/abs/2502.09814},
72
+ }
73
+ ```
74
+
75
+ ```
76
+ @misc{adelani2023sib200,
77
+ title={SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects},
78
+ author={David Ifeoluwa Adelani and Hannah Liu and Xiaoyu Shen and Nikita Vassilyev and Jesujoba O. Alabi and Yanke Mao and Haonan Gao and Annie En-Shiun Lee},
79
+ year={2023},
80
+ eprint={2309.07445},
81
+ archivePrefix={arXiv},
82
+ primaryClass={cs.CL}
83
+ }
84
+ ```