ArabicNLP 2025 Shared Task | Multidialectal Arabic Speech Processing

Shared Task Subtasks

Three complementary tasks addressing core challenges in Arabic speech processing.

1

Spoken Arabic Dialect Identification (ADI)

Open Track

Objective

Despite notable progress in speech processing, Arabic dialect identification from speech remains a significant challenge due to the rich linguistic diversity of Arabic and the limited availability of labeled datasets. Although earlier shared tasks on spoken Arabic dialect identification (e.g., Ali et al., 2017; Ali et al., 2019) laid important groundwork, the field has since seen substantial advancements—particularly with the rise of large-scale, joint ASR and language identification models such as Whisper (Radford et al., 2023) and MMS (Pratap et al., 2024). This subtask is designed to (1) encourage community involvement in Arabic speech technology by providing a benchmark dataset and evaluation framework that supports innovation and collaboration and (2) evaluate the effectiveness of post-i-vector and x-vector-based language identification systems in the context of Arabic dialects. The task is of high practical relevance for multilingual AI systems, voice assistants, and digital accessibility, particularly in supporting Arabic speakers across various dialect regions. It also holds promise for real-world applications, including automated transcription, conversational AI, and technologies targeting low-resource languages.

Dataset

Newly created, high-quality multidialectal Arabic speech corpus
8 hours of dialect-annotated speech for adaptation
8 hours for validation
8 hours blind test set
No mandatory training data; external resources like ADI-5/ADI-17 are allowed.

Evaluation

Primary: Accuracy
Secondary: Average Cost (LRE 2022 formulation)
Baseline: Pretrained ECAPA-TDNN VoxLingua107 system Finetuned on adaptation split.

Resources

2

Multidialectal Arabic ASR

Open Track

Objective

The goal of this subtask is to develop ASR systems capable of accurately transcribing Arabic speech across multiple dialects. Participants must address challenges such as phonetic variation, code-switching, and dialectal diversity, using the Casablanca dataset as a benchmark for building robust, dialect-aware speech recognition models. Participants will be provided with training and validation/development data for model development for the ASR subtask, while a private, previously unseen test dataset will be hosted on Codabench. The ASR subtask can be approached in a zero-shot setting, where models are evaluated directly on the validation set to prepare systems for the blind testing. Alternatively, participants can finetine models on the training data or use data from the training set for few-shot learning. For the SER subtask, participants will be evaluating directly on the test set via Codabench without separate training or validation data provided.

Dataset

Casablanca dataset
Train: 12,800 utterances (1,600 per dialect)
Dev: 12,800 utterances
Test: 10,298 blind utterances
Total: 35,898 utterances

Evaluation

Primary: Word Error Rate (WER)
Secondary: Character Error Rate (CER)
Baseline: Zero-shot Whisper-large-v3

Resources

3

Diacritic Restoration

Open Track Closed Track

Objective

The proposed task aims to advance research on automatic vowelization for spoken Arabic varieties. As the vast majority of existing vowelization or diacritic restoration efforts focus on Classical Arabic (CA) or Modern Standard Arabic (MSA), we aim to raise attention to more challenging spoken varieties, such as dialects and code-switching, with a focus on generalization across different varieties. The shared task also encourages multimodal (text/speech) approaches for distant supervision to achieve generalizable performance. Participants will be provided with baseline systems and relevant speech/text resources, and their submitted systems will be evaluated against manually annotated test sets that include CA, MSA, and dialects with codeswitching instances.

Tracks

Track 1 (Open): Participants may use any speech or text data, including external resources, as long as test sets are excluded from training. All used resources must be documented in the system description paper.
Track 2 (Closed): k In this track, participants are asked to use only the provided training/validation resources for a fair and controlled comparison

Datasets

Provided training and development sets for closed track are available here: https://huggingface.co/collections/MBZUAI/nadi-2025-sub-task-3-datasets-683739edbf94db861a4d4edf
Development sets are named test/development on huggingface. The official test set will be provided at a later date.
The datasets represent a wide range of Arabic varieties and recording conditions, with over 85K training sentences in total. Datasets consist of dialectal, modern standard, classical, and code-switched Arabic speech and diacritized transcriptions.

**Table 1:** Number of sentences in datasets provided for the shared task. *(Our test set)* refers to the held-out test set for this shared task. CA refers to Classical Arabic. CS refers to Code Switched Arabic.
Dataset	Type	Diacritized	Train
MDASPC	Multi-dialectal	True	60677
TunSwitch	Dialectal, CS	True	5212
ArzEn	Dialectal, CS	False	3344
Mixat	Dialectal, CS	False	3721
ClArTTS	CA	True	9500
ArVoice	MSA	True	2507

Evaluation

Primary: Word Error Rate (WER) and Character Error Rate (CER)
Baselines:
- Multimodal: ArTST-based attention fusion
- Text-only: CATT (https://arxiv.org/abs/2407.03236)

Possible Research Directions

Semi-supervised Data Augmentation: diacritization of speech transcripts using text-based diacritizers / LLMs for model training.

Resources

🏆 Leaderboards

Top-performing systems for each subtask in the NADI 2025 shared task.

Spoken Arabic Dialect Identification (ADI)

RANK	Codabench Username	Accuracy	Cost
🥇	harounelleuch	0.7983	0.1788
🥈	badr_alabsi	0.7640	0.2265
🥉	rafiulbiswas	0.616	0.3068
4	gahmed92	0.612	0.3477
5	ADI Baseline	0.6109	0.3422

Multidialectal ASR

RANK	Codabench Username	Avg-WER	Avg-CER	JOR-WER	EGY-WER	MOR-WER	ALG-WER	YEM-WER	MAU-WER	UAE-WER	PAL-WER	JOR-CER	EGY-CER	MOR-CER	ALG-CER	YEM-CER	MAU-CER	UAE-CER	PAL-CER
🥇	msalhab96	35.688	12.207	20.684	20.885	41.715	53.623	44.623	59.031	22.667	22.278	5.641	7.333	14.045	18.442	14.303	23.283	6.551	8.058
🥈	youssef_saidi	38.539	14.527	28.034	26.831	38.267	53.732	46.632	58.105	29.352	27.362	9.364	11.435	13.664	20.435	16.665	24.535	9.914	10.205
🥉	yusser	39.785	14.755	28.844	29.501	43.069	55.044	46.416	59.365	28.380	27.660	9.474	11.914	15.524	20.592	16.053	24.850	9.044	10.590
4	alhassan10ehab	42.050	16.187	32.247	24.728	48.220	60.322	51.766	66.234	28.008	24.873	9.901	10.213	18.119	23.338	20.411	29.113	8.986	9.412
5	badr_alabsi	44.146	15.588	31.744	37.239	43.310	56.121	46.147	63.324	38.655	36.627	9.949	12.573	15.070	21.386	15.686	26.701	11.147	12.189
6	ASR Baseline	93.897	72.794	46.096	100.068	100.382	101.031	101.091	100.591	101.152	100.769	19.286	81.377	80.422	79.589	80.583	82.893	80.275	77.926
7	rafiulbiswas	104.895	84.693	44.974	113.978	104.078	116.599	113.541	111.594	116.791	117.607	19.194	97.658	87.585	94.268	94.563	92.847	97.005	94.422

Diacritic Restoration (DR)

RANK	Codabench Username	WER	CER
🥇	gahmed92	0.55	0.13
🥈	mohamed_elrefai	0.64	0.15

Important Dates

Key milestones for the ArabicNLP 2025 Shared Task.

Training Data Release

~~June 1,2025~~ June 10,2025

Release of training/dev data and evaluation scripts.

Registration Deadline

~~July 20,2025~~ July 23,2025

Final registration deadline and test set release.

Submission Deadline

~~July 25,2025~~ July 29,2025

Test submission deadline via Codabench.

Results Announcement

July 30, 2025

Final results released to participants.

Paper Submission

August 15, 2025

System description papers due.

Workshop

November 5–9, 2025

ArabicNLP 2025 Workshop in Suzhou, China.

Participation Guidelines

🤝 Please fill out [this form] to register and participate.

---

For each task, please participate through the Codabench link, which can be found in the respective subtask sections above. For any questions or clarifications, please visit our FAQ page. Detailed instructions for preparing and submitting your paper(s) can be found on the Paper Guidelines page. .

Organizing Committee

Muhammad Abdul-Mageed, University of British Columbia

Bashar Talafha, University of British Columbia

Hawau Olamide Toyin, MBZUAI

Peter Sullivan, University of British Columbia

AbdelRahim Elmadany, University of British Columbia

Abdurrahman Juma, Birzeit University

Amirbek Djanibekov, MBZUAI

Chiyu Zhang, University of British Columbia

Hamad Alshehhi, MBZUAI

Hanan Aldarmaki, MBZUAI

Mustafa Jarar, Birzeit University

Nizar Habash, New York University Abu Dhabi & MBZUAI