NADI 2022 Shared-Task

INTRODUCTION

Arabic has a widely varying collection of dialects. Many of these dialects remain under-studied due to the rarity of resources. The goal of the Nuanced Arabic Dialect Identification (NADI) shared task series is to alleviate this bottleneck by providing datasets and modeling opportunities for participants to carry out dialect identification. Dialect identification is the task of automatically detecting the source variety of a given text or speech segment. In addition to nuanced dialect identification at the country level, NADI 2022 also offers a new subtask focused on country-level sentiment analysis. While we invite participation in either of the two subtasks, we hope that teams will submit systems to both tasks (i.e., participate in the two tasks rather than only one task). By offering two subtasks, we hope to receive systems that exploit diverse machine learning architectures. This could include multi-task learning systems as well as sequence-to-sequence architectures in a single model such as the text-to-text Transformer. Other approaches could also be possible. We introduce the two subtasks next.

SHARED TASK

This shared task targets country-level dialects and sentiment identification for dialectical Arabic. The subtasks are:

Subtask 1: Country-level dialect identification. In this subtask, we provide a Twitter training dataset that covers 18 dialects (total of ~20K tweets, the same as NADI 2021). The final evaluation will be on two new test sets: the first test set (TEST-A) covers 18 country-level dialects, whereas the second test set (TEST-B) covers k country-level dialects, where we keep k unknown. The average score between the two test sets is the subtask score.

Subtask 2: Sentiment analysis of country-level Arabic. A total of 5,000 tweets covering ten Arab countries (involving both MSA and dialects) manually labeled with tags from the set {positive, negative, neutral}. The dataset is split into 1,500 tweets for training (TRAIN), 500 tweets for development (DEV), and 3,000 tweets for test (TEST). We intentionally provide a small training dataset to encourage various approaches including `few-shot’.

METRICS

The evaluation metrics will include precision/recall/f-score/accuracy. Official metrics:
         (1) Macro-Averaged F-score will be the official metric for subtask 1 (dialect ID)
         (2) Macro-F1-PN score over the positive and negative classes only while neglecting the neutral class will be the official metric for subtask 2 (sentiment).

The evaluation of shared tasks will be hosted through CODALAB. Teams will be provided with a CODALAB link for each shared task.
         -CODALAB link for NADI Shared Task Subtask 1
         -CODALAB link for NADI Shared Task Subtask 2

DOWNLOAD DATASET

Train, development, and test (unlabelled) dataset has already been released to registered participants via email. The evaluation stage is over but you can score your system on the Codalab by the post-evaluation phase. By downloading the NADI-2022 Shared Task files from HERE, you agree to the terms of the license. This is the data registration form. https://forms.gle/kSjvKTsWD4wti1bJA

IMPORTANT DATES

         - Shared task announcement. Release of training data and scoring script.
         - August 7, 2022 August 15, 2022: Registration deadline (extended)
         - August 14, 2022August 16, 2022: Test set made available (extended)
         - August 30, 2022 September 15, 2022:: Codalab TEST system submission deadline (extended)
         - September 5, 2022: September 27, 2022: Shared task system paper submissions due(extended)
         - October 10, 2022: Notification of acceptance
         - October 21, 2022: Camera-ready version
         - December 8, 2022: WANLP 2022 workshop at EMNLP in Abu Dhabi.
         * All deadlines are 11:59 PM UTC-12:00 (Anywhere On Earth).

CONTACT

For any questions related to this task, please contact the organizers directly using the following email address: ubc.nadi2020@gmail.com or join the google group: https://groups.google.com/d/forum/nadi_shared_task.

ORGANIZERS

Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany (The University of British Columbia, Canada), Houda Bouamor (Carnegie Mellon University, Qatar), and Nizar Habash (New York University Abu Dhabi).