Dependency Parsing for 7 Indian Languages

Introduction

Language Technologies Research Centre (LTRC) at the International Institute of Information Technology, Hyderabad (IIITH) is pleased to announce a shared task for dependency parsing in 7 Indian languages: Hindi, Kannada, Bengali, Telugu, Malayalam, Marathi, and Urdu.

Dependency Parsing is an essential NLP task that identifies the grammatical structure of a sentence and the relationships between its constituent words. It is also beneficial for natural language processing applications such as machine translation, dialog systems, summarization, and question answering systems.

Task Objective

The main goal of this shared task is to develop multilingual neural dependency parsing models for Indian languages. The dependency annotated data also contains rich linguistic information such as Part-of-Speech tags, Chunk tags, morph information (root, lexical category, gender, number, person, case, vibhakti marker). We also want to advance research in the direction of incorporating linguistic properties in neural architectures for downstream tasks like dependency parsing.

All participants are required to create a neural multilingual model that can parse a sentence in any of the languages.

Data

Participants will be provided with a corpus of dependency annotated data for Hindi, Kannada, Bengali, Telugu, Malayalam, Marathi, and Urdu. The dependency annotations are based on the Paninian dependency framework. The guidelines of the dependency labels will also be released along with the data release. The data will be released in two formats: SSF (Shakti Standard Format) and CONLL.

Evaluation

The dependency parsers will be evaluated based on label attachment score (LAS). The evaluation will involve comparing the model's predictions against a blind test data.

Registration

To register for participation in the shared task, please fill this form.

Timelines

  • Task Announcement: November 15, 2023
  • Training and Dev Data Release: November 20, 2023
  • Test Data Release: December 01, 2023
  • Submission Deadline: December 03, 2023
  • Evaluation Results Announcement: December 04, 2023
  • Report (4 page) submission Deadline: December 06, 2023

Prizes

Prizes will be awarded to the top-performing participants or teams.

  • 1st Prize : 20K INR
  • 2nd Prize : 15K INR
  • 3rd Prize : 10K INR

Contact Information

  • For any queries you may write to:
    Shared Task Admin Office: ltrc.office1@iiit.ac.in

Program Committee

  • Dipti Misra Sharma (IIITH)
  • Parameswari Krishnamurthy (IIITH)
  • Arafat Ahsan (IIITH)
  • Vandan Mujadia (IIITH)
  • Akshit Kumar (IIITH)
  • Pruthwik Mishra (IIITH)


© LTRC,IIIT-Hyderabad