• Low Resource Domain Adaptation for Indic Machine Translation
  • Home
  • About
  • Register
  • Important Dates
  • Contact
  • Organizers
  • ICON 2020
Test Data Released

Adap-MT 2020: Low Resource Domain Adaptation for Indic Machine Translation



The focus of this task is to show capability of general domain machine translation when translating into Indic languages (English - Hindi , English - Telugu and Hindi - Telugu) and Low resource domain adaptation of MT systems using (Existing available parallel corpora + small indomain parallel data) for AI and Chemistry Domains.

Goals of the shared task:

  • To investigate the applicability of current MT techniques when translating into Indic Languages
  • To test Translation performance for low-resource, morphologically rich languages
  • To investigate Low resource domain adptation when translating into Indic Languages
  • To create publicly available corpora for machine translation and machine translation evaluation



Task Details

Adap-MT 2020

The subtask is defined as follows:

  • SubTask 1 : To show sentence level Machine translation capability for on General domain .
  • SubTask 2 : To show sentence level Machine translation capability for on specified domains (AI, chemistry) utilizing general domain parallel data and very limited domain specific data for domain adaptation.

Data

Use relevant parallel corpora from following sources for task related Language pairs for General Domain
  • CVIT-PIB
  • CVIT-MKB
  • IITB
  • Mechanical Turk
  • JW
  • ALT
  • bible-uedin
  • globalvoices
  • gnome
  • kde4
  • opensubtitles
  • tanzil
  • tatoeba
  • ubuntu
  • wikimedia

Domain Specific data can be download from here (For AI and Chemistry)

Download
Test Data Download
- Password for the data download will be shared after registration.
- by clicking download, you are agreeing to data license and share task rules


Monoligual Corpora from Indic NLP link
Before using any corpora other than those listed above kindly ask the organizers.

Evaluation

Automatic evaluation: Metric: BLEU, RIBES



Registration

To register for participation in the shared tasks, please fill this form.


Important Dates

Please consult the Shared Task website for official dates for the Shared Tasks. All submission deadlines are 11:59 PM IST (Anywhere on Earth) Time Zone (UTC+5:30).

Event Date
Shared Task Announcement Oct 07, 2020
Registration Open Oct 07, 2020
Data Released Oct 17, 2020
Deadline for Registration Oct 30, 2020 Nov 08, 2020
Test Set Release (Blind) Nov 02, 2020 Nov 10, 2020
System Runs Due Nov 10, 2020 Nov 18, 2020
Preliminary System Reports Due in SoftConf Nov 20, 2020 Nov 28, 2020
Notification for Acceptance Dec 03, 2020
Camera Ready Due Dec 05, 2020
Participant Presentations at ICON 2020 TBD


Contact

For further information about this task and dataset, please contact:

  • adapmt2020, adapmt2020@googlegroups.com



Organizing Committee

  • Dipti Misra Sharma (IIIT-Hyderabad)
  • Asif Ekbal (IIT-Patna)
  • Karunesh Arora (C-DAC, Noida)
  • Sudip Kumar Naskar (Jadavpur University)
  • Dipankar Ganguly (C-DAC, Noida)
  • Sobha L (AUKBC-Chennai)
  • Radhika Mamidi (IIIT-Hyderabad)
  • Sunita Arora (C-DAC, Noida)
  • Pruthwik Mishra (IIIT-Hyderabad)
  • Vandan Mujadia (IIIT-Hyderabad)


Contact: adapmt2020@googlegroups.com

Follow us: https://twitter.com/adapmt2020

© 2020 LTRC, IIIT-Hyderabad

Back to top