Automatic extraction of domain-specific terms based on relevant domain concepts for a given document is a highly challenging task. Domain Term extraction can be defined as the automated process of identifying terminology from domain specific texts. Domain terms are described as textal span which represent/describe the domain. Participants are asked to develop an unsupervised system/s which automatically identify the technical terms from the given text (a document) of English. Such a document provides information about specific technical domains like Computer Science, Physics, Life Science, Law etc.
Domain dependent machine translation needs to pay special attention to the domain configuration especially to the domain terms. Using this as a feature in machine translation (MT) systems has shown benefit for overall translation adequacy.
For this task, Participants will be provided with domain specific corpora/text without any tagging and Participants can use these domain specific monolingual data to develop their unsupervised domain term extraction system/s. Participants have to use only provided monolingual domain corpora for the task but can use available tools to improve their system (i.e POS tagger, morph, etc.).
The Task is defined as follows:
SubTask will be scored using standard evaluation metrics, including accuracy, precision, recall and F1-score. The submissions will be ranked by F1-score.
To register for participation in the shared tasks, please fill this form.
Please consult the Shared Task website for official dates for the Shared Tasks. All submission deadlines are 11:59 PM IST (Anywhere on Earth) Time Zone (UTC+5:30).
|Shared Task Announcement
|Oct 07, 2020
|Oct 07, 2020
|Oct 14, 2020
|Deadline for Registration
|Test Set Release (Blind)
|System Runs Due
|Preliminary System Reports Due in SoftConf
|Notification for Acceptance
|Dec 03, 2020
|Camera Ready Due
|Dec 05, 2020
|Participant Presentations at ICON 2020
For further information about this task and dataset, please contact: