Shared task & Workshop
                              Machine Translation System
                                       Indian Languages

                                Proceedings will be submitted to for online publication.

                                                                                                              About CEN Task Description


                                                                        TASK DESCRIPTION
About MT :
Machine Translation is a phenomena of converting a text from one language to another. This Machine Translation has been explored well already in English and other European Languages. But, it is still naive to Asian Languages, especially Indian Languages. This Machine Translation shared task focuses mainly on four Indian language pairs.
                                                                                 1.English - Tamil
                                                                                 2.English - Malayalam
                                                                                 3.English - Hindi and
                                                                                 4.English - Punjabi
Goals :
In recent years, the multilingual content over web grows exponentially along with the development of internet. The usage of multilingual contents is excluded from the native language users because of language barrier. So, automatic machine translation is the only possible solution to make these content available for native language crowd. Henceforth, this shared task aims at the following objectives. They are: To scrutinize the state-of-art machine translation mechanisms when translating from English to indian languages To explore the challenges faced in translating between morphologically divergent languages in terms of syntactic structure and morphology. To create an elegant open source parallel corpora for machine translation task for indian languages, which is lacking. To explore the potentiality of Indian Language further. We believe that both beginners and established research groups will participate in this task.
Task Descriptions:
* The parallel corpus for two domains, General and Agriculture will be given for English - Tamil and English - Hindi language pair. For the rest, we provide parallel corpus for General domain only.
* The parallel corpus is splitted as training and development data.
* Participants are requested to develop MT system under the constrained environment (use the provided training and development data).
* However, participants are allowed to use their own language model.
* Participants using their own language model should provide the statistics and source of data used for development.
* We also strongly encourage the participants to develop any state-of-art MT system (own language model, translational model, and decoder) as per their wish.
* Participants are allowed to use other open-source linguistic tools such as POS tagger, morphological analyser/generator etc. for developing MT.
* Participants using external linguistic tools must flag their system about the tools used.
* The test data will be released later.
* The translation quality is measured by a manual evaluation and automatic evaluation metric.
* Participants are requested to contribute to do the manual evaluation.


Click for Registration


  • Tamil
  • Malayalam
  • Hindi
  • Punjabi

        Important Dates

Training Data Release

20th April,2017

Test Data Release

20th May,2017

Run Submission Deadline

25th May,2017

Results Declared

5th June,2017

Working Notes Due

10th July,2017


7th and 8th September,2017


Web site launched on 16-03-2017

Anand Kumar M, CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Prof. Rajendran S,CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Soman K P ,CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Advisory committee

Prof. Ramanan, RelAgent Pvt Ltd, Chennai
Prof. N. Deiva Sundaram, NDS Lingsoft Solutions Pvt. Ltd., Chennai
Dr. V. Dhanalakshmi, Assistant Director, Tamil Virtual Academy,Chennai
Dr. Govind D , CEN,Amrita Vishwa Vidyapeetham
Mr. Vijay Krishnan Menon ,CEN,Amrita Vishwa Vidyapeetham
Mr. Barathi Ganesh , Data Science practitioner,TCS Cochin

Student Coordinators
Premjith B , CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Kavirajan B, CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Sanjana Shree P, CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Athira G, CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Vaithehi S, CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India
Shivkaran Singh, CEN, Amrita Vishwa Vidyapeetham, Coimbatore, India


You Have any Queries? That's great! Give us a call or send us an email and we will get back to you as soon as possible!


Phone: 0 422 2685000 (5594)