WAT2022 English-Bengali Multi-Modal Translation Task

After three successive event of “WAT 2019, WAT2020 and WAT2021 English-Hindi Multimodal Translation Task”, the Workshop on Asian Translation 2022 (WAT2022) will continue the task of multimodal task with new language Bengali. The task relies on our “Bengali Visual Genome”, a multimodal dataset consisting of text and images suitable for English-Bengali multimodal machine translation task and multimodal research.

Timeline

  • July 11, 2022: Translations need to be submitted to the organizers
  • Aug 1, 2022: System description paper submission deadline
  • Aug 9, 2022: Review feedback for system description
  • Sept 5, 2022: Camera-ready
  • Oct 10-17, 2021: WAT2022 takes place

Task Description

The setup of the WAT2022 task is as follows:

  • Inputs:
    • An image,
    • A rectangular region in that image
    • A short English caption of the rectangular region.
  • Output:
    • The caption translated to Hindi.

Types of Submissions Expected

The setup of the WAT2022 task is as follows:

  • Text-only translation
  • Hindi-only image captioning
  • Multi-modal translation (uses both the image and the text)

Training Data

The Hindi Visual Genome consists of:

  • 29k training examples
  • 1k dev set
  • 1.6k evaluation set

Evaluation

WAT2022 Multi-Modal Task will be evaluated on:

  • 1.6k evaluation set of Bengali Visual Genome
  • 1.4k challenge set of Bengali Visual Genome

Means of evaluation:

  • Automatic metrics: BLEU, CHRF3, and others
  • Manual evaluation, subject to the availability of Bengali speakers

Participants of the task need to indicate which track their translations belong to:

  • Text-only / Image-only / Multi-modal
    • see above
  • Domain-Aware / Domain-Unaware
    • Whether the full (English) Visual Genome was used in the training or not.
  • Constrained / Non-Constrained
    • 29k training segments from the Bengali Visual Genome
    • HindEnCorp 0.5
    • (English-only) Visual Genome [making the submission a domain-aware run]
  • Non-constrained submission may use other data, but need to specify what data was used.
       

Download Link

http://hdl.handle.net/11234/1-3722

Submission Requirement

The system description should be a short report (4 to 6 pages) submitted to WAT 2022 describing the method(s).

Each participating team can submit at most 2 systems for each of the task (e.g. Text-only, Bengali-only image captioning, multimodal translation using text and image). Please submit through the submission link available in the WAT2022 website and select the task for submission.   

Paper and References

Please refer to the below papers:

[paper] 

Bengali Visual Genome: A Multimodal Dataset for Machine Translation and Image Captioning

 

[Reference Papers]

Multimodal Neural Machine Translation System for English to Bengali

 

Organizers

  • Shantipriya Parida (Silo AI, Finland)
  • Ondřej Bojar (Charles University, Czech Republic)

Contact

email: wat-multimodal-task@ufal.mff.cuni.cz

License

The data is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.

Acknowledgement

This shared task is supported by the below projects/grants from Charles University (Czech Republic).

  • Grantová agentura České republiky, Project code: 19-26934X, Project name: Neural Representations in Multi-modal and Multi-lingual Modelling