EMNLP 2024 -

Faculty and students presented the following papers at the Proceedings of the Ninth Conference on Machine Translation at EMNLP 2024 – Association for Computational Linguistics, held at Miami, Florida from 12 to 16 November:

chrF-S: Semantics Is All You Need – Ananya Mukherjee and Manish Shrivastava

Here is the summary of the research work:

Machine translation (MT) evaluation metrics like BLEU and chrF++ are widely used reference-based metrics that do not require training and are language-independent. However, these metrics primarily focus on n-gram matching and often overlook semantic depth and contextual understanding. To address this gap, we introduce chrF-S (Semantic chrF++), an enhanced metric that integrates sentence embeddings to evaluate translation quality more comprehensively. By combining traditional character and word n-gram analysis with semantic information derived from embeddings, chrF-S captures both syntactic accuracy and sentence-level semantics. This paper presents our contributions to the WMT24 shared metrics task, showcasing our participation and the development of chrF-S. We also demonstrate that, according to preliminary results on the leaderboard, our metric performs on par with other supervised and LLM-based metrics. By merging semantic insights with n-gram precision, chrF-S offers a significant enhancement in the assessment of machine-generated translations, advancing the field of MT evaluation.

This paper has outperformed several baselines like chrf, bertscore, BLEURT-20, XCOMET, BLEU etc in the AFRIMTE Challenge Set which covered 13 language pairs primarily focused on African Languages. (https://lnkd.in/gsHveeeV)

CoST of breaking the LLMs – Ananya Mukherjee, Saumitra Yadav, and Manish Shrivastava

Here is the summary of the research work:

This paper presents an evaluation of 16 machine translation systems submitted to the Shared Task of the 9th Conference of Machine Translation (WMT24) for the English-Hindi (en-hi) language pair using our Complex Structures Test (CoST) suite. Aligning with this year’s test suite sub-task theme, “Help us break LLMs”, we curated a comprehensive test suite encompassing diverse datasets across various categories, including autobiography, poetry, legal, conversation, play, narration, technical, and mixed genres. Our evaluation reveals that all the systems struggle significantly with the archaic style of text like legal and technical writings or text with creative twist like conversation and poetry datasets, highlighting their weaknesses in handling complex linguistic structures and stylistic nuances inherent in these text types. Our evaluation identifies the strengths and limitations of the submitted models, pointing to specific areas where further research and development are needed to enhance their performance.

A3-108 Controlling Token Generation in Low Resource Machine Translation Systems – Saumitra Yadav, Ananya Mukherjee, and Manish Shrivastava

Here is the summary of the research work:

Translating for languages with limited resources poses a persistent challenge due to the scarcity of high-quality training data. To enhance translation accuracy, we explored controlled generation mechanisms, focusing on the importance of control tokens. In our experiments, while training, we encoded the target sentence length as a control token to the source sentence, treating it as an additional feature for the source sentence. We developed various NMT models using transformer architecture and conducted experiments across 8 language directions (English = Assamese, Manipuri, Khasi, and Mizo), exploring four variations of length encoding mechanisms. Through comparative analysis against the baseline model, we submitted two systems for each language direction.

November 2024