Paper Digest: Recent Papers on Machine Translation
Paper Digest Team extracted all recent Machine Translation related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: Recent Papers on Machine Translation
Paper | Author(s) | Source | Date | |
---|---|---|---|---|
1 | Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To understand when and why the navigation capabilities of language IDs are weakened, we compare two extreme decoder input cases in the ZST directions: Off-Target (OFF) and On-Target (ON) cases. |
CHANGTONG ZAN et. al. | arxiv-cs.CL | 2023-09-28 |
2 | A Benchmark for Learning to Translate A New Language from One Grammar Book Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We turn to a field that is explicitly motivated and bottlenecked by a scarcity of web data: low-resource languages. In this paper, we introduce MTOB (Machine Translation from One Book), a benchmark for learning to translate between English and Kalamang — a language with less than 200 speakers and therefore virtually no presence on the web — using several hundred pages of field linguistics reference materials. |
Garrett Tanzer; Mirac Suzgun; Eline Visser; Dan Jurafsky; Luke Melas-Kyriazi; | arxiv-cs.CL | 2023-09-28 |
3 | Cross-Modal Multi-Tasking for Speech-to-Text Translation Via Hard Parameter Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we instead propose a ST/MT multi-tasking framework with hard parameter sharing in which all model parameters are shared cross-modally. |
Brian Yan; Xuankai Chang; Antonios Anastasopoulos; Yuya Fujita; Shinji Watanabe; | arxiv-cs.CL | 2023-09-27 |
4 | MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, visual speech is not as distinguishable as audio speech, making it difficult to develop a mapping from source speech phonemes to the target language text. To address this issue, we propose MixSpeech, a cross-modality self-learning framework that utilizes audio speech to regularize the training of visual speech tasks. |
XIZE CHENG et. al. | iccv | 2023-09-27 |
5 | CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these are not directly applicable to MMT since they do not provide aligned multimodal multilingual features for generative tasks. To alleviate this issue, instead of designing complex modules for MMT, we propose CLIPTrans, which simply adapts the independently pre-trained multimodal M-CLIP and the multilingual mBART. |
DEVAANSH GUPTA et. al. | iccv | 2023-09-27 |
6 | Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the FBK’s participation in the Simultaneous Translation and Automatic Subtitling tracks of the IWSLT 2023 Evaluation Campaign. |
Sara Papi; Marco Gaido; Matteo Negri; | arxiv-cs.CL | 2023-09-27 |
7 | Segmentation-Free Streaming Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Segmentation-Free framework that enables the model to translate an unsegmented source stream by delaying the segmentation decision until the translation has been generated. |
Javier Iranzo-Sánchez; Jorge Iranzo-Sánchez; Adrià Giménez; Jorge Civera; Alfons Juan; | arxiv-cs.CL | 2023-09-26 |
8 | Hindi to English: Transformer-Based Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we have developed a Neural Machine Translation (NMT) system by training the Transformer model to translate texts from Indian Language Hindi to English. |
Kavit Gangar; Hardik Ruparel; Shreyas Lele; | arxiv-cs.CL | 2023-09-22 |
9 | NJUNLP’s Participation for The WMT2023 Quality Estimation Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the submissions of the NJUNLP team to the WMT 2023 Quality Estimation (QE) shared task. |
XIANG GENG et. al. | arxiv-cs.CL | 2023-09-22 |
10 | Audience-specific Explanations for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we explore techniques to extract example explanations from a parallel corpus. |
Renhan Lou; Jan Niehues; | arxiv-cs.CL | 2023-09-22 |
11 | Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we developed carefully a parallel corpus for Arabic-English (AR- EN) translation in the financial domain for benchmarking different domain adaptation methods. |
Emad A. Alghamdi; Jezia Zakraoui; Fares A. Abanmy; | arxiv-cs.CL | 2023-09-22 |
12 | OSN-MDAD: Machine Translation Dataset for Arabic Multi-Dialectal Conversations on Online Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While few attempts have been made to build translation datasets for dialectal Arabic, they are domain dependent and are not OSN cultural-language friendly. In this work, we attempt to alleviate these limitations by proposing an online social network-based multidialect Arabic dataset that is crafted by contextually translating English tweets into four Arabic dialects: Gulf, Yemeni, Iraqi, and Levantine. |
Fatimah Alzamzami; Abdulmotaleb El Saddik; | arxiv-cs.CL | 2023-09-21 |
13 | SpeechAlign: A Framework for Speech Translation Alignment Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Speech-to-Speech and Speech-to-Text translation are currently dynamic areas of research. To contribute to these fields, we present SpeechAlign, a framework to evaluate the underexplored field of source-target alignment in speech models. |
Belen Alastruey; Aleix Sant; Gerard I. Gállego; David Dale; Marta R. Costa-jussà; | arxiv-cs.CL | 2023-09-20 |
14 | Towards Effective Disambiguation for Machine Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the capabilities of LLMs to translate ambiguous sentences containing polysemous words and rare word senses. |
Vivek Iyer; Pinzhen Chen; Alexandra Birch; | arxiv-cs.CL | 2023-09-20 |
15 | A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a novel fine-tuning approach for LLMs that is specifically designed for the translation task, eliminating the need for the abundant parallel data that traditional translation models usually depend on. |
Haoran Xu; Young Jin Kim; Amr Sharaf; Hany Hassan Awadalla; | arxiv-cs.CL | 2023-09-20 |
16 | SignBank+: Multilingual Sign Language Translation Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SignBank+, a clean version of the SignBank dataset, optimized for machine translation. |
Amit Moryossef; Zifan Jiang; | arxiv-cs.CL | 2023-09-20 |
17 | NSOAMT — New Search Only Approach to Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The idea is to develop a solution that, by indexing an incremental set of words that combine a certain semantic meaning, makes it possible to create a process of correspondence between their native language record and the language of translation. |
João Luís; Diogo Cardoso; José Marques; Luís Campos; | arxiv-cs.CL | 2023-09-19 |
18 | A Benchmark for Text Expansion: Datasets, Metrics, and Baselines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifiers into proper locations of the plain text to concretize or vivify human writings. |
YI CHEN et. al. | arxiv-cs.CL | 2023-09-17 |
19 | Neural Machine Translation Models Can Learn to Be Few-shot Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that a much smaller model can be trained to perform ICL by fine-tuning towards a specialized training objective, exemplified on the task of domain adaptation for neural machine translation. |
Raphael Reinauer; Patrick Simianer; Kaden Uhlig; Johannes E. M. Mosig; Joern Wuebker; | arxiv-cs.CL | 2023-09-15 |
20 | Simultaneous Machine Translation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the feasibility of utilizing LLMs for SimulMT. |
MINGHAN WANG et. al. | arxiv-cs.CL | 2023-09-13 |
21 | Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce methods to mitigate both failure cases with a modified decoding objective, without requiring retraining or external models. |
Rico Sennrich; Jannis Vamvas; Alireza Mohammadshahi; | arxiv-cs.CL | 2023-09-13 |
22 | Glancing Future for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel method that glances future in curriculum learning to achieve the transition from the seq2seq training to prefix2prefix training. |
Shoutao Guo; Shaolei Zhang; Yang Feng; | arxiv-cs.CL | 2023-09-12 |
23 | Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Improperly assuming the pseudo-parallel data are correctly correlated will make the networks overfit to the noisy correspondence. Therefore, we propose Dual-view Curricular Optimal Transport (DCOT) to learn with noisy correspondence in CCR. |
YABING WANG et. al. | arxiv-cs.CV | 2023-09-11 |
24 | The Effect of Alignment Objectives on Code-Switching Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are proposing a way of training a single machine translation model that is able to translate monolingual sentences from one language to another, along with translating code-switched sentences to either language. |
Mohamed Anwar; | arxiv-cs.CL | 2023-09-10 |
25 | Epi-Curriculum: Episodic Curriculum Learning for Low-Resource Domain Adaptation in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach Epi-Curriculum to address low-resource domain adaptation (DA), which contains a new episodic training framework along with denoised curriculum learning. |
Keyu Chen; Di Zhuang; Mingchen Li; J. Morris Chang; | arxiv-cs.LG | 2023-09-05 |
26 | Advancing Text-to-GLOSS Neural Translation Using A Novel Hyper-parameter Optimization Technique Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the use of transformers for Neural Machine Translation of text-to-GLOSS for Deaf and Hard-of-Hearing communication. |
Younes Ouargani; Noussaima El Khattabi; | arxiv-cs.CL | 2023-09-05 |
27 | Task-Based MoE for Multitask Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we instead design a novel method that incorporates task information into MoE models at different granular levels with shared dynamic task-based adapters. |
HAI PHAM et. al. | arxiv-cs.CL | 2023-08-30 |
28 | Impact of Visual Context on Noisy Multimodal NMT: An Empirical Study for English to Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The study investigates the effectiveness of utilizing multimodal information in Neural Machine Translation (NMT). |
Baban Gain; Dibyanayan Bandyopadhyay; Samrat Mukherjee; Chandranath Adak; Asif Ekbal; | arxiv-cs.CL | 2023-08-30 |
29 | A Classification-Guided Approach for Adversarial Attacks Against Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce ACT, a novel adversarial attack framework against NMT systems guided by a classifier. |
Sahar Sadrizadeh; Ljiljana Dolamic; Pascal Frossard; | arxiv-cs.CL | 2023-08-29 |
30 | An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct empirical studies on intra-modal and cross-modal consistency and propose two training strategies, SimRegCR and SimZeroCR, for E2E ST in regular and zero-shot scenarios. |
Pengzhi Gao; Ruiqing Zhang; Zhongjun He; Hua Wu; Haifeng Wang; | arxiv-cs.CL | 2023-08-28 |
31 | Training and Meta-Evaluating Machine Translation Evaluation Metrics at The Paragraph Level Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As research on machine translation moves to translating text beyond the sentence level, it remains unclear how effective automatic evaluation metrics are at scoring longer … |
Daniel Deutsch; Juraj Juraska; Mara Finkelstein; Markus Freitag; | arxiv-cs.CL | 2023-08-25 |
32 | Improving Translation Faithfulness of Large Language Models Via Augmenting Instructions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) present strong general capabilities, and a current compelling challenge is stimulating their specialized capabilities, such as machine translation, through low-cost instruction tuning. |
YIJIE CHEN et. al. | arxiv-cs.CL | 2023-08-24 |
33 | SONAR: Sentence-Level Multimodal and Language-Agnostic Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SONAR, a new multilingual and multimodal fixed-size sentence embedding space. |
Paul-Ambroise Duquenne; Holger Schwenk; Benoît Sagot; | arxiv-cs.CL | 2023-08-22 |
34 | SeamlessM4T-Massively Multilingual & Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: More specifically, conventional speech-to-speech translation systems rely on cascaded systems that perform translation progressively, putting high-performing unified systems out of reach. To address these gaps, we introduce SeamlessM4T, a single model that supports speech-to-speech translation, speech-to-text translation, text-to-speech translation, text-to-text translation, and automatic speech recognition for up to 100 languages. |
SEAMLESS COMMUNICATION et. al. | arxiv-cs.CL | 2023-08-22 |
35 | An Effective Method Using Phrase Mechanism in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we report an effective method using a phrase mechanism, PhraseTransformer, to improve the strong baseline model Transformer in constructing a Neural Machine Translation (NMT) system for parallel corpora Vietnamese-Chinese. |
Phuong Minh Nguyen; Le Minh Nguyen; | arxiv-cs.CL | 2023-08-21 |
36 | Is Context All You Need? Scaling Neural Sign Language Translation to Large Domains of Discourse Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking direct inspiration from how humans translate, we propose a novel multi-modal transformer architecture that tackles the translation task in a context-aware manner, as a human would. |
Ozge Mercanoglu Sincan; Necati Cihan Camgoz; Richard Bowden; | arxiv-cs.CV | 2023-08-18 |
37 | Factuality Detection Using Machine Translation — A Use Case for German Clinical Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the context of factuality detection, this work presents a simple solution using machine translation to translate English data to German to train a transformer-based factuality detection model. |
Mohammed Bin Sumait; Aleksandra Gabryszak; Leonhard Hennig; Roland Roller; | arxiv-cs.CL | 2023-08-17 |
38 | Fast Training of NMT Model with Data Sorting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching, minimizing the waste of computing power. |
Daniela N. Rim; Kimera Richard; Heeyoul Choi; | arxiv-cs.CL | 2023-08-16 |
39 | VBD-MT Chinese-Vietnamese Translation Systems for VLSP 2022 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our systems participated in the VLSP 2022 machine translation shared task. |
Hai Long Trieu; Song Kiet Bui; Tan Minh Tran; Van Khanh Tran; Hai An Nguyen; | arxiv-cs.CL | 2023-08-15 |
40 | Optimizing Transformer-based Machine Translation Model for Single GPU Training: A Hyperparameter Ablation Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The insights from this study contribute to the ongoing efforts to make machine translation more accessible and cost-effective, emphasizing the importance of precise hyperparameter tuning over mere scaling. |
Luv Verma; Ketaki N. Kolhatkar; | arxiv-cs.CL | 2023-08-11 |
41 | Exploring Linguistic Similarity and Zero-Shot Learning for Multilingual Translation of Dravidian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we overcome the shortcomings of zero-shot translation by taking advantage of transliteration and linguistic similarity. |
Danish Ebadulla; Rahul Raman; S. Natarajan; Hridhay Kiran Shetty; Ashish Harish Shenoy; | arxiv-cs.CL | 2023-08-10 |
42 | Extrapolating Large Language Models to Non-English By Aligning Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to empower pre-trained LLMs on non-English languages by building semantic alignment across languages. |
WENHAO ZHU et. al. | arxiv-cs.CL | 2023-08-09 |
43 | Evaluating and Optimizing The Effectiveness of Neural Machine Translation in Supporting Code Retrieval Models: A Study on The CAT Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the performance of NMT in natural language-to-code translation in the newly curated CAT benchmark that includes the optimized versions of three Java datasets TLCodeSum, CodeSearchNet, Funcom, and a Python dataset PCSD. |
Hung Phan; Ali Jannesari; | arxiv-cs.SE | 2023-08-09 |
44 | Character-level NMT and Language Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate the models using automatic MT metrics and show that translation between similar languages benefits from character-level input segmentation, while for less related languages, character-level vanilla Transformer-base often lags behind subword-level segmentation. |
Josef Jon; Ondřej Bojar; | arxiv-cs.CL | 2023-08-08 |
45 | Negative Lexical Constraints in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compared various methods based on modifying either the decoding process or the training data. |
Josef Jon; Dušan Variš; Michal Novák; João Paulo Aires; Ondřej Bojar; | arxiv-cs.CL | 2023-08-07 |
46 | Sinhala-English Parallel Word Dictionary Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, in this work, we introduce three parallel English-Sinhala word dictionaries (En-Si-dict-large, En-Si-dict-filtered, En-Si-dict-FastText) which help in multilingual Natural Language Processing (NLP) tasks related to English and Sinhala languages. |
Kasun Wickramasinghe; Nisansa de Silva; | arxiv-cs.CL | 2023-08-04 |
47 | Do Multilingual Language Models Think Better in English? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a new approach called self-translate, which overcomes the need of an external translation system by leveraging the few-shot translation capabilities of multilingual language models. |
Julen Etxaniz; Gorka Azkune; Aitor Soroa; Oier Lopez de Lacalle; Mikel Artetxe; | arxiv-cs.CL | 2023-08-02 |
48 | Optimizing Machine Translation Through Prompt Engineering: An Investigation Into ChatGPT’s Customizability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the influence of integrating the purpose of the translation and the target audience into prompts on the quality of translations produced by ChatGPT. |
Masaru Yamada; | arxiv-cs.CL | 2023-08-02 |
49 | Predicting Perfect Quality Segments in MT Output with Fine-Tuned OpenAI LLM: Is It Possible to Capture Editing Distance Patterns from Historical Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: English-Italiano bilingual Abstract is available in the paper. |
Serge Gladkoff; Gleb Erofeev; Lifeng Han; Goran Nenadic; | arxiv-cs.CL | 2023-07-31 |
50 | Toward Quantum Machine Translation of Syntactically Distinct Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The present study aims to explore the feasibility of language translation using quantum natural language processing algorithms on noisy intermediate-scale quantum (NISQ) devices. |
Mina Abbaszade; Mariam Zomorodi; Vahid Salari; Philip Kurian; | arxiv-cs.CL | 2023-07-31 |
51 | Structural Transfer Learning in NL-to-Bash Semantic Parsers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a methodology for obtaining a quantitative understanding of structural overlap between machine translation tasks. |
Kyle Duffy; Satwik Bhattamishra; Phil Blunsom; | arxiv-cs.CL | 2023-07-31 |
52 | Multilingual Lexical Simplification Via Paraphrase Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel multilingual LS method via paraphrase generation, as paraphrases provide diversity in word selection while preserving the sentence’s meaning. |
KANG LIU et. al. | arxiv-cs.CL | 2023-07-27 |
53 | XDLM: Cross-lingual Diffusion Language Model for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, while pretraining with diffusion models has been studied within a single language, the potential of cross-lingual pretraining remains understudied. To address these gaps, we propose XDLM, a novel Cross-lingual diffusion model for machine translation, consisting of pretraining and fine-tuning stages. |
Linyao Chen; Aosong Feng; Boming Yang; Zihui Li; | arxiv-cs.CL | 2023-07-25 |
54 | Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation Through Phrase Pair Variables Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a method called Joint Dropout, that addresses the challenge of low-resource neural machine translation by substituting phrases with variables, resulting in significant enhancement of compositionality, which is a key aspect of generalization. |
Ali Araabi; Vlad Niculae; Christof Monz; | arxiv-cs.CL | 2023-07-24 |
55 | Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation. |
Neel Bhandari; Pin-Yu Chen; | arxiv-cs.CL | 2023-07-24 |
56 | Incorporating Human Translator Style Into English-Turkish Literary Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on English-Turkish literary translation and develop machine translation models that take into account the stylistic features of translators. |
ZEYNEP YIRMIBEŞOĞLU et. al. | arxiv-cs.CL | 2023-07-21 |
57 | Syntax-Aware Complex-Valued Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method to incorporate syntax information into a complex-valued Encoder-Decoder architecture. |
Yang Liu; Yuexian Hou; | arxiv-cs.CL | 2023-07-17 |
58 | Improving End-to-End Speech Translation By Imitation-Based Knowledge Distillation with Synthetic Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an imitation learning approach where a teacher NMT system corrects the errors of an AST student without relying on manual transcripts. |
Rebekka Hubert; Artem Sokolov; Stefan Riezler; | arxiv-cs.CL | 2023-07-17 |
59 | A Neural-Symbolic Approach Towards Identifying Grammatically Correct Sentences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the importance of having access to well-written sentences, figuring out ways to validate them is still an open area of research. To address this problem, we present a simplified way to validate English sentences through a novel neural-symbolic approach. |
Nicos Isaak; | arxiv-cs.CL | 2023-07-16 |
60 | Data Augmentation for Machine Translation Via Dependency Subtree Swapping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a generic framework for data augmentation via dependency subtree swapping that is applicable to machine translation. |
Attila Nagy; Dorina Petra Lakatos; Botond Barta; Patrick Nanys; Judit Ács; | arxiv-cs.CL | 2023-07-13 |
61 | Pluggable Neural Machine Translation Models Via Memory-augmented Adapters Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given the expensive training cost and the data scarcity challenge of learning a new model from scratch for each user requirement, we propose a memory-augmented adapter to steer pretrained NMT models in a pluggable manner. |
YUZHUANG XU et. al. | arxiv-cs.CL | 2023-07-12 |
62 | Neural Machine Translation Data Generation and Augmentation Using ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate an alternative to manual parallel corpora – hallucinated parallel corpora created by generative language models. |
Wayne Yang; Garrett Nicolai; | arxiv-cs.CL | 2023-07-11 |
63 | The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the NPU-MSXF system for the IWSLT 2023 speech-to-speech translation (S2ST) task which aims to translate from English speech of multi-source to Chinese speech. |
KUN SONG et. al. | arxiv-cs.SD | 2023-07-10 |
64 | The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop and compare several neural explainability methods and demonstrate their effectiveness for interpreting state-of-the-art fine-tuned neural metrics. |
RICARDO REI et. al. | acl | 2023-07-08 |
65 | Bring More Attention to Syntactic Symmetry for Automatic Postediting of High-Quality Machine Translations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose a linguistically motivated method of regularization that is expected to enhance APE models� understanding of the target language: a loss function that encourages symmetric self-attention on the given MT. Our analysis of experimental results demonstrates that the proposed method helps improving the state-of-the-art architecture�s APE quality for high-quality MTs. |
Baikjin Jung; Myungji Lee; Jong-Hyeok Lee; Yunsu Kim; | acl | 2023-07-08 |
66 | Simple and Effective Unsupervised Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages. To address this issue, we study a simple and effective approach to build speech translation systems without labeled data by leveraging recent advances in unsupervised speech recognition, machine translation and speech synthesis, either in a pipeline approach, or to generate pseudo-labels for training end-to-end speech translation models. |
CHANGHAN WANG et. al. | acl | 2023-07-08 |
67 | Learning Optimal Policy for Simultaneous Machine Translation Via Binary Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a new method for constructing the optimal policy online via binary search. |
Shoutao Guo; Shaolei Zhang; Yang Feng; | acl | 2023-07-08 |
68 | Translation-Enhanced Multilingual Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide two key contributions. 1) Relying on a multilingual multi-modal encoder, we provide a systematic empirical study of standard methods used in cross-lingual NLP when applied to mTTI: Translate Train, Translate Test, and Zero-Shot Transfer. 2) We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTI performance. |
Yaoyiran Li; Ching-Yun Chang; Stephen Rawls; Ivan Vulic; Anna Korhonen; | acl | 2023-07-08 |
69 | MCLIP: Multilingual CLIP Via Cross-lingual Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce mCLIP, a retrieval-efficient dual-stream multilingual VLP model, trained by aligning the CLIP model and a Multilingual Text Encoder (MTE) through a novel Triangle Cross-modal Knowledge Distillation (TriKD) method. |
GUANHUA CHEN et. al. | acl | 2023-07-08 |
70 | Exploring Better Text Image Translation with Multimodal Codebook Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we first annotate a Chinese-English TIT dataset named OCRMT30K, providing convenience for subsequent studies. |
ZHIBIN LAN et. al. | acl | 2023-07-08 |
71 | A Holistic Approach to Reference-Free Evaluation of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a reference-free evaluation approach that characterizes evaluation as two aspects: (1) fluency: how well the translated text conforms to normal human language usage; (2) faithfulness: how well the translated text reflects the source data. |
Hanming Wu; Wenjuan Han; Hui Di; Yufeng Chen; Jinan Xu; | acl | 2023-07-08 |
72 | Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unpaired cross-lingual image captioning has long suffered from irrelevancy and disfluency issues, due to the inconsistencies of the semantic scene and syntax attributes during transfer. In this work, we propose to address the above problems by incorporating the scene graph (SG) structures and the syntactic constituency (SC) trees. |
Shengqiong Wu; Hao Fei; Wei Ji; Tat-Seng Chua; | acl | 2023-07-08 |
73 | Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a new method named Top-1 Information Enhanced Knowledge Distillation (TIE-KD). |
SONGMING ZHANG et. al. | acl | 2023-07-08 |
74 | Prompting PaLM for Translation: Assessing Strategies and Performance IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate various strategies for choosing translation examples for few-shot prompting, concluding that example quality is the most important factor. |
DAVID VILAR et. al. | acl | 2023-07-08 |
75 | Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. |
Chantal Amrhein; Florian Schottmann; Rico Sennrich; Samuel L�ubli; | acl | 2023-07-08 |
76 | Easy Guided Decoding in Providing Suggestions for Interactive Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we utilize the parameterized objective function of neural machine translation (NMT) and propose a novel constrained decoding algorithm, namely Prefix-Suffix Guided Decoding (PSGD), to deal with the TS problem without additional training. |
Ke Wang; Xin Ge; Jiayi Wang; Yuqi Zhang; Yu Zhao; | acl | 2023-07-08 |
77 | Back Translation for Speech-to-text Translation Without Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to utilize large amounts of target-side monolingual data to enhance ST without transcripts. |
Qingkai Fang; Yang Feng; | acl | 2023-07-08 |
78 | What About �em�? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Wrong pronoun translations can discriminate against marginalized groups, e. g. , non-binary individuals (Dev et al. , 2021). In this �reality check�, we study how three commercial MT systems translate 3rd-person pronouns. |
Anne Lauscher; Debora Nozza; Ehm Miltersen; Archie Crowley; Dirk Hovy; | acl | 2023-07-08 |
79 | CMOT: Cross-modal Mixup Via Optimal Transport for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Cross-modal Mixup via Optimal Transport (CMOT) to overcome the modality gap. |
Yan Zhou; Qingkai Fang; Yang Feng; | acl | 2023-07-08 |
80 | Do GPTs Produce Less Literal Translations? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. |
Vikas Raunak; Arul Menezes; Matt Post; Hany Hassan; | acl | 2023-07-08 |
81 | Rethinking Multimodal Entity and Relation Extraction from A Translation Point of View Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit the multimodal entity and relation extraction from a translation point of view. |
Changmeng Zheng; Junhao Feng; Yi Cai; Xiaoyong Wei; Qing Li; | acl | 2023-07-08 |
82 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM�s Translation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. |
Eleftheria Briakou; Colin Cherry; George Foster; | acl | 2023-07-08 |
83 | Scene Graph As Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate a more realistic unsupervised multimodal machine translation (UMMT) setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-text inputs. |
Hao Fei; Qian Liu; Meishan Zhang; Min Zhang; Tat-Seng Chua; | acl | 2023-07-08 |
84 | Understanding and Improving The Robustness of Terminology Constraints in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the robustness of two typical terminology translation methods: Placeholder (PH) and Code-Switch (CS), concerning (1) the number of constraints and (2) the target constraint length. |
HUAAO ZHANG et. al. | acl | 2023-07-08 |
85 | Extrinsic Evaluation of Machine Translation Metrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how useful MT metrics are at detecting the segment-level quality by correlating metrics with how useful the translations are for downstream task. |
Nikita Moghe; Tom Sherborne; Mark Steedman; Alexandra Birch; | acl | 2023-07-08 |
86 | A Simple Concatenation Can Effectively Improve Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the works of video Transformer, we propose a simple unified cross-modal ST method, which concatenates speech and text as the input, and builds a teacher that can utilize both cross-modal information simultaneously. |
Linlin Zhang; Kai Fan; Boxing Chen; Luo Si; | acl | 2023-07-08 |
87 | XPQA: Cross-Lingual Product Question Answering in 12 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. |
Xiaoyu Shen; Akari Asai; Bill Byrne; Adria De Gispert; | acl | 2023-07-08 |
88 | Subset Retrieval Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose �Subset kNN-MT�, which improves the decoding speed of kNN-MT by two methods: (1) retrieving neighbor target tokens from a subset that is the set of neighbor sentences of the input sentence, not from all sentences, and (2) efficient distance computation technique that is suitable for subset neighbor search using a look-up table. |
HIROYUKI DEGUCHI et. al. | acl | 2023-07-08 |
89 | Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new MMT approach based on a strong text-only MT model, which uses neural adapters, a novel guided self-attention mechanism and which is jointly trained on both visually-conditioned masking and MMT. |
Matthieu Futeral; Cordelia Schmid; Ivan Laptev; Beno�t Sagot; Rachel Bawden; | acl | 2023-07-08 |
90 | On Evaluating Multilingual Compositional Generalization with Translated Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we show that this entails critical semantic distortion. To address this limitation, we craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese. |
Zi Wang; Daniel Hershcovich; | acl | 2023-07-08 |
91 | Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. |
Yasmine Karoui; R�mi Lebret; Negar Foroutan Eghlidi; Karl Aberer; | acl | 2023-07-08 |
92 | Understanding and Bridging The Modality Gap for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the modality gap is relatively small during training except for some difficult cases, but keeps increasing during inference due to the cascading effect. To address these problems, we propose the Cross-modal Regularization with Scheduled Sampling (Cress) method. |
Qingkai Fang; Yang Feng; | acl | 2023-07-08 |
93 | Multilingual Event Extraction from Historical Newspaper Adverts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new multilingual dataset in English, French, and Dutch composed of newspaper ads from the early modern colonial period reporting on enslaved people who liberated themselves from enslavement. |
Nadav Borenstein; Nat�lia da Silva Perez; Isabelle Augenstein; | acl | 2023-07-08 |
94 | RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While ACT has garnered attention in recent years due to its usefulness in real-world applications, progress in the task is currently limited by dataset availability, since most prior approaches rely on supervised methods. To address this limitation, we propose Retrieval and Attribute-Marking enhanced Prompting (RAMP), which leverages large multilingual language models to perform ACT in few-shot and zero-shot settings. |
GABRIELE SARTI et. al. | acl | 2023-07-08 |
95 | Ethical Considerations for Machine Translation of Indigenous Languages: Giving A Voice to The Speakers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The data collection, modeling and deploying machine translation systems thus result in new ethical questions that must be addressed. Motivated by this, we first survey the existing literature on ethical considerations for the documentation, translation, and general natural language processing for Indigenous languages. Afterward, we conduct and analyze an interview study to shed light on the positions of community leaders, teachers, and language activists regarding ethical concerns for the automatic translation of their languages. |
Manuel Mager; Elisabeth Mager; Katharina Kann; Ngoc Thang Vu; | acl | 2023-07-08 |
96 | Neural Machine Translation for Mathematical Formulae Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we perform the tasks of translating from LaTeX to Mathematica as well as from LaTeX to semantic LaTeX. |
Felix Petersen; Moritz Schubotz; Andre Greiner-Petter; Bela Gipp; | acl | 2023-07-08 |
97 | Neural Machine Translation Methods for Translating Text to Sign Language Glosses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments, we improve the performance of the transformer-based models via (1) data augmentation, (2) semi-supervised Neural Machine Translation (NMT), (3) transfer learning and (4) multilingual NMT. |
Dele Zhu; Vera Czehmann; Eleftherios Avramidis; | acl | 2023-07-08 |
98 | Learning Language-Specific Layers for Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Language-Specific Transformer Layers (LSLs), which allow us to increase model capacity, while keeping the amount of computation and the number of parameters used in the forward pass constant. |
Telmo Pires; Robin Schmidt; Yi-Hsiu Liao; Stephan Peitz; | acl | 2023-07-08 |
99 | Binary and Ternary Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We approach the problem with a mix of statistics-based quantization for the weights and elastic quantization of the activations and demonstrate the first ternary and binary transformer models on the downstream tasks of summarization and machine translation. |
Zechun Liu; Barlas Oguz; Aasish Pappu; Yangyang Shi; Raghuraman Krishnamoorthi; | acl | 2023-07-08 |
100 | Continual Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest. |
Yuanchi Zhang; Peng Li; Maosong Sun; Yang Liu; | acl | 2023-07-08 |
101 | Discourse-Centric Evaluation of Document-level Machine Translation with A New Densely Annotated Parallel Corpus of Novels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using these annotations, we systematically investigate the similarities and differences between the discourse structures of source and target languages, and the challenges they pose to MT. We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures. This gives us a new perspective on the challenges and opportunities in document-level MT. We make our resource publicly available to spur future research in document-level MT and its generalization to other language translation tasks. |
YUCHEN ELEANOR JIANG et. al. | acl | 2023-07-08 |
102 | Considerations for Meaningful Sign Language Machine Translation Based on Glosses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we review recent works on neural gloss translation. |
Mathias M�ller; Zifan Jiang; Amit Moryossef; Annette Rios; Sarah Ebling; | acl | 2023-07-08 |
103 | PEIT: Bridging The Modality Gap with Pre-trained Models for End-to-End Image Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PEIT, an end-to-end image translation framework that bridges the modality gap with pre-trained models. |
Shaolin Zhu; Shangjie Li; Yikun Lei; Deyi Xiong; | acl | 2023-07-08 |
104 | BIG-C: A Multimodal Multi-Purpose Dataset for Bemba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present BIG-C (Bemba Image Grounded Conversations), a large multimodal dataset for Bemba. |
Claytone Sikasote; Eunice Mukonde; Md Mahfuz Ibn Alam; Antonios Anastasopoulos; | acl | 2023-07-08 |
105 | Songs Across Borders: Singable and Controllable Neural Lyric Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper bridges the singability quality gap by formalizing lyric translation into a constrained translation problem, converting theoretical guidance and practical techniques from translatology literature to prompt-driven NMT approaches, exploring better adaptation methods, and instantiating them to an English-Chinese lyric translation system. |
Longshen Ou; Xichu Ma; Min-Yen Kan; Ye Wang; | acl | 2023-07-08 |
106 | A Survey on Zero Pronoun Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This phenomenon has been studied extensively in machine translation (MT), as it poses a significant challenge for MT systems due to the difficulty in determining the correct antecedent for the pronoun. This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution so that researchers can recognize the current state and future directions of this field. |
LONGYUE WANG et. al. | acl | 2023-07-08 |
107 | MultiTACRED: A Multilingual Version of The TAC Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al. , 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. |
Leonhard Hennig; Philippe Thomas; Sebastian M�ller; | acl | 2023-07-08 |
108 | Multi-VALUE: A Framework for Cross-Dialectal English NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a suite of resources for evaluating and achieving English dialect invariance. |
CALEB ZIEMS et. al. | acl | 2023-07-08 |
109 | TeCS: A Dataset and Benchmark for Tense Consistency of Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a parallel tense test set, containing French-English 552 utterances. |
Yiming Ai; Zhiwei He; Kai Yu; Rui Wang; | acl | 2023-07-08 |
110 | Text Style Transfer Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For natural inputs, BT brings only slight improvements and sometimes even adverse effects. To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer to modify the source side of BT data. |
DAIMENG WEI et. al. | acl | 2023-07-08 |
111 | INK: Injecting KNN Knowledge in Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters. |
Wenhao Zhu; Jingjing Xu; Shujian Huang; Lingpeng Kong; Jiajun Chen; | acl | 2023-07-08 |
112 | Using Neural Machine Translation for Generating Diverse Challenging Exercises for Language Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel approach to automatically generate distractors for cloze exercises for English language learners, using round-trip neural machine translation. |
Frank Palma Gomez; Subhadarshi Panda; Michael Flor; Alla Rozovskaya; | acl | 2023-07-08 |
113 | To Be or Not to Be: A Translation Reception Study of A Literary Text Translated Into Dutch and Catalan Using Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article presents the results of a study involving the reception of a fictional story by Kurt Vonnegut translated from English into Catalan and Dutch in three conditions: machine-translated (MT), post-edited (PE) and translated from scratch (HT). |
Ana Guerberof Arenas; Antonio Toral; | arxiv-cs.CL | 2023-07-05 |
114 | X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Task-oriented dialogue research has mainly focused on a few popular languages like English and Chinese, due to the high dataset creation cost for a new language. |
MEHRAD MORADSHAHI et. al. | arxiv-cs.CL | 2023-06-30 |
115 | Stop Pre-Training: Adapt Visual-Language Models to Unseen Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple yet efficient approach to adapt VLP to unseen languages using MPLM. |
Yasmine Karoui; Rémi Lebret; Negar Foroutan; Karl Aberer; | arxiv-cs.CL | 2023-06-29 |
116 | Scaling Laws for Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we provide a large-scale empirical study of the scaling properties of multilingual neural machine translation models. |
Patrick Fernandes; Behrooz Ghorbani; Xavier Garcia; Markus Freitag; Orhan Firat; | icml | 2023-06-27 |
117 | The Unreasonable Effectiveness of Few-shot Learning for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. |
XAVIER GARCIA et. al. | icml | 2023-06-27 |
118 | Quality Estimation of Machine Translated Texts Based on Direct Evidence from Training Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we show that the parallel corpus used as training data for training the MT system holds direct clues for estimating the quality of translations produced by the MT system. |
Vibhuti Kumari; Narayana Murthy Kavi; | arxiv-cs.CL | 2023-06-27 |
119 | Constructing Multilingual Code Search Dataset Using Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we create a multilingual code search dataset in four natural and four programming languages using a neural machine translation model. |
Ryo Sekizawa; Nan Duan; Shuai Lu; Hitomi Yanaka; | arxiv-cs.CL | 2023-06-27 |
120 | Prompting Neural Machine Translation with Translation Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a simple but effective method to introduce TMs into neural machine translation (NMT) systems. |
ABUDUREXITI REHEMAN et. al. | aaai | 2023-06-26 |
121 | AMOM: Adaptive Masking Over Masking for Conditional Masked Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we further introduce a simple yet effective adaptive masking over masking strategy to enhance the refinement capability of the decoder and make the encoder optimization easier. |
YISHENG XIAO et. al. | aaai | 2023-06-26 |
122 | Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a data-driven approach for Formality-Sensitive Machine Translation (FSMT) that caters to the unique linguistic properties of four target languages. |
Seugnjun Lee; Hyeonseok Moon; Chanjun Park; Heuiseok Lim; | arxiv-cs.CL | 2023-06-26 |
123 | VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose VideoDubber, a machine translation system tailored for the task of video dubbing, which directly considers the speech duration of each token in translation, to match the length of source and target speech. |
YIHAN WU et. al. | aaai | 2023-06-26 |
124 | A Graph Fusion Approach for Cross-Lingual Machine Reading Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel approach, which jointly models the cross-lingual alignment information and the mono-lingual syntax information using a graph. |
ZENAN XU et. al. | aaai | 2023-06-26 |
125 | Evaluation of Chinese-English Machine Translation of Emotion-Loaded Microblog Texts: A Human Annotated Dataset for The Quality Assessment of Emotion Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on how current Machine Translation (MT) tools perform on the translation of emotion-loaded texts by evaluating outputs from Google Translate according to a framework proposed in this paper. |
Shenbin Qian; Constantin Orasan; Felix do Carmo; Qiuliang Li; Diptesh Kanojia; | arxiv-cs.CL | 2023-06-20 |
126 | EvolveMT: An Ensemble MT Engine Improving Itself with Usage Only Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents EvolveMT for efficiently combining multiple machine translation (MT) engines. |
Kamer Ali Yuksel; Ahmet Gunduz; Mohamed Al-Badrashiny; Shreyas Sharma; Hassan Sawaf; | arxiv-cs.CL | 2023-06-20 |
127 | BayLing: Bridging Cross-lingual Alignment and Instruction Following Through Interactive Translation for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To minimize human workload, we propose to transfer the capabilities of language generation and instruction following from English to other languages through an interactive translation task. |
SHAOLEI ZHANG et. al. | arxiv-cs.CL | 2023-06-19 |
128 | Sheffield’s Submission to The AmericasNLP Shared Task on Machine Translation Into Indigenous Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we describe the University of Sheffield’s submission to the AmericasNLP 2023 Shared Task on Machine Translation into Indigenous Languages which comprises the translation from Spanish to eleven indigenous languages. |
Edward Gow-Smith; Danae Sánchez Villegas; | arxiv-cs.CL | 2023-06-16 |
129 | Discourse Representation Structure Parsing for Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the pipeline of automatically collecting the linearized Chinese meaning representation data for sequential-to sequential neural networks. |
Chunliu Wang; Xiao Zhang; Johan Bos; | arxiv-cs.CL | 2023-06-16 |
130 | Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Babel-ImageNet, a massively multilingual benchmark that offers (partial) translations of 1000 ImageNet labels to 92 languages, built without resorting to machine translation (MT) or requiring manual annotation. |
Gregor Geigle; Radu Timofte; Goran Glavaš; | arxiv-cs.CL | 2023-06-14 |
131 | A Survey of Vision-Language Pre-training from The Lens of Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We summarize the common architectures, pre-training objectives, and datasets from literature and conjecture what further is needed to make progress on multimodal machine translation. |
Jeremy Gwinnup; Kevin Duh; | arxiv-cs.CL | 2023-06-12 |
132 | Textual Augmentation Techniques Applied to Low Resource Machine Translation: Case of Swahili Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we investigate the impact of applying textual data augmentation tasks to low resource machine translation. |
Catherine Gitau; VUkosi Marivate; | arxiv-cs.CL | 2023-06-12 |
133 | Measuring Sentiment Bias in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we explore how machine translation might introduce a bias in sentiments as classified by sentiment analysis models. |
KAI HARTUNG et. al. | arxiv-cs.CL | 2023-06-12 |
134 | Rethinking Translation Memory Augmented Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper rethinks translation memory augmented neural machine translation (TM-augmented NMT) from two perspectives, i.e., a probabilistic view of retrieval and the variance-bias … |
HONGKUN HAO et. al. | arxiv-cs.CL | 2023-06-12 |
135 | Good, But Not Always Fair: An Evaluation of Gender Bias for Three Commercial Machine Translation Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, analyses have been redirected to more nuanced aspects, intricate phenomena, as well as potential risks that may arise from the widespread use of MT tools. Along this line, this paper offers a meticulous assessment of three commercial MT systems – Google Translate, DeepL, and Modern MT – with a specific focus on gender translation and bias. |
Silvia Alma Piazzolla; Beatrice Savoldi; Luisa Bentivogli; | arxiv-cs.CL | 2023-06-09 |
136 | Assisting Language Learners: Automated Trans-Lingual Definition Generation Via Contrastive Prompt Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel task of Trans-Lingual Definition Generation (TLDG), which aims to generate definitions in another language, i.e., the native speaker’s language. |
HENGYUAN ZHANG et. al. | arxiv-cs.CL | 2023-06-09 |
137 | Improving Language Model Integration for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, some works on automatic speech recognition have demonstrated that, if the implicit language model is neutralized in decoding, further improvements can be gained when integrating an external language model. In this work, we transfer this concept to the task of machine translation and compare with the most prominent way of including additional monolingual data – namely back-translation. |
Christian Herold; Yingbo Gao; Mohammad Zeineldeen; Hermann Ney; | arxiv-cs.CL | 2023-06-08 |
138 | A Little Is Enough: Few-Shot Quality Estimation Based Corpus Filtering Improves Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: All the scripts and datasets utilized in this study will be publicly available. |
Akshay Batheja; Pushpak Bhattacharyya; | arxiv-cs.CL | 2023-06-06 |
139 | Extract and Attend: Improving Entity Translation in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When we humans encounter an unknown entity during translation, we usually first look up in a dictionary and then organize the entity translation together with the translations of other parts to form a smooth target sentence. Inspired by this translation process, we propose an Extract-and-Attend approach to enhance entity translation in NMT, where the translation candidates of source entities are first extracted from a dictionary and then attended to by the NMT model to generate the target sentence. |
ZIXIN ZENG et. al. | arxiv-cs.CL | 2023-06-03 |
140 | Evaluating Machine Translation Quality with Conformal Predictive Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new approach for assessing uncertainty in machine translation by simultaneously evaluating translation quality and providing a reliable confidence score. |
Patrizio Giovannotti; | arxiv-cs.CL | 2023-06-02 |
141 | Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the submission of the UPC Machine Translation group to the IWSLT 2023 Offline Speech Translation task. |
Ioannis Tsiamas; Gerard I. Gállego; José A. R. Fonollosa; Marta R. Costa-jussà; | arxiv-cs.CL | 2023-06-02 |
142 | Improving Polish to English Neural Machine Translation with Transfer Learning: Effects of Data Volume and Language Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the impact of data volume and the use of similar languages on transfer learning in a machine translation task. |
Juuso Eronen; Michal Ptaszynski; Karol Nowakowski; Zheng Lin Chia; Fumito Masui; | arxiv-cs.CL | 2023-06-01 |
143 | Improved Cross-Lingual Transfer Learning For Automatic Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this work it to improve cross-lingual transfer learning in multilingual speech-to-text translation via semantic knowledge distillation. |
SAMEER KHURANA et. al. | arxiv-cs.CL | 2023-06-01 |
144 | How Does Pretraining Improve Discourse-Aware Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the underlying reasons for their strong performance have not been well explained. To bridge this gap, we introduce a probing task to interpret the ability of PLMs to capture discourse relation knowledge. |
Zhihong Huang; Longyue Wang; Siyou Liu; Derek F. Wong; | arxiv-cs.CL | 2023-05-31 |
145 | Automatic Discrimination of Human and Neural Machine Translation in Multilingual Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We tackle the task of automatically discriminating between human and machine translations. |
Malina Chichirau; Rik van Noord; Antonio Toral; | arxiv-cs.CL | 2023-05-31 |
146 | Translation-Enhanced Multilingual Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 2) We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTI performance. |
Yaoyiran Li; Ching-Yun Chang; Stephen Rawls; Ivan Vulić; Anna Korhonen; | arxiv-cs.CL | 2023-05-30 |
147 | A Corpus for Sentence-level Subjectivity Detection on English News Articles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel corpus for subjectivity detection at the sentence level. |
FRANCESCO ANTICI et. al. | arxiv-cs.CL | 2023-05-29 |
148 | An Open-Source Gloss-Based Baseline for Spoken to Signed Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present an open-source implementation of a text-to-gloss-to-pose-to-video pipeline approach, demonstrating conversion from German to Swiss German Sign Language, French to French Sign Language of Switzerland, and Italian to Italian Sign Language of Switzerland. |
AMIT MORYOSSEF et. al. | arxiv-cs.CL | 2023-05-28 |
149 | Neural Machine Translation with Dynamic Graph Convolutional Decoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most previous works merely focus on leveraging the source syntax in the well-known encoder-decoder framework. In sharp contrast, this paper proposes an end-to-end translation architecture from the (graph \& sequence) structural inputs to the (graph \& sequence) outputs, where the target translation and its corresponding syntactic graph are jointly modeled and generated. |
Lei Li; Kai Fan; Lingyu Yang; Hongjia Li; Chun Yuan; | arxiv-cs.CL | 2023-05-28 |
150 | HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. |
SHANTIPRIYA PARIDA et. al. | arxiv-cs.CL | 2023-05-28 |
151 | Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes CIC NLP’s submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas. |
ATNAFU LAMBEBO TONJA et. al. | arxiv-cs.CL | 2023-05-27 |
152 | Do GPTs Produce Less Literal Translations? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there has been relatively little investigation on how such translations differ qualitatively from the translations generated by standard Neural Machine Translation (NMT) models. In this work, we investigate these differences in terms of the literalness of translations produced by the two systems. |
Vikas Raunak; Arul Menezes; Matt Post; Hany Hassan Awadalla; | arxiv-cs.CL | 2023-05-26 |
153 | Robustness of Multi-Source MT to Transcription Errors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Automatic speech translation is sensitive to speech recognition errors, but in a multilingual scenario, the same content may be available in various languages via simultaneous interpreting, dubbing or subtitling. In this paper, we hypothesize that leveraging multiple sources will improve translation quality if the sources complement one another in terms of correct information they contain. |
Dominik Macháček; Peter Polák; Ondřej Bojar; Raj Dabre; | arxiv-cs.CL | 2023-05-26 |
154 | On The Copying Problem of Unsupervised NMT: A Training Schedule with A Language Discriminator Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a simple but effective training schedule that incorporates a language discriminator loss. |
Yihong Liu; Alexandra Chronopoulou; Hinrich Schütze; Alexander Fraser; | arxiv-cs.CL | 2023-05-26 |
155 | Disambiguated Lexically Constrained Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose disambiguated LCNMT (D-LCNMT) to solve the problem. |
JINPENG ZHANG et. al. | arxiv-cs.CL | 2023-05-26 |
156 | CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Neural machine translation (NMT) systems exhibit limited robustness in handling source-side linguistic variations. Their performance tends to degrade when faced with even slight … |
Md Mahfuz Ibn Alam; Sina Ahmadi; Antonios Anastasopoulos; | arxiv-cs.CL | 2023-05-26 |
157 | Gender Lost In Translation: How Bridging The Gap Between Languages Affects Gender Bias in Zero-Shot Multilingual Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore how bridging the gap between languages for which parallel data is not available affects gender bias in multilingual NMT, specifically for zero-shot directions. |
Lena Cabrera; Jan Niehues; | arxiv-cs.CL | 2023-05-26 |
158 | What About Em? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Wrong pronoun translations can discriminate against marginalized groups, e.g., non-binary individuals (Dev et al., 2021). In this “reality check”, we study how three commercial MT systems translate 3rd-person pronouns. |
Anne Lauscher; Debora Nozza; Archie Crowley; Ehm Miltersen; Dirk Hovy; | arxiv-cs.CL | 2023-05-25 |
159 | MTCue: Learning Zero-Shot Control of Extra-Textual Attributes By Leveraging Unstructured Context in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces MTCue, a novel neural machine translation (NMT) framework that interprets all context (including discrete variables) as text. |
Sebastian Vincent; Robert Flynn; Carolina Scarton; | arxiv-cs.CL | 2023-05-25 |
160 | Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Cross-Lingual Knowledge Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2 models for low-resource languages in the tasks without the need of labeled data for the target language. |
Shivanshu Gupta; Yoshitomo Matsubara; Ankit Chadha; Alessandro Moschitti; | arxiv-cs.CL | 2023-05-25 |
161 | Eliciting The Translation Ability of Large Language Models Via Multilingual Finetuning with Translation Instructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a detailed analysis by finetuning a multilingual pretrained language model, XGLM-7B, to perform multilingual translation following given instructions. |
Jiahuan Li; Hao Zhou; Shujian Huang; Shanbo Cheng; Jiajun Chen; | arxiv-cs.CL | 2023-05-24 |
162 | Unit-based Speech-to-Speech Translation Without Parallel Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose an unsupervised speech-to-speech translation (S2ST) system that does not rely on parallel data between the source and target languages. |
Anuj Diwan; Anirudh Srinivasan; David Harwath; Eunsol Choi; | arxiv-cs.CL | 2023-05-24 |
163 | Leveraging GPT-4 for Automatic Translation Post-Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we formalize the task of translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs across several language pairs. |
Vikas Raunak; Amr Sharaf; Hany Hassan Awadallah; Arul Menezes; | arxiv-cs.CL | 2023-05-24 |
164 | ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. |
CHENYANG LE et. al. | arxiv-cs.CL | 2023-05-24 |
165 | Dolphin: A Challenging and Diverse Benchmark for Arabic NLG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Dolphin, a novel benchmark that addresses the need for an evaluation framework for the wide collection of Arabic languages and varieties. |
El Moatez Billah Nagoudi; Ahmed El-Shangiti; AbdelRahim Elmadany; Muhammad Abdul-Mageed; | arxiv-cs.CL | 2023-05-24 |
166 | Cascaded Beam Search: Plug-and-Play Terminology-Forcing For Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a plug-and-play approach for translation with terminology constraints. |
Frédéric Odermatt; Béni Egressy; Roger Wattenhofer; | arxiv-cs.CL | 2023-05-23 |
167 | Revisiting Machine Translation for Cross-lingual Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The optimal approach, however, is highly task dependent, as we identify various sources of cross-lingual transfer gap that affect different tasks and approaches differently. |
Mikel Artetxe; Vedanuj Goswami; Shruti Bhosale; Angela Fan; Luke Zettlemoyer; | arxiv-cs.CL | 2023-05-23 |
168 | BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To better model the common semantics shared across texts and videos, we introduce a contrastive learning method in the cross-modal encoder. |
LIYAN KANG et. al. | arxiv-cs.CV | 2023-05-23 |
169 | WYWEB: A NLP Evaluation Benchmark For Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For the prosperity of the NLP community, in this paper, we introduce the WYWEB evaluation benchmark, which consists of nine NLP tasks in classical Chinese, implementing sentence classifi cation, sequence labeling, reading comprehension, and machine translation. |
Bo Zhou; Qianglong Chen; Tianyu Wang; Xiaomi Zhong; Yin Zhang; | arxiv-cs.CL | 2023-05-23 |
170 | Challenges in Context-Aware Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems. In this work, we investigate several challenges that impede progress within this field, relating to discourse phenomena, context usage, model architectures, and document-level evaluation. |
Linghao Jin; Jacqueline He; Jonathan May; Xuezhe Ma; | arxiv-cs.CL | 2023-05-23 |
171 | In-context Example Selection for Machine Translation Using Multiple Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a general framework for combining different features influencing example selection. |
Aswanth Kumar; Anoop Kunchukuttan; Ratish Puduppully; Raj Dabre; | arxiv-cs.CL | 2023-05-23 |
172 | Empowering LLM-based Machine Translation with Cultural Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce a new data curation pipeline to construct a culturally relevant parallel corpus, enriched with annotations of cultural-specific entities. |
Binwei Yao; Ming Jiang; Diyi Yang; Junjie Hu; | arxiv-cs.CL | 2023-05-23 |
173 | Improving Speech Translation By Fusing Speech and Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we harness the complementary strengths of speech and text, which are disparate modalities. |
WENBIAO YIN et. al. | arxiv-cs.CL | 2023-05-23 |
174 | Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we specifically target the unambiguous gender bias issue of multilingual machine translation models and propose a new mitigation method based on a novel perspective on the problem. |
MINWOO LEE et. al. | arxiv-cs.CL | 2023-05-23 |
175 | Sāmayik: A Benchmark and Dataset for English-Sanskrit Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We include training splits from our contemporary dataset and the Sanskrit-English parallel sentences from the training split of Itih\={a}sa, a previously released classical era machine translation dataset containing Sanskrit. |
AYUSH MAHESHWARI et. al. | arxiv-cs.CL | 2023-05-23 |
176 | Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through qualitative analysis, we found particular improvements when it comes to translating grammatical relations or function words, which results in increased fluency of our model. |
Jiayi Wang; Ke Wang; Yuqi Zhang; Yu Zhao; Pontus Stenetorp; | arxiv-cs.CL | 2023-05-22 |
177 | Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce target factors in a transformer model to predict durations jointly with target language phoneme sequences. |
PROYAG PAL et. al. | arxiv-cs.CL | 2023-05-22 |
178 | Non-Autoregressive Document-Level Machine Translation (NA-DMT): Exploring Effective Approaches, Challenges, and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experiments reveal that, although NAT models significantly accelerate text generation on documents, they do not perform as effectively as on sentences. To bridge this performance gap, we introduce a novel design that underscores the importance of sentence-level alignment for non-autoregressive document-level machine translation (NA-DMT). |
Guangsheng Bao; Zhiyang Teng; Yue Zhang; | arxiv-cs.CL | 2023-05-22 |
179 | Neural Machine Translation for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we survey the NMT for code generation literature, cataloging the variety of methods that have been explored according to input and output representations, model architectures, optimization techniques used, data sets, and evaluation methods. |
Dharma KC; Clayton T. Morrison; | arxiv-cs.CL | 2023-05-22 |
180 | Decomposed Prompting for Machine Translation Between Related Languages Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach of few-shot prompting that decomposes the translation process into a sequence of word chunk translations. |
Ratish Puduppully; Raj Dabre; Ai Ti Aw; Nancy F. Chen; | arxiv-cs.CL | 2023-05-22 |
181 | Is Translation Helpful? An Empirical Analysis of Cross-Lingual Transfer in Low-Resource Dialog Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A typical approach is to leverage off-the-shelf machine translation (MT) systems to utilize either the training corpus or developed models from high-resource languages. In this work, we investigate whether it is helpful to utilize MT at all in this task. |
Lei Shen; Shuai Yu; Xiaoyu Shen; | arxiv-cs.CL | 2023-05-21 |
182 | VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present our deployment-ready Speech-to-Speech Machine Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi language pairs. |
SHIVAM MHASKAR et. al. | arxiv-cs.CL | 2023-05-21 |
183 | ReSeTOX: Re-learning Attention Weights for Toxicity Mitigation in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed method, ReSeTOX (REdo SEarch if TOXic), addresses the issue of Neural Machine Translation (NMT) generating translation outputs that contain toxic words not present in the input. |
Javier García Gilabert; Carlos Escolano; Marta R. Costa-Jussà; | arxiv-cs.CL | 2023-05-19 |
184 | HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we release an annotated dataset for the hallucination and omission phenomena covering 18 translation directions with varying resource levels and scripts. |
DAVID DALE et. al. | arxiv-cs.CL | 2023-05-19 |
185 | The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop and compare several neural explainability methods and demonstrate their effectiveness for interpreting state-of-the-art fine-tuned neural metrics. |
RICARDO REI et. al. | arxiv-cs.CL | 2023-05-19 |
186 | Accurate Knowledge Distillation with N-best Reranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose extending the Sequence-level Knowledge Distillation (Kim and Rush, 2016) with n-best reranking to consider not only the top-1 hypotheses but also the top n-best hypotheses of teacher models. |
Hendra Setiawan; | arxiv-cs.CL | 2023-05-19 |
187 | Viewing Knowledge Transfer in Multilingual Machine Translation Through A Representational Lens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation. To support this claim, we introduce Representational Transfer Potential (RTP), which measures representational similarities between languages. |
David Stap; Vlad Niculae; Christof Monz; | arxiv-cs.CL | 2023-05-19 |
188 | Discourse Centric Evaluation of Machine Translation with A Densely Annotated Parallel Corpus Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new dataset with rich discourse annotations, built upon the large-scale parallel corpus BWB introduced in Jiang et al. (2022). |
YUCHEN ELEANOR JIANG et. al. | arxiv-cs.CL | 2023-05-18 |
189 | NollySenti: Leveraging Transfer Learning and Machine Translation for Nigerian Movie Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we focus on the task of sentiment classification for cross domain adaptation. |
Iyanuoluwa Shode; David Ifeoluwa Adelani; Jing Peng; Anna Feldman; | arxiv-cs.CL | 2023-05-18 |
190 | DUB: Discrete Unit Back-translation for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With DUB, the back-translation technique can successfully be applied on direct ST and obtains an average boost of 5.5 BLEU on MuST-C En-De/Fr/Es. |
Dong Zhang; Rong Ye; Tom Ko; Mingxuan Wang; Yaqian Zhou; | arxiv-cs.CL | 2023-05-18 |
191 | AlignAtt: Using Attention-based Audio-Translation Alignments As A Guide for Simultaneous Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose AlignAtt, a novel policy for simultaneous ST (SimulST) that exploits the attention information to generate source-target alignments that guide the model during inference. |
Sara Papi; Marco Turchi; Matteo Negri; | arxiv-cs.CL | 2023-05-18 |
192 | Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. |
Chantal Amrhein; Florian Schottmann; Rico Sennrich; Samuel Läubli; | arxiv-cs.CL | 2023-05-18 |
193 | On The Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we find that failing in encoding discriminative target language signal will lead to off-target and a closer lexical distance (i.e., KL-divergence) between two languages’ vocabularies is related with a higher off-target rate. |
Liang Chen; Shuming Ma; Dongdong Zhang; Furu Wei; Baobao Chang; | arxiv-cs.CL | 2023-05-18 |
194 | Multilingual Event Extraction from Historical Newspaper Adverts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new multilingual dataset in English, French, and Dutch composed of newspaper ads from the early modern colonial period reporting on enslaved people who liberated themselves from enslavement. |
Nadav Borenstein; Natalia da Silva Perez; Isabelle Augenstein; | arxiv-cs.CL | 2023-05-18 |
195 | Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce a novel method to enhance neural interlingua representations by making their length variable, thereby overcoming the constraint of fixed-length neural interlingua representations. |
Zhuoyuan Mao; Haiyue Song; Raj Dabre; Chenhui Chu; Sadao Kurohashi; | arxiv-cs.CL | 2023-05-17 |
196 | ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings Across Bengali and Five Other Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this multicultural age, language translation is one of the most performed tasks, and it is becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims to be proficient in such translation tasks and in this paper, we put that claim to the test. |
Sourojit Ghosh; Aylin Caliskan; | arxiv-cs.CY | 2023-05-17 |
197 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM’s Translation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. |
Eleftheria Briakou; Colin Cherry; George Foster; | arxiv-cs.CL | 2023-05-17 |
198 | Progressive Translation: Improving Domain Robustness of Neural Machine Translation with Intermediate Sequences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Borrowing techniques from Statistical Machine Translation, we propose intermediate signals which are intermediate sequences from the source-like structure to the target-like structure. |
Chaojun Wang; Yang Liu; Wai Lam; | arxiv-cs.CL | 2023-05-16 |
199 | The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided By Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated particularly by the task of cross-lingual SLU, we demonstrate that the task of speech translation (ST) is a good means of pretraining speech models for end-to-end SLU on both monolingual and cross-lingual scenarios. |
Mutian He; Philip N. Garner; | arxiv-cs.CL | 2023-05-16 |
200 | XPQA: Cross-Lingual Product Question Answering Across 12 Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages across 9 branches, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. |
Xiaoyu Shen; Akari Asai; Bill Byrne; Adrià de Gispert; | arxiv-cs.CL | 2023-05-16 |
201 | Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a novel method named \textbf{T}op-1 \textbf{I}nformation \textbf{E}nhanced \textbf{K}nowledge \textbf{D}istillation (TIE-KD). |
SONGMING ZHANG et. al. | arxiv-cs.CL | 2023-05-14 |
202 | PESTS: Persian_English Cross Lingual Corpus for Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, the corpus of semantic textual similarity between sentences in Persian and English languages has been produced for the first time by using linguistic experts. |
Mohammad Abdous; Poorya Piroozfar; Behrouz Minaei Bidgoli; | arxiv-cs.CL | 2023-05-13 |
203 | Improving The Quality of Neural Machine Translation Through Proper Translation of Name Entities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we have shown a method of improving the quality of neural machine translation by translating/transliterating name entities as a preprocessing step. |
Radhika Sharma; Pragya Katyayan; Nisheeth Joshi; | arxiv-cs.CL | 2023-05-12 |
204 | Improving Zero-shot Multilingual Neural Machine Translation By Leveraging Cross-lingual Consistency Regularization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a cross-lingual consistency regularization, CrossConST, to bridge the representation gap among different languages and boost zero-shot translation performance. |
Pengzhi Gao; Liwen Zhang; Zhongjun He; Hua Wu; Haifeng Wang; | arxiv-cs.CL | 2023-05-12 |
205 | Perturbation-based QE: An Explainable, Unsupervised Word-level Quality Estimation Method for Blackbox Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Perturbation-based QE – a word-level Quality Estimation approach that works simply by analyzing MT system output on perturbed input source sentences. |
Tu Anh Dinh; Jan Niehues; | arxiv-cs.CL | 2023-05-12 |
206 | Chain-of-Dictionary Prompting Elicits Translation in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities for LLMs. |
HONGYUAN LU et. al. | arxiv-cs.CL | 2023-05-11 |
207 | Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To use SSMT during inference we propose dynamic decoding, a text generation algorithm that adapts segmentations as it generates translations. |
Francois Meyer; Jan Buys; | arxiv-cs.CL | 2023-05-11 |
208 | How Good Are Commercial Large Language Models on African Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a preliminary analysis of commercial large language models on two tasks (machine translation and text classification) across eight African languages, spanning different language families and geographical areas. |
Jessica Ojo; Kelechi Ogueji; | arxiv-cs.CL | 2023-05-10 |
209 | PriGen: Towards Automated Translation of Android Applications’ Code to Privacy Captions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous work has attempted to help developers create privacy notices through a questionnaire or predefined templates. In this paper, we propose a novel approach and a framework, called PriGen, that extends these prior work. |
Vijayanta Jain; Sanonda Datta Gupta; Sepideh Ghanavati; Sai Teja Peddinti; | arxiv-cs.SE | 2023-05-10 |
210 | Multi-Teacher Knowledge Distillation For Text Image Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Multi-Teacher Knowledge Distillation (MTKD) method to effectively distillate knowledge into the end-to-end TIMT model from the pipeline model. |
CONG MA et. al. | arxiv-cs.CL | 2023-05-09 |
211 | Utilizing Lexical Similarity to Enable Zero-Shot Machine Translation for Extremely Low-resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address the task of machine translation from an extremely low-resource language (LRL) to English using cross-lingual transfer from a closely related high-resource language (HRL). |
Kaushal Kumar Maurya; Rahul Kejriwal; Maunendra Sankar Desarkar; Anoop Kunchukuttan; | arxiv-cs.CL | 2023-05-09 |
212 | MultiTACRED: A Multilingual Version of The TAC Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al., 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. |
Leonhard Hennig; Philippe Thomas; Sebastian Möller; | arxiv-cs.CL | 2023-05-08 |
213 | Label-Free Multi-Domain Machine Translation with Stage-wise Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a label-free multi-domain machine translation model which requires only a few or no domain-annotated data in training and no domain labels in inference. |
Fan Zhang; Mei Tu; Sangha Kim; Song Liu; Jinyao Yan; | arxiv-cs.CL | 2023-05-06 |
214 | Exploring Human-Like Translation Strategy with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast to traditional machine translation that focuses solely on source-target mapping, LLM-based translation can potentially mimic the human translation process that takes many preparatory steps to ensure high-quality translation. This work aims to explore this possibility by proposing the MAPS framework, which stands for Multi-Aspect Prompting and Selection. |
ZHIWEI HE et. al. | arxiv-cs.CL | 2023-05-06 |
215 | In-context Learning As Maintaining Coherency: A Study of On-the-fly Machine Translation Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The phenomena of in-context learning has typically been thought of as learning from examples. In this work which focuses on Machine Translation, we present a perspective of in-context learning as the desired generation task maintaining coherency with its context, i.e., the prompt examples. |
Suzanna Sia; Kevin Duh; | arxiv-cs.CL | 2023-05-05 |
216 | Unified Model Learning for Various Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although the dataset-specific models have achieved impressive performance, it is cumbersome as each dataset demands a model to be designed, trained, and stored. In this work, we aim to unify these translation tasks into a more general setting. |
YUNLONG LIANG et. al. | arxiv-cs.CL | 2023-05-04 |
217 | Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate lexical sharing in multilingual machine translation (MT) from Hindi, Gujarati, Nepali into English. |
Sonal Sannigrahi; Rachel Bawden; | arxiv-cs.CL | 2023-05-04 |
218 | Learning Language-Specific Layers for Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Language-Specific Transformer Layers (LSLs), which allow us to increase model capacity, while keeping the amount of computation and the number of parameters used in the forward pass constant. |
Telmo Pessoa Pires; Robin M. Schmidt; Yi-Hsiu Liao; Stephan Peitz; | arxiv-cs.CL | 2023-05-04 |
219 | End-to-end Training and Decoding for Pivot-based Cascaded Translation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes an end-to-end training method for the cascaded translation model and configures an improved decoding algorithm. |
Hao Cheng; Meng Zhang; Liangyou Li; Qun Liu; Zhihua Zhang; | arxiv-cs.CL | 2023-05-03 |
220 | Evaluating The Efficacy of Length-Controllable Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that BLEURT and COMET have the highest correlation with human evaluation and are most suitable as evaluation metrics for length-controllable machine translation. |
HAO CHENG et. al. | arxiv-cs.CL | 2023-05-03 |
221 | SLTUNET: A Simple Unified Model for Sign Language Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose SLTUNET, a simple unified neural model designed to support multiple SLTrelated tasks jointly, such as sign-to-gloss, gloss-to-text and sign-to-text translation. |
Biao Zhang; Mathias Müller; Rico Sennrich; | arxiv-cs.CL | 2023-05-02 |
222 | Low-Resourced Machine Translation for Senegalese Wolof Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a parallel Wolof/French corpus of 123,000 sentences on which we conducted experiments on machine translation models based on Recurrent Neural Networks (RNN) in different data configurations. |
Derguene Mbaye; Moussa Diallo; Thierno Ibrahima Diop; | arxiv-cs.CL | 2023-04-30 |
223 | Synthetic Cross-language Information Retrieval Training Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While MS MARCO is a large resource, it is of fixed size; its genre and domain of discourse are fixed; and the translated documents are not written in the language of a native speaker of the language, but rather in translationese. To address these problems, we introduce the JH-POLO CLIR training set creation methodology. |
JAMES MAYFIELD et. al. | arxiv-cs.IR | 2023-04-29 |
224 | Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation. |
N. Bhandari; P. -Y. Chen; | icassp | 2023-04-27 |
225 | LEAPT: Learning Adaptive Prefix-to-Prefix Translation For Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by strategies utilized by human interpreters and wait policies, we propose a novel adaptive prefix-to-prefix training policy called LEAPT, which allows our machine translation model to learn how to translate source sentence prefixes and make use of the future context. |
L. Lin; S. Li; X. Shi; | icassp | 2023-04-27 |
226 | Rethinking The Reasonability of The Test Set for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we manually annotate a monotonic test set based on the MuST-C English-Chinese test set, denoted as SiMuST-C. |
M. LIU et. al. | icassp | 2023-04-27 |
227 | Improving Speech-to-Speech Translation Through Unlabeled Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an effective way to utilize the massive existing unlabeled text from different languages to create a large amount of S2ST data to improve S2ST performance by applying various acoustic effects to the generated synthetic data. |
X. -P. NGUYEN et. al. | icassp | 2023-04-27 |
228 | M3ST: Mix at Three Levels for Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Mix at three levels for Speech Translation (M3ST) method to increase the diversity of the augmented training corpus. |
X. CHENG et. al. | icassp | 2023-04-27 |
229 | Targeted Adversarial Attacks Against Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new targeted adversarial attack against NMT models. |
S. Sadrizadeh; A. D. Aghdam; L. Dolamic; P. Frossard; | icassp | 2023-04-27 |
230 | Escaping The Sentence-level Paradigm in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Much work in document-context machine translation exists, but for various reasons has been unable to catch hold. This paper suggests a path out of this rut by addressing three impediments at once: what architectures should we use? |
Matt Post; Marcin Junczys-Dowmunt; | arxiv-cs.CL | 2023-04-25 |
231 | Translationese Reduction Using Abstract Meaning Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that across four metrics, and qualitatively, using AMR as an interlingua enables the reduction of translationese and we compare our results to two additional approaches: one based on round-trip machine translation and one based on syntactically controlled generation. |
Shira Wein; Nathan Schneider; | arxiv-cs.CL | 2023-04-22 |
232 | Exploring Paracrawl for Document-level Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we extract parallel paragraphs from Paracrawl parallel webpages using automatic sentence alignments and we use the extracted parallel paragraphs as parallel documents for training document-level translation models. |
Yusser Al Ghussin; Jingyi Zhang; Josef van Genabith; | arxiv-cs.CL | 2023-04-20 |
233 | Improving Speech Translation By Cross-Modal Multi-Grained Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the final model often performs worse on the MT task than the MT model trained alone, which means that the knowledge transfer ability of this method is also limited. To deal with these problems, we propose the FCCL (Fine- and Coarse- Granularity Contrastive Learning) approach for E2E-ST, which makes explicit knowledge transfer through cross-modal multi-grained contrastive learning. |
HAO ZHANG et. al. | arxiv-cs.CL | 2023-04-20 |
234 | The EBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the eBible corpus: a dataset containing 1009 translations of portions of the Bible with data in 833 different languages across 75 language families. |
VESA AKERMAN et. al. | arxiv-cs.CL | 2023-04-19 |
235 | An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, works focusing on distilling knowledge from large multilingual neural machine translation (MNMT) models into smaller ones are practically nonexistent, despite the popularity and superiority of MNMT. This paper bridges this gap by presenting an empirical investigation of knowledge distillation for compressing MNMT models. |
Varun Gumma; Raj Dabre; Pratyush Kumar; | arxiv-cs.CL | 2023-04-18 |
236 | Improving Autoregressive NLP Tasks Via Modular Linearized Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes modular linearized attention (MLA), which combines multiple efficient attention mechanisms, including cosFormer, to maximize inference quality while achieving notable speedups. |
Victor Agostinelli; Lizhong Chen; | arxiv-cs.CL | 2023-04-17 |
237 | Neural Machine Translation For Low Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The goal of this paper is to investigate the realm of low resource languages and build a Neural Machine Translation model to achieve state-of-the-art results. |
Vakul Goyle; Parvathy Krishnaswamy; Kannan Girija Ravikumar; Utsa Chattopadhyay; Kartikay Goyle; | arxiv-cs.CL | 2023-04-16 |
238 | TransDocs: Optical Character Recognition with Word to Word Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, I have shown comparative study for pre-trained OCR while using deep learning model using LSTM-based seq2seq architecture with attention for machine translation. |
Abhishek Bamotra; Phani Krishna Uppala; | arxiv-cs.CV | 2023-04-15 |
239 | Learning Homographic Disambiguation Representation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach to tackle homographic issues of NMT in the latent space. |
Weixuan Wang; Wei Peng; Qun Liu; | arxiv-cs.CL | 2023-04-12 |
240 | Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we systematically investigate the advantages and challenges of LLMs for MMT by answering two questions: 1) How well do LLMs perform in translating a massive number of languages? |
WENHAO ZHU et. al. | arxiv-cs.CL | 2023-04-10 |
241 | RISC: Generating Realistic Synthetic Bilingual Insurance Contract Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents RISC, an open-source Python package data generator (https://github.com/GRAAL-Research/risc). |
David Beauchemin; Richard Khoury; | arxiv-cs.CL | 2023-04-09 |
242 | ParroT: Translating During Chat Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models are only accessible through restricted APIs, which creates barriers to new research and advancements in the field. Therefore, we propose the $\mathbf{ParroT}$ framework to enhance and regulate the translation abilities during chat based on open-sourced LLMs (i.e., LLaMA-7b, BLOOMZ-7b-mt) and human written translation and evaluation data. |
WENXIANG JIAO et. al. | arxiv-cs.CL | 2023-04-05 |
243 | MUFIN: Improving Neural Repair Models with Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The main problem is to generate interesting and diverse pairs that maximize the effectiveness of training. As a contribution to this problem, we propose to use back-translation, a technique coming from neural machine translation. |
André Silva; João F. Ferreira; He Ye; Martin Monperrus; | arxiv-cs.SE | 2023-04-05 |
244 | How to Design Translation Prompts for ChatGPT: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, in this paper, we explore how to assist machine translation with ChatGPT. |
Yuan Gao; Ruili Wang; Feng Hou; | arxiv-cs.CL | 2023-04-04 |
245 | Document-Level Machine Translation with Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The study fo-cuses on three aspects: 1) Effects of Discourse-Aware Prompts, where we investigate the impact of different prompts on document-level translation quality and discourse phenomena; 2) Comparison of Translation Models, where we compare the translation performance of Chat-GPT with commercial MT systems and advanced document-level MT methods; 3) Analysis of Discourse Modelling Abilities, where we further probe discourse knowledge encoded in LLMs and examine the impact of training techniques on discourse modeling. |
LONGYUE WANG et. al. | arxiv-cs.CL | 2023-04-04 |
246 | LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new multilingual hate speech analysis dataset for English, Hindi, Arabic, French, German and Spanish languages for multiple domains across hate speech – Abuse, Racism, Sexism, Religious Hate and Extremism. |
Ankit Yadav; Shubham Chandel; Sushant Chatufale; Anil Bandhakavi; | arxiv-cs.CL | 2023-04-03 |
247 | PEACH: Pre-Training Sequence-to-Sequence Multilingual Models for Translation with Semi-Supervised Pseudo-Parallel Document Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a novel semi-supervised method, SPDG, that generates high-quality pseudo-parallel data for multilingual pre-training. |
Alireza Salemi; Amirhossein Abaskohi; Sara Tavakoli; Yadollah Yaghoobzadeh; Azadeh Shakery; | arxiv-cs.CL | 2023-04-03 |
248 | $\varepsilon$ KÚ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the performance of massively multilingual neural machine translation (NMT) systems in translating Yor\`ub\’a greetings ($\varepsilon$ k\’u [MASK]), which are a big part of Yor\`ub\’a language and culture, into English. To evaluate these models, we present IkiniYor\`ub\’a, a Yor\`ub\’a-English translation dataset containing some Yor\`ub\’a greetings, and sample use cases. |
Idris Akinade; Jesujoba Alabi; David Adelani; Clement Odoje; Dietrich Klakow; | arxiv-cs.CL | 2023-03-31 |
249 | Hallucinations in Large Multilingual Translation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing research on hallucinations has primarily focused on small bilingual models trained on high-resource languages, leaving a gap in our understanding of hallucinations in massively multilingual models across diverse translation scenarios. In this work, we fill this gap by conducting a comprehensive analysis on both the M2M family of conventional neural machine translation models and ChatGPT, a general-purpose large language model~(LLM) that can be prompted for translation. |
NUNO M. GUERREIRO et. al. | arxiv-cs.CL | 2023-03-28 |
250 | Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We test the efficacy of bilingual lexica in a real-world set-up, on 200-language translation models trained on web-crawled text. We present several findings: (1) using lexical data augmentation, we demonstrate sizable performance gains for unsupervised translation; (2) we compare several families of data augmentation, demonstrating that they yield similar improvements, and can be combined for even greater improvements; (3) we demonstrate the importance of carefully curated lexica over larger, noisier ones, especially with larger models; and (4) we compare the efficacy of multilingual lexicon data versus human-translated parallel data. |
Alex Jones; Isaac Caswell; Ishank Saxena; Orhan Firat; | arxiv-cs.CL | 2023-03-27 |
251 | Linguistically Informed ChatGPT Prompts to Enhance Japanese-Chinese Machine Translation: A Case Study on Attributive Clauses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Present-day machine translation tools often fail to accurately translate attributive clauses from Japanese to Chinese. In light of this, this paper investigates the linguistic problem underlying such difficulties, namely how does the semantic role of the modified noun affect the selection of translation patterns for attributive clauses, from a linguistic perspective. |
Wenshi Gu; | arxiv-cs.CL | 2023-03-27 |
252 | Translate The Beauty in Songs: Jointly Learning to Align Melody and Translate Lyrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Lyrics-Melody Translation with Adaptive Grouping (LTAG), a holistic solution to automatic song translation by jointly modeling lyrics translation and lyrics-melody alignment. |
CHENGXI LI et. al. | arxiv-cs.CL | 2023-03-27 |
253 | Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative large language models (LLMs), e.g., ChatGPT, have demonstrated remarkable proficiency across several NLP tasks such as machine translation, question answering, text summarization, and natural language understanding. |
Qingyu Lu; Baopu Qiu; Liang Ding; Liping Xie; Dacheng Tao; | arxiv-cs.CL | 2023-03-24 |
254 | Towards Making The Most of ChatGPT for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this report, we aim to further mine ChatGPT’s translation ability by revisiting several aspects: temperature, task information, and domain information, and correspondingly propose two (simple but effective) prompts: Task-Specific Prompts (TSP) and Domain-Specific Prompts (DSP). |
KEQIN PENG et. al. | arxiv-cs.CL | 2023-03-23 |
255 | Selective Data Augmentation for Robust Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to use an e2e architecture for English-Hindi (en-hi) ST. We use two imperfect machine translation (MT) services to translate Libri-trans en text into hi text. |
Rajul Acharya; Ashish Panda; Sunil Kumar Kopparapu; | arxiv-cs.CL | 2023-03-22 |
256 | LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by strategies utilized by human interpreters and wait policies, we propose a novel adaptive prefix-to-prefix training policy called LEAPT, which allows our machine translation model to learn how to translate source sentence prefixes and make use of the future context. |
Lei Lin; Shuangtao Li; Xiaodong Shi; | arxiv-cs.CL | 2023-03-21 |
257 | Translate Your Gibberish: Black-box Adversarial Attack on Machine Translation Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a simple approach to fool state-of-the-art machine translation tools in the task of translation from Russian to English and vice versa. |
Andrei Chertkov; Olga Tsymboi; Mikhail Pautov; Ivan Oseledets; | arxiv-cs.CL | 2023-03-20 |
258 | Towards Reliable Neural Machine Translation with Consistency-Aware Meta-Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A contributing factor to this problem is that NMT models trained with the one-to-one paradigm struggle to handle the source diversity phenomenon, where inputs with the same meaning can be expressed differently. In this work, we treat this problem as a bilevel optimization problem and present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it. |
Rongxiang Weng; Qiang Wang; Wensen Cheng; Changfeng Zhu; Min Zhang; | arxiv-cs.CL | 2023-03-20 |
259 | AMOM: Adaptive Masking Over Masking for Conditional Masked Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer-based autoregressive (AR) methods have achieved appealing performance for varied sequence-to-sequence generation tasks, e.g., neural machine translation, summarization, and code generation, but suffer from low inference efficiency. |
YISHENG XIAO et. al. | arxiv-cs.CL | 2023-03-13 |
260 | ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To relax the dependency on labeled data of downstream tasks, we propose an intuitive and effective zero-shot learning framework, ZeroNLG, which can deal with multiple NLG tasks, including image-to-text (image captioning), video-to-text (video captioning), and text-to-text (neural machine translation), across English, Chinese, German, and French within a unified framework. |
BANG YANG et. al. | arxiv-cs.CL | 2023-03-11 |
261 | MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nonetheless, visual speech is not as distinguishable as audio speech, making it difficult to develop a mapping from source speech phonemes to the target language text. To address this issue, we propose MixSpeech, a cross-modality self-learning framework that utilizes audio speech to regularize the training of visual speech tasks. |
XIZE CHENG et. al. | arxiv-cs.CV | 2023-03-09 |
262 | GATE: A Challenge Set for Gender-Ambiguous Translation Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent work has led to the development of gender rewriters that generate alternative gender translations on such ambiguous inputs, but such systems are plagued by poor linguistic coverage. To encourage better performance on this task we present and release GATE, a linguistically diverse corpus of gender-ambiguous source sentences along with multiple alternative target language translations. |
Spencer Rarrick; Ranjita Naik; Varun Mathur; Sundar Poudel; Vishal Chowdhary; | arxiv-cs.CL | 2023-03-07 |
263 | Exploiting Language Relatedness in Machine Translation Through Domain Adaptation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to tackle the challenges faced by MT, we present a novel approach of using a scaled similarity score of sentences, especially for related languages based on a 5-gram KenLM language model with Kneser-ney smoothing technique for filtering in-domain data from out-of-domain corpora that boost the translation quality of MT. Furthermore, we employ other domain adaptation techniques such as multi-domain, fine-tuning and iterative back-translation approach to compare our novel approach on the Hindi-Nepali language pair for NMT and SMT. |
Amit Kumar; Rupjyoti Baruah; Ajay Pratap; Mayank Swarnkar; Anil Kumar Singh; | arxiv-cs.CL | 2023-03-03 |
264 | Rethinking The Reasonability of The Test Set for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we manually annotate a monotonic test set based on the MuST-C English-Chinese test set, denoted as SiMuST-C. |
MENGGE LIU et. al. | arxiv-cs.CL | 2023-03-02 |
265 | Targeted Adversarial Attacks Against Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new targeted adversarial attack against NMT models. |
Sahar Sadrizadeh; AmirHossein Dabiri Aghdam; Ljiljana Dolamic; Pascal Frossard; | arxiv-cs.CL | 2023-03-02 |
266 | Federated Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel federated nearest neighbor (FedNN) machine translation framework that, instead of multi-round model-based interactions, leverages one-round memorization-based interaction to share knowledge across different clients to build low-overhead privacy-preserving systems. |
YICHAO DU et. al. | arxiv-cs.CL | 2023-02-23 |
267 | Simple and Scalable Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple and scalable nearest neighbor machine translation framework to drastically promote the decoding and storage efficiency of $k$NN-based models while maintaining the translation performance. |
YUHAN DAI et. al. | arxiv-cs.CL | 2023-02-23 |
268 | Exploring The Potential of Machine Translation for Generating Named Entity Datasets: A Case Study Between Persian and English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study focuses on the generation of Persian named entity datasets through the application of machine translation on English datasets. |
Amir Sartipi; Afsaneh Fatemi; | arxiv-cs.CL | 2023-02-19 |
269 | Zero and Few-Shot Localization of Task-Oriented Dialogue Agents with A Distilled Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose automatic methods that use ToD training data in a source language to build a high-quality functioning dialogue agent in another target language that has no training data (i.e. zero-shot) or a small training set (i.e. few-shot). |
Mehrad Moradshahi; Sina J. Semnani; Monica S. Lam; | arxiv-cs.CL | 2023-02-18 |
270 | How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a comprehensive evaluation of GPT models for machine translation, covering various aspects such as quality of different GPT models in comparison with state-of-the-art research and commercial systems, effect of prompting strategies, robustness towards domain shifts and document-level translation. |
AMR HENDY et. al. | arxiv-cs.CL | 2023-02-17 |
271 | Evaluating and Improving The Coreference Capabilities of Machine Translation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we ask: \emph{How well do MT models learn coreference resolution from implicit signal?} |
Asaf Yehudai; Arie Cattan; Omri Abend; Gabriel Stanovsky; | arxiv-cs.CL | 2023-02-16 |
272 | Document Flattening: Beyond Concatenating Context for Document-Level Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct comprehensive experiments and analyses on three benchmark datasets for English-German translation, and validate the effectiveness of two variants of DocFlat. |
Minghao Wu; George Foster; Lizhen Qu; Gholamreza Haffari; | arxiv-cs.CL | 2023-02-15 |
273 | Encoding Sentence Position in Context-Aware Neural Machine Translation with Concatenation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We compare various methods to encode sentence positions into token representations, including novel methods. |
Lorenzo Lupo; Marco Dinarelli; Laurent Besacier; | arxiv-cs.CL | 2023-02-13 |
274 | Language-Aware Multilingual Machine Translation with Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Finally, we apply intra-distillation to this co-training approach. Combining these two approaches significantly improves MMT performance, outperforming three state-of-the-art SSL methods by a large margin, e.g., 11.3\% and 3.7\% improvement on an 8-language and a 15-language benchmark compared with MASS, respectively |
Haoran Xu; Jean Maillard; Vedanuj Goswami; | arxiv-cs.CL | 2023-02-09 |
275 | Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Training such metrics requires data which can be expensive and difficult to acquire, particularly for lower-resource languages. We show how knowledge can be distilled from Large Language Models (LLMs) to improve upon such learned metrics without requiring human annotators, by creating synthetic datasets which can be mixed into existing datasets, requiring only a corpus of text in the target language. |
Amirkeivan Mohtashami; Mauro Verzetti; Paul K. Rubenstein; | arxiv-cs.CL | 2023-02-07 |
276 | The Unreasonable Effectiveness of Few-shot Learning for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that with only 5 examples of high-quality translation data shown at inference, a transformer decoder-only model trained solely with self-supervised learning, is able to match specialized supervised state-of-the-art models as well as more general commercial translation systems. |
XAVIER GARCIA et. al. | arxiv-cs.CL | 2023-02-02 |
277 | Code Translation with Compiler Representations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we leverage low-level compiler intermediate representations (IR) code translation. |
MARC SZAFRANIEC et. al. | iclr | 2023-02-01 |
278 | An Evaluation of Persian-English Machine Translation Datasets with Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Nowadays, many researchers are focusing their attention on the subject of machine translation (MT). However, Persian machine translation has remained unexplored despite a vast … |
Amir Sartipi; Meghdad Dehghan; Afsaneh Fatemi; | arxiv-cs.CL | 2023-02-01 |
279 | Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel architecture named as attention link (AL) to help improve transformer models’ performance, especially in low training resources. |
Zeping Min; | arxiv-cs.CL | 2023-02-01 |
280 | Adaptive Machine Translation with Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work aims to investigate how we can utilize in-context learning to improve real-time adaptive MT. Our extensive experiments show promising results at translation time. |
Yasmin Moslem; Rejwanul Haque; John D. Kelleher; Andy Way; | arxiv-cs.CL | 2023-01-30 |
281 | KG-BERTScore: Incorporating Knowledge Graph Into BERTScore for Reference-Free Machine Translation Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we incorporate multilingual knowledge graph into BERTScore and propose a metric named KG-BERTScore, which linearly combines the results of BERTScore and bilingual named entity matching for reference-free machine translation evaluation. |
ZHANGLIN WU et. al. | arxiv-cs.CL | 2023-01-30 |
282 | Extremal Domain Translation with Neural Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the extremal transport (ET) which is a mathematical formalization of the theoretically best possible unpaired translation between a pair of domains w.r.t. the given similarity function. |
Milena Gazdieva; Alexander Korotin; Daniil Selikhanovych; Evgeny Burnaev; | arxiv-cs.LG | 2023-01-30 |
283 | Improving Cross-lingual Information Retrieval on Low-Resource Languages Via Optimal Transport Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose OPTICAL: Optimal Transport distillation for low-resource Cross-lingual information retrieval. |
Zhiqi Huang; Puxuan Yu; James Allan; | arxiv-cs.CL | 2023-01-29 |
284 | Gender Neutralization for An Inclusive Machine Translation: from Theoretical Foundations to Open Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore gender-neutral translation (GNT) as a form of gender inclusivity and a goal to be achieved by machine translation (MT) models, which have been found to perpetuate gender bias and discrimination. |
Andrea Piergentili; Dennis Fucci; Beatrice Savoldi; Luisa Bentivogli; Matteo Negri; | arxiv-cs.CL | 2023-01-24 |
285 | Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Choosing the incorrect option might significantly affect translation usefulness and quality. We propose a novel method interactive-chain prompting — a series of question, answering and generation intermediate steps between a Translator model and a User model — that reduces translations into a list of subproblems addressing ambiguities and then resolving such subproblems before producing the final text to be translated. |
Jonathan Pilault; Xavier Garcia; Arthur Bražinskas; Orhan Firat; | arxiv-cs.LG | 2023-01-24 |
286 | Is ChatGPT A Good Translator? Yes With GPT-4 As The Engine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This report provides a preliminary evaluation of ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness. |
Wenxiang Jiao; Wenxuan Wang; Jen-tse Huang; Xing Wang; Zhaopeng Tu; | arxiv-cs.CL | 2023-01-20 |
287 | Improving Machine Translation with Phrase Pair Injection and Corpus Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. |
Akshay Batheja; Pushpak Bhattacharyya; | arxiv-cs.CL | 2023-01-19 |
288 | Machine Translation for Accessible Multi-Language Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to leverage those very advances to demonstrate that multi-language analysis is currently accessible to all computational scholars. |
Edward W. Chew; William D. Weisman; Jingying Huang; Seth Frey; | arxiv-cs.CL | 2023-01-19 |
289 | Understanding and Detecting Hallucinations in Neural Machine Translation Via Model Introspection Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: Neural sequence generation models are known to hallucinate, by producing outputs that are unrelated to the source text. These hallucinations are potentially harmful, yet it … |
Weijia Xu; Sweta Agrawal; Eleftheria Briakou; Marianna J. Martindale; Marine Carpuat; | arxiv-cs.CL | 2023-01-18 |
290 | HanoiT: Enhancing Context-aware Translation Via Selective Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context. To mitigate this problem, we propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context. |
JIAN YANG et. al. | arxiv-cs.CL | 2023-01-17 |
291 | Unsupervised Mandarin-Cantonese Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The key contributions of our project include: 1. |
Megan Dare; Valentina Fajardo Diaz; Averie Ho Zoen So; Yifan Wang; Shibingfeng Zhang; | arxiv-cs.CL | 2023-01-10 |
292 | Automatic Standardization of Arabic Dialects for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Carrying out this research could then lead to combining ”automatic standardization” software and automatic translation software so that we take the output of the first software and introduce it as input into the second one to obtain at the end a quality machine translation. |
Abidrabbo Alnassan; | arxiv-cs.CL | 2023-01-09 |
293 | Applying Automated Machine Translation to Educational Video Courses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We studied the capability of automated machine translation in the online video education space by automatically translating Khan Academy videos with state-of-the-art translation models and applying text-to-speech synthesis and audio/video synchronization to build engaging videos in target languages. |
Linden Wang; | arxiv-cs.CL | 2023-01-08 |
294 | Building A Parallel Corpus and Training Translation Models Between Luganda and English Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we build a parallel corpus with 41,070 pairwise sentences for Luganda and English which is based on three different open-sourced corpora. |
Richard Kimera; Daniela N. Rim; Heeyoul Choi; | arxiv-cs.CL | 2023-01-06 |
295 | Statistical Machine Translation for Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Different preprocessing approaches are proposed in this paper to handle the noise of the dataset. |
Sudhansu Bala Das; Divyajoti Panda; Tapas Kumar Mishra; Bidyut Kr. Patra; | arxiv-cs.CL | 2023-01-02 |
296 | From Inclusive Language to Gender-Neutral Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Gender inclusivity in language has become a central topic of debate and research. Its application in the cross-lingual contexts of human and machine translation (MT), however, … |
Andrea Piergentili; Dennis Fucci; Beatrice Savoldi; L. Bentivogli; Matteo Negri; | ArXiv | 2023-01-01 |
297 | Is ChatGPT A Good Translator? A Preliminary Study IF:4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This report provides a preliminary evaluation of ChatGPT for machine translation, including translation prompt, multilingual translation, and translation robustness. We adopt the … |
Wenxiang Jiao; Wenxuan Wang; Jen-tse Huang; Xing Wang; Zhaopeng Tu; | ArXiv | 2023-01-01 |
298 | Non-Autoregressive Neural Machine Translation: A Call for Clarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we take a step back and revisit several techniques that have been proposed for improving non-autoregressive translation models and compare their combined translation quality and speed implications under third-party testing environments. |
Robin Schmidt; Telmo Pires; Stephan Peitz; Jonas L��f; | emnlp | 2022-12-30 |
299 | Breaking The Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel representation method for Chinese characters to break the bottlenecks, namely StrokeNet, which represents a Chinese character by a Latinized stroke sequence (e. g. , �? |
Zhijun Wang; Xuebo Liu; Min Zhang; | emnlp | 2022-12-30 |
300 | Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To compress and accelerate Transformer, we propose a Hybrid Tensor-Train (HTT) decomposition, which retains full rank and meanwhile reduces operations and parameters. |
SUNZHU LI et. al. | emnlp | 2022-12-30 |
301 | IndicXNLI: Evaluating Multilingual Inference for Indian Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we introduce INDICXNLI, an NLI dataset for 11 Indic languages. |
Divyanshu Aggarwal; Vivek Gupta; Anoop Kunchukuttan; | emnlp | 2022-12-30 |
302 | Neural Machine Translation with Contrastive Translation Memories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Different from previous works that make use of mutually similar but redundant translation memories (TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gain in three phases. |
Xin Cheng; Shen Gao; Lemao Liu; Dongyan Zhao; Rui Yan; | emnlp | 2022-12-30 |
303 | Modeling Consistency Preference Via Lexical Chains for Document-level Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we aim to relieve the issue of lexical translation inconsistency for document-level neural machine translation (NMT) by modeling consistency preference for lexical chains, which consist of repeated words in a source-side document and provide a representation of the lexical consistency structure of the document. |
XINGLIN LYU et. al. | emnlp | 2022-12-30 |
304 | IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages. |
AMAN KUMAR et. al. | emnlp | 2022-12-30 |
305 | LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end,we first propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.Then, an effective baseline LVP-M3 using visual prompts is proposed to support translations between different languages,which includes three stages (token encoding, language-aware visual prompt generation, and language translation). |
HONGCHENG GUO et. al. | emnlp | 2022-12-30 |
306 | Information-Transport-based Policy for Simultaneous Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we treat the translation as information transport from source to target and accordingly propose an Information-Transport-based Simultaneous Translation (ITST). |
Shaolei Zhang; Yang Feng; | emnlp | 2022-12-30 |
307 | Exploring Document-Level Literary Machine Translation with Parallel Paragraphs from World Literature Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The experts note that MT outputs contain not only mistranslations, but also discourse-disrupting errors and stylistic inconsistencies. To address these problems, we train a post-editing model whose output is preferred over normal MT output at a rate of 69% by experts. |
KATHERINE THAI et. al. | emnlp | 2022-12-30 |
308 | DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. |
Gabriele Sarti; Arianna Bisazza; Ana Guerberof-Arenas; Antonio Toral; | emnlp | 2022-12-30 |
309 | A Template-based Method for Constrained Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a template-based method that can yield results with high translation quality and match accuracy and the inference speed of our method is comparable with unconstrained NMT models. |
SHUO WANG et. al. | emnlp | 2022-12-30 |
310 | WeTS: A Benchmark for Translation Suggestion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To break these limitations mentioned above and spur the research in TS, we create a benchmark dataset, called WeTS, which is a golden corpus annotated by expert translators on four translation directions. |
Zhen Yang; Fandong Meng; Yingxue Zhang; Ernan Li; Jie Zhou; | emnlp | 2022-12-30 |
311 | PreQuEL: Quality Estimation of Machine Translation Outputs in Advance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the task of PreQuEL, Pre-(Quality-Estimation) Learning. |
Shachar Don-Yehiya; Leshem Choshen; Omri Abend; | emnlp | 2022-12-30 |
312 | DEMETR: Diagnosing Evaluation Metrics for Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The operations of newer learned metrics (e. g. , BLEURT, COMET), which leverage pretrained language models to achieve higher correlations with human quality judgments than BLEU, are opaque in comparison. In this paper, we shed light on the behavior of these learned metrics by creating DEMETR, a diagnostic dataset with 31K English examples (translated from 10 source languages) for evaluating the sensitivity of MT evaluation metrics to 35 different linguistic perturbations spanning semantic, syntactic, and morphological error categories. |
MARZENA KARPINSKA et. al. | emnlp | 2022-12-30 |
313 | Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study. |
TU VU et. al. | emnlp | 2022-12-30 |
314 | Low-resource Neural Machine Translation with Cross-modal Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we turn to connect several low-resource languages to a particular high-resource one by additional visual modality. |
Zhe Yang; Qingkai Fang; Yang Feng; | emnlp | 2022-12-30 |
315 | Distill The Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, in this work, we introduce IKD-MMT, a novel MMT framework to support the image-free inference phase via an inversion knowledge distillation scheme. |
Ru Peng; Yawen Zeng; Jake Zhao; | emnlp | 2022-12-30 |
316 | Entropy-Based Vocabulary Substitution for Incremental Learning in Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an entropy-based vocabulary substitution (EVS) method that just needs to walk through new language pairs for incremental learning in a large-scale multilingual data updating while remaining the size of the vocabulary. |
Kaiyu Huang; Peng Li; Jin Ma; Yang Liu; | emnlp | 2022-12-30 |
317 | Competency-Aware Neural Machine Translation: Can Machine Translation Know Its Own Translation Quality? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is in sharp contrast to human translators who give feedback or conduct further investigations whenever they are in doubt about predictions. To fill this gap, we propose a novel competency-aware NMT by extending conventional NMT with a self-estimator, offering abilities to translate a source sentence and estimate its competency. |
PEI ZHANG et. al. | emnlp | 2022-12-30 |
318 | MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. |
ANNA CURREY et. al. | emnlp | 2022-12-30 |
319 | GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To bridge the data and evaluation gaps, we propose a benchmark testset for target evaluation on Chinese-English ZP translation. |
MINGZHOU XU et. al. | emnlp | 2022-12-30 |
320 | ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer knowledge from the parent model during the training of the child model. |
Zhaocong Li; Xuebo Liu; Derek F. Wong; Lidia S. Chao; Min Zhang; | emnlp | 2022-12-30 |
321 | T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks. |
Paul-Ambroise Duquenne; Hongyu Gong; Beno�t Sagot; Holger Schwenk; | emnlp | 2022-12-30 |
322 | Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In order to enable zero-shot ST, we propose a novel Discrete Cross-Modal Alignment (DCMA) method that employs a shared discrete vocabulary space to accommodate and match both modalities of speech and text. |
CHEN WANG et. al. | emnlp | 2022-12-30 |
323 | Bilingual Synchronization: Restoring Translational Relationships with Editing Operations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. |
Jitao Xu; Josep Crego; Fran�ois Yvon; | emnlp | 2022-12-30 |
324 | Optimizing Deep Transformers for Chinese-Thai Low-Resource Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the use of deep Transformer translation model for the CCMT 2022 Chinese-Thai low-resource machine translation task. |
Wenjie Hao; Hongfei Xu; Lingling Mu; Hongying Zan; | arxiv-cs.CL | 2022-12-24 |
325 | Beyond Triplet: Leveraging The Most Data for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: First, they can only utilize triple data (bilingual texts with images), which is scarce; second, current benchmarks are relatively restricted and do not correspond to realistic scenarios. Therefore, this paper correspondingly establishes new methods and new datasets for MMT. |
YAOMING ZHU et. al. | arxiv-cs.CL | 2022-12-20 |
326 | Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new MMT approach based on a strong text-only MT model, which uses neural adapters, a novel guided self-attention mechanism and which is jointly trained on both visually-conditioned masking and MMT. |
Matthieu Futeral; Cordelia Schmid; Ivan Laptev; Benoît Sagot; Rachel Bawden; | arxiv-cs.CL | 2022-12-20 |
327 | T-Projection: High Quality Annotation Projection for Sequence Labeling Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present T-Projection, a new approach for annotation projection that leverages large pretrained text2text language models and state-of-the-art machine translation technology. |
Iker García-Ferrero; Rodrigo Agerri; German Rigau; | arxiv-cs.CL | 2022-12-20 |
328 | IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Indian languages, having over a billion speakers, are linguistically different from English, and to date, there has not been a systematic study of evaluating MT systems from English into Indian languages. In this paper, we fill this gap by creating an MQM dataset consisting of 7000 fine-grained annotations, spanning 5 Indian languages and 7 MT systems, and use it to establish correlations between annotator scores and scores obtained using existing automatic metrics. |
ANANYA B. SAI et. al. | arxiv-cs.CL | 2022-12-20 |
329 | Mu2SLAM: Multitask, Multilingual Speech and Language Models Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech … |
Yong Cheng; Yu Zhang; Melvin Johnson; Wolfgang Macherey; Ankur Bapna; | ArXiv | 2022-12-19 |
330 | Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages. |
Yong Cheng; Yu Zhang; Melvin Johnson; Wolfgang Macherey; Ankur Bapna; | arxiv-cs.CL | 2022-12-19 |
331 | AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AdaTranS for end-to-end ST. It adapts the speech features with a new shrinking mechanism to mitigate the length mismatch between speech and text features by predicting word boundaries. |
Xingshan Zeng; Liangyou Li; Qun Liu; | arxiv-cs.CL | 2022-12-17 |
332 | Controlling Styles in Neural Machine Translation with Activation Prompt Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address both challenges, this paper presents a new benchmark and approach. |
Yifan Wang; Zewei Sun; Shanbo Cheng; Weiguo Zheng; Mingxuan Wang; | arxiv-cs.CL | 2022-12-17 |
333 | Better Datastore, Better Translation: Generating Datastores from Pre-Trained Models for Nearest Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT. |
Jiahuan Li; Shanbo Cheng; Zewei Sun; Mingxuan Wang; Shujian Huang; | arxiv-cs.CL | 2022-12-17 |
334 | Attention As A Guide for Simultaneous Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although its patterns have been exploited to perform different tasks, from neural network understanding to textual alignment, no previous work has analysed the encoder-decoder attention behavior in speech translation (ST) nor used it to improve ST on a specific task. In this paper, we fill this gap by proposing an attention-based policy (EDAtt) for simultaneous ST (SimulST) that is motivated by an analysis of the existing attention relations between audio input and textual output. |
Sara Papi; Matteo Negri; Marco Turchi; | arxiv-cs.CL | 2022-12-15 |
335 | Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show effective regularization strategies, namely dropout techniques for MoE layers in EOM and FOM, Conditional MoE Routing and Curriculum Learning methods that prevent over-fitting and improve the performance of MoE models on low-resource tasks without adversely affecting high-resource tasks. |
Maha Elbayad; Anna Sun; Shruti Bhosale; | arxiv-cs.CL | 2022-12-14 |
336 | ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we step towards bridging the gap between multilingual NLs and multilingual PLs for large language models (LLMs). |
YEKUN CHAI et. al. | arxiv-cs.CL | 2022-12-13 |
337 | Towards A General Purpose Machine Translation System for Sranantongo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study we create a general purpose machine translation system for srn. |
Just Zwennicker; David Stap; | arxiv-cs.CL | 2022-12-13 |
338 | End-to-End Speech Translation of Arabic to English Broadcast News Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our efforts towards the development of the first Broadcast News end-to-end Arabic to English speech translation system. |
Fethi Bougares; Salim Jouili; | arxiv-cs.CL | 2022-12-11 |
339 | M3ST: Mix at Three Levels for Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Mix at three levels for Speech Translation (M^3ST) method to increase the diversity of the augmented training corpus. |
XUXIN CHENG et. al. | arxiv-cs.CL | 2022-12-07 |
340 | Life-long Learning for Multilingual Neural Machine Translation with Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, in one-tomany scenario, we propose a multilingual distillation method to make the new model (student) jointly learn multilingual output from old model (teacher) and new task. |
YANG ZHAO et. al. | arxiv-cs.CL | 2022-12-06 |
341 | Impact of Domain-Adapted Multilingual Neural Machine Translation in The Medical Domain Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We compare the out-of-domain MNMT with the in-domain adapted MNMT. |
Miguel Rios; Raluca-Maria Chereji; Alina Secara; Dragos Ciobanu; | arxiv-cs.CL | 2022-12-05 |
342 | In-context Examples Selection for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to understand the properties of good in-context examples for MT in both in-domain and out-of-domain settings. |
Sweta Agrawal; Chunting Zhou; Mike Lewis; Luke Zettlemoyer; Marjan Ghazvininejad; | arxiv-cs.CL | 2022-12-05 |
343 | Democratizing Neural Machine Translation with OPUS-MT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into end-user applications, development platforms and professional workflows. |
JÖRG TIEDEMANN et. al. | arxiv-cs.CL | 2022-12-04 |
344 | The RoyalFlush System for The WMT 2022 Efficiency Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the submission of the RoyalFlush neural machine translation system for the WMT 2022 translation efficiency task. |
BO QIN et. al. | arxiv-cs.CL | 2022-12-03 |
345 | Tackling Low-Resourced Sign Language Translation: UPC at WMT-SLT 22 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes the system developed at the Universitat Polit\`ecnica de Catalunya for the Workshop on Machine Translation 2022 Sign Language Translation Task, in particular, for the sign-to-text direction. |
Laia Tarrés; Gerard I. Gàllego; Xavier Giró-i-Nieto; Jordi Torres; | arxiv-cs.CL | 2022-12-02 |
346 | CUNI Systems for The WMT22 Czech-Ukrainian Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Charles University submissions to the WMT22 General Translation Shared Task on Czech-Ukrainian and Ukrainian-Czech machine translation. |
Martin Popel; Jindřich Libovický; Jindřich Helcl; | arxiv-cs.CL | 2022-12-01 |
347 | Sevi: Speech-to-Visualization Through Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Arguably, the most natural way to specify what to visualize is through natural language or speech, similar to our daily search on Google or Apple Siri, leaving to the system the task of reasoning about what to visualize and how. In this demo, we present Sevi an end-to-end data visualization system that acts as a virtual assistant to allow novices to create visualizations through either natural language or speech. |
Jiawei Tang; Yuyu Luo; Mourad Ouzzani; Guoliang Li; Hongyang Chen; | sigmod | 2022-11-30 |
348 | Word Alignment in The Era of Deep Learning: A Tutorial Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The word alignment task, despite its prominence in the era of statistical machine translation (SMT), is niche and under-explored today. In this two-part tutorial, we argue for the continued relevance for word alignment. |
Bryan Li; | arxiv-cs.CL | 2022-11-30 |
349 | Extending The Subwording Model of Multilingual Pretrained Models for New Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we add new subwords to the SentencePiece tokenizer to apply a multilingual pretrained model to new languages (Inuktitut in this paper). |
Kenji Imamura; Eiichiro Sumita; | arxiv-cs.CL | 2022-11-29 |
350 | CUNI Submission in WMT22 General Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the CUNI-Bergamot submission for the WMT22 General translation task. |
Josef Jon; Martin Popel; Ondřej Bojar; | arxiv-cs.CL | 2022-11-29 |
351 | Findings of The WMT 2022 Shared Task on Translation Suggestion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We report the result of the first edition of the WMT shared task on Translation Suggestion (TS). |
Zhen Yang; Fandong Meng; Yingxue Zhang; Ernan Li; Jie Zhou; | arxiv-cs.CL | 2022-11-29 |
352 | Domain Mismatch Doesn’t Always Prevent Cross-Lingual Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that a simple initialization regimen can overcome much of the effect of domain mismatch in cross-lingual transfer. |
Daniel Edmiston; Phillip Keung; Noah A. Smith; | arxiv-cs.CL | 2022-11-29 |
353 | Considerations for Meaningful Sign Language Machine Translation Based on Glosses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we review recent works on neural gloss translation. |
Mathias Müller; Zifan Jiang; Amit Moryossef; Annette Rios; Sarah Ebling; | arxiv-cs.CL | 2022-11-28 |
354 | Summer: WeChat Neural Machine Translation Systems for The WMT22 Biomedical Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces WeChat’s participation in WMT 2022 shared biomedical translation task on Chinese to English. |
Ernan Li; Fandong Meng; Jie Zhou; | arxiv-cs.CL | 2022-11-27 |
355 | BJTU-WeChat’s Systems for The WMT22 Chat Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT’22 chat translation task for English-German. |
Yunlong Liang; Fandong Meng; Jinan Xu; Yufeng Chen; Jie Zhou; | arxiv-cs.CL | 2022-11-27 |
356 | Competency-Aware Neural Machine Translation: Can Machine Translation Know Its Own Translation Quality? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is in sharp contrast to human translators who give feedback or conduct further investigations whenever they are in doubt about predictions. To fill this gap, we propose a novel competency-aware NMT by extending conventional NMT with a self-estimator, offering abilities to translate a source sentence and estimate its competency. |
PEI ZHANG et. al. | arxiv-cs.CL | 2022-11-24 |
357 | ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic-English Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present our work on collecting ArzEn-ST, a code-switched Egyptian Arabic-English Speech Translation Corpus. This corpus is an extension of the ArzEn speech corpus, which was … |
Injy Hamed; Nizar Habash; S. Abdennadher; Ngoc Thang Vu; | Workshop on Arabic Natural Language Processing | 2022-11-22 |
358 | Average Token Delay: A Latency Metric for Simultaneous Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel latency evaluation metric called Average Token Delay (ATD) that focuses on the end timings of partial translations in simultaneous translation. |
Yasumasa Kano; Katsuhito Sudoh; Satoshi Nakamura; | arxiv-cs.CL | 2022-11-22 |
359 | ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic – English Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we collect translations in both directions, monolingual Egyptian Arabic and monolingual English, forming a three-way speech translation corpus. |
Injy Hamed; Nizar Habash; Slim Abdennadher; Ngoc Thang Vu; | arxiv-cs.CL | 2022-11-21 |
360 | Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we propose a simple back-translation-style data augmentation method for mandarin Chinese polyphone disambiguation, utilizing a large amount of unlabeled text data. |
CHUNYU QIANG et. al. | arxiv-cs.SD | 2022-11-17 |
361 | TSMind: Alibaba and Soochow University’s Submission to The WMT22 Translation Suggestion Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the joint submission of Alibaba and Soochow University, TSMind, to the WMT 2022 Shared Task on Translation Suggestion (TS). |
XIN GE et. al. | arxiv-cs.CL | 2022-11-16 |
362 | MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we leverage the evaluations of candidate systems submitted to the English-German SST task at IWSLT 2022 and conduct an extensive correlation analysis of CR and the aforementioned metrics. |
Dominik Macháček; Ondřej Bojar; Raj Dabre; | arxiv-cs.CL | 2022-11-15 |
363 | Findings of The Covid-19 MLIA Machine Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents the results of the machine translation (MT) task from the Covid-19 MLIA @ Eval initiative, a community effort to improve the generation of MT systems focused on the current Covid-19 crisis. |
FRANCISCO CASACUBERTA et. al. | arxiv-cs.CL | 2022-11-14 |
364 | Easy Guided Decoding in Providing Suggestions for Interactive Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we utilize the parameterized objective function of neural machine translation (NMT) and propose a novel constrained decoding algorithm, namely Prefix Suffix Guided Decoding (PSGD), to deal with the TS problem without additional training. |
Ke Wang; Xin Ge; Jiayi Wang; Yu Zhao; Yuqi Zhang; | arxiv-cs.CL | 2022-11-13 |
365 | Grammatical Error Correction: A Survey of The State of The Art IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this survey paper, we condense the field into a single article and first outline some of the linguistic challenges of the task, introduce the most popular datasets that are available to researchers (for both English and other languages), and summarise the various methods and techniques that have been developed with a particular focus on artificial error generation. |
CHRISTOPHER BRYANT et. al. | arxiv-cs.CL | 2022-11-09 |
366 | ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ERNIE-UniX2, a unified cross-lingual cross-modal pre-training framework for both generation and understanding tasks. |
BIN SHAN et. al. | arxiv-cs.CV | 2022-11-09 |
367 | Review of Coreference Resolution in English and Persian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, it has a significant effect on the quality of these systems. This article reviews the existing corpora and evaluation metrics in this field. |
Hassan Haji Mohammadi; Alireza Talebpour; Ahmad Mahmoudi Aznaveh; Samaneh Yazdani; | arxiv-cs.CL | 2022-11-08 |
368 | Refining Low-Resource Unsupervised Translation By Language Disentanglement of Multilingual Translation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a simple refinement procedure to separate languages from a pre-trained multilingual UMT model for it to focus on only the target low-resource task. |
Xuan-Phi Nguyen; Shafiq Joty; Kui Wu; Ai Ti Aw; | nips | 2022-11-06 |
369 | InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose InsNet, an expressive insertion-based text generator with efficient training and flexible decoding (parallel or sequential). |
Sidi Lu; Tao Meng; Nanyun Peng; | nips | 2022-11-06 |
370 | MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. |
ANNA CURREY et. al. | arxiv-cs.CL | 2022-11-02 |
371 | Domain Curricula for Code-Switched MT at MixMT 2022 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our approach and results for the Code-mixed Machine Translation (MixMT) shared task at WMT 2022: the task consists of two subtasks, monolingual to code-mixed machine translation (Subtask-1) and code-mixed to monolingual machine translation (Subtask-2). |
Lekan Raheem; Maab Elrashid; | arxiv-cs.CL | 2022-10-31 |
372 | Domain Adaptation of Machine Translation with Crowdworkers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework that efficiently and effectively collects parallel sentences in a target domain from the web with the help of crowdworkers. |
Makoto Morishita; Jun Suzuki; Masaaki Nagata; | arxiv-cs.CL | 2022-10-27 |
373 | ACES: Translation Accuracy Challenge Sets for Evaluating Machine Translation Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use ACES to evaluate a wide range of MT metrics including the submissions to the WMT 2022 metrics shared task and perform several analyses leading to general recommendations for metric developers. |
Chantal Amrhein; Nikita Moghe; Liane Guillou; | arxiv-cs.CL | 2022-10-27 |
374 | COMET-QE and Active Learning for Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use COMET-QE, a reference-free evaluation metric, to select sentences for low-resource neural machine translation. |
Everlyn Asiko Chimoto; Bruce A. Bassett; | arxiv-cs.CL | 2022-10-27 |
375 | The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents the first relatively large-scale Amharic-English parallel sentence dataset. |
TADESSE DESTAW BELAY et. al. | arxiv-cs.CL | 2022-10-27 |
376 | Improving Speech-to-Speech Translation Through Unlabeled Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an effective way to utilize the massive existing unlabeled text from different languages to create a large amount of S2ST data to improve S2ST performance by applying various acoustic effects to the generated synthetic data. |
XUAN-PHI NGUYEN et. al. | arxiv-cs.CL | 2022-10-26 |
377 | A Bilingual Parallel Corpus with Discourse Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes BWB, a large parallel corpus first introduced in Jiang et al. (2022), along with an annotated test set. |
YUCHEN ELEANOR JIANG et. al. | arxiv-cs.CL | 2022-10-26 |
378 | Smart Speech Segmentation Using Acousto-Linguistic Features with Look-ahead Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a hybrid approach that leverages both acoustic and language information to improve segmentation. |
PIYUSH BEHRE et. al. | arxiv-cs.CL | 2022-10-25 |
379 | Bilingual Synchronization: Restoring Translational Relationships with Editing Operations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. |
Jitao Xu; Josep Crego; François Yvon; | arxiv-cs.CL | 2022-10-24 |
380 | Analyzing The Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we examine the use of influence functions for Neural Machine Translation (NMT). |
Tsz Kin Lam; Eva Hasler; Felix Hieber; | arxiv-cs.CL | 2022-10-24 |
381 | Focused Concatenation for Context-Aware Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose an improved concatenation approach that encourages the model to focus on the translation of the current sentence, discounting the loss generated by target context. |
Lorenzo Lupo; Marco Dinarelli; Laurent Besacier; | arxiv-cs.CL | 2022-10-24 |
382 | Translation Word-Level Auto-Completion: What Can We Achieve Out of The Box? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work describes our submissions to WMT’s shared task on word-level auto-completion, for the Chinese-to-English, English-to-Chinese, German-to-English, and English-to-German language directions. |
Yasmin Moslem; Rejwanul Haque; Andy Way; | arxiv-cs.CL | 2022-10-23 |
383 | University of Cape Town’s WMT22 System: Multilingual Machine Translation for Southern African Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper describes the University of Cape Town’s submission to the constrained track of the WMT22 Shared Task: Large-Scale Machine Translation Evaluation for African Languages. |
Khalid N. Elmadani; Francois Meyer; Jan Buys; | arxiv-cs.CL | 2022-10-21 |
384 | Turning Fixed to Adaptive: Integrating Post-Evaluation Into Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a method of performing the adaptive policy via integrating post-evaluation into the fixed policy. |
Shoutao Guo; Shaolei Zhang; Yang Feng; | arxiv-cs.CL | 2022-10-21 |
385 | A Semi-supervised Approach for A Better Translation of Sentiment in Dialectical Arabic UGT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we aim to improve the translation of sentiment in UGT written in the dialectical versions of the Arabic language to English. |
Hadeel Saadany; Constantin Orasan; Emad Mohamed; Ashraf Tantawy; | arxiv-cs.CL | 2022-10-21 |
386 | Gui at MixMT 2022 : English-Hinglish: An MT Approach for Translation of Code Mixed Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we discuss the use of mBART with some special pre-processing and post-processing (transliteration from Devanagari to Roman) for the first task in detail and the experiments that we performed for the second task of translating code-mixed Hinglish to monolingual English. |
AKSHAT GAHOI et. al. | arxiv-cs.CL | 2022-10-21 |
387 | Is Encoder-Decoder Redundant for Neural Machine Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the aforementioned concept for machine translation. |
Yingbo Gao; Christian Herold; Zijian Yang; Hermann Ney; | arxiv-cs.CL | 2022-10-21 |
388 | Wait-info Policy: Balancing Source and Target at Information Level for Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Wait-info Policy to balance source and target at the information level. |
Shaolei Zhang; Shoutao Guo; Yang Feng; | arxiv-cs.CL | 2022-10-20 |
389 | Can Domains Be Transferred Across Languages in Multi-Domain Multilingual Neural Machine Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous works mostly focus on either multilingual or multi-domain aspects of neural machine translation (NMT). |
Thuy-Trang Vu; Shahram Khadivi; Xuanli He; Dinh Phung; Gholamreza Haffari; | arxiv-cs.CL | 2022-10-20 |
390 | The University of Edinburgh’s Submission to The WMT22 Code-Mixing Shared Task (MixMT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For subtask 2, we investigated different pretraining techniques, namely comparing simple initialisation from existing machine translation models and aligned augmentation. |
Faheem Kirefu; Vivek Iyer; Pinzhen Chen; Laurie Burchell; | arxiv-cs.CL | 2022-10-20 |
391 | SIT at MixMT 2022: Fluent Translation Built on Giant Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the Stevens Institute of Technology’s submission for the WMT 2022 Shared Task: Code-mixed Machine Translation (MixMT). |
Abdul Rafae Khan; Hrishikesh Kanade; Girish Amar Budhrani; Preet Jhanglani; Jia Xu; | arxiv-cs.CL | 2022-10-20 |
392 | The VolcTrans System for WMT22 Multilingual Machine Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report describes our VolcTrans system for the WMT22 shared task on large-scale multilingual machine translation. |
XIAN QIAN et. al. | arxiv-cs.CL | 2022-10-20 |
393 | Hybrid-Regressive Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we empirically confirm that non-autoregressive translation with an iterative refinement mechanism (IR-NAT) suffers from poor acceleration robustness because it is more sensitive to decoding batch size and computing device setting than autoregressive translation (AT). |
Qiang Wang; Xinhui Hu; Ming Chen; | arxiv-cs.CL | 2022-10-19 |
394 | LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advances still struggle to train a separate model for each language pair, which is costly and unaffordable when the number of languages increases in the real world. |
HONGCHENG GUO et. al. | arxiv-cs.CL | 2022-10-19 |
395 | Separating Grains from The Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work describes our approach, which is based on filtering the given noisy data using a sentence-pair classifier that was built by fine-tuning a pre-trained language model. |
IDRIS ABDULMUMIN et. al. | arxiv-cs.CL | 2022-10-19 |
396 | Domain Specific Sub-network for Multi-Domain Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Domain-Specific Sub-network (DoSS). |
Amr Hendy; Mohamed Abdelghaffar; Mohamed Afify; Ahmed Y. Tawfik; | arxiv-cs.CL | 2022-10-18 |
397 | Tencent’s Multilingual Machine Translation System for WMT22 Large-Scale African Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes Tencent’s multilingual machine translation systems for the WMT22 shared task on Large-Scale Machine Translation Evaluation for African Languages. |
WENXIANG JIAO et. al. | arxiv-cs.CL | 2022-10-18 |
398 | Alibaba-Translate China’s Submission for WMT 2022 Quality Estimation Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation). |
KEQIN BAO et. al. | arxiv-cs.CL | 2022-10-18 |
399 | Tencent AI Lab – Shanghai Jiao Tong University Low-Resource Translation System for The WMT22 Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes Tencent AI Lab – Shanghai Jiao Tong University (TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task. |
Zhiwei He; Xing Wang; Zhaopeng Tu; Shuming Shi; Rui Wang; | arxiv-cs.CL | 2022-10-17 |
400 | Modeling Context With Linear Attention for Scalable Document-Level Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the efficacy of a recent linear attention model by Peng et al. (2021) on document translation and augment it with a sentential gate to promote a recency inductive bias. |
Zhaofeng Wu; Hao Peng; Nikolaos Pappas; Noah A. Smith; | arxiv-cs.CL | 2022-10-15 |
401 | Categorizing Semantic Representations for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they have recently been shown to suffer limitation in compositional generalization, failing to effectively learn the translation of atoms (e.g., words) and their semantic composition (e.g., modification) from seen compounds (e.g., phrases), and thus suffering from significantly weakened translation performance on unseen compounds during inference. We address this issue by introducing categorization to the source contextualized representations. |
Yongjing Yin; Yafu Li; Fandong Meng; Jie Zhou; Yue Zhang; | arxiv-cs.CL | 2022-10-13 |
402 | DICTDIS: Dictionary Constrained Disambiguation for Improved NMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we present \dictdis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries. |
Ayush Maheshwari; Piyush Sharma; Preethi Jyothi; Ganesh Ramakrishnan; | arxiv-cs.CL | 2022-10-13 |
403 | Improved Data Augmentation for Translation Suggestion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the system used in our submission to the WMT’22 Translation Suggestion shared task. |
HONGXIAO ZHANG et. al. | arxiv-cs.CL | 2022-10-12 |
404 | Integrating Translation Memories Into Non-Autoregressive Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By modifying the data presentation and introducing an extra deletion operation, we obtain performance that are on par with an autoregressive approach, while reducing the decoding load. |
Jitao Xu; Josep Crego; François Yvon; | arxiv-cs.CL | 2022-10-12 |
405 | Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain Via Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks. |
Lifeng Han; Gleb Erofeev; Irina Sorokina; Serge Gladkoff; Goran Nenadic; | arxiv-cs.CL | 2022-10-12 |
406 | Streaming Punctuation for Long-form Dictation with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Automatic Speech Recognition (ASR) production systems, however, are constrained by real-time requirements, making it hard to incorporate the right context when making punctuation decisions. In this paper, we propose a streaming approach for punctuation or re-punctuation of ASR output using dynamic decoding windows and measure its impact on punctuation and segmentation accuracy across scenarios. |
Piyush Behre; Sharman Tan; Padma Varadharajan; Shuangyu Chang; | arxiv-cs.CL | 2022-10-11 |
407 | CTC Alignments Improve Autoregressive Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we argue that CTC does in fact make sense for translation if applied in a joint CTC/attention framework wherein CTC’s core properties can counteract several key weaknesses of pure-attention models during training and decoding. |
BRIAN YAN et. al. | arxiv-cs.CL | 2022-10-11 |
408 | Machine Translation Between Spoken Languages and Signed Languages Represented in SignWriting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents work on novel machine translation (MT) systems between spoken and signed languages, where signed languages are represented in SignWriting, a sign language writing system. |
Zifan Jiang; Amit Moryossef; Mathias Müller; Sarah Ebling; | arxiv-cs.CL | 2022-10-11 |
409 | Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a Viterbi decoding framework for DA-Transformer, which guarantees to find the joint optimal solution for the translation and decoding path under any length constraint. |
Chenze Shao; Zhengrui Ma; Yang Feng; | arxiv-cs.CL | 2022-10-11 |
410 | Exploring Segmentation Approaches for Neural Machine Translation of Code-Switched Egyptian Arabic-English Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the effectiveness of different segmentation approaches on MT performance, covering morphology-based and frequency-based segmentation techniques. |
MARWA GASER et. al. | arxiv-cs.CL | 2022-10-11 |
411 | Distill The Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, in this work, we introduce IKD-MMT, a novel MMT framework to support the image-free inference phase via an inversion knowledge distillation scheme. |
Ru Peng; Yawen Zeng; Junbo Zhao; | arxiv-cs.CL | 2022-10-10 |
412 | Improving Robustness of Retrieval Augmented Translation Via Shuffling of Suggestions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a simple method to expose fuzzy-match NMT systems during training and show that it results in a system that is much more tolerant (regaining up to 5.8 BLEU) to inference with TMs with domain mismatch. |
Cuong Hoang; Devendra Sachan; Prashant Mathur; Brian Thompson; Marcello Federico; | arxiv-cs.CL | 2022-10-10 |
413 | Checks and Strategies for Enabling Code-Switched Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work explores multilingual NMT models’ ability to handle code-switched text. |
Thamme Gowda; Mozhdeh Gheini; Jonathan May; | arxiv-cs.CL | 2022-10-10 |
414 | Ngram-OAXE: Phrase-Based Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Starting from the intuition that reordering generally occurs between phrases, we extend oaxe by only allowing reordering between ngram phrases and still requiring a strict match of word order within the phrases. |
Cunxiao Du; Zhaopeng Tu; Longyue Wang; Jing Jiang; | arxiv-cs.CL | 2022-10-08 |
415 | Improving End-to-End Text Image Translation From The Auxiliary Text Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel text translation enhanced text image translation, which trains the end-to-end model with text translation as an auxiliary task. |
CONG MA et. al. | arxiv-cs.CL | 2022-10-07 |
416 | Toxicity in Multilingual Machine Translation at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we focus on one type of critical error: added toxicity. |
MARTA R. COSTA-JUSSÀ et. al. | arxiv-cs.CL | 2022-10-06 |
417 | The Boundaries of Meaning: A Case Study in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: But do they have any linguistic or philosophical plausibility? I attempt to cast light on this question by reviewing the relevant details of the subword segmentation algorithms and by relating them to important philosophical and linguistic debates, in the spirit of making artificial intelligence more transparent and explainable. |
Yuri Balashov; | arxiv-cs.CL | 2022-10-02 |
418 | MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We empirically demonstrate the effectiveness of self-supervised pre-training and data augmentation for zero-shot multi-lingual machine translation. |
Kshitij Gupta; | arxiv-cs.CL | 2022-10-01 |
419 | Multimodality Information Fusion for Automated Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View |
Lin Li; Turghun Tayir; Yifeng Han; Xiaohui Tao; Juan D. Velasquez; | Inf. Fusion | 2022-10-01 |
420 | FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present FRMT, a new dataset and evaluation benchmark for Few-shot Region-aware Machine Translation, a type of style-targeted translation. |
PARKER RILEY et. al. | arxiv-cs.CL | 2022-10-01 |
421 | Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer. |
Alexandra Chronopoulou; Dario Stojanovski; Alexander Fraser; | arxiv-cs.CL | 2022-09-30 |
422 | QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its high utility in the real world, there remain several limitations concerning manual QE data creation: inevitably incurred non-trivial costs due to the need for translation experts, and issues with data scaling and language expansion. To tackle these limitations, we present QUAK, a Korean-English synthetic QE dataset generated in a fully automatic manner. |
SUGYEONG EO et. al. | arxiv-cs.CL | 2022-09-30 |
423 | Blur The Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English Via Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a novel approach to building a practical NMT model for Buddhist scriptures. |
DENGHAO LI et. al. | arxiv-cs.CL | 2022-09-29 |
424 | Revamping Multilingual Agreement Bidirectionally Via Switched Back-translation for Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \textbf{B}idirectional \textbf{M}ultilingual \textbf{A}greement via \textbf{S}witched \textbf{B}ack-\textbf{t}ranslation (\textbf{BMA-SBT}), a novel and universal multilingual agreement framework for fine-tuning pre-trained MNMT models, which (i) exempts the need for aforementioned parallel data by using a novel method called switched BT that creates synthetic text written in another source language using the translation target and (ii) optimizes the agreement bidirectionally with the Kullback-Leibler Divergence loss. |
HONGYUAN LU et. al. | arxiv-cs.CL | 2022-09-28 |
425 | Effective General-Domain Data Inclusion for The Machine Translation Task By Vanilla Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Not only it is revolutionary for various translation tasks, but also for a majority of other NLP tasks. In this paper, we aim at a Transformer-based system that is able to translate a source sentence in German to its counterpart target sentence in English. |
Hassan Soliman; | arxiv-cs.CL | 2022-09-28 |
426 | An Automatic Evaluation of The WMT22 General Machine Translation Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report presents an automatic evaluation of the general machine translation task of the Seventh Conference on Machine Translation (WMT22). |
Benjamin Marie; | arxiv-cs.CL | 2022-09-28 |
427 | Improving Multilingual Neural Machine Translation System for Indic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a MNMT system to address the issues related to low-resource language translation. |
Sudhansu Bala Das; Atharv Biradar; Tapas Kumar Mishra; Bidyut Kumar Patra; | arxiv-cs.CL | 2022-09-27 |
428 | Zero-shot Domain Adaptation for Neural Machine Translation with Retrieved Phrase-level Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a non-tuning paradigm, resolving domain adaptation with a prompt-based method. |
ZEWEI SUN et. al. | arxiv-cs.CL | 2022-09-23 |
429 | Approaching English-Polish Machine Translation Quality Assessment with Neural-based Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our contribution to the PolEval 2021 Task 2: Evaluation of translation quality assessment metrics. |
Artur Nowakowski; | arxiv-cs.CL | 2022-09-22 |
430 | PePe: Personalized Post-editing Model Utilizing User-generated Post-edits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the recent advancement of machine translation, it remains a demanding task to properly reflect personal style. In this paper, we introduce a personalized automatic post-editing framework to address this challenge, which effectively generates sentences considering distinct personal behaviors. |
Jihyeon Lee; Taehee Kim; Yunwon Tae; Cheonbok Park; Jaegul Choo; | arxiv-cs.CL | 2022-09-21 |
431 | Vega-MT: The JD Explore Academy Machine Translation System for WMT22 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We describe the JD Explore Academy’s submission of the WMT 2022 shared general translation task. We participated in all high-resource tracks and one medium-resource track, … |
CHANGTONG ZAN et. al. | ArXiv | 2022-09-20 |
432 | Vega-MT: The JD Explore Academy Translation System for WMT22 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the JD Explore Academy’s submission of the WMT 2022 shared general translation task. |
CHANGTONG ZAN et. al. | arxiv-cs.CL | 2022-09-19 |
433 | A Snapshot Into The Possibility of Video Game Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present in this article what we believe to be one of the first attempts at video game machine translation. |
Damien Hansen; Pierre-Yves Houlmont; | arxiv-cs.CL | 2022-09-19 |
434 | The First Neural Machine Translation System for The Erzya Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first neural machine translation system for translation between the endangered Erzya language and Russian and the dataset collected by us to train and evaluate it. |
David Dale; | arxiv-cs.CL | 2022-09-19 |
435 | Normalization of Code-switched Text for Speech Synthesis Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In multilingual communities, code-switching is a common phenomenon. Due to the increase in usage of social media, high level of code-switching is present in social media text as … |
Sreeram Manghat; Sreeja Manghat; Tanja Schultz; | Interspeech | 2022-09-18 |
436 | Learning Decoupled Retrieval Representation for Nearest Neighbour Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generally, kNN-MT borrows the off-the-shelf context representation in the translation task, e.g., the output of the last decoder layer, as the query vector of the retrieval task. In this work, we highlight that coupling the representations of these two tasks is sub-optimal for fine-grained retrieval. |
Qiang Wang; Rongxiang Weng; Ming Chen; | arxiv-cs.CL | 2022-09-18 |
437 | Changing The Representation: Examining Language Representation for Neural Sign Language Production Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Neural Sign Language Production (SLP) aims to automatically translate from spoken language sentences to sign language videos. Historically the SLP task has been broken into two … |
Harry Walsh; Ben Saunders; R. Bowden; | ArXiv | 2022-09-16 |
438 | Rethinking Round-Trip Translation for Machine Translation Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we report the surprising finding that round-trip translation can be used for automatic evaluation without the references. |
Terry Yue Zhuo; Qiongkai Xu; Xuanli He; Trevor Cohn; | arxiv-cs.CL | 2022-09-15 |
439 | Data-adaptive Transfer Learning for Translation: A Case Study in Haitian and Jamaican Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multilingual transfer techniques often improve low-resource machine translation (MT). |
Nathaniel R. Robinson; Cameron J. Hogan; Nancy Fulda; David R. Mortensen; | arxiv-cs.CL | 2022-09-13 |
440 | Rethink About The Word-level Quality Estimation for Machine Translation from Human Judgement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Typically, conventional works on word-level QE are designed to predict the translation quality in terms of the post-editing effort, where the word labels (OK and BAD) are automatically generated by comparing words between MT sentences and the post-edited sentences through a Translation Error Rate (TER) toolkit. |
Zhen Yang; Fandong Meng; Yuanmeng Yan; Jie Zhou; | arxiv-cs.CL | 2022-09-12 |
441 | Adapting to Non-Centered Languages for Zero-shot Multilingual Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a simple, lightweight yet effective language-specific modeling method by adapting to non-centered languages and combining the shared information and the language-specific information to counteract the instability of zero-shot translation. |
Zhi Qu; Taro Watanabe; | arxiv-cs.CL | 2022-09-09 |
442 | On The Complementarity Between Pre-Training and Random-Initialization for Resource-Rich Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We take the first step to investigate the complementarity between PT and RI in resource-rich scenarios via two probing analyses, and find that: 1) PT improves NOT the accuracy, but the generalization by achieving flatter loss landscapes than that of RI; 2) PT improves NOT the confidence of lexical choice, but the negative diversity by assigning smoother lexical probability distributions than that of RI. Based on these insights, we propose to combine their complementarities with a model fusion algorithm that utilizes optimal transport to align neurons between PT and RI. |
CHANGTONG ZAN et. al. | arxiv-cs.CL | 2022-09-07 |
443 | Adam Mickiewicz University at WMT 2022: NER-Assisted and Quality-Aware Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Adam Mickiewicz University’s (AMU) submissions to the constrained track of the WMT 2022 General MT Task. |
Artur Nowakowski; Gabriela Pałka; Kamil Guttmann; Mikołaj Pokrywka; | arxiv-cs.CL | 2022-09-07 |
444 | Facilitating Global Team Meetings Between Language-Based Subgroups: When and How Can Machine Translation Help? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the current study, we investigate the idea of leveraging machine translation (MT) to facilitate global team meetings. |
Yongle Zhang; Dennis Asamoah Owusu; Marine Carpuat; Ge Gao; | arxiv-cs.CL | 2022-09-06 |
445 | Rare But Severe Neural Machine Translation Errors Induced By Minimal Deletion: An Empirical Study on Chinese and English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We examine the inducement of rare but severe errors in English-Chinese and Chinese-English in-domain neural machine translation by minimal deletion of the source text with character-based models. |
Ruikang Shi; Alvin Grissom II; Duc Minh Trinh; | arxiv-cs.CL | 2022-09-05 |
446 | Informative Language Representation Learning for Massively Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, recent studies show that prepending language tokens sometimes fails to navigate the multilingual neural machine translation models into right translation directions, especially on zero-shot translation. To mitigate this issue, we propose two methods, language embedding embodiment and language-aware multi-head attention, to learn informative language representations to channel translation into right directions. |
Renren Jin; Deyi Xiong; | arxiv-cs.CL | 2022-09-04 |
447 | Nearest Neighbor Non-autoregressive Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous studies addressed this issue through iterative decoding. This study proposes using nearest neighbors as the initial state of an NAR decoder and editing them iteratively. |
Ayana Niwa; Sho Takase; Naoaki Okazaki; | arxiv-cs.CL | 2022-08-26 |
448 | Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Kencorpus is a Kenyan Language corpus that intends to bridge the gap on how to collect, and store text and speech data that is good enough to enable data-driven solutions in applications such as machine translation, question answering and transcription in multilingual communities. |
BARACK WANJAWA et. al. | arxiv-cs.CL | 2022-08-25 |
449 | MuMUR : Multilingual Multimodal Universal Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework MuMUR, that utilizes knowledge transfer from a multilingual model to boost the performance of multi-modal (image and video) retrieval. |
AVINASH MADASU et. al. | arxiv-cs.CV | 2022-08-24 |
450 | Improving Video Retrieval Using Multilingual Knowledge Transfer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Video retrieval has seen tremendous progress with the development of vision-language models. However, further improving these models require additional labelled data which is a … |
AVINASH MADASU et. al. | European Conference on Information Retrieval | 2022-08-24 |
451 | Domain-Specific Text Generation for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach to domain adaptation leveraging state-of-the-art pretrained language models (LMs) for domain-specific data augmentation for MT, simulating the domain characteristics of either (a) a small bilingual dataset, or (b) the monolingual source text to be translated. |
Yasmin Moslem; Rejwanul Haque; John D. Kelleher; Andy Way; | arxiv-cs.CL | 2022-08-11 |
452 | Looking for A Needle in A Haystack: A Comprehensive Study of Hallucinations in Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we set foundations for the study of NMT hallucinations. |
Nuno M. Guerreiro; Elena Voita; André F. T. Martins; | arxiv-cs.CL | 2022-08-10 |
453 | Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate the proposed method on four low-resource language pairs of WMT21 QE shared task, as well as a new English-Farsi test dataset introduced in this paper. |
Fatemeh Azadi; Heshaam Faili; Mohammad Javad Dousti; | arxiv-cs.CL | 2022-07-31 |
454 | Benchmarking Azerbaijani Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we benchmark the performance of Azerbaijani-English NMT systems on a range of techniques and datasets. |
Chih-Chen Chen; William Chen; | arxiv-cs.CL | 2022-07-29 |
455 | GTrans: Grouping and Fusing Transformer Layers for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, vanilla Transformer mainly exploits the top-layer representation, assuming the lower layers provide trivial or redundant information and thus ignoring the bottom-layer feature that is potentially valuable. In this work, we propose the Group-Transformer model (GTrans) that flexibly divides multi-layer representations of both encoder and decoder into different groups and then fuses these group features to generate target words. |
JIAN YANG et. al. | arxiv-cs.CL | 2022-07-29 |
456 | Thutmose Tagger: Single-pass Neural Model for Inverse Text Normalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, such neural models are prone to hallucinations that could lead to unacceptable errors. To mitigate this issue, we propose a single-pass token classifier model that regards ITN as a tagging task. |
Alexandra Antonova; Evelina Bakhturina; Boris Ginsburg; | arxiv-cs.CL | 2022-07-29 |
457 | Multimodal Neural Machine Translation with Search Engine Based Image Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an open-vocabulary image retrieval methods to collect descriptive images for bilingual parallel corpus using image search engine. |
ZhenHao Tang; XiaoBing Zhang; Zi Long; XiangHua Fu; | arxiv-cs.CV | 2022-07-26 |
458 | Lagrangian Method for Q-Function Learning (with Applications to Machine Translation) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses a new approach to the fundamental problem of learning optimal Q-functions. |
Huang Bojun; | arxiv-cs.LG | 2022-07-22 |
459 | Unifying Cross-lingual Summarization and Machine Translation with Compression Rate Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel task, Cross-lingual Summarization with Compression rate (CSC), to benefit Cross-Lingual Summarization by large-scale Machine Translation corpus. |
YU BAI et. al. | sigir | 2022-07-12 |
460 | No Language Left Behind: Scaling Human-Centered Machine Translation IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks. |
NLLB TEAM et. al. | arxiv-cs.CL | 2022-07-11 |
461 | UM4: Unified Multilingual Multiple Teacher-Student Model for Zero-Resource Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel method, named as Unified Multilingual Multiple teacher-student Model for NMT (UM4). |
JIAN YANG et. al. | arxiv-cs.CL | 2022-07-11 |
462 | Tricks for Training Sparse Translation Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that that sparse architectures for multilingual machine translation can perform poorly out of the box and propose two straightforward techniques to mitigate this – a temperature heating mechanism and dense pre-training. |
DHEERU DUA et. al. | naacl | 2022-07-09 |
463 | Non-Autoregressive Machine Translation: It’s Not As Fast As It Seems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we point out flaws in the evaluation methodology present in the literature on NAR models and we provide a fair comparison between a state-of-the-art NAR model and the autoregressive submissions to the shared task. |
Jindrich Helcl; Barry Haddow; Alexandra Birch; | naacl | 2022-07-09 |
464 | Language Model Augmented Monotonic Attention for Simultaneous Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a framework to aid monotonic attention with an external language model to improve its decisions. |
Sathish Reddy Indurthi; Mohd Abbas Zaidi; Beomseok Lee; Nikhil Kumar Lakumarapu; Sangha Kim; | naacl | 2022-07-09 |
465 | Does Summary Evaluation Survive Translation to Other Languages? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To investigate how much we can trust machine translation of summarization datasets, we translate the English SummEval dataset to seven languages and compare performances across automatic evaluation measures. |
Spencer Braun; Oleg Vasilyev; Neslihan Iskender; John Bohannon; | naacl | 2022-07-09 |
466 | Semantically Informed Slang Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a semantically informed slang interpretation (SSI) framework that considers jointly the contextual and semantic appropriateness of a candidate interpretation for a query slang. |
Zhewei Sun; Richard Zemel; Yang Xu; | naacl | 2022-07-09 |
467 | Original or Translated? A Causal Analysis of The Impact of Translationese on Machine Translation Performance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we collect CausalMT, a dataset where the MT training data are also labeled with the human translation directions. |
Jingwei Ni; Zhijing Jin; Markus Freitag; Mrinmaya Sachan; Bernhard Sch?lkopf; | naacl | 2022-07-09 |
468 | Quantifying Synthesis and Fusion and Their Impact on Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, literature in Natural Language Processing (NLP) typically labels a whole language with a strict type of morphology, e.g. fusional or agglutinative. In this work, we propose to reduce the rigidity of such claims, by quantifying morphological typology at the word and segment level. |
ARTURO ONCEVAY et. al. | naacl | 2022-07-09 |
469 | On Systematic Style Differences Between Unsupervised and Supervised MT and An Application for High-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We compare translations from supervised and unsupervised MT systems of similar quality, finding that unsupervised output is more fluent and more structurally different in comparison to human translation than is supervised MT. We then demonstrate a way to combine the benefits of both methods into a single system which results in improved adequacy and fluency as rated by human evaluators. |
Kelly Marchisio; Markus Freitag; David Grangier; | naacl | 2022-07-09 |
470 | Quality-Aware Decoding for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search. In this paper, we bring together these two lines of research and propose quality-aware decoding for NMT, by leveraging recent breakthroughs in reference-free and reference-based MT evaluation through various inference methods like N-best reranking and minimum Bayes risk decoding. |
PATRICK FERNANDES et. al. | naacl | 2022-07-09 |
471 | Building Multilingual Machine Translation Systems That Serve Arbitrary XY Translations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The model suffers from poor performance in one-to-many and many-to-many with zero-shot setup. To address this issue, this paper discusses how to practically build MNMT systems that serve arbitrary X-Y translation directions while leveraging multilinguality with a two-stage training strategy of pretraining and finetuning. |
Akiko Eriguchi; Shufang Xie; Tao Qin; Hany Hassan; | naacl | 2022-07-09 |
472 | AdMix: A Mixed Sample Data Augmentation Method for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel data augmentation approach for NMT, which is independent of any additional training data. |
Chang Jin; Shigui Qiu; Nini Xiao; Hao Jia; | ijcai | 2022-07-01 |
473 | Explicit Alignment Learning for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose two approaches an explicit alignment learning approach, in which we further remove the need for the additional alignment model, and perform embedding mixup with the alignment based on encoder–decoder attention weights in the NMT model. |
Zuchao Li; Hai Zhao; Fengshun Xiao; Masao Utiyama; Eiichiro Sumita; | ijcai | 2022-07-01 |
474 | Reduce Indonesian Vocabularies with An Indonesian Sub-word Separator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a strategy to address the unique word problem of the neural machine translation (NMT) system, which uses Indonesian as a pair language. |
Mukhlis Amien; Feng Chong; Huang Heyan; | arxiv-cs.CL | 2022-07-01 |
475 | Towards Discourse-Aware Document-Level Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim at incorporating the coherence information hidden within the RST-style discourse structure into machine translation. |
Xin Tan; Longyin Zhang; Fang Kong; Guodong Zhou; | ijcai | 2022-07-01 |
476 | Code Translation with Compiler Representations IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we leverage low-level compiler intermediate representations (IR) to improve code translation. |
MARC SZAFRANIEC et. al. | arxiv-cs.PL | 2022-06-30 |
477 | GERNERMED++: Transfer Learning in German Medical NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a statistical model for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. |
Johann Frei; Ludwig Frei-Stuber; Frank Kramer; | arxiv-cs.CL | 2022-06-29 |
478 | Building Multilingual Machine Translation Systems That Serve Arbitrary X-Y Translations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The model suffers from poor performance in one-to-many and many-to-many with zero-shot setup. To address this issue, this paper discusses how to practically build MNMT systems that serve arbitrary X-Y translation directions while leveraging multilinguality with a two-stage training strategy of pretraining and finetuning. |
Akiko Eriguchi; Shufang Xie; Tao Qin; Hany Hassan Awadalla; | arxiv-cs.CL | 2022-06-29 |
479 | On The Impact of Noises in Crowd-Sourced Data for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: What are the impacts of these data quality issues for model development and evaluation? In this paper, we propose an automatic method to fix or filter the above quality issues, using English-German (En-De) translation as an example. |
Siqi Ouyang; Rong Ye; Lei Li; | arxiv-cs.CL | 2022-06-28 |
480 | Human Evaluation of English-Irish Transformer-Based NMT Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced … |
Séamus Lankford; Haithem Afli; Andy Way; | Inf. | 2022-06-25 |
481 | Comparing Formulaic Language in Human and Machine Translation: Insight from A Parliamentary Corpus Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A recent study has shown that, compared to human translations, neural machine translations contain more strongly-associated formulaic sequences made of relatively high-frequency words, but far less strongly-associated formulaic sequences made of relatively rare words. |
Yves Bestgen; | arxiv-cs.CL | 2022-06-22 |
482 | Scaling Autoregressive Models for Content-Rich Text-to-Image Generation IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the Pathways Autoregressive Text-to-Image (Parti) model, which generates high-fidelity photorealistic images and supports content-rich synthesis involving complex compositions and world knowledge. |
JIAHUI YU et. al. | arxiv-cs.CV | 2022-06-21 |
483 | Reliable and Safe Use of Machine Translation in Medical Settings Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Language barriers between patients and clinicians contribute to disparities in quality of care. Machine Translation (MT) tools are widely used in healthcare settings, but even … |
Nikita Mehandru; Samantha Robertson; Niloufar Salehi; | 2022 ACM Conference on Fairness, Accountability, and … | 2022-06-20 |
484 | Understanding and Being Understood: User Strategies for Identifying and Recovering From Mistranslations in Machine Translation-Mediated Chat Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Machine translation (MT) is now widely and freely available, and has the potential to greatly improve cross-lingual communication. In order to use MT reliably and safely, end … |
Samantha Robertson; Mark Díaz; | Proceedings of the 2022 ACM Conference on Fairness, … | 2022-06-20 |
485 | The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes the submission of our end-to-end YiTrans speech translation system for the IWSLT 2022 offline task, which translates from English audio to German, Chinese, and Japanese. |
ZIQIANG ZHANG et. al. | arxiv-cs.CL | 2022-06-12 |
486 | The YiTrans Speech Translation System for IWSLT 2022 Offline Shared Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper describes the submission of our end-to-end YiTrans speech translation system for the IWSLT 2022 offline task, which translates from English audio to German, Chinese, … |
Ziqiang Zhang; Junyi Ao; | ArXiv | 2022-06-12 |
487 | A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Chinese dialect TTS frontend with a translation module, which converts Mandarin text into dialectic expressions to improve the intelligibility and naturalness of synthesized speech. |
Junhui Zhang; Wudi Bao; Junjie Pan; Xiang Yin; Zejun Ma; | arxiv-cs.CL | 2022-06-10 |
488 | VALHALLA: Visual Hallucination for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a visual hallucination framework, called VALHALLA, which requires only source sentences at inference time and instead uses hallucinated visual representations for multimodal machine translation. |
YI LI et. al. | cvpr | 2022-06-07 |
489 | Globetrotter: Connecting Languages By Connecting Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a method that uses visual observations to bridge the gap between languages, rather than relying on parallel corpora or topological properties of the representations. |
Dídac Surís; Dave Epstein; Carl Vondrick; | cvpr | 2022-06-07 |
490 | LegoNN: Building Modular Encoder-Decoder Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To achieve this reusability, the interface between encoder and decoder modules is grounded to a sequence of marginal distributions over a pre-defined discrete vocabulary. We present two approaches for ingesting these marginals; one is differentiable, allowing the flow of gradients across the entire network, and the other is gradient-isolating. |
SIDDHARTH DALMIA et. al. | arxiv-cs.CL | 2022-06-07 |
491 | MorisienMT: A Dataset for Mauritian Creole Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe MorisienMT, a dataset for benchmarking machine translation quality of Mauritian Creole. |
Raj Dabre; Aneerav Sukhoo; | arxiv-cs.CL | 2022-06-06 |
492 | Finetuning A Kalaallisut-English Machine Translation System Using Web-crawled Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we attempt to finetune a pretrained Kalaallisut-to-English neural machine translation (NMT) system using web-crawled pseudoparallel sentences from around 30 multilingual websites. |
Alex Jones; | arxiv-cs.CL | 2022-06-05 |
493 | Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that for many-to-one translation we can indeed increase decoder speed without sacrificing quality using this approach, but for one-to-many translation, shallow decoders cause a clear quality drop. To ameliorate this drop, we propose a deep encoder with multiple shallow decoders (DEMSD) where each shallow decoder is responsible for a disjoint subset of target languages. |
XIANG KONG et. al. | arxiv-cs.CL | 2022-06-04 |
494 | Findings of The The RuATD Shared Task 2022 on Artificial Text Detection in Russian Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the shared task on artificial text detection in Russian, which is organized as a part of the Dialogue Evaluation initiative, held in 2022. |
TATIANA SHAMARDINA et. al. | arxiv-cs.CL | 2022-06-03 |
495 | Exploring Diversity in Back Translation for Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work puts forward a more nuanced framework for understanding diversity in training data, splitting it into lexical diversity and syntactic diversity. We present novel metrics for measuring these different aspects of diversity and carry out empirical analysis into the effect of these types of diversity on final neural machine translation model performance for low-resource English$\leftrightarrow$Turkish and mid-resource English$\leftrightarrow$Icelandic. |
Laurie Burchell; Alexandra Birch; Kenneth Heafield; | arxiv-cs.CL | 2022-06-01 |
496 | NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on developing resources for languages in Indonesia. |
GENTA INDRA WINATA et. al. | arxiv-cs.CL | 2022-05-31 |
497 | Refining Low-Resource Unsupervised Translation By Language Disentanglement of Multilingual Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a simple refinement procedure to separate languages from a pre-trained multilingual UMT model for it to focus on only the target low-resource task. |
Xuan-Phi Nguyen; Shafiq Joty; Wu Kui; Ai Ti Aw; | arxiv-cs.CL | 2022-05-31 |
498 | VALHALLA: Visual Hallucination for Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a visual hallucination framework, called VALHALLA, which requires only source sentences at inference time and instead uses hallucinated visual representations for multimodal machine translation. |
YI LI et. al. | arxiv-cs.CV | 2022-05-31 |
499 | Preparing An Endangered Language for The Digital Age: The Case of Judeo-Spanish Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For text-to-speech synthesis, we present a 3.5 hour single speaker speech corpus for building a neural speech synthesis engine. |
Alp Öktem; Rodolfo Zevallos; Yasmin Moslem; Güneş Öztürk; Karen Şarhon; | arxiv-cs.CL | 2022-05-31 |
500 | X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we fill this research gap and present an abstractive cross-lingual summarization dataset for four different languages in the scholarly domain, which enables us to train and evaluate models that process English papers and generate summaries in German, Italian, Chinese and Japanese. |
Sotaro Takeshita; Tommaso Green; Niklas Friedrich; Kai Eckert; Simone Paolo Ponzetto; | arxiv-cs.CL | 2022-05-30 |
501 | Can Transformer Be Too Compositional? Analysing Idiom Processing in Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer, by analysing the hidden states and attention patterns for models with English as source language and one of seven European languages as target language. |
Verna Dankers; Christopher G. Lucas; Ivan Titov; | arxiv-cs.CL | 2022-05-30 |
502 | BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While most of the research attention is given to the English language in a monolingual setting, resource-constrained languages like Bangla remain out of focus, predominantly due to a lack of standard datasets. Addressing this issue, we present a new dataset BAN-Cap following the widely used Flickr8k dataset, where we collect Bangla captions of the images provided by qualified annotators. |
Mohammad Faiyaz Khan; S. M. Sadiq-Ur-Rahman Shifath; Md Saiful Islam; | arxiv-cs.CL | 2022-05-28 |
503 | TURJUMAN: A Public Toolkit for Neural Arabic Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: We present TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA). TURJUMAN exploits the recently-introduced text-to-text Transformer AraT5 … |
El Moatez Billah Nagoudi; AbdelRahim Elmadany; M. Abdul-Mageed; | ArXiv | 2022-05-27 |
504 | Investigating Lexical Replacements for Arabic-English Code-Switched Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate data augmentation techniques for synthesizing dialectal Arabic-English CS text. |
Injy Hamed; Nizar Habash; Slim Abdennadher; Ngoc Thang Vu; | arxiv-cs.CL | 2022-05-25 |
505 | FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we provide baselines for the tasks based on multilingual pre-trained models like mSLAM. |
ALEXIS CONNEAU et. al. | arxiv-cs.CL | 2022-05-24 |
506 | DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. |
Gabriele Sarti; Arianna Bisazza; Ana Guerberof Arenas; Antonio Toral; | arxiv-cs.CL | 2022-05-24 |
507 | T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks. |
Paul-Ambroise Duquenne; Hongyu Gong; Benoît Sagot; Holger Schwenk; | arxiv-cs.CL | 2022-05-24 |
508 | Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As an alternative, we propose performing back-translation via code summarization and generation. |
Wasi Uddin Ahmad; Saikat Chakraborty; Baishakhi Ray; Kai-Wei Chang; | arxiv-cs.CL | 2022-05-23 |
509 | Tackling Data Scarcity in Speech Translation Using Zero-Shot Multilingual Machine Translation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the related field of multilingual text translation, several techniques have been proposed for zero-shot translation. |
T. A. Dinh; D. Liu; J. Niehues; | icassp | 2022-05-22 |
510 | ISOMETRIC MT: Neural Machine Translation for Automatic Dubbing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces a self-learning approach that allows a transformer model to directly learn to generate outputs that closely match the source length, in short Isometric MT. In particular, our approach does not require to generate multiple hypotheses nor any auxiliary ranking function. |
S. M. Lakew; Y. Virkar; P. Mathur; M. Federico; | icassp | 2022-05-22 |
511 | Context-Adaptive Document-Level Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces a data-adaptive method that enables the model to adopt the necessary and helpful context. |
L. Zhang; Z. Zhang; B. Chen; W. Luo; L. Si; | icassp | 2022-05-22 |
512 | Integrating Multiple ASR Systems Into NLP Backend with Attention Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we reduce the impact of ASR errors on the NLP back-end by combining transcriptions from various ASR systems. |
T. Kano; A. Ogawa; M. Delcroix; S. Watanabe; | icassp | 2022-05-22 |
513 | Non-Autoregressive Neural Machine Translation: A Call for Clarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Non-autoregressive approaches aim to improve the inference speed of translation models by only requiring a single forward pass to generate the output sequence instead of iteratively producing each predicted token. |
Robin M. Schmidt; Telmo Pires; Stephan Peitz; Jonas Lööf; | arxiv-cs.CL | 2022-05-21 |
514 | Understanding and Mitigating The Uncertainty in Zero-Shot Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to understand and alleviate the off-target issues from the perspective of uncertainty in zero-shot translation. |
Wenxuan Wang; Wenxiang Jiao; Shuo Wang; Zhaopeng Tu; Michael R. Lyu; | arxiv-cs.CL | 2022-05-20 |
515 | Translating Hanja Historical Documents to Contemporary Korean and English Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose H2KE, a neural machine translation model, that translates historical documents in Hanja to more easily understandable Korean and to English. |
JUHEE SON et. al. | arxiv-cs.CL | 2022-05-20 |
516 | Data Augmentation to Address Out-of-Vocabulary Problem in Low-Resource Sinhala-English Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a word and phrase replacement-based DA technique that consider both types of OOV, by augmenting (1) rare words in the existing parallel corpus, and (2) new words from a bilingual dictionary. |
Aloka Fernando; Surangika Ranathunga; | arxiv-cs.CL | 2022-05-18 |
517 | Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we report our recent achievements in S2ST. |
QIANQIAN DONG et. al. | arxiv-cs.CL | 2022-05-18 |
518 | Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel posterior alignment technique that is truly online in its execution and superior in terms of alignment error rates compared to existing methods. |
Soumya Chatterjee; Sunita Sarawagi; Preethi Jyothi; | acl | 2022-05-17 |
519 | Scheduled Multi-task Learning for Neural Chat Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although the NCT models have achieved impressive success, it is still far from satisfactory due to insufficient chat translation data and simple joint training manners. To address the above issues, we propose a scheduled multi-task learning framework for NCT. |
Yunlong Liang; Fandong Meng; Jinan Xu; Yufeng Chen; Jie Zhou; | acl | 2022-05-17 |
520 | An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework for training non-autoregressive sequence-to-sequence models for editing tasks, where the original input sequence is iteratively edited to produce the output. |
Sweta Agrawal; Marine Carpuat; | acl | 2022-05-17 |
521 | Conditional Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While one possible solution is to directly take target contexts into these statistical metrics, the target-context-aware statistical computing is extremely expensive, and the corresponding storage overhead is unrealistic. To solve the above issues, we propose a target-context-aware metric, named conditional bilingual mutual information (CBMI), which makes it feasible to supplement target context information for statistical metrics. |
SONGMING ZHANG et. al. | acl | 2022-05-17 |
522 | DEEP: DEnoising Entity Pre-training for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Earlier named entity translation methods mainly focus on phonetic transliteration, which ignores the sentence context for translation and is limited in domain and language coverage. To address this limitation, we propose DEEP, a DEnoising Entity Pre-training method that leverages large amounts of monolingual data and a knowledge base to improve named entity translation accuracy within sentences. |
Junjie Hu; Hiroaki Hayashi; Kyunghyun Cho; Graham Neubig; | acl | 2022-05-17 |
523 | Efficient Cluster-Based K-Nearest-Neighbor Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To make it practical, in this paper, we explore a more efficient kNN-MT and propose to use clustering to improve the retrieval efficiency. |
Dexin Wang; Kai Fan; Boxing Chen; Deyi Xiong; | acl | 2022-05-17 |
524 | Zero-Shot Cross-lingual Semantic Parsing IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-logical form paired data and in-domain natural language corpora in each new language. |
Tom Sherborne; Mirella Lapata; | acl | 2022-05-17 |
525 | Confidence Based Bidirectional Global Context Aware Training Framework for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Confidence Based Bidirectional Global Context Aware (CBBGCA) training framework for NMT, where the NMT model is jointly trained with an auxiliary conditional masked language model (CMLM). |
CHULUN ZHOU et. al. | acl | 2022-05-17 |
526 | On Vision Features in Multimodal Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the impact of vision models on MMT. |
BEI LI et. al. | acl | 2022-05-17 |
527 | UniTE: Unified Translation Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose , which is the first unified framework engaged with abilities to handle all three evaluation tasks. |
YU WAN et. al. | acl | 2022-05-17 |
528 | BiTIIMT: A Bilingual Text-infilling Method for Interactive Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel BiTIIMT system, Bilingual Text-Infilling for Interactive Neural Machine Translation. |
YANLING XIAO et. al. | acl | 2022-05-17 |
529 | Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce multilingual crossover encoder-decoder (mXEncDec) to fuse language pairs at an instance level. |
YONG CHENG et. al. | acl | 2022-05-17 |
530 | Towards Making The Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper demonstrates that multilingual pretraining and multilingual fine-tuning are both critical for facilitating cross-lingual transfer in zero-shot translation, where the neural machine translation (NMT) model is tested on source languages unseen during supervised training. Following this idea, we present SixT+, a strong many-to-English NMT model that supports 100 source languages but is trained with a parallel dataset in only six source languages. |
GUANHUA CHEN et. al. | acl | 2022-05-17 |
531 | Sub-Word Alignment Is Still Useful: A Vest-Pocket Method for Enhancing Low-Resource Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We leverage embedding duplication between aligned sub-words to extend the Parent-Child transfer learning method, so as to improve low-resource machine translation. |
Minhan Xu; Yu Hong; | acl | 2022-05-17 |
532 | STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing techniques often attempt to transfer powerful machine translation (MT) capabilities to ST, but neglect the representation discrepancy across modalities. In this paper, we propose the Speech-TExt Manifold Mixup (STEMM) method to calibrate such discrepancy. |
Qingkai Fang; Rong Ye; Lei Li; Yang Feng; Mingxuan Wang; | acl | 2022-05-17 |
533 | Measuring and Mitigating Name Biases in Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we describe a new source of bias prevalent in NMT systems, relating to translations of sentences containing person names. |
Jun Wang; Benjamin Rubinstein; Trevor Cohn; | acl | 2022-05-17 |
534 | Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an adaptive segmentation policy for end-to-end ST. Inspired by human interpreters, the policy learns to segment the source streaming speech into meaningful units by considering both acoustic features and translation history, maintaining consistency between the segmentation and translation. |
Ruiqing Zhang; Zhongjun He; Hua Wu; Haifeng Wang; | acl | 2022-05-17 |
535 | Triangular Transfer: Freezing The Pivot for Triangular Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a transfer-learning-based approach that utilizes all types of auxiliary data. |
Meng Zhang; Liangyou Li; Qun Liu; | acl | 2022-05-17 |
536 | MSCTD: A Multimodal Sentiment Chat Translation Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce a new task named Multimodal Chat Translation (MCT), aiming to generate more accurate translations with the help of the associated dialogue history and visual context. |
Yunlong Liang; Fandong Meng; Jinan Xu; Yufeng Chen; Jie Zhou; | acl | 2022-05-17 |
537 | Machine Translation for Livonian: Catering to 20 Speakers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we tackle the task of developing neural machine translation (NMT) between Livonian and English, with a two-fold aim: on one hand, preserving the language and on the other – enabling access to Livonian folklore, lifestories and other textual intangible heritage as well as making it easier to create further parallel corpora. |
Matiss Rikters; Marili Tomingas; Tuuli Tuisk; Valts Ern�treits; Mark Fishel; | acl | 2022-05-17 |
538 | Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT), which augments each training instance with an adjacency semantic region that could cover adequate variants of literal expression under the same meaning. |
XIANGPENG WEI et. al. | acl | 2022-05-17 |
539 | ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores a deeper relationship between Transformer and numerical ODE methods. |
BEI LI et. al. | acl | 2022-05-17 |
540 | Can Transformer Be Too Compositional? Analysing Idiom Processing in Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer, by analysing the hidden states and attention patterns for models with English as source language and one of seven European languages as target language. |
Verna Dankers; Christopher Lucas; Ivan Titov; | acl | 2022-05-17 |
541 | Redistributing Low-Frequency Words: Making The Most of Monolingual Data in Non-Autoregressive Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we provide an appealing alternative for NAT – monolingual KD, which trains NAT student on external monolingual data with AT teacher trained on the original bilingual data. |
Liang Ding; Longyue Wang; Shuming Shi; Dacheng Tao; Zhaopeng Tu; | acl | 2022-05-17 |
542 | DiBiMT: A Novel Benchmark for Measuring Word Sense Disambiguation Biases in Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present DiBiMT, the first entirely manually-curated evaluation benchmark which enables an extensive study of semantic biases in Machine Translation of nominal and verbal words in five different language combinations, namely, English and one or other of the following languages: Chinese, German, Italian, Russian and Spanish. |
Niccol� Campolungo; Federico Martelli; Francesco Saina; Roberto Navigli; | acl | 2022-05-17 |
543 | Focus on The Target’s Vocabulary: Masked Label Smoothing for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: When allocating smoothed probability, original label smoothing treats the source-side words that would never appear in the target language equally to the real target-side words, which could bias the translation model. To address this issue, we propose Masked Label Smoothing (MLS), a new mechanism that masks the soft label probability of source-side words to zero. |
Liang Chen; Runxin Xu; Baobao Chang; | acl | 2022-05-17 |
544 | A Variational Hierarchical Model for Neural Cross-Lingual Summarization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, it is very challenging for the model to directly conduct CLS as it requires both the abilities to translate and summarize. To address this issue, we propose a hierarchical model for the CLS task, based on the conditional variational auto-encoder. |
YUNLONG LIANG et. al. | acl | 2022-05-17 |
545 | Geographical Distance Is The New Hyperparameter: A Case Study Of Finding The Optimal Pre-trained Language For English-isiZulu Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the potential benefits of transfer learning in an English-isiZulu translation framework. |
Muhammad Umair Nasir; Innocent Amos Mchechesi; | arxiv-cs.CL | 2022-05-17 |
546 | As Little As Possible, As Much As Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Omission and addition of content is a typical issue in neural machine translation. We propose a method for detecting such phenomena with off-the-shelf translation models. |
Jannis Vamvas; Rico Sennrich; | acl | 2022-05-17 |
547 | Bridging The Data Gap Between Training and Inference for Unsupervised Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To narrow the data gap, we propose an online self-training approach, which simultaneously uses the pseudo parallel data {natural source, translated target} to mimic the inference scenario. |
Zhiwei He; Xing Wang; Rui Wang; Shuming Shi; Zhaopeng Tu; | acl | 2022-05-17 |
548 | Consistent Human Evaluation of Machine Translation Across Language Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new metric called XSTS that is more focused on semantic equivalence and a cross-lingual calibration method that enables more consistent assessment. |
DANIEL LICHT et. al. | arxiv-cs.CL | 2022-05-17 |
549 | From Simultaneous to Streaming Machine Translation By Leveraging Streaming History Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation, that is successfully evaluated on streaming conditions for a reference IWSLT task |
Javier Iranzo Sanchez; Jorge Civera; Alfons Juan-C�scar; | acl | 2022-05-17 |
550 | Bias Mitigation in Machine Translation Quality Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We analyse the partial input bias in further detail and evaluate four approaches to use auxiliary tasks for bias mitigation. |
Hanna Behnke; Marina Fomicheva; Lucia Specia; | acl | 2022-05-17 |
551 | AppTek’s Submission to The IWSLT 2022 Isometric Spoken Language Translation Task Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: To participate in the Isometric Spoken Language Translation Task of the IWSLT 2022 evaluation, constrained condition, AppTek developed neural Transformer-based systems for … |
Patrick Wilken; Evgeny Matusov; | arxiv-cs.CL | 2022-05-11 |
552 | Controlling Extra-Textual Attributes About Dialogue Participants — A Case Study of English-to-Polish Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We focus on the underresearched problem of utilising external metadata in automatic translation of TV dialogue, proposing a case study where a wide range of approaches for controlling attributes in translation is employed in a multi-attribute scenario. |
Sebastian T. Vincent; Loïc Barrault; Carolina Scarton; | arxiv-cs.CL | 2022-05-10 |
553 | Controlling Extra-Textual Attributes About Dialogue Participants: A Case Study of English-to-Polish Neural Machine Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Unlike English, morphologically rich languages can reveal characteristics of speakers or their conversational partners, such as gender and number, via pronouns, morphological … |
S. Vincent; Loïc Barrault; Carolina Scarton; | ArXiv | 2022-05-10 |
554 | ParaCotta: Synthetic Multilingual Paraphrase Corpora from The Most Diverse Translation Sample Pair Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We generate multiple translation samples using beam search and choose the most lexically diverse pair according to their sentence BLEU. |
ALHAM FIKRI AJI et. al. | arxiv-cs.CL | 2022-05-09 |
555 | CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an annotated dataset (CoCoA-MT) and an associated evaluation metric for training and evaluating formality-controlled MT models for six diverse target languages. |
MARIA NĂDEJDE et. al. | arxiv-cs.CL | 2022-05-09 |
556 | Example-Based Machine Translation from Text to A Hierarchical Representation of Sign Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article presents an original method for Text-to-Sign Translation. |
Élise Bertin-Lemée; Annelies Braffort; Camille Challant; Claire Danet; Michael Filhol; | arxiv-cs.CL | 2022-05-06 |
557 | Bridging The Domain Gap for Stance Detection for The Zulu Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a black-box non-intrusive method that utilizes techniques from Domain Adaptation to reduce the domain gap, without requiring any human expertise in the target language, by leveraging low-quality data in both a supervised and unsupervised manner. |
Gcinizwe Dlamini; Imad Eddine Ibrahim Bekkouch; Adil Khan; Leon Derczynski; | arxiv-cs.CL | 2022-05-06 |
558 | Non-Autoregressive Machine Translation: It’s Not As Fast As It Seems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we point out flaws in the evaluation methodology present in the literature on NAR models and we provide a fair comparison between a state-of-the-art NAR model and the autoregressive submissions to the shared task. |
Jindřich Helcl; Barry Haddow; Alexandra Birch; | arxiv-cs.CL | 2022-05-04 |
559 | ON-TRAC Consortium Systems for The IWSLT 2022 Dialect and Low-resource Speech Translation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2022: low-resource and dialect speech translation. |
MARCELY ZANON BOITO et. al. | arxiv-cs.CL | 2022-05-04 |
560 | Original or Translated? A Causal Analysis of The Impact of Translationese on Machine Translation Performance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we collect CausalMT, a dataset where the MT training data are also labeled with the human translation directions. |
Jingwei Ni; Zhijing Jin; Markus Freitag; Mrinmaya Sachan; Bernhard Schölkopf; | arxiv-cs.CL | 2022-05-04 |
561 | The Implicit Length Bias of Label Smoothing on Beam Search Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We verify our theory by applying a simple rectification function at inference time to restore the unbiased distributions from the label-smoothed model predictions. |
Bowen Liang; Pidong Wang; Yuan Cao; | arxiv-cs.CL | 2022-05-02 |
562 | Quality-Aware Decoding for Neural Machine Translation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search. In this paper, we bring together these two lines of research and propose quality-aware decoding for NMT, by leveraging recent breakthroughs in reference-free and reference-based MT evaluation through various inference methods like $N$-best reranking and minimum Bayes risk decoding. |
PATRICK FERNANDES et. al. | arxiv-cs.CL | 2022-05-02 |
563 | Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, there is a need to create training and evaluation data for implementing machine learning tasks and bridging the research gap in the language. This work presents the Hausa Visual Genome (HaVG), a dataset that contains the description of an image or a section within the image in Hausa and its equivalent in English. |
IDRIS ABDULMUMIN et. al. | arxiv-cs.CL | 2022-05-02 |
564 | Improving Machine Translation Systems Via Isotopic Replacement IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Machine translation plays an essential role in people’s daily international communication. However, machine translation systems are far from perfect. To tackle this problem, … |
ZEYU SUN et. al. | 2022 IEEE/ACM 44th International Conference on Software … | 2022-05-01 |
565 | The Cross-lingual Conversation Summarization Challenge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the shared task of cross-lingual conversation summarization, \emph{ConvSumX Challenge}, opening new avenues for researchers to investigate solutions that integrate conversation summarization and machine translation. |
YULONG CHEN et. al. | arxiv-cs.CL | 2022-04-30 |
566 | Can Machine Translation Be A Reasonable Alternative for Multilingual Question Answering Systems Over Knowledge Graphs? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we discuss Knowledge Graph Question Answering (KGQA) systems that aim at providing natural language access to data stored in Knowledge Graphs (KG). |
Aleksandr Perevalov; Andreas Both; Dennis Diefenbach; Axel-Cyrille Ngonga Ngomo; | www | 2022-04-29 |
567 | How Robust Is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze how translation performance changes as the data ratios among languages vary in the tokenizer training corpus. |
SHIYUE ZHANG et. al. | arxiv-cs.CL | 2022-04-29 |
568 | NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Translation-based similarity measures include direct and pivot translation probability, as well as translation cross-likelihood, which has not been studied so far. We analyze these measures in the common framework of multilingual NMT, releasing the NMTScore library (available at https://github.com/ZurichNLP/nmtscore). |
Jannis Vamvas; Rico Sennrich; | arxiv-cs.CL | 2022-04-28 |
569 | UniTE: Unified Translation Evaluation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose UniTE, which is the first unified framework engaged with abilities to handle all three evaluation tasks. |
YU WAN et. al. | arxiv-cs.CL | 2022-04-28 |
570 | Data-Driven Adaptive Simultaneous Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, wait-k suffers from two major limitations: (a) it is a fixed policy that can not adaptively adjust latency given context, and (b) its training is much slower than full-sentence translation. To alleviate these issues, we propose a novel and efficient training scheme for adaptive SimulMT by augmenting the training corpus with adaptive prefix-to-prefix pairs, while the training complexity remains the same as that of training full-sentence translation models. |
GUANGXU XUN et. al. | arxiv-cs.CL | 2022-04-26 |
571 | When Do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a word-level contrastive objective to leverage word alignments for many-to-many NMT. |
ZHUOYUAN MAO et. al. | arxiv-cs.CL | 2022-04-26 |
572 | Efficient Machine Translation Domain Adaptation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore several approaches to speed up nearest neighbor machine translation. |
Pedro Henrique Martins; Zita Marinho; André F. T. Martins; | arxiv-cs.CL | 2022-04-26 |
573 | Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to … |
Mathieu De Coster; J. Dambre; | Inf. | 2022-04-23 |