Paper Digest: EMNLP 2022 Highlights
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: EMNLP 2022 Highlights
Paper | Author(s) | |
---|---|---|
1 | Generative Knowledge Graph Construction: A Review Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we summarize the recent compelling progress in generative knowledge graph construction. |
Hongbin Ye; Ningyu Zhang; Hui Chen; Huajun Chen; |
2 | CDConv: A Benchmark for Contradiction Detection in Chinese Conversations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv. |
Chujie Zheng; Jinfeng Zhou; Yinhe Zheng; Libiao Peng; Zhen Guo; Wenquan Wu; Zheng-Yu Niu; Hua Wu; Minlie Huang; |
3 | Transformer Feed-Forward Layers Build Predictions By Promoting Concepts in The Vocabulary Space Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood. In this work, we make a substantial step towards unveiling this underlying prediction process, by reverse-engineering the operation of the feed-forward network (FFN) layers, one of the building blocks of transformer models. |
Mor Geva; Avi Caciularu; Kevin Wang; Yoav Goldberg; |
4 | Learning to Generate Question By Asking Question: A Primal-Dual Approach with Uncommon Word Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, unseen or rare word generation has not been studied in previous works. In this paper, we propose a novel approach which incorporates question generation with its dual problem, question answering, into a unified primal-dual framework. |
Qifan Wang; Li Yang; Xiaojun Quan; Fuli Feng; Dongfang Liu; Zenglin Xu; Sinong Wang; Hao Ma; |
5 | Graph-based Model Generation for Few-Shot Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing models follow a ?one-for-all? scheme where one general large model performs all individual N-way-K-shot tasks in FSRE, which prevents the model from achieving the optimal point on each task. In view of this, we propose a model generation framework that consists of one general model for all tasks and many tiny task-specific models for each individual task. |
Wanli Li; Tieyun Qian; |
6 | Backdoor Attacks in Federated Learning By Rare Embeddings and Gradient Ensembling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates the feasibility of model poisoning for backdoor attacks through rare word embeddings of NLP models. |
Ki Yoon Yoo; Nojun Kwak; |
7 | Generating Natural Language Proofs with Verifier-Guided Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we focus on proof generation: Given a hypothesis and a set of supporting facts, the model generates a proof tree indicating how to derive the hypothesis from supporting facts. |
Kaiyu Yang; Jia Deng; Danqi Chen; |
8 | Toward Unifying Text Segmentation and Long Document Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore the role that section segmentation plays in extractive summarization of written and spoken documents. |
Sangwoo Cho; Kaiqiang Song; Xiaoyang Wang; Fei Liu; Dong Yu; |
9 | The Geometry of Multilingual Language Model Representations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We assess how multilingual language models maintain a shared multilingual representation space while still encoding language-sensitive information in each language. |
Tyler Chang; Zhuowen Tu; Benjamin Bergen; |
10 | Improving Complex Knowledge Base Question Answering Via Question-to-Action and Question-to-Question Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, there is a significant semantic and structural gap between natural language and action sequences, which makes this conversion difficult. In this paper, we introduce an alignment-enhanced complex question answering framework, called ALCQA, which mitigates this gap through question-to-action alignment and question-to-question alignment. |
Yechun Tang; Xiaoxia Cheng; Weiming Lu; |
11 | PAIR: Prompt-Aware MargIn Ranking for Counselor Reflection Scoring in Motivational Interviewing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a system for the analysis of counselor reflections. |
Do June Min; Ver�nica P�rez-Rosas; Kenneth Resnicow; Rada Mihalcea; |
12 | Co-guiding Net: Achieving Mutual Guidances Between Multiple Intent Detection and Slot Filling Via Heterogeneous Semantics-Label Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks. |
Bowen Xing; Ivor Tsang; |
13 | The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a general task-agnostic method, namely intra-distillation, appended to the regular training loss to balance parameter sensitivity. |
Haoran Xu; Philipp Koehn; Kenton Murray; |
14 | Interpreting Language Models with Contrastive Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To disentangle the different decisions in language modeling, we focus on explaining language models contrastively: we look for salient input tokens that explain why the model predicted one token instead of another. |
Kayo Yin; Graham Neubig; |
15 | RankGen: Improving Text Generation with Large Ranking Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts. To address these issues we present RankGen, a 1.2B parameter encoder model for English that scores model generations given a prefix. |
Kalpesh Krishna; Yapei Chang; John Wieting; Mohit Iyyer; |
16 | Learning A Grammar Inducer from Massive Uncurated Instructional Videos Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: While previous work focuses on building systems for inducing grammars on text that are well-aligned with video content, we investigate the scenario, in which text and video are only in loose correspondence. |
Songyang Zhang; Linfeng Song; Lifeng Jin; Haitao Mi; Kun Xu; Dong Yu; Jiebo Luo; |
17 | Normalized Contrastive Learning for Text-Video Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we show that many test instances are either over- or under-represented during retrieval, significantly hurting the retrieval performance. To address this problem, we propose Normalized Contrastive Learning (NCL) which utilizes the Sinkhorn-Knopp algorithm to compute the instance-wise biases that properly normalize the sum retrieval probabilities of each instance so that every text and video instance is fairly represented during cross-modal retrieval. |
Yookoon Park; Mahmoud Azab; Seungwhan Moon; Bo Xiong; Florian Metze; Gourab Kundu; Kirmani Ahmed; |
18 | Estimating Soft Labels for Out-of-Domain Intent Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an adaptive soft pseudo labeling (ASoul) method that can estimate soft labels for pseudo OOD samples when training OOD detectors. |
Hao Lang; Yinhe Zheng; Jian Sun; Fei Huang; Luo Si; Yongbin Li; |
19 | Multi-VQG: Generating Engaging Questions for Multiple Images Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose generating engaging questions from multiple images. |
Min-Hsuan Yeh; Vincent Chen; Ting-Hao Huang; Lun-Wei Ku; |
20 | Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the first systematic conceptual and data-driven analysis to examine the shortcomings of token-level equivalence measures. |
Jannis Bulian; Christian Buck; Wojciech Gajewski; Benjamin B�rschinger; Tal Schuster; |
21 | Non-Parametric Domain Adaptation for End-to-End Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel non-parametric method that leverages in-domain text translation corpus to achieve domain adaptation for E2E-ST systems. |
Yichao Du; Weizhi Wang; Zhirui Zhang; Boxing Chen; Tong Xu; Jun Xie; Enhong Chen; |
22 | Prompting for Multimodal Hateful Meme Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, there is no known explicit external knowledge base that could provide such hate speech contextual information. To address this gap, we propose PromptHate, a simple yet effective prompt-based model that prompts pre-trained language models (PLMs) for hateful meme classification. |
Rui Cao; Roy Ka-Wei Lee; Wen-Haw Chong; Jing Jiang; |
23 | Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the concept of certified error control of candidate set pruning for relevance ranking, which means that the test error after pruning is guaranteed to be controlled under a user-specified threshold with high probability. |
Minghan Li; Xinyu Zhang; Ji Xin; Hongyang Zhang; Jimmy Lin; |
24 | Linearizing Transformer with Key-Value Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose Memsizer, an approach towards closing the performance gap while improving the efficiency even with short generation. |
Yizhe Zhang; Deng Cai; |
25 | Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate the robustness of multimodal classifiers to cross-modal dilutions ? a plausible variation. |
Gaurav Verma; Vishwa Vinay; Ryan Rossi; Srijan Kumar; |
26 | Translation Between Molecules and Natural Language Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present MolT5 – a self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings. |
Carl Edwards; Tuan Lai; Kevin Ros; Garrett Honke; Kyunghyun Cho; Heng Ji; |
27 | What Makes Instruction Learning Hard? An Investigation and A New Challenge in A Synthetic Environment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We thus propose Hard RegSet as a challenging instruction learning dataset, and a controlled environment for studying instruction learning. |
Matthew Finlayson; Kyle Richardson; Ashish Sabharwal; Peter Clark; |
28 | Sentence-Incremental Neural Coreference Resolution Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a sentence-incremental neural coreference resolution system which incrementally builds clusters after marking mention boundaries in a shift-reduce method. |
Matt Grenander; Shay B. Cohen; Mark Steedman; |
29 | SNaC: Coherence Error Detection for Narrative Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce SNaC, a narrative coherence evaluation framework for fine-grained annotations of long summaries. |
Tanya Goyal; Junyi Jessy Li; Greg Durrett; |
30 | HydraSum: Disentangling Style Features in Text Summarization with Multi-Decoder Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these are implicitly encoded within model parameters and specific styles cannot be enforced. To address this, we introduce HydraSum, a new summarization architecture that extends the single decoder framework of current models to a mixture-of-experts version with multiple decoders. |
Tanya Goyal; Nazneen Rajani; Wenhao Liu; Wojciech Kryscinski; |
31 | A Good Neighbor, A Found Treasure: Mining Treasured Neighbors for Knowledge Graph Entity Typing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Besides, we also observe that there are co-occurrence relations between types, which is very helpful to alleviate false-negative problem. In this paper, we propose a novel method called Mining Treasured Neighbors (MiNer) to make use of these two characteristics. |
Zhuoran Jin; Pengfei Cao; Yubo Chen; Kang Liu; Jun Zhao; |
32 | Guiding Neural Entity Alignment with Compatibility Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we argue that different entities within one KG should have compatible counterparts in the other KG due to the potential dependencies among the entities. |
Bing Liu; Harrisen Scells; Wen Hua; Guido Zuccon; Genghong Zhao; Xia Zhang; |
33 | InstructDial: Improving Zero and Few-shot Generalization in Dialogue Through Instruction Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce InstructDial, an instruction tuning framework for dialogue, which consists of a repository of 48 diverse dialogue tasks in a unified text-to-text format created from 59 openly available dialogue datasets. |
Prakhar Gupta; Cathy Jiao; Yi-Ting Yeh; Shikib Mehri; Maxine Eskenazi; Jeffrey Bigham; |
34 | Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we suggest unsupervised statistical boundary information instead, and propose an architecture to encode the information directly into pre-trained language models, resulting in Boundary-Aware BERT (BABERT). |
Peijie Jiang; Dingkun Long; Yanzhao Zhang; Pengjun Xie; Meishan Zhang; Min Zhang; |
35 | RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose RetroMAE, a new retrieval oriented pre-training paradigm based on Masked Auto-Encoder (MAE). |
Shitao Xiao; Zheng Liu; Yingxia Shao; Zhao Cao; |
36 | Aligning Recommendation and Conversation Via Dual Imitation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing conversational recommendation systems (CRS) ignore the advantage of user interest shift in connecting recommendation and conversation, which leads to an ineffective loose coupling structure of CRS. To address this issue, by modeling the recommendation actions as recommendation paths in a knowledge graph (KG), we propose DICR (Dual Imitation for Conversational Recommendation), which designs a dual imitation to explicitly align the recommendation paths and user interest shift paths in a recommendation module and a conversation module, respectively. |
Jinfeng Zhou; Bo Wang; Minlie Huang; Dongming Zhao; Kun Huang; Ruifang He; Yuexian Hou; |
37 | QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose QRelScore, a context-aware Relevance evaluation metric for Question Generation. |
Xiaoqiang Wang; Bang Liu; Siliang Tang; Lingfei Wu; |
38 | Abstract Visual Reasoning with Tangram Shapes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. |
Anya Ji; Noriyuki Kojima; Noah Rush; Alane Suhr; Wai Keen Vong; Robert Hawkins; Yoav Artzi; |
39 | UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation by proposing the UnifiedSKG framework, which unifies 21 SKG tasks into a text-to-text format, aiming to promote systematic SKG research, instead of being exclusive to a single task, domain, or dataset. |
Tianbao Xie; Chen Henry Wu; Peng Shi; Ruiqi Zhong; Torsten Scholak; Michihiro Yasunaga; Chien-Sheng Wu; Ming Zhong; Pengcheng Yin; Sida I. Wang; Victor Zhong; Bailin Wang; Chengzu Li; Connor Boyle; Ansong Ni; Ziyu Yao; Dragomir Radev; Caiming Xiong; Lingpeng Kong; Rui Zhang; Noah A. Smith; Luke Zettlemoyer; Tao Yu; |
40 | Balanced Adversarial Training: Balancing Tradeoffs Between Fickleness and Obstinacy in NLP Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that standard adversarial training methods focused on reducing vulnerability to fickle adversarial examples may make a model more vulnerable to obstinate adversarial examples, with experiments for both natural language inference and paraphrase identification tasks. To counter this phenomenon, we introduce Balanced Adversarial Training, which incorporates contrastive learning to increase robustness against both fickle and obstinate adversarial examples. |
Hannah Chen; Yangfeng Ji; David Evans; |
41 | When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a simple transformer-based model that outperforms specialized architectures on ReaSCAN and a modified version (Qiu et al., 2021) of gSCAN (Ruis et al., 2020). |
Ankur Sikarwar; Arkil Patel; Navin Goyal; |
42 | Generative Language Models for Paragraph-Level Question Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, it is difficult to measure advances in QG research since there are no standardized resources that allow a uniform comparison among approaches. In this paper, we introduce QG-Bench, a multilingual and multidomain benchmark for QG that unifies existing question answering datasets by converting them to a standard QG setting. |
Asahi Ushio; Fernando Alva-Manchego; Jose Camacho-Collados; |
43 | A Unified Encoder-Decoder Framework with Entity Memory Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose an encoder-decoder framework with an entity memory, namely EDMem. |
Zhihan Zhang; Wenhao Yu; Chenguang Zhu; Meng Jiang; |
44 | Segmenting Numerical Substitution Ciphers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose the first automatic methods to segment those ciphers using Byte Pair Encoding (BPE) and unigram language models. |
Nada Aldarrab; Jonathan May; |
45 | Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we present the Crossmodal-3600 dataset (XM3600 in short), a geographically diverse set of 3600 images annotated with human-generated reference captions in 36 languages. |
Ashish V. Thapliyal; Jordi Pont Tuset; Xi Chen; Radu Soricut; |
46 | ReSel: N-ary Relation Extraction from Scientific Text and Tables By Learning to Retrieve and Select Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study the problem of extracting N-ary relation tuples from scientific articles. |
Yuchen Zhuang; Yinghao Li; Junyang Zhang; Yue Yu; Yingjun Mou; Xiang Chen; Le Song; Chao Zhang; |
47 | GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: An additional limitation is that the union operator is non-closure, which undermines the model to handle a series of union operators. To address these problems, we propose a novel probabilistic embedding model, namely Gamma Embeddings (GammaE), for encoding entities and queries to answer different types of FOL queries on KGs. |
Dong Yang; Peijun Qing; Yang Li; Haonan Lu; Xiaodong Lin; |
48 | Reasoning Like Program Executors Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we showcase two simple instances POET-Math and POET-Logic, in addition to a complex instance, POET-SQL. |
Xinyu Pi; Qian Liu; Bei Chen; Morteza Ziyadi; Zeqi Lin; Qiang Fu; Yan Gao; Jian-Guang Lou; Weizhu Chen; |
49 | SEM-F1: An Automatic Way for Semantic Evaluation of Multi-Narrative Overlap Summaries at Scale Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we exclusively focus on the automated evaluation of the SOS task using the benchmark dataset. |
Naman Bansal; Mousumi Akter; Shubhra Kanti Karmaker Santu; |
50 | Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to understand and further develop prefix-tuning through the kernel lens. |
Yifan Chen; Devamanyu Hazarika; Mahdi Namazifar; Yang Liu; Di Jin; Dilek Hakkani-Tur; |
51 | DocInfer: Document-level Natural Language Inference Using Optimal Evidence Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present DocInfer – a novel, end-to-end Document-level Natural Language Inference model that builds a hierarchical document graph enriched through inter-sentence relations (topical, entity-based, concept-based), performs paragraph pruning using the novel SubGraph Pooling layer, followed by optimal evidence selection based on REINFORCE algorithm to identify the most important context sentences for a given hypothesis. |
Puneet Mathur; Gautam Kunapuli; Riyaz Bhat; Manish Shrivastava; Dinesh Manocha; Maneesh Singh; |
52 | LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework Via Three-view Label Propagation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we argue that existing complex EA methods inevitably inherit the inborn defects from their neural network lineage: poor interpretability and weak scalability. |
Xin Mao; Wenting Wang; Yuanbin Wu; Man Lan; |
53 | Metric-guided Distillation: Distilling Knowledge from The Metric to Ranker and Retriever for Generative Commonsense Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Another problem is that re-ranking is very expensive, but only using retrievers will seriously degrade the performance of their generation models. To solve these problems, we propose the metric distillation rule to distill knowledge from the metric (e.g., BLEU) to the ranker. |
Xingwei He; Yeyun Gong; A-Long Jin; Weizhen Qi; Hang Zhang; Jian Jiao; Bartuer Zhou; Biao Cheng; Sm Yiu; Nan Duan; |
54 | Efficient Document Retrieval By End-to-End Refining and Quantizing BERT Embedding with Contrastive Product Quantization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, the Hamming distance can only be equal to one of several integer values, significantly limiting its representational ability for document distances. To address these issues, in this paper, we propose to leverage BERT embeddings to perform efficient retrieval based on the product quantization technique, which will assign for every document a real-valued codeword from the codebook, instead of a binary code as in semantic hashing. |
Zexuan Qiu; Qinliang Su; Jianxing Yu; Shijing Si; |
55 | Curriculum Knowledge Distillation for Emoji-supervised Cross-lingual Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, based on the intuitive assumption that the relationships between emojis and sentiments are consistent across different languages, we investigate transferring sentiment knowledge across languages with the help of emojis. |
Jianyang Zhang; Tao Liang; Mingyang Wan; Guowu Yang; Fengmao Lv; |
56 | Correctable-DST: Mitigating Historical Context Mismatch Between Training and Inference for Improved Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, only the previously predicted dialogue state can be used in inference. This discrepancy might lead to error propagation, i.e., mistakes made by the model in the current turn are likely to be carried over to the following turns.To solve this problem, we propose Correctable Dialogue State Tracking (Correctable-DST). |
Hongyan Xie; Haoxiang Su; Shuangyong Song; Hao Huang; Bo Zou; Kun Deng; Jianghua Lin; Zhihui Zhang; Xiaodong He; |
57 | DropMix: A Textual Data Augmentation Combining Dropout with Mixup Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we argue that the property is essential to overcome overfitting in text learning. |
Fanshuang Kong; Richong Zhang; Xiaohui Guo; Samuel Mensah; Yongyi Mao; |
58 | Cross-document Event Coreference Search: Task, Dataset and Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose an appealing, and often more applicable, complementary set up for the task ? Cross-document Coreference Search, focusing in this paper on event coreference. |
Alon Eirew; Avi Caciularu; Ido Dagan; |
59 | VIRT: Improving Representation-based Text Matching Via Virtual Interaction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these models suffer from severe performance degradation due to the lack of interactions between the pair of texts. To remedy this, we propose a Virtual InteRacTion mechanism (VIRT) for improving representation-based text matching while maintaining its efficiency. |
Dan Li; Yang Yang; Hongyin Tang; Jiahao Liu; Qifan Wang; Jingang Wang; Tong Xu; Wei Wu; Enhong Chen; |
60 | MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Different types of event relations naturally interact with each other, but existing datasets only cover limited relation types at once, which prevents models from taking full advantage of relation interactions. To address these issues, we construct a unified large-scale human-annotated ERE dataset MAVEN-ERE with improved annotation schemes. |
Xiaozhi Wang; Yulin Chen; Ning Ding; Hao Peng; Zimu Wang; Yankai Lin; Xu Han; Lei Hou; Juanzi Li; Zhiyuan Liu; Peng Li; Jie Zhou; |
61 | Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce effective ways to select data from unlabeled corpora of target domains for language model pretraining to improve the performances in target entity extraction tasks. |
Aniruddha Mahapatra; Sharmila Reddy Nangi; Aparna Garimella; Anandhavelu N; |
62 | How Large Language Models Are Transforming Machine-Paraphrase Plagiarism Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work explores T5 and GPT3 for machine-paraphrase generation on scientific articles from arXiv, student theses, and Wikipedia. |
Jan Philip Wahle; Terry Ruas; Frederic Kirstein; Bela Gipp; |
63 | M2D2: A Massively Multi-Domain Language Modeling Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation in language models (LMs). |
Machel Reid; Victor Zhong; Suchin Gururangan; Luke Zettlemoyer; |
64 | �Will You Find These Shortcuts?� A Protocol for Evaluating The Faithfulness of Input Salience Methods for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing work on faithfulness evaluation is not conclusive and does not provide a clear answer as to how different methods are to be compared.Focusing on text classification and the model debugging scenario, our main contribution is a protocol for faithfulness evaluation that makes use of partially synthetic data to obtain ground truth for feature importance ranking. |
Jasmijn Bastings; Sebastian Ebert; Polina Zablotskaia; Anders Sandholm; Katja Filippova; |
65 | Information-Transport-based Policy for Simultaneous Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we treat the translation as information transport from source to target and accordingly propose an Information-Transport-based Simultaneous Translation (ITST). |
Shaolei Zhang; Yang Feng; |
66 | Learning to Adapt to Low-Resource Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, transferring a paraphrasing model to another domain encounters the problem of domain shifting especially when the data is sparse. At the same time, widely using large pre-trained language models (PLMs) faces the overfitting problem when training on scarce labeled data. To mitigate these two issues, we propose, LAPA, an effective adapter for PLMs optimized by meta-learning. |
Zhigen Li; Yanmeng Wang; Rizhao Fan; Ye Wang; Jianfeng Li; Shaojun Wang; |
67 | A Distributional Lens for Multi-Aspect Controllable Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods achieve complex multi-aspect control by fusing multiple controllers learned from single-aspect, but suffer from attribute degeneration caused by the mutual interference of these controllers. To address this, we provide observations on attribute fusion from a distributional perspective and propose to directly search for the intersection areas of multiple attribute distributions as their combination for generation. |
Yuxuan Gu; Xiaocheng Feng; Sicheng Ma; Lingyuan Zhang; Heng Gong; Bing Qin; |
68 | ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose ELMER: an efficient and effective PLM for NAR text generation to explicitly model the token dependency during NAR generation. |
Junyi Li; Tianyi Tang; Wayne Xin Zhao; Jian-Yun Nie; Ji-Rong Wen; |
69 | Multilingual Relation Classification Via Efficient and Effective Prompting Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the first work on prompt-based multilingual relation classification (RC), by introducing an efficient and effective method that constructs prompts from relation triples and involves only minimal translation for the class labels. |
Yuxuan Chen; David Harbecke; Leonhard Hennig; |
70 | Topic-Regularized Authorship Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To handle a large number of unseen authors and topics, we propose Authorship Representation Regularization (ARR), a distillation framework that creates authorship representation with reduced reliance on topic-specific information. |
Jitkapat Sawatphol; Nonthakit Chaiwong; Can Udomcharoenchaikit; Sarana Nutanong; |
71 | Fine-grained Contrastive Learning for Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose fine-grained contrastive learning (FineCL) for RE, which leverages fine-grained information about which silver labels are and are not noisy to improve the quality of learned relationship representations for RE. |
William Hogan; Jiacheng Li; Jingbo Shang; |
72 | Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Succinctly summarizing dialogue is a task of growing interest, but inherent challenges, such as insufficient training data and low information density impede our ability to train abstractive models. In this work, we propose a novel curriculum-based prompt learning method with self-training to address these problems. |
Changqun Li; Linlin Wang; Xin Lin; Gerard de Melo; Liang He; |
73 | Zero-Shot Text Classification with Self-Training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the fact that such models are unfamiliar with the target task can lead to instability and performance issues. We propose a plug-and-play method to bridge this gap using a simple self-training approach, requiring only the class names along with an unlabeled dataset, and without the need for domain expertise or trial and error. |
Ariel Gera; Alon Halfon; Eyal Shnarch; Yotam Perlitz; Liat Ein-Dor; Noam Slonim; |
74 | Deconfounding Legal Judgment Prediction for European Court of Human Rights Cases Towards Better Alignment with Experts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work demonstrates that Legal Judgement Prediction systems without expert-informed adjustments can be vulnerable to shallow, distracting surface signals that arise from corpus construction, case distribution, and confounding factors. |
T.y.s.s Santosh; Shanshan Xu; Oana Ichim; Matthias Grabmair; |
75 | SQuALITY: Building A Long-Document Summarization Dataset The Hard Way Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we turn to a slower but more straightforward approach to developing summarization benchmark data: We hire highly-qualified contractors to read stories and write original summaries from scratch. |
Alex Wang; Richard Yuanzhe Pang; Angelica Chen; Jason Phang; Samuel R. Bowman; |
76 | MetaASSIST: Robust Dialogue State Tracking with Meta Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we propose three schemes with varying degrees of flexibility, ranging from slot-wise to both slot-wise and instance-wise, to convert the weighting parameter into learnable functions. |
Fanghua Ye; Xi Wang; Jie Huang; Shenghui Li; Samuel Stern; Emine Yilmaz; |
77 | Multilingual Machine Translation with Hyper-Adapters Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks using hyper-adapters ? hyper-networks that generate adapters from language and layer embeddings. |
Christos Baziotis; Mikel Artetxe; James Cross; Shruti Bhosale; |
78 | Z-LaVI: Zero-Shot Language Solver Fueled By Visual Imagination Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they generally suffer from reporting bias, the phenomenon describing the lack of explicit commonsense knowledge in written text, e.g., ?an orange is orange?. To overcome this limitation, we develop a novel approach, Z-LaVI, to endow language models with visual imagination capabilities. |
Yue Yang; Wenlin Yao; Hongming Zhang; Xiaoyang Wang; Dong Yu; Jianshu Chen; |
79 | Using Commonsense Knowledge to Answer Why-Questions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: What aspects can be made accessible via external commonsense resources? We study these questions in the context of answering questions in the TellMeWhy dataset using COMET as a source of relevant commonsense relations. |
Yash Kumar Lal; Niket Tandon; Tanvi Aggarwal; Horace Liu; Nathanael Chambers; Raymond Mooney; Niranjan Balasubramanian; |
80 | Affective Idiosyncratic Responses to Music Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Despite consensus that idiosyncratic factors play a key role in regulating how listeners emotionally respond to music, precisely measuring the marginal effects of these variables has proved challenging. To address this gap, we develop computational methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform. |
Sky CH-Wang; Evan Li; Oliver Li; Smaranda Muresan; Zhou Yu; |
81 | Successive Prompting for Decomposing Complex Questions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce a way to generate synthetic dataset which can be used to bootstrap model?s ability to decompose and answer intermediate questions. |
Dheeru Dua; Shivanshu Gupta; Sameer Singh; Matt Gardner; |
82 | Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we develop Maieutic Prompting, which aims to infer a correct answer to a question even from the unreliable generations of LM. |
Jaehun Jung; Lianhui Qin; Sean Welleck; Faeze Brahman; Chandra Bhagavatula; Ronan Le Bras; Yejin Choi; |
83 | DANLI: Deliberative Agent for Following Natural Language Instructions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These reactive agents are insufficient for long-horizon complex tasks. To address this limitation, we propose a neuro-symbolic deliberative agent that, while following language instructions, proactively applies reasoning and planning based on its neural and symbolic representations acquired from past experience (e.g., natural language and egocentric vision). |
Yichi Zhang; Jianing Yang; Jiayi Pan; Shane Storks; Nikhil Devraj; Ziqiao Ma; Keunwoo Yu; Yuwei Bao; Joyce Chai; |
84 | Tracing Semantic Variation in Slang Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore these theories using computational models and test them against historical slang dictionary entries, with a focus on characterizing regularity in the geographical variation of slang usages attested in the US and the UK over the past two centuries. |
Zhewei Sun; Yang Xu; |
85 | Fine-grained Category Discovery Under Coarse-grained Supervision with Hierarchical Weighted Self-contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Considering most current methods cannot transfer knowledge from coarse-grained level to fine-grained level, we propose a hierarchical weighted self-contrastive network by building a novel weighted self-contrastive module and combining it with supervised learning in a hierarchical manner. |
Wenbin An; Feng Tian; Ping Chen; Siliang Tang; Qinghua Zheng; QianYing Wang; |
86 | PLM-based World Models for Text-based Games Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: As the core tasks of world models are future prediction and commonsense understanding, our claim is that pre-trained language models (PLMs) already provide a strong base upon which to build world models. |
Minsoo Kim; Yeonjoon Jung; Dohyeon Lee; Seung-won Hwang; |
87 | Prompt-Based Meta-Learning For Few-shot Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Prompt-tuning has recently proved to be another effective few-shot learner by bridging the gap between pre-train and downstream tasks. In this work, we closely combine the two promising few-shot learning methodologies in structure and propose a Prompt-Based Meta-Learning (PBML) model to overcome the above meta-learning problem by adding the prompting mechanism. |
Haoxing Zhang; Xiaofeng Zhang; Haibo Huang; Lei Yu; |
88 | How Well Can Text-to-Image Generative Models Understand Ethical Natural Language Interventions? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce an Ethical NaTural Language Interventions in Text-to-Image GENeration (ENTIGEN) benchmark dataset to evaluate the change in image generations conditional on ethical interventions across three social axes ? gender, skin color, and culture. |
Hritik Bansal; Da Yin; Masoud Monajatipoor; Kai-Wei Chang; |
89 | Geographic Citation Gaps in NLP Research Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In the spirit of “what we do not measure, we cannot improve”, this work asks a series of questions on the relationship between geographical location and publication success (acceptance in top NLP venues and citation impact). |
Mukund Rungta; Janvijay Singh; Saif M. Mohammad; Diyi Yang; |
90 | Language Models of Code Are Few-Shot Commonsense Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we show that when we instead frame structured commonsense reasoning tasks as code generation tasks, pre-trained LMs of code are better structured commonsense reasoners than LMs of natural language, even when the downstream task does not involve source code at all.We demonstrate our approach across three diverse structured commonsense reasoning tasks. |
Aman Madaan; Shuyan Zhou; Uri Alon; Yiming Yang; Graham Neubig; |
91 | Numerical Optimizations for Weighted Low-rank Estimation on Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Unlike standard SVD, weighed value decomposition is a non-convex optimization problem that lacks a closed-form solution. We systematically investigated multiple optimization strategies to tackle the problem and examined our method by compressing Transformer-based language models. |
Ting Hua; Yen-Chang Hsu; Felicity Wang; Qian Lou; Yilin Shen; Hongxia Jin; |
92 | Generative Multi-hop Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such a bi-encoder approach has limitations in multi-hop settings; (1) the reformulated query gets longer as the number of hops increases, which further tightens the embedding bottleneck of the query vector, and (2) it is prone to error propagation. In this paper, we focus on alleviating these limitations in multi-hop settings by formulating the problem in a fully generative way. |
Hyunji Lee; Sohee Yang; Hanseok Oh; Minjoon Seo; |
93 | Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Accordingly, we annotate a dataset manually to facilitate the investigation of the newly-introduced task, and then build several benchmark encoder-decoder models by using VL-BART and VL-T5 as backbones. |
Yu Zhao; Jianguo Wei; ZhiChao Lin; Yueheng Sun; Meishan Zhang; Min Zhang; |
94 | M3: A Multi-View Fusion and Multi-Decoding Network for Multi-Document Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel method that tries to employ a multi-view fusion and multi-decoding mechanism to achieve it. |
Liang Wen; Houfeng Wang; Yingwei Luo; Xiaolin Wang; |
95 | COCO-DR: Combating The Distribution Shift in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new zero-shot dense retrieval (ZeroDR) method, COCO-DR, to improve the generalization ability of dense retrieval by combating the distribution shifts between source training tasks and target scenarios. |
Yue Yu; Chenyan Xiong; Si Sun; Chao Zhang; Arnold Overwijk; |
96 | Language Model Pre-Training with Sparse Latent Typing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. |
Liliang Ren; Zixuan Zhang; Han Wang; Clare Voss; ChengXiang Zhai; Heng Ji; |
97 | On The Transformation of Latent Space in Fine-Tuned NLP Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the evolution of latent space in fine-tuned NLP models. |
Nadir Durrani; Hassan Sajjad; Fahim Dalvi; Firoj Alam; |
98 | Watch The Neighbors: A Unified K-Nearest Neighbor Contrastive Learning Framework for OOD Intent Discovery Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified K-nearest neighbor contrastive learning framework to discover OOD intents. |
Yutao Mou; Keqing He; Pei Wang; Yanan Wu; Jingang Wang; Wei Wu; Weiran Xu; |
99 | Extracted BERT Model Leaks More Information Than You Think! Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work bridges this gap by launching an attribute inference attack against the extracted BERT model. |
Xuanli He; Lingjuan Lyu; Chen Chen; Qiongkai Xu; |
100 | Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We take a first step in closing this gap by creating a new multimodal task targeted at evaluating understanding of predicate-noun dependencies in a controlled setup. |
Mitja Nikolaus; Emmanuelle Salin; Stephane Ayache; Abdellah Fourtassi; Benoit Favre; |
101 | A Multilingual Perspective Towards The Evaluation of Attribution Methods in Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present a multilingual approach for evaluating attribution methods for the Natural Language Inference (NLI) task in terms of faithfulness and plausibility. |
Kerem Zaman; Yonatan Belinkov; |
102 | Graph-Based Multilingual Label Propagation for Low-Resource Part-of-Speech Tagging Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel method for transferring labels from multiple high-resource source to low-resource target languages. |
Ayyoob ImaniGooghari; Silvia Severini; Masoud Jalili Sabet; Fran�ois Yvon; Hinrich Sch�tze; |
103 | SubeventWriter: Iterative Sub-event Sequence Generation with Coherence Controller Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new task of sub-event generation for an unseen process to evaluate the understanding of the coherence of sub-event actions and objects. |
Zhaowei Wang; Hongming Zhang; Tianqing Fang; Yangqiu Song; Ginny Wong; Simon See; |
104 | Infinite SCAN: An Infinite Model of Diachronic Semantic Change Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we propose a Bayesian model that can jointly estimate the number of senses of words and their changes through time.The model combines a dynamic topic model on Gaussian Markov random fields with a logistic stick-breaking process that realizes Dirichlet process. |
Seiichi Inoue; Mamoru Komachi; Toshinobu Ogiso; Hiroya Takamura; Daichi Mochihashi; |
105 | Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study how IT can be improved with unlabeled data. |
Yuxian Gu; Pei Ke; Xiaoyan Zhu; Minlie Huang; |
106 | Counterfactual Data Augmentation Via Perspective Transition for Open-Domain Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a data augmentation method to automatically augment high-quality responses with different semantics by counterfactual inference. |
Jiao Ou; Jinchao Zhang; Yang Feng; Jie Zhou; |
107 | SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here we present SQUIRE, the first Sequence-to-sequence based multi-hop reasoning framework, which utilizes an encoder-decoder Transformer structure to translate the query to a path. |
Yushi Bai; Xin Lv; Juanzi Li; Lei Hou; Yincen Qu; Zelin Dai; Feiyu Xiong; |
108 | SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified-modal speech-unit-text pre-training model, SpeechUT, to connect the representations of a speech encoder and a text decoder with a shared unit encoder. |
Ziqiang Zhang; Long Zhou; Junyi Ao; Shujie Liu; Lirong Dai; Jinyu Li; Furu Wei; |
109 | Learning Label Modular Prompts for Text Classification in The Wild Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, current modular approaches in NLP do not take advantage of recent advances in parameter efficient tuning of pretrained language models. To close this gap, we propose ModularPrompt, a label-modular prompt tuning framework for text classification tasks. |
Hailin Chen; Amrita Saha; Shafiq Joty; Steven C.H. Hoi; |
110 | Unbiased and Efficient Sampling of Dependency Trees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we show that their fastest algorithm for sampling with replacement, Wilson-RC, is in fact producing biased samples and we provide two alternatives that are unbiased. |
Milo� Stanojevic; |
111 | Continual Learning of Neural Machine Translation Within Low Forgetting Risk Regions Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To solve the problem, we propose a two-stage training method based on the local features of the real loss. |
Shuhao Gu; Bojie Hu; Yang Feng; |
112 | COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the trade-off of early exiting, we propose a joint training approach that calibrates slenderization and preserves contributive structures to each exit instead of only the final layer. |
Bowen Shen; Zheng Lin; Yuanxin Liu; Zhengxiao Liu; Lei Wang; Weiping Wang; |
113 | Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a simple enhancement of RE using k nearest neighbors (kNN-RE). |
Zhen Wan; Qianying Liu; Zhuoyuan Mao; Fei Cheng; Sadao Kurohashi; Jiwei Li; |
114 | StoryER: Automatic Story Evaluation Via Ranking, Rating and Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing automatic story evaluation methods place a premium on story lexical level coherence, deviating from human preference.We go beyond this limitation by considering a novel Story Evaluation method that mimics human preference when judging a story, namely StoryER, which consists of three sub-tasks: Ranking, Rating and Reasoning. |
Hong Chen; Duc Vo; Hiroya Takamura; Yusuke Miyao; Hideki Nakayama; |
115 | Enhancing Self-Consistency and Performance of Pre-Trained Language Models Through Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: . To address this failure mode, we propose a framework, Consistency Correction through Relation Detection, or ConCoRD, for boosting the consistency and accuracy of pre-trained NLP models using pre-trained natural language inference (NLI) models without fine-tuning or re-training. |
Eric Mitchell; Joseph Noh; Siyan Li; Will Armstrong; Ananth Agarwal; Patrick Liu; Chelsea Finn; Christopher Manning; |
116 | Robustness of Demonstration-based Learning Under Limited Data Scenario Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling and show that (1) demonstrations composed of random tokens still make the model a better few-shot learner; (2) the length of random demonstrations and the relevance of random tokens are the main factors affecting the performance; (3) demonstrations increase the confidence of model predictions on captured superficial patterns. |
Hongxin Zhang; Yanzhe Zhang; Ruiyi Zhang; Diyi Yang; |
117 | Modeling Information Change in Science Communication with Semantically Matched Paraphrases Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we present the SCIENTIFIC PARAPHRASE AND INFORMATION CHANGE DATASET (SPICED), the first paraphrase dataset of scientific findings annotated for degree of information change. |
Dustin Wright; Jiaxin Pei; David Jurgens; Isabelle Augenstein; |
118 | Word Order Matters When You Increase Masking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, recent experiments have shown that explicit position encoding is not always useful, since some models without such feature managed to achieve state-of-the art performance on some tasks. To understand better this phenomenon, we examine the effect of removing position encodings on the pre-training objective itself (i.e., masked language modelling), to test whether models can reconstruct position information from co-occurrences alone. |
Karim Lasri; Alessandro Lenci; Thierry Poibeau; |
119 | An Empirical Analysis of Memorization in Fine-tuned Autoregressive Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we empirically study memorization of fine-tuning methods using membership inference and extraction attacks, and show that their susceptibility to attacks is very different. |
Fatemehsadat Mireshghallah; Archit Uniyal; Tianhao Wang; David Evans; Taylor Berg-Kirkpatrick; |
120 | Style Transfer As Data Augmentation: A Case Study on Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we take the named entity recognition task in the English language as a case study and explore style transfer as a data augmentation method to increase the size and diversity of training data in low-resource scenarios. |
Shuguang Chen; Leonardo Neves; Thamar Solorio; |
121 | Linguistic Corpus Annotation for Automatic Text Simplification Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose annotations of the ASSET corpus that can be used to shed more light on ATS evaluation. |
R�mi Cardon; Adrien Bibal; Rodrigo Wilkens; David Alfter; Magali Norr�; Adeline M�ller; Watrin Patrick; Thomas Fran�ois; |
122 | Semantic Framework Based Query Generation for Temporal Question Answering Over Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on the semantic framework, we propose a temporal question answering method, SF-TQA, which generates query graphs by exploring the relevant facts of mentioned entities, where the exploring process is restricted by SF-TCons. |
Wentao Ding; Hao Chen; Huayu Li; Yuzhong Qu; |
123 | There Is No Standard Answer: Knowledge-Grounded Dialogue Generation with Adversarial Activated Multi-Reference Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we establish a multi-reference KGC dataset and propose a series of metrics to systematically assess the one-to-many efficacy of existing KGC models. |
Xueliang Zhao; Tingchen Fu; Chongyang Tao; Rui Yan; |
124 | Stop Measuring Calibration When Humans Disagree Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recently, calibration to human majority has been measured on tasks where humans inherently disagree about which class applies. We show that measuring calibration to human majority given inherent disagreements is theoretically problematic, demonstrate this empirically on the ChaosNLI dataset, and derive several instance-level measures of calibration that capture key statistical properties of human judgements – including class frequency, ranking and entropy. |
Joris Baan; Wilker Aziz; Barbara Plank; Raquel Fernandez; |
125 | Improving Compositional Generalization for Multi-step Quantitative Reasoning in Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Quantitative reasoning is an important aspect of question answering, especially when numeric and verbal cues interact to indicate sophisticated, multi-step programs. In this paper, we demonstrate how modeling the compositional nature of quantitative text can enhance the performance and robustness of QA models, allowing them to capture arithmetic logic that is expressed verbally. |
Armineh Nourbakhsh; Cathy Jiao; Sameena Shah; Carolyn Ros�; |
126 | A Comprehensive Comparison of Neural Networks As Cognitive Models of Inflection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This debate has gravitated into NLP by way of the question: Are neural networks a feasible account for human behavior in morphological inflection?We address that question by measuring the correlation between human judgments and neural network probabilities for unknown word inflections. |
Adam Wiemerslage; Shiran Dudy; Katharina Kann; |
127 | Can Visual Context Improve Automatic Speech Recognition for An Embodied Agent? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we present a method to incorporate a robot?s visual information into an ASR system and improve the recognition of a spoken utterance containing a visible entity. |
Pradip Pramanick; Chayan Sarkar; |
128 | AfroLID: A Neural Language Identification Tool for African Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Problematically, most of the world?s 7000+ languages today are not covered by LID technologies. We address this pressing issue for Africa by introducing AfroLID, a neural LID toolkit for 517 African languages and varieties. |
Ife Adebara; AbdelRahim Elmadany; Muhammad Abdul-Mageed; Alcides Inciarte; |
129 | EvEntS ReaLM: Event Reasoning of Entity States Via Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates models of event implications. |
Evangelia Spiliopoulou; Artidoro Pagnoni; Yonatan Bisk; Eduard Hovy; |
130 | Large Language Models Are Few-shot Clinical Information Extractors Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that large language models, such as InstructGPT (Ouyang et al., 2022), perform well at zero- and few-shot information extraction from clinical text despite not being trained specifically for the clinical domain. |
Monica Agrawal; Stefan Hegselmann; Hunter Lang; Yoon Kim; David Sontag; |
131 | Towards A Unified Multi-Dimensional Evaluator for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified multi-dimensional evaluator UniEval for NLG. |
Ming Zhong; Yang Liu; Da Yin; Yuning Mao; Yizhu Jiao; Pengfei Liu; Chenguang Zhu; Heng Ji; Jiawei Han; |
132 | GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a benchmark dataset, Geo-diverse Commonsense Multilingual Language Models Analysis (GeoMLAMA), for probing the diversity of the relational knowledge in multilingual PLMs. |
Da Yin; Hritik Bansal; Masoud Monajatipoor; Liunian Harold Li; Kai-Wei Chang; |
133 | The (Undesired) Attenuation of Human Biases By Multilinguality Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce and release CA-WEAT, multilingual cultural aware tests to quantify biases, and compare them to previous English-centric tests. |
Cristina Espa�a-Bonet; Alberto Barr�n-Cede�o; |
134 | Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our goal is a question-answering (QA) system that can show how its answers are implied by its own internal beliefs via a systematic chain of reasoning. |
Oyvind Tafjord; Bhavana Dalvi Mishra; Peter Clark; |
135 | Near-Negative Distinction: Giving A Second Life to Human Evaluation Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new and simple automatic evaluation method for NLG called Near-Negative Distinction (NND) that repurposes prior human annotations into NND tests.In an NND test, an NLG model must place a higher likelihood on a high-quality output candidate than on a near-negative candidate with a known error. |
Philippe Laban; Chien-Sheng Wu; Wenhao Liu; Caiming Xiong; |
136 | ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: It is also difficult to collect a large-scale hate speech annotated dataset. In this work, we frame this problem as a few-shot learning task, and show significant gains with decomposing the task into its ?constituent? parts. |
Badr AlKhamissi; Faisal Ladhak; Srinivasan Iyer; Veselin Stoyanov; Zornitsa Kozareva; Xian Li; Pascale Fung; Lambert Mathias; Asli Celikyilmaz; Mona Diab; |
137 | Are Hard Examples Also Harder to Explain? A Study with Human and Model-Generated Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study the connection between explainability and sample hardness by investigating the following research question ? ?Are LLMs and humans equally good at explaining data labels for both easy and hard samples? |
Swarnadeep Saha; Peter Hase; Nazneen Rajani; Mohit Bansal; |
138 | Stanceosaurus: Classifying Stance Towards Multicultural Misinformation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present Stanceosaurus, a new corpus of 28,033 tweets in English, Hindi and Arabic annotated with stance towards 250 misinformation claims. |
Jonathan Zheng; Ashutosh Baheti; Tarek Naous; Wei Xu; Alan Ritter; |
139 | Gendered Mental Health Stigma in Masked Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate gendered mental health stigma in masked language models. |
Inna Lin; Lucille Njoo; Anjalie Field; Ashish Sharma; Katharina Reinecke; Tim Althoff; Yulia Tsvetkov; |
140 | Efficient Nearest Neighbor Search for Cross-Encoder Models Using Matrix Factorization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. |
Nishant Yadav; Nicholas Monath; Rico Angell; Manzil Zaheer; Andrew McCallum; |
141 | Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a method for arbitrary textual style transfer (TST)?the task of transforming a text into any given style?utilizing general-purpose pre-trained language models. |
Mirac Suzgun; Luke Melas-Kyriazi; Dan Jurafsky; |
142 | Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we look at large-scale intermediate pre-training of decomposition-based transformers using distant supervision from comparable texts, particularly large-scale parallel news. |
Ben Zhou; Kyle Richardson; Xiaodong Yu; Dan Roth; |
143 | Why Is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we identify the dataset?s main challenges through a suite of experiments on related tasks (probing task, image retrieval task), data augmentation, and manual inspection of the dataset. |
Anuj Diwan; Layne Berry; Eunsol Choi; David Harwath; Kyle Mahowald; |
144 | Gradient-based Constrained Sampling from Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Large pretrained language models are successful at generating fluent text but are notoriously hard to controllably sample from. In this work, we study constrained sampling from such language models, i.e., generating text that satisfies user-defined constraints, while maintaining fluency and model?s performance in a downstream task. |
Sachin Kumar; Biswajit Paria; Yulia Tsvetkov; |
145 | TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions Over Tabular Data Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, auto-regressive PLMs are challenged by recent emerging numerical reasoning datasets, such as TAT-QA, due to the error-prone implicit calculation. In this paper, we present TaCube, to pre-compute aggregation/arithmetic results for the table in advance, so that they are handy and readily available for PLMs to answer numerical reasoning questions. |
Fan Zhou; Mengkang Hu; Haoyu Dong; Zhoujun Cheng; Fan Cheng; Shi Han; Dongmei Zhang; |
146 | Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we simulate knowledge conflicts (i.e., where parametric knowledge suggests one answer and different passages suggest different answers) and examine model behaviors. |
Hung-Ting Chen; Michael Zhang; Eunsol Choi; |
147 | QA Domain Adaptation Using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel self-supervised framework called QADA for QA domain adaptation. |
Zhenrui Yue; Huimin Zeng; Bernhard Kratzwald; Stefan Feuerriegel; Dong Wang; |
148 | When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel domain specific Financial LANGuage model (FLANG) which uses financial keywords and phrases for better masking, together with span boundary objective and in-filing objective. |
Raj Shah; Kunal Chawla; Dheeraj Eidnani; Agam Shah; Wendi Du; Sudheer Chava; Natraj Raman; Charese Smiley; Jiaao Chen; Diyi Yang; |
149 | Retrieval As Attention: End-to-end Learning of Retrieval and Reading Within A Single Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: These two components are usually modeled separately, which necessitates a cumbersome implementation and is awkward to optimize in an end-to-end fashion. In this paper, we revisit this design and eschew the separate architecture and training in favor of a single Transformer that performs retrieval as attention (RAA), and end-to-end training solely based on supervision from the end QA task. |
Zhengbao Jiang; Luyu Gao; Zhiruo Wang; Jun Araki; Haibo Ding; Jamie Callan; Graham Neubig; |
150 | Reproducibility in Computational Linguistics: Is Source Code Enough? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This work studies trends in source code availability at major computational linguistics conferences, namely, ACL, EMNLP, LREC, NAACL, and COLING. |
Mohammad Arvan; Lu�s Pina; Natalie Parde; |
151 | Generating Information-Seeking Conversations from Unlabeled Documents Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce a novel framework, **SimSeek**, (**Sim**ulating information-**Seek**ing conversation from unlabeled documents), and compare its two variants. |
Gangwoo Kim; Sungdong Kim; Kang Min Yoo; Jaewoo Kang; |
152 | Distill The Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thus, in this work, we introduce IKD-MMT, a novel MMT framework to support the image-free inference phase via an inversion knowledge distillation scheme. |
Ru Peng; Yawen Zeng; Jake Zhao; |
153 | A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by recent work on cross-topic authorship identification and content preservation in summarization, we re-evaluate different authorship obfuscation techniques on detection evasion and content preservation. |
Malik Altakrori; Thomas Scialom; Benjamin C. M. Fung; Jackie Chi Kit Cheung; |
154 | SafeText: A Benchmark for Exploring Physical Safety in Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We create the first benchmark dataset, SafeText, comprising real-life scenarios with paired safe and physically unsafe pieces of advice. |
Sharon Levy; Emily Allaway; Melanie Subbiah; Lydia Chilton; Desmond Patton; Kathleen McKeown; William Yang Wang; |
155 | Ground-Truth Labels Matter: A Deeper Look Into Input-Label Demonstrations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Intrigued by this counter-intuitive observation, we re-examine the importance of ground-truth labels in in-context learning. |
Kang Min Yoo; Junyeob Kim; Hyuhng Joon Kim; Hyunsoo Cho; Hwiyeol Jo; Sang-Woo Lee; Sang-goo Lee; Taeuk Kim; |
156 | D4: A Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Based on clinical depression diagnostic criteria ICD-11 and DSM-5, we designed a 3-phase procedure to construct D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat, which simulates the dialogue between doctors and patients during the diagnosis of depression, including diagnosis results and symptom summary given by professional psychiatrists for each conversation. |
Binwei Yao; Chao Shi; Likai Zou; Lingfeng Dai; Mengyue Wu; Lu Chen; Zhen Wang; Kai Yu; |
157 | Exploiting Domain-slot Related Keywords Description for Few-Shot Cross-Domain Dialogue State Tracking Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel framework based on domain-slot related description to tackle the challenge of few-shot cross-domain DST. |
Gao Qixiang; Guanting Dong; Yutao Mou; Liwen Wang; Chen Zeng; Daichi Guo; Mingyang Sun; Weiran Xu; |
158 | CoCoa: An Encoder-Decoder Model for Controllable Code-switched Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present CoCoa, an encoder-decoder translation model that converts monolingual Hindi text to Hindi-English code-switched text with both encoder-side and decoder-side interventions to achieve fine-grained controllable generation. |
Sneha Mondal; Ritika .; Shreya Pathak; Preethi Jyothi; Aravindan Raghuveer; |
159 | Towards Climate Awareness in NLP Research Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a remedy, we propose a climate performance model card with the primary purpose of being practically usable with only limited information about experiments and the underlying computer hardware. |
Daniel Hershcovich; Nicolas Webersinke; Mathias Kraus; Julia Bingler; Markus Leippold; |
160 | Navigating Connected Memories with A Task-oriented Dialog System Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose dialogs for connected memories as a powerful tool to empower users to search their media collection through a multi-turn, interactive conversation. |
Satwik Kottur; Seungwhan Moon; Alborz Geramifard; Babak Damavandi; |
161 | Language Model Decomposition: Quantifying The Dependency and Correlation of Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, a theoretical framework for studying their relationships is still missing. In this paper, we fill this gap by investigating the linear dependency between pre-trained LMs. |
Hao Zhang; |
162 | SynGEC: Syntax-Enhanced Grammatical Error Correction with A Tailored GEC-Oriented Parser Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work proposes a syntax-enhanced grammatical error correction (GEC) approach named SynGEC that effectively incorporates dependency syntactic information into the encoder part of GEC models. |
Yue Zhang; Bo Zhang; Zhenghua Li; Zuyi Bao; Chen Li; Min Zhang; |
163 | Varifocal Question Generation for Fact-checking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present Varifocal, a method that generates questions based on different focal points within a given claim, i.e. different spans of the claim and its metadata, such as its source and date. |
Nedjma Ousidhoum; Zhangdie Yuan; Andreas Vlachos; |
164 | Bilingual Lexicon Induction for Low-Resource Languages Using Graph Matching Via Optimal Transport Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we improve bilingual lexicon induction performance across 40 language pairs with a graph-matching method based on optimal transport. |
Kelly Marchisio; Ali Saad-Eldin; Kevin Duh; Carey Priebe; Philipp Koehn; |
165 | Whose Language Counts As High Quality? Measuring Language Ideologies in Text Data Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Using a new dataset of U.S. high school newspaper articles—written by students from across the country—we investigate whose language is preferred by the quality filter used for GPT-3. |
Suchin Gururangan; Dallas Card; Sarah Dreier; Emily Gade; Leroy Wang; Zeyu Wang; Luke Zettlemoyer; Noah A. Smith; |
166 | ConReader: Exploring Implicit Relations in Contracts for Contract Clause Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study automatic Contract Clause Extraction (CCE) by modeling implicit relations in legal contracts. |
Weiwen Xu; Yang Deng; Wenqiang Lei; Wenlong Zhao; Tat-Seng Chua; Wai Lam; |
167 | Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current approaches for Natural Language Understanding (NLU) tasks use CL to improve in-distribution data performance often via heuristic-oriented or task-agnostic difficulties. In this work, instead, we employ CL for NLU by taking advantage of training dynamics as difficulty metrics, i.e., statistics that measure the behavior of the model at hand on specific task-data instances during training and propose modifications of existing CL schedulers based on these statistics. |
Fenia Christopoulou; Gerasimos Lampouras; Ignacio Iacobacci; |
168 | Revisiting Parameter-Efficient Tuning: Are We Really There Yet? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: By tuning just a fraction amount of parameters comparing to full model finetuning, PETuning methods claim to have achieved performance on par with or even better than finetuning. In this work, we take a step back and re-examine these PETuning methods by conducting the first comprehensive investigation into the training and evaluation of them. |
Guanzheng Chen; Fangyu Liu; Zaiqiao Meng; Shangsong Liang; |
169 | Transfer Learning from Semantic Role Labeling to Event Argument Extraction with Template-based Slot Querying Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we investigate transfer learning from semantic role labeling (SRL) to event argument extraction (EAE), considering their similar argument structures. |
Zhisong Zhang; Emma Strubell; Eduard Hovy; |
170 | Calibrating Zero-shot Cross-lingual (Un-)structured Predictions Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study different post-training calibration methods in structured and unstructured prediction tasks. |
Zhengping Jiang; Anqi Liu; Benjamin Van Durme; |
171 | PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper introduces a simple yet effective pre-training paradigm, equipped with a knowledge-enhanced decoder that predicts the next entity token with noises in the prefix, explicitly strengthening the representation learning of entities that span over multiple input tokens. |
Song Xu; Haoran Li; Peng Yuan; Youzheng Wu; Xiaodong He; |
172 | How Far Are We from Robust Long Abstractive Summarization? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Abstractive summarization has made tremendous progress in recent years. In this work, we perform fine-grained human annotations to evaluate long document abstractive summarization systems (i.e., models and metrics) with the aim of implementing them to generate reliable summaries. |
Huan Yee Koh; Jiaxin Ju; He Zhang; Ming Liu; Shirui Pan; |
173 | Measuring Context-Word Biases in Lexical Semantic Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We question this assumption by presenting the first quantitative analysis on the context-word interaction being tested in major contextual lexical semantic tasks. To achieve this, we run probing baselines on masked input, and propose measures to calculate and visualize the degree of context or word biases in existing datasets. |
Qianchu Liu; Diana McCarthy; Anna Korhonen; |
174 | Iteratively Prompt Pre-trained Language Models for Chain of Thought Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore an iterative prompting framework, a new prompting paradigm which progressively elicits relevant knowledge from PLMs for multi-step inference. |
Boshi Wang; Xiang Deng; Huan Sun; |
175 | Unobserved Local Structures Make Compositional Generalization Hard Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the factors that make generalization to certain test instances challenging. |
Ben Bogin; Shivanshu Gupta; Jonathan Berant; |
176 | Mitigating Data Sparsity for Short Text Topic Modeling By Topic-Semantic Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To better address data sparsity, in this paper we propose a novel short text topic modeling framework, Topic-Semantic Contrastive Topic Model (TSCTM). |
Xiaobao Wu; Anh Tuan Luu; Xinshuai Dong; |
177 | Back to The Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Recent studies of dialogue modeling commonly employ pre-trained language models (PrLMs) to encode the dialogue history as successive tokens, which is insufficient in capturing the temporal characteristics of dialogues. Therefore, we propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder, which explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks. |
Yiyang Li; Hai Zhao; Zhuosheng Zhang; |
178 | Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Similarly, the explanations may tell us when the model might know and when it does not. Inspired by this, we propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions. |
Dongfang Li; Baotian Hu; Qingcai Chen; |
179 | Non-Autoregressive Neural Machine Translation: A Call for Clarity Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we take a step back and revisit several techniques that have been proposed for improving non-autoregressive translation models and compare their combined translation quality and speed implications under third-party testing environments. |
Robin Schmidt; Telmo Pires; Stephan Peitz; Jonas L��f; |
180 | RED-ACE: Robust Error Detection for ASR Using Confidence Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Specifically, we add an ASR Confidence Embedding (ACE) layer to the AED model’s encoder, allowing us to jointly encode the confidence scores and the transcribed text into a contextualized representation. |
Zorik Gekhman; Dina Zverinski; Jonathan Mallinson; Genady Beryozkin; |
181 | Fast-R2D2: A Pretrained Recursive Neural Network Based on Pruned CKY for Grammar Induction and Text Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, its rule-based pruning process suffers from local optima and slow inference. In this paper, we propose a unified R2D2 method that overcomes these issues. |
Xiang Hu; Haitao Mi; Liang Li; Gerard de Melo; |
182 | A Localized Geometric Method to Match Knowledge in Low-dimensional Hyperbolic Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a localized geometric method to find equivalent entities in hyperbolic space. |
Bo Hui; Tian Xia; Wei-Shinn Ku; |
183 | Memory-assisted Prompt Editing to Improve GPT-3 After Deployment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Large LMs such as GPT-3 are powerful, but can commit mistakes that are obvious to humans. For example, GPT-3 would mistakenly interpret What word is similar to good? to mean a homophone, while the user intended a synonym. Our goal is to effectively correct such errors via user interactions with the system but without retraining, which will be prohibitively costly. |
Aman Madaan; Niket Tandon; Peter Clark; Yiming Yang; |
184 | LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end,we first propose the Multilingual MMT task by establishing two new Multilingual MMT benchmark datasets covering seven languages.Then, an effective baseline LVP-M3 using visual prompts is proposed to support translations between different languages,which includes three stages (token encoding, language-aware visual prompt generation, and language translation). |
Hongcheng Guo; Jiaheng Liu; Haoyang Huang; Jian Yang; Zhoujun Li; Dongdong Zhang; Zheng Cui; |
185 | PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to formulate EHRs generation as a text-to-text translation task by language models (LMs), which suffices to highly flexible event imputation during generation. |
Zifeng Wang; Jimeng Sun; |
186 | ROSE: Robust Selective Fine-tuning for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they are still limited due to redundant attack search spaces and the inability to defend against various types of attacks. In this work, we present a novel fine-tuning approach called RObust SEletive fine-tuning (ROSE) to address this issue. |
Lan Jiang; Hao Zhou; Yankai Lin; Peng Li; Jie Zhou; Rui Jiang; |
187 | CodeRetriever: A Large Scale Contrastive Pre-Training Method for Code Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the CodeRetriever model, which learns the function-level code semantic representations through large-scale code-text contrastive pre-training. |
Xiaonan Li; Yeyun Gong; Yelong Shen; Xipeng Qiu; Hang Zhang; Bolun Yao; Weizhen Qi; Daxin Jiang; Weizhu Chen; Nan Duan; |
188 | Open-Topic False Information Detection on Social Networks with Contrastive Adversarial Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this open-topic scenario, we empirically find that the existing models suffer from impairment in the detection performance for seen or unseen topic data, resulting in poor overall model performance. To address this issue, we propose a novel Contrastive Adversarial Learning Network, CALN, that employs an unsupervised topic clustering method to capture topic-specific features to enhance the model?s performance for seen topics and an unsupervised adversarial learning method to align data representation distributions to enhance the model?s generalisation to unseen topics. |
Guanghui Ma; Chunming Hu; Ling Ge; Hong Zhang; |
189 | Mitigating Inconsistencies in Multimodal Sentiment Analysis Under Uncertain Missing Modalities Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle the issue, we propose an Ensemble-based Missing Modality Reconstruction (EMMR) network to detect and recover semantic features of the key missing modality. |
Jiandian Zeng; Jiantao Zhou; Tianyi Liu; |
190 | ConvTrans: Transforming Web Search Sessions for Conversational Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present ConvTrans, a data augmentation method that can automatically transform easily-accessible web search sessions into conversational search sessions to fundamentally alleviate the data scarcity problem for conversational dense retrieval. |
Kelong Mao; Zhicheng Dou; Hongjin Qian; Fengran Mo; Xiaohua Cheng; Zhao Cao; |
191 | MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service. |
Xiangyu Xi; Jianwei Lv; Shuaipeng Liu; Wei Ye; Fan Yang; Guanglu Wan; |
192 | Reproducibility Issues for BERT-based Evaluation Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we ask whether results and claims from four recent BERT-based metrics can be reproduced. |
Yanran Chen; Jonas Belouadi; Steffen Eger; |
193 | Improving Multi-task Stance Detection with Multi-task Interaction Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they neglect to explore capturing the fine-grained task-specific interaction between stance detection and sentiment tasks, thus degrading performance. To address this issue, this paper proposes a novel multi-task interaction network (MTIN) for improving the performance of stance detection and sentiment analysis tasks simultaneously. |
Heyan Chai; Siyu Tang; Jinhao Cui; Ye Ding; Binxing Fang; Qing Liao; |
194 | Neural-based Mixture Probabilistic Query Embedding for Answering FOL Queries on Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a Neural-based Mixture Probabilistic Query Embedding Model (NMP-QEM) that encodes the answer set of each mini-query as a mixed Gaussian distribution with multiple means and covariance parameters, which can approximate any random distribution arbitrarily well in real KGs. |
Xiao Long; Liansheng Zhuang; Li Aodi; Shafei Wang; Houqiang Li; |
195 | Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In comparison, multi-turn ES conversation systems can provide ES more effectively, but face several new technical challenges, including: (1) how to adopt appropriate support strategies to achieve the long-term dialogue goal of comforting the user?s emotion; (2) how to dynamically model the user?s state. In this paper, we propose a novel system MultiESC to address these issues. |
Yi Cheng; Wenge Liu; Wenjie Li; Jiashuo Wang; Ruihui Zhao; Bang Liu; Xiaodan Liang; Yefeng Zheng; |
196 | Conformal Predictor for Improving Zero-Shot Text Classification Efficiency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we improve the efficiency of such cross-encoder-based 0shot models by restricting the number of likely labels using another fast base classifier-based conformal predictor (CP) calibrated on samples labeled by the 0shot model. |
Prafulla Kumar Choubey; Yu Bai; Chien-Sheng Wu; Wenhao Liu; Nazneen Rajani; |
197 | Effective and Efficient Query-aware Snippet Extraction for Web Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose an effective query-aware webpage snippet extraction method named DeepQSE. |
Jingwei Yi; Fangzhao Wu; Chuhan Wu; Xiaolong Huang; Binxing Jiao; Guangzhong Sun; Xing Xie; |
198 | You Only Need One Model for Open-domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This allows us to use a single question answering model trained end-to-end, which is a more efficient use of model capacity and also leads to better gradient flow. We present a pre-training method to effectively train this architecture and evaluate our model on the Natural Questions and TriviaQA open datasets. |
Haejun Lee; Akhil Kedia; Jongwon Lee; Ashwin Paranjape; Christopher Manning; Kyoung-Gu Woo; |
199 | Generative Entity Typing with Curriculum Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The traditional classification-based entity typing paradigm has two unignorable drawbacks: 1) it fails to assign an entity to the types beyond the predefined type set, and 2) it can hardly handle few-shot and zero-shot situations where many long-tail types only have few or even no training instances. To overcome these drawbacks, we propose a novel generative entity typing (GET) paradigm: given a text with an entity mention, the multiple types for the role that the entity plays in the text are generated with a pre-trained language model (PLM). |
Siyu Yuan; Deqing Yang; Jiaqing Liang; Zhixu Li; Jinxi Liu; Jingyue Huang; Yanghua Xiao; |
200 | SetGNER: General Named Entity Recognition As Entity Set Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Motivated by the observation that the target output of NER is essentially a set of sequences, we propose a novel entity set generation framework for general NER scenes in this paper. |
Yuxin He; Buzhou Tang; |
201 | Opinion Summarization By Weak-Supervision from Mix-structured Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we convert each review into a mixof structured and unstructured data, which we call opinion-aspect pairs (OAs) and implicit sentences (ISs). |
Yizhu Liu; Qi Jia; Kenny Zhu; |
202 | Multi-level Distillation of Semantic Knowledge for Pre-training Multilingual Language Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose Multi-level Multilingual Knowledge Distillation (MMKD), a novel method for improving multilingual language models. |
Mingqi Li; Fei Ding; Dan Zhang; Long Cheng; Hongxin Hu; Feng Luo; |
203 | Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose to use a query generator as the teacher in the cross-lingual setting, which is less dependent on enough training samples and high-quality negative samples. |
Houxing Ren; Linjun Shou; Ning Wu; Ming Gong; Daxin Jiang; |
204 | R2F: A General Retrieval, Reading and Fusion Framework for Document-level Natural Language Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we establish a general solution, named Retrieval, Reading and Fusion (R2F) framework, and a new setting, by analyzing the main challenges of DOCNLI: interpretability, long-range dependency, and cross-sentence inference. |
Hao Wang; Yixin Cao; Yangguang Li; Zhen Huang; Kun Wang; Jing Shao; |
205 | Revisiting Pre-trained Language Models and Their Evaluation for Arabic Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they are still limited due to redundant attack search spaces and the inability to defend against various types of attacks.In this work, we present a novel fine-tuning approach called RObust SEletive fine-tuning (ROSE) to address this issue. |
Abbas Ghaddar; Yimeng Wu; Sunyam Bagga; Ahmad Rashid; Khalil Bibi; Mehdi Rezagholizadeh; Chao Xing; Yasheng Wang; Xinyu Duan; Zhefeng Wang; Baoxing Huai; Xin Jiang; Qun Liu; Phillippe Langlais; |
206 | KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, most existing approaches for MRC may perform poorly in the few-shot learning scenario. To solve this issue, we propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP). |
Jianing Wang; Chengyu Wang; Minghui Qiu; Qiuhui Shi; Hongbin Wang; Jun Huang; Ming Gao; |
207 | Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Then we design multiple continuous prompts rules and transform the knowledge sub-graph into natural language prompts. To further leverage the factual knowledge from these prompts, we propose two novel knowledge-aware self-supervised tasks including prompt relevance inspection and masked prompt modeling. |
Jianing Wang; Wenkang Huang; Minghui Qiu; Qiuhui Shi; Hongbin Wang; Xiang Li; Ming Gao; |
208 | On The Evaluation Metrics for Paraphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper we revisit automatic metrics for paraphrase evaluation and obtain two findings that disobey conventional wisdom: (1) Reference-free metrics achieve better performance than their reference-based counterparts. |
Lingfeng Shen; Lemao Liu; Haiyun Jiang; Shuming Shi; |
209 | Curriculum Learning Meets Weakly Supervised Multimodal Correlation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we inject curriculum learning into weakly supervised multimodal correlation learning. |
Sijie Mai; Ya Sun; Haifeng Hu; |
210 | Rethinking Positional Encoding in Tree Transformer for Code Representation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose a novel tree Transformer encoding node positions based on our new description method for tree structures. |
Han Peng; Ge Li; Yunfei Zhao; Zhi Jin; |
211 | RASAT: Integrating Relational Structures Into Pretrained Seq2Seq Model for Text-to-SQL Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, introducing these structural relations comes with prices: they often result in a specialized model structure, which largely prohibits using large pretrained models in text-to-SQL. To address this problem, we propose RASAT: a Transformer seq2seq architecture augmented with relation-aware self-attention that could leverage a variety of relational structures while inheriting the pretrained parameters from the T5 model effectively. |
Jiexing Qi; Jingyao Tang; Ziwei He; Xiangpeng Wan; Yu Cheng; Chenghu Zhou; Xinbing Wang; Quanshi Zhang; Zhouhan Lin; |
212 | COM-MRC: A COntext-Masked Machine Reading Comprehension Framework for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel COntext-Masked MRC (COM-MRC) framework for ASTE. |
Zepeng Zhai; Hao Chen; Fangxiang Feng; Ruifan Li; Xiaojie Wang; |
213 | CEM: Machine-Human Chatting Handoff Via Causal-Enhance Module Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These variables are significantly associated with handoff decisions, resulting in prediction bias and cost increasement. Therefore, we propose Causal-Enhance Module (CEM) by establishing the causal graph of MHCH based on these two variables, which is a simple yet effective module and can be easy to plug into the existing MHCH methods. |
Shanshan Zhong; Jinghui Qin; Zhongzhan Huang; Daifeng Li; |
214 | Nearest Neighbor Zero-Shot Inference Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Retrieval-augmented language models (LMs) use non-parametric memory to substantially outperform their non-retrieval counterparts on perplexity-based evaluations, but it is an open question whether they achieve similar gains in few- and zero-shot end-task accuracy. We extensively study one such model, the k-nearest neighbor LM (kNN-LM), showing that the gains marginally transfer. |
Weijia Shi; Julian Michael; Suchin Gururangan; Luke Zettlemoyer; |
215 | Robots-Dont-Cry: Understanding Falsely Anthropomorphic Utterances in Dialog Systems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We collect human ratings on the feasibility of approximately 900 two-turn dialogs sampled from 9 diverse data sources. |
David Gros; Yu Li; Zhou Yu; |
216 | A Joint Learning Framework for Restaurant Survival Prediction and Explanation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we tackle the practical problem of restaurant survival prediction. |
Xin Li; Xiaojie Zhang; Peng JiaHao; Rui Mao; Mingyang Zhou; Xing Xie; Hao Liao; |
217 | Making Pretrained Language Models Good Long-tailed Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This motivates us to check the hypothesis that prompt-tuning is also a promising choice for long-tailed classification, since the tail classes are intuitively few-shot ones. To achieve this aim, we conduct empirical studies to examine the hypothesis. |
Chen Zhang; Lei Ren; Jingang Wang; Wei Wu; Dawei Song; |
218 | UniGeo: Unifying Geometry Logical Reasoning Via Reformulating Mathematical Expression Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, in essence, these two tasks have similar problem representations and overlapped math knowledge which can improve the understanding and reasoning ability of a deep model on both two tasks. Therefore, we construct a large-scale Unified Geometry problem benchmark, UniGeo, which contains 4,998 calculation problems and 9,543 proving problems. |
Jiaqi Chen; Tong Li; Jinghui Qin; Pan Lu; Liang Lin; Chongyu Chen; Xiaodan Liang; |
219 | Face-Sensitive Image-to-Emotional-Text Cross-modal Translation for Multimodal Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a face-sensitive image-to-emotional-text translation (FITE) method, which focuses on capturing visual sentiment cues through facial expressions and selectively matching and fusing with the target aspect in textual modality. |
Hao Yang; Yanyan Zhao; Bing Qin; |
220 | FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we are motivated to propose a multi-dimensional dialogue-level metric, which consists of three sub-metrics with each targeting a specific dimension. |
Chen Zhang; Luis Fernando D�Haro; Qiquan Zhang; Thomas Friedrichs; Haizhou Li; |
221 | Sentence Representation Learning with Generative Objective Rather Than Contrastive Objective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We instead propose a novel generative self-supervised learning objective based on phrase reconstruction. |
Bohong Wu; Hai Zhao; |
222 | RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL). |
Mingkai Deng; Jianyu Wang; Cheng-Ping Hsieh; Yihan Wang; Han Guo; Tianmin Shu; Meng Song; Eric Xing; Zhiting Hu; |
223 | DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a new CTG approach, namely DisCup, which incorporates the attribute knowledge of discriminator to optimize the control-prompts, steering a frozen CLM to produce attribute-specific texts. |
Hanqing Zhang; Dawei Song; |
224 | CPL: Counterfactual Prompt Learning for Vision and Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Towards non-spurious and efficient prompt learning from limited examples, this paper presents a novel Counterfactual Prompt Learning (CPL) method for vision and language models, which simultaneously employs counterfactual generation and contrastive learning in a joint optimization framework. |
Xuehai He; Diji Yang; Weixi Feng; Tsu-Jui Fu; Arjun Akula; Varun Jampani; Pradyumna Narayana; Sugato Basu; William Yang Wang; Xin Wang; |
225 | Red Teaming Language Models with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we automatically find cases where a target LM behaves in a harmful way, by generating test cases (?red teaming?) using another LM. |
Ethan Perez; Saffron Huang; Francis Song; Trevor Cai; Roman Ring; John Aslanides; Amelia Glaese; Nat McAleese; Geoffrey Irving; |
226 | CapOnImage: Context-driven Dense-Captioning on Image Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we introduce a new taskcalled captioning on image (CapOnImage), which aims to generatedense captions at different locations of the image based on contextual information. |
Yiqi Gao; Xinglin Hou; Yuanmeng Zhang; Tiezheng Ge; Yuning Jiang; Peng Wang; |
227 | SpanProto: A Two-stage Span-based Prototypical Network for Few-shot Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach, including span extraction and mention classification. |
Jianing Wang; Chengyu Wang; Chuanqi Tan; Minghui Qiu; Songfang Huang; Jun Huang; Ming Gao; |
228 | Discovering Differences in The Representation of People Using Contextualized Semantic Axes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In particular, past work has compared embeddings against “semantic axes” that represent two opposing concepts. We extend this paradigm to BERT embeddings, and construct contextualized axes that mitigate the pitfall where antonyms have neighboring representations. |
Li Lucy; Divya Tadimeti; David Bamman; |
229 | Generating Literal and Implied Subquestions to Fact-check Complex Claims Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on decomposing a complex claim into a comprehensive set of yes-no subquestions whose answers influence the veracity of the claim. |
Jifan Chen; Aniruddh Sriram; Eunsol Choi; Greg Durrett; |
230 | Machine Translation Robustness to Natural Asemantic Variation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: An important yet under-studied category involves minor variations in nuance (non-typos) that preserve meaning w.r.t. the target language. We introduce and formalize this category as Natural Asemantic Variation (NAV) and investigate it in the context of MT robustness. |
Jacob Bremerman; Xiang Ren; Jonathan May; |
231 | Natural Language to Code Translation with Execution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce execution result?based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks. |
Freda Shi; Daniel Fried; Marjan Ghazvininejad; Luke Zettlemoyer; Sida I. Wang; |
232 | Life Is A Circus and We Are The Clowns: Automatically Finding Analogies Between Situations and Processes Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Our goal is to automatically extract entities and their relations from the text and find a mapping between the different domains based on relational similarity (e.g., blood is mapped to water). |
Oren Sultan; Dafna Shahaf; |
233 | Language Contamination Helps Explains The Cross-lingual Capabilities of English Pretrained Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: These models are generally presented as being trained only on English text but have been found to transfer surprisingly well to other languages. We investigate this phenomenon and find that common English pretraining corpora actually contain significant amounts of non-English text: even when less than 1% of data is not English (well within the error rate of strong language classifiers), this leads to hundreds of millions of foreign language tokens in large-scale datasets. |
Terra Blevins; Luke Zettlemoyer; |
234 | Analyzing The Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, because these analyses have focused on fully trained multilingual models, little is known about the dynamics of the multilingual pretraining process. We investigate when these models acquire their in-language and cross-lingual abilities by probing checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks. |
Terra Blevins; Hila Gonen; Luke Zettlemoyer; |
235 | Neural Machine Translation with Contrastive Translation Memories Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Different from previous works that make use of mutually similar but redundant translation memories (TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gain in three phases. |
Xin Cheng; Shen Gao; Lemao Liu; Dongyan Zhao; Rui Yan; |
236 | Distilling Causal Effect from Miscellaneous Other-Class for Continual Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Thanks to the causal inference, we identify that the forgetting is caused by the missing causal effect from the old data.To this end, we propose a unified causal framework to retrieve the causality from both new entity types and Other-Class. |
Junhao Zheng; Zhanxian Liang; Haibin Chen; Qianli Ma; |
237 | Exploring The Secrets Behind The Learning Difficulty of Meaning Representations for Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a data-aware metric called ISS (denoting incremental structural stability) of MRs, and demonstrate that ISS is highly correlated with the final performance. |
Zhenwen Li; Jiaqi Guo; Qian Liu; Jian-Guang Lou; Tao Xie; |
238 | That�s The Wrong Lung! Evaluating and Improving The Interpretability of Unsupervised Multimodal Encoders for Medical Data Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. |
Jered McInerney; Geoffrey Young; Jan-Willem van de Meent; Byron Wallace; |
239 | Unsupervised Tokenization Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In the presented study, we discover that the so-called ?transition freedom? metric appears superior for unsupervised tokenization purposes in comparison to statistical metrics such as mutual information and conditional probability, providing F-measure scores in range from 0.71 to 1.0 across explored multilingual corpora. |
Anton Kolonin; Vignav Ramesh; |
240 | A Template-based Method for Constrained Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a template-based method that can yield results with high translation quality and match accuracy and the inference speed of our method is comparable with unconstrained NMT models. |
Shuo Wang; Peng Li; Zhixing Tan; Zhaopeng Tu; Maosong Sun; Yang Liu; |
241 | PATS: Sensitivity-aware Noisy Learning for Pretrained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present PATS (Perturbation According To Sensitivity), a noisy training mechanism which considers each parameter?s importance in the downstream task to help fine-tune PLMs. |
Yupeng Zhang; Hongzhi Zhang; Sirui Wang; Wei Wu; Zhoujun Li; |
242 | Towards Reinterpreting Neural Topic Models Via Composite Activations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a model-free two-stage process to reinterpret NTM and derive further insights on the state of the trained model. |
Jia Peng Lim; Hady Lauw; |
243 | Few-shot Query-Focused Summarization with Prefix-Merging Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the idea that whether we can integrate and transfer the knowledge of text summarization and question answering to assist the few-shot learning in query-focused summarization. |
Ruifeng Yuan; Zili Wang; Ziqiang Cao; Wenjie Li; |
244 | Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, we find that the existing approaches capture few interactions between the input sentence pairs, which degrades the word alignment quality severely, especially for the ambiguous words in the monolingual context. To remedy this problem, we propose Cross-Align to model deep interactions between the input sentence pairs, in which the source and target sentences are encoded separately with the shared self-attention modules in the shallow layers, while cross-lingual interactions are explicitly constructed by the cross-attention modules in the upper layers. |
Siyu Lai; Zhen Yang; Fandong Meng; Yufeng Chen; Jinan Xu; Jie Zhou; |
245 | BERTScore Is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To that end, this work presents the first systematic study on the social bias in PLM-based metrics. |
Tianxiang Sun; Junliang He; Xipeng Qiu; Xuanjing Huang; |
246 | HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To bridge the gap, in this paper, we propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label MLM perspective. |
Zihan Wang; Peiyi Wang; Tianyu Liu; Binghuai Lin; Yunbo Cao; Zhifang Sui; Houfeng Wang; |
247 | Not to Overfit or Underfit The Source Domains? An Empirical Study of Domain Generalization in Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Here we examine the contrasting view that multi-source domain generalization (DG) is first and foremost a problem of mitigating source domain underfitting: models not adequately learning the signal already present in their multi-domain training data. |
Md Arafat Sultan; Avi Sil; Radu Florian; |
248 | Neural Theory-of-Mind? On The Limits of Social Intelligence in Large LMs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theorybased perspective. |
Maarten Sap; Ronan Le Bras; Daniel Fried; Yejin Choi; |
249 | Improving Passage Retrieval with Zero-Shot Question Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. |
Devendra Sachan; Mike Lewis; Mandar Joshi; Armen Aghajanyan; Wen-tau Yih; Joelle Pineau; Luke Zettlemoyer; |
250 | Summarizing Community-based Question-Answer Pairs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To help users quickly digest the key information, we propose the novel CQA summarization task that aims to create a concise summary from CQA pairs. |
Ting-Yao Hsu; Yoshi Suhara; Xiaolan Wang; |
251 | Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Unlike prior work, we show that improved interpretability can be achieved without decreasing the predictive accuracy. |
Joe Stacey; Pasquale Minervini; Haim Dubossarsky; Marek Rei; |
252 | How to Disagree Well: Investigating The Dispute Tactics Used on Wikipedia Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Disagreements are frequently studied from the perspective of either detecting toxicity or analysing argument structure. We propose a framework of dispute tactics which unifies these two perspectives, as well as other dialogue acts which play a role in resolving disputes, such as asking questions and providing clarification. |
Christine De Kock; Andreas Vlachos; |
253 | Chapter Ordering in Novels Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Understanding narrative flow and text coherence in long-form documents (novels) remains an open problem in NLP. To gain insight, we explore the task of chapter ordering, reconstructing the original order of chapters in novel given a random permutation of the text. |
Allen Kim; Steve Skiena; |
254 | Open-ended Knowledge Tracing for Computer Science Education Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We develop an initial solution to the OKT problem, a student knowledge-guided code generation approach, that combines program synthesis methods using language models with student knowledge tracing methods. |
Naiming Liu; Zichao Wang; Richard Baraniuk; Andrew Lan; |
255 | Logical Neural Networks for Knowledge Base Completion with Embeddings & Rules Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose to utilize logical neural networks (LNN), a powerful neuro-symbolic AI framework that can express both kinds of rules and learn these end-to-end using gradient-based optimization. |
Prithviraj Sen; Breno William Carvalho; Ibrahim Abdelaziz; Pavan Kapanipathi; Salim Roukos; Alexander Gray; |
256 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we decouple images and texts for multimodal contrastive learning, thus scaling the usable training data in a combinatorial magnitude with low cost. |
Zifeng Wang; Zhenbang Wu; Dinesh Agarwal; Jimeng Sun; |
257 | GA-SAM: Gradient-Strength Based Adaptive Sharpness-Aware Minimization for Improved Generalization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, it has some difficulty implying SAM to some natural language tasks, especially to models with drastic gradient changes, such as RNNs. In this work, we analyze the relation between the flatness of the local minimum and its generalization ability from a novel and straightforward theoretical perspective. |
Zhiyuan Zhang; Ruixuan Luo; Qi Su; Xu Sun; |
258 | Sparse Teachers Can Be Dense with Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To remove the parameters that result in student-unfriendliness, we propose a sparse teacher trick under the guidance of an overall knowledgeable score for each teacher parameter. |
Yi Yang; Chen Zhang; Dawei Song; |
259 | BBTv2: Towards A Gradient-Free Future with Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present BBTv2, an improved version of Black-Box Tuning, to drive PTMs for few-shot learning. |
Tianxiang Sun; Zhengfu He; Hong Qian; Yunhua Zhou; Xuanjing Huang; Xipeng Qiu; |
260 | Passage-Mask: A Learnable Regularization Strategy for Retriever-Reader Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Retriever-reader models achieve competitive performance across many different NLP tasks such as open question answering and dialogue conversations. In this work, we notice these models easily overfit the top-rank retrieval passages and standard training fails to reason over the entire retrieval passages. |
Shujian Zhang; Chengyue Gong; Xingchao Liu; |
261 | Mixed-effects Transformers for Hierarchical Adaptation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the mixed-effects transformer (MET), a novel approach for learning hierarchically-structured prefixes? lightweight modules prepended to an input sequence? to account for structured variation in language use. |
Julia White; Noah Goodman; Robert Hawkins; |
262 | On Measuring The Intrinsic Few-Shot Hardness of Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We consider an extensive set of recent few-shot learning methods and show that their performance across a large number of datasets is highly correlated, showing that few-shot hardness may be intrinsic to datasets, for a given pre-trained model. |
Xinran Zhao; Shikhar Murty; Christopher Manning; |
263 | Group Is Better Than Individual: Exploiting Label Topologies and Label Relations for Joint Multiple Intent Detection and Slot Filling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Therefore, in this paper, we first construct a Heterogeneous Label Graph (HLG) containing two kinds of topologies: (1) statistical dependencies based on labels’ co-occurrence patterns and hierarchies in slot labels; (2) rich relations among the label nodes.Then we propose a novel model termed ReLa-Net.It can capture beneficial correlations among the labels from HLG. |
Bowen Xing; Ivor Tsang; |
264 | An Empirical Study on Finding Spans Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present an empirical study on methods for span finding, the selection of consecutive tokens in text for some downstream tasks. |
Weiwei Gu; Boyuan Zheng; Yunmo Chen; Tongfei Chen; Benjamin Van Durme; |
265 | MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features. To deal with these issues, we propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time. |
Zilong Wang; Jiuxiang Gu; Chris Tensmeyer; Nikolaos Barmpalios; Ani Nenkova; Tong Sun; Jingbo Shang; Vlad Morariu; |
266 | Understanding Jargon: Combining Extraction and Generation for Definition Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to combine extraction and generation for jargon definition modeling: first extract self- and correlative definitional information of target jargon from the Web and then generate the final definitions by incorporating the extracted definitional information. |
Jie Huang; Hanyin Shao; Kevin Chen-Chuan Chang; Jinjun Xiong; Wen-mei Hwu; |
267 | ProsocialDialog: A Prosocial Backbone for Conversational Agents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them. To address this issue, we introduce ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms. |
Hyunwoo Kim; Youngjae Yu; Liwei Jiang; Ximing Lu; Daniel Khashabi; Gunhee Kim; Yejin Choi; Maarten Sap; |
268 | Exploiting Global and Local Hierarchies for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To exploit global and local hierarchies, we propose Hierarchy-guided BERT with Global and Local hierarchies (HBGL), which utilizes the large-scale parameters and prior language knowledge of BERT to model both global and local hierarchies. |
Ting Jiang; Deqing Wang; Leilei Sun; Zhongzhi Chen; Fuzhen Zhuang; Qinghong Yang; |
269 | Semantic-aware Contrastive Learning for More Accurate Semantic Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations and take the overall sequence-level semantic into consideration. |
Shan Wu; Chunlei Xin; Bo Chen; Xianpei Han; Le Sun; |
270 | Scientific Paper Extractive Summarization Enhanced By Citation Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings. |
Xiuying Chen; Mingzhe Li; Shen Gao; Rui Yan; Xin Gao; Xiangliang Zhang; |
271 | Hardness-guided Domain Adaptation to Recognise Biomedical Named Entities Under Low-resource Scenarios Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present a simple yet effective hardness-guided domain adaptation framework for bioNER tasks that can effectively leverage the domain hardness information to improve the adaptability of the learnt model in the low-resource scenarios. |
Ngoc Dang Nguyen; Lan Du; Wray Buntine; Changyou Chen; Richard Beare; |
272 | Syntactic Multi-view Learning for Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we model both constituency and dependency trees into word-level graphs, and enable neural OpenIE to learn from the syntactic structures. |
Kuicai Dong; Aixin Sun; Jung-Jae Kim; Xiaoli Li; |
273 | TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Though previous VLP works have proved the effectiveness of ViTs, they still suffer from computational efficiency brought by the long visual sequence. To tackle this problem, in this paper, we propose an efficient vision-and-language pre-training model with Text-Relevant Image Patch Selection, namely TRIPS, which reduces the visual sequence progressively with a text-guided patch-selection layer in the visual backbone for efficient training and inference. |
Chaoya Jiang; Haiyang Xu; Chenliang Li; Ming Yan; Wei Ye; Shikun Zhang; Bin Bi; Songfang Huang; |
274 | CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Practical dialog systems need to deal with various knowledge sources, noisy user expressions, and the shortage of annotated data. To better solve the above problems, we propose CGoDial, a new challenging and comprehensive Chinese benchmark for multi-domain Goal-oriented Dialog evaluation. |
Yinpei Dai; Wanwei He; Bowen Li; Yuchuan Wu; Zheng Cao; Zhongqi An; Jian Sun; Yongbin Li; |
275 | Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, such two-stage methods scale up the computational complexity of training process and obstruct valid feature information while mitigating bias.To address this issue, we utilize the representation normalization method which aims at disentangling the correlations between features of encoded sentences. |
SongYang Gao; Shihan Dou; Qi Zhang; Xuanjing Huang; |
276 | A Unified Positive-Unlabeled Learning Framework for Document-Level Relation Extraction with Different Levels of Labeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To solve the common incomplete labeling problem, we propose a unified positive-unlabeled learning framework – shift and squared ranking loss positive-unlabeled (SSR-PU) learning. |
Ye Wang; Xinxin Liu; Wenxin Hu; Tao Zhang; |
277 | Automatic Generation of Socratic Subquestions for Teaching Math Word Problems Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose various guided question generation schemes based on input conditioning and reinforcement learning. |
Kumar Shridhar; Jakub Macina; Mennatallah El-Assady; Tanmay Sinha; Manu Kapur; Mrinmaya Sachan; |
278 | Mixture of Attention Heads: Selecting Attention Heads Per Token Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes the Mixture of Attention Heads (MoA), a new architecture that combines multi-head attention with the MoE mechanism. |
Xiaofeng Zhang; Yikang Shen; Zeyu Huang; Jie Zhou; Wenge Rong; Zhang Xiong; |
279 | The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we consider the problem of sparsifying BERT models, which are a key building block for natural language processing, in order to reduce their storage and computational cost. |
Eldar Kurtic; Daniel Campos; Tuan Nguyen; Elias Frantar; Mark Kurtz; Benjamin Fineran; Michael Goin; Dan Alistarh; |
280 | Information-Theoretic Text Hallucination Reduction for Video-grounded Dialogue Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Hence, we design Text Hallucination Mitigating (THAM) framework, which incorporates Text Hallucination Regularization (THR) loss derived from the proposed information-theoretic text hallucination measurement approach. |
Sunjae Yoon; Eunseop Yoon; Hee Suk Yoon; Junyeong Kim; Chang Yoo; |
281 | DSM: Question Generation Over Knowledge Base Via Modeling Diverse Subgraphs with Meta-learner Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we show that making use of the past experience on semantically similar subgraphs can reduce the learning difficulty and promote the performance of KBQG models. To achieve this, we propose a novel approach to model diverse subgraphs with meta-learner (DSM). |
Shasha Guo; Jing Zhang; Yanling Wang; Qianyi Zhang; Cuiping Li; Hong Chen; |
282 | RelU-Net: Syntax-aware Graph U-Net for Relational Triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This is due to the absence of entity locations, which is the prerequisite for pruning noisy edges from the dependency tree, when extracting relational triples. In this paper, we propose a unified framework to tackle this challenge and incorporate syntactic information for relational triple extraction. |
Yunqi Zhang; Yubo Chen; Yongfeng Huang; |
283 | Evidence > Intuition: Transferability Estimation for Encoder Selection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose to generate quantitative evidence to predict which LM, out of a pool of models, will perform best on a target task without having to fine-tune all candidates. |
Elisa Bassignana; Max M�ller-Eberstein; Mike Zhang; Barbara Plank; |
284 | Chunk-based Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a chunk-based kNN-MT model which retrieves chunks of tokens from the datastore, instead of a single token. |
Pedro Henrique Martins; Zita Marinho; Andr� F. T. Martins; |
285 | FiE: Building A Global Probability Space By Leveraging Early Fusion in Encoder for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose to extend transformer encoders with the ability to fuse information from multiple passages, using global representation to provide cross-sample attention over all tokens across samples. |
Akhil Kedia; Mohd Abbas Zaidi; Haejun Lee; |
286 | Inductive Relation Prediction with Logical Reasoning Using Contrastive Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel graph convolutional network (GCN)-based model LogCo with logical reasoning by contrastive representations. |
Yudai Pan; Jun Liu; Lingling Zhang; Tianzhe Zhao; Qika Lin; Xin Hu; Qianying Wang; |
287 | Improving Chinese Spelling Check By Character Pronunciation Prediction: The Effects of Adaptivity and Granularity Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As most of these spelling errors are caused by phonetic similarity, effectively modeling the pronunciation of Chinese characters is a key factor for CSC. In this paper, we consider introducing an auxiliary task of Chinese pronunciation prediction (CPP) to improve CSC, and, for the first time, systematically discuss the adaptivity and granularity of this auxiliary task. |
Jiahao Li; Quan Wang; Zhendong Mao; Junbo Guo; Yanyan Yang; Yongdong Zhang; |
288 | MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. |
Anna Currey; Maria Nadejde; Raghavendra Reddy Pappagari; Mia Mayer; Stanislas Lauly; Xing Niu; Benjamin Hsu; Georgiana Dinu; |
289 | A Span-level Bidirectional Network for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present a span-level bidirectional network which utilizes all possible spans as input and extracts triplets from spans bidirectionally. |
Yuqi Chen; Chen Keming; Xian Sun; Zequn Zhang; |
290 | On The Calibration of Massively Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Overall, our work contributes towards building more reliable multilingual models by highlighting the issue of their miscalibration, understanding what language and model-specific factors influence it, and pointing out the strategies to improve the same. |
Kabir Ahuja; Sunayana Sitaram; Sandipan Dandapat; Monojit Choudhury; |
291 | Momentum Contrastive Pre-training for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Existing pre-training methods for extractive Question Answering (QA) generate cloze-like queries different from natural questions in syntax structure, which could overfit pre-trained models to simple keyword matching. In order to address this problem, we propose a novel Momentum Contrastive pRe-training fOr queStion anSwering (MCROSS) method for extractive QA. |
Minda Hu; Muzhi Li; Yasheng Wang; Irwin King; |
292 | A Second Wave of UD Hebrew Treebanking and Cross-Domain Parsing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents a new, freely available UD treebank of Hebrew stratified from a range of topics selected from Hebrew Wikipedia. |
Amir Zeldes; Nick Howell; Noam Ordan; Yifat Ben Moshe; |
293 | Finding Dataset Shortcuts with Grammar Induction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose to use probabilistic grammars to characterize and discover shortcuts in NLP datasets. |
Dan Friedman; Alexander Wettig; Danqi Chen; |
294 | Retrieval Augmentation for Commonsense Reasoning: A Unified Approach Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we systematically investigate how to leverage commonsense knowledge retrieval to improve commonsense reasoning tasks. |
Wenhao Yu; Chenguang Zhu; Zhihan Zhang; Shuohang Wang; Zhuosheng Zhang; Yuwei Fang; Meng Jiang; |
295 | Open World Classification with Adaptive Negative Samples Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose an approach based on Adaptive Negative Samples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets. |
Ke Bai; Guoyin Wang; Jiwei Li; Sunghyun Park; Sungjin Lee; Puyang Xu; Ricardo Henao; Lawrence Carin; |
296 | Re3: Generating Longer Stories With Recursive Reprompting and Revision Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Compared to prior work on shorter stories, long-range plot coherence and relevance are more central challenges here. We propose the Recursive Reprompting and Revision framework (Re3) to address these challenges by (a) prompting a general-purpose language model to construct a structured overarching plan, and (b) generating story passages by repeatedly injecting contextual information from both the plan and current story state into a language model prompt. |
Kevin Yang; Yuandong Tian; Nanyun Peng; Dan Klein; |
297 | Does Joint Training Really Help Cascaded Speech Translation? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we seek to answer the question of whether joint training really helps cascaded speech translation. |
Viet Anh Khoa Tran; David Thulke; Yingbo Gao; Christian Herold; Hermann Ney; |
298 | MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Multiple challenges exist, including the limited availability of annotated training and evaluation datasets as well as the lack of understanding of which settings, languages, and recently proposed methods like cross-lingual transfer will be effective. In this paper, we aim to move towards solutions for these challenges, focusing on the task of named entity recognition (NER). |
David Adelani; Graham Neubig; Sebastian Ruder; Shruti Rijhwani; Michael Beukman; Chester Palen-Michel; Constantine Lignos; Jesujoba Alabi; Shamsuddeen Muhammad; Peter Nabende; Cheikh M. Bamba Dione; Andiswa Bukula; Rooweither Mabuya; Bonaventure F. P. Dossou; Blessing Sibanda; Happy Buzaaba; Jonathan Mukiibi; Godson Kalipe; Derguene Mbaye; Amelia Taylor; Fatoumata Kabore; Chris Chinenye Emezue; Anuoluwapo Aremu; Perez Ogayo; Catherine Gitau; Edwin Munkoh-Buabeng; Victoire Memdjokam Koagne; Allahsera Auguste Tapo; Tebogo Macucwa; Vukosi Marivate; Mboning Tchiaze Elvis; Tajuddeen Gwadabe; Tosin Adewumi; Orevaoghene Ahia; Joyce Nakatumba-Nabende; Neo Lerato Mokono; Ignatius Ezeani; Chiamaka Chukwuneke; Mofetoluwa Oluwaseun Adeyemi; Gilles Quentin Hacheme; Idris Abdulmumin; Odunayo Ogundepo; Oreen Yousuf; Tatiana Moteu; Dietrich Klakow; |
299 | Ethics Consideration Sections in Natural Language Processing Papers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present the results of a manual classification of all ethical consideration sections for ACL 2021. |
Luciana Benotti; Patrick Blackburn; |
300 | Continued Pretraining for Better Zero- and Few-Shot Promptability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate if a dedicated continued pretraining stage could improve ?promptability?, i.e., zero-shot performance with natural language prompts or few-shot performance with prompt tuning. |
Zhaofeng Wu; Robert L Logan IV; Pete Walsh; Akshita Bhagia; Dirk Groeneveld; Sameer Singh; Iz Beltagy; |
301 | Less Is More: Summary of Long Instructions Is Better for Program Synthesis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show that LMs benefit from the summarized version of complicated questions. |
Kirby Kuznia; Swaroop Mishra; Mihir Parmar; Chitta Baral; |
302 | Is A Question Decomposition Unit All We Need? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: With the growing number of new benchmarks, we build bigger and more complex LMs. |
Pruthvi Patel; Swaroop Mishra; Mihir Parmar; Chitta Baral; |
303 | Discourse-Aware Soft Prompting for Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We show that structured design of prefix parameters yields more coherent, faithful and relevant generations than the baseline prefix-tuning on all generation tasks. |
Marjan Ghazvininejad; Vladimir Karpukhin; Vera Gor; Asli Celikyilmaz; |
304 | ExPUNations: Augmenting Puns with Keywords and Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we present the ExPUNations (ExPUN) dataset, in which we augment an existing dataset of puns with detailed crowdsourced annotations of keywords denoting the most distinctive words that make the text funny, pun explanations describing why the text is funny, and fine-grained funniness ratings. |
Jiao Sun; Anjali Narayan-Chen; Shereen Oraby; Alessandra Cervone; Tagyoung Chung; Jing Huang; Yang Liu; Nanyun Peng; |
305 | SLING: Sino Linguistic Evaluation of Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To understand what kinds of linguistic knowledge are encoded by pretrained Chinese language models (LMs), we introduce the benchmark of Sino LINGuistics (SLING), which consists of 38K minimal sentence pairs in Mandarin Chinese grouped into 9 high-level linguistic phenomena. |
Yixiao Song; Kalpesh Krishna; Rajesh Bhatt; Mohit Iyyer; |
306 | Context-Situated Pun Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a new task, context-situated pun generation, where a specific context represented by a set of keywords is provided, and the task is to first identify suitable pun words that are appropriate for the context, then generate puns based on the context keywords and the identified pun words. |
Jiao Sun; Anjali Narayan-Chen; Shereen Oraby; Shuyang Gao; Tagyoung Chung; Jing Huang; Yang Liu; Nanyun Peng; |
307 | Retrieval-Augmented Generative Question Answering for Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a retrieval-augmented generative QA model (R-GQA) for event argument extraction. |
Xinya Du; Heng Ji; |
308 | Concadia: Towards Image-Based Text Generation with A Purpose Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Descriptions focus on visual features and are meant to replace an image (often to increase accessibility), whereas captions appear alongside an image to supply additional information. To motivate this distinction and help people put it into practice, we introduce the publicly available Wikipedia-based dataset Concadia consisting of 96,918 images with corresponding English-language descriptions, captions, and surrounding context. |
Elisa Kreiss; Fei Fang; Noah Goodman; Christopher Potts; |
309 | Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: The fundamental shortcoming of these metrics is that they do not take context into account, whereas contextual information is highly valued by BLV users. To substantiate these claims, we present a study with BLV participants who rated descriptions along a variety of dimensions. |
Elisa Kreiss; Cynthia Bennett; Shayan Hooshmand; Eric Zelikman; Meredith Ringel Morris; Christopher Potts; |
310 | MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a comprehensive benchmark to investigate models? logical reasoning capabilities in complex real-life scenarios. |
Yinya Huang; Hongming Zhang; Ruixin Hong; Xiaodan Liang; Changshui Zhang; Dong Yu; |
311 | Explicit Query Rewriting for Conversational Dense Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a model CRDR that can perform query rewriting and context modelling in a unified framework in which the query rewriting?s supervision signals further enhance the context modelling. |
Hongjin Qian; Zhicheng Dou; |
312 | Efficient Nearest Neighbor Emotion Classification with BERT-whitening Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose kNN-EC, a simple and efficient non-parametric emotion classification (EC) method using nearest neighbor retrieval. |
Wenbiao Yin; Lin Shang; |
313 | FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes FastClass, an efficient weakly-supervised classification approach. |
Tingyu Xia; Yue Wang; Yuan Tian; Yi Chang; |
314 | Neural-Symbolic Inference for Robust Autoregressive Graph Parsing Via Compositional Uncertainty Quantification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study compositionality-aware approach to neural-symbolic inference informed by model confidence, performing fine-grained neural-symbolic reasoning at subgraph level (i.e., nodes and edges) and precisely targeting subgraph components with high uncertainty in the neural parser. |
Zi Lin; Jeremiah Liu; Jingbo Shang; |
315 | A Speaker-Aware Co-Attention Framework for Medical Dialogue Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, in this paper, we propose a speaker-aware co-attention framework for medical dialogue information extraction. |
Yuan Xia; Zhenhui Shi; Jingbo Zhou; Jiayu Xu; Chao Lu; Yehui Yang; Lei Wang; Haifeng Huang; Xia Zhang; Junwei Liu; |
316 | Towards Interactivity and Interpretability: A Rationale-based Legal Judgment Prediction Framework Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Following the judge?s real trial logic, in this paper, we propose a novel Rationale-based Legal Judgment Prediction (RLJP) framework. |
Yiquan Wu; Yifei Liu; Weiming Lu; Yating Zhang; Jun Feng; Changlong Sun; Fei Wu; Kun Kuang; |
317 | RelCLIP: Adapting Language-Image Pretraining for Visual Relationship Detection Via Relational Contrastive Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we introduce compact language information of relation labels for regularizing the representation learning of visual relations. |
Yi Zhu; Zhaoqing Zhu; Bingqian Lin; Xiaodan Liang; Feng Zhao; Jianzhuang Liu; |
318 | Candidate Soups: Fusing Candidate Results Improves Translation Quality for Non-Autoregressive Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple but effective method called ?Candidate Soups,? which can obtain high-quality translations while maintaining the inference speed of NAT models. |
Huanran Zheng; Wei Zhu; Pengfei Wang; Xiaoling Wang; |
319 | Evaluating Parameter Efficient Learning for Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we presentcomparisons between PERMs and finetuningfrom three new perspectives: (1) the effect ofsample and model size to in-domain evaluations, (2) generalization to unseen domains andnew datasets, and (3) the faithfulness of generations. |
Peng Xu; Mostofa Patwary; Shrimai Prabhumoye; Virginia Adams; Ryan Prenger; Wei Ping; Nayeon Lee; Mohammad Shoeybi; Bryan Catanzaro; |
320 | McQueen: A Benchmark for Multimodal Conversational Query Rewrite Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose the task of multimodal conversational query rewrite (McQR), which performs query rewrite under the multimodal visual conversation setting. |
Yifei Yuan; Chen Shi; Runze Wang; Liyi Chen; Feijun Jiang; Yuan You; Wai Lam; |
321 | Self-supervised Graph Masking Pre-training for Graph-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Additionally, PLMs are typically pre-trained on free text which introduces domain mismatch between pre-training and downstream G2T generation tasks. To address these shortcomings, we propose graph masking pre-training strategies that neither require supervision signals nor adjust the architecture of the underlying pre-trained encoder-decoder model. |
Jiuzhou Han; Ehsan Shareghi; |
322 | Improving Stability of Fine-Tuning Pretrained Language Models Via Component-Wise Gradient Norm Clipping Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we first point out that this method does not always work out due to the different convergence speeds of different layers/modules. Inspired by this observation, we propose a simple component-wise gradient norm clipping method to adjust the convergence speed for different components. |
Chenghao Yang; Xuezhe Ma; |
323 | Differentially Private Language Models for Secure Data Sharing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we approach the problem at hand using global differential privacy, particularly by training a generative language model in a differentially private manner and consequently sampling data from it. |
Justus Mattern; Zhijing Jin; Benjamin Weggenmann; Bernhard Schoelkopf; Mrinmaya Sachan; |
324 | Conditional Set Generation Using Seq2seq Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel algorithm for effectively sampling informative orders over the combinatorial space of label orders. |
Aman Madaan; Dheeraj Rajagopal; Niket Tandon; Yiming Yang; Antoine Bosselut; |
325 | Analyzing and Evaluating Faithfulness in Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we first perform the fine-grained human analysis on the faithfulness of dialogue summaries and observe that over 35% of generated summaries are faithfully inconsistent respective the source dialogues. Furthermore, we present a new model-level faithfulness evaluation method. |
Bin Wang; Chen Zhang; Yan Zhang; Yiming Chen; Haizhou Li; |
326 | Twist Decoding: Diverse Generators Guide Each Other Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce Twist decoding, a simple and general text generation algorithm that benefits from diverse models at inference time. |
Jungo Kasai; Keisuke Sakaguchi; Ronan Le Bras; Hao Peng; Ximing Lu; Dragomir Radev; Yejin Choi; Noah A. Smith; |
327 | Exploring Representation-level Augmentation for Code Search Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we explore augmentation methods that augment data (both code and query) at representation level which does not require additional data processing and training, and based on this we propose a general format of representation-level augmentation that unifies existing methods. |
Haochen Li; Chunyan Miao; Cyril Leung; Yanxian Huang; Yuan Huang; Hongyu Zhang; Yanlin Wang; |
328 | Learning Semantic Textual Similarity Via Topic-informed Discrete Latent Variables Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a topic-informed discrete latent variable model for semantic textual similarity, which learns a shared latent space for sentence-pair representation via vector quantization. |
Erxin Yu; Lan Du; Yuan Jin; Zhepei Wei; Yi Chang; |
329 | STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel type of dialogue summarization task – STRUctured DiaLoguE Summarization (STRUDEL) – that can help pre-trained language models to better understand dialogues and improve their performance on important dialogue comprehension tasks. |
Borui Wang; Chengcheng Feng; Arjun Nair; Madelyn Mao; Jai Desai; Asli Celikyilmaz; Haoran Li; Yashar Mehdad; Dragomir Radev; |
330 | Competency-Aware Neural Machine Translation: Can Machine Translation Know Its Own Translation Quality? Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This is in sharp contrast to human translators who give feedback or conduct further investigations whenever they are in doubt about predictions. To fill this gap, we propose a novel competency-aware NMT by extending conventional NMT with a self-estimator, offering abilities to translate a source sentence and estimate its competency. |
Pei Zhang; Baosong Yang; Hao-Ran Wei; Dayiheng Liu; Kai Fan; Luo Si; Jun Xie; |
331 | PASTA: Table-Operations Aware Fact Verification Via Sentence-Table Cloze Pre-training Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet, progress has been limited due to the lack of datasets that can be used to pre-train language models (LMs) to be aware of common table operations, such as aggregating a column or comparing tuples. To bridge this gap, this paper introduces PASTA for table-based fact verification via pre-training with synthesized sentence?table cloze questions. |
Zihui Gu; Ju Fan; Nan Tang; Preslav Nakov; Xiaoman Zhao; Xiaoyong Du; |
332 | Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose SentiWSP, a novel Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks. |
Shuai Fan; Chen Lin; Haonan Li; Zhenghao Lin; Jinsong Su; Hang Zhang; Yeyun Gong; JIan Guo; Nan Duan; |
333 | Towards Multi-Modal Sarcasm Detection Via Hierarchical Congruity Modeling with Knowledge Enhancement Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel hierarchical framework for sarcasm detection by exploring both the atomic-level congruity based on multi-head cross attentions and the composition-level congruity based on graph neural networks, where a post with low congruity can be identified as sarcasm. |
Hui Liu; Wenya Wang; Haoliang Li; |
334 | Efficiently Tuned Parameters Are Task Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we anticipate that task-specific parameters updated in parameter-efficient tuning methods are likely to encode task-specific information. |
Wangchunshu Zhou; Canwen Xu; Julian McAuley; |
335 | COPEN: Probing Conceptual Knowledge in Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. |
Hao Peng; Xiaozhi Wang; Shengding Hu; Hailong Jin; Lei Hou; Juanzi Li; Zhiyuan Liu; Qun Liu; |
336 | Capturing Global Structural Information in Long Document Question Answering with Compressive Graph Selector Network Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, these methods usually ignore the global structure of the long document, which is essential for long-range understanding. To tackle this problem, we propose Compressive Graph Selector Network (CGSN) to capture the global structure in a compressive and iterative manner. |
Yuxiang Nie; Heyan Huang; Wei Wei; Xian-Ling Mao; |
337 | Structural Generalization Is Hard for Sequence-to-sequence Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, recent work on compositional generalization has shown that seq2seq models achieve very low accuracy in generalizing to linguistic structures that were not seen in training. We present new evidence that this is a general limitation of seq2seq models that is present not just in semantic parsing, but also in syntactic parsing and in text-to-text tasks, and that this limitation can often be overcome by neurosymbolic models that have linguistic knowledge built in. |
Yuekun Yao; Alexander Koller; |
338 | Contrastive Learning Enhanced Author-Style Headline Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Besides, we propose two methods to use the learned stylistic features to guide both the pointer and the decoder during the generation. |
Hui Liu; Weidong Guo; Yige Chen; Xiangyang Li; |
339 | Multi-Granularity Optimization for Non-Autoregressive Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This assumption is further strengthened by cross-entropy loss, which encourages a strict match between the hypothesis and the reference token by token. To alleviate this issue, we propose multi-granularity optimization for NAT, which collects model behaviours on translation segments of various granularities and integrates feedback for backpropagation. |
Yafu Li; Leyang Cui; Yongjing Yin; Yue Zhang; |
340 | Super-NaturalInstructions: Generalization Via Declarative Instructions on 1600+ NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Furthermore, we build Tk-Instruct, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples). |
Yizhong Wang; Swaroop Mishra; Pegah Alipoormolabashi; Yeganeh Kordi; Amirreza Mirzaei; Atharva Naik; Arjun Ashok; Arut Selvan Dhanasekaran; Anjana Arunkumar; David Stap; Eshaan Pathak; Giannis Karamanolakis; Haizhi Lai; Ishan Purohit; Ishani Mondal; Jacob Anderson; Kirby Kuznia; Krima Doshi; Kuntal Kumar Pal; Maitreya Patel; Mehrad Moradshahi; Mihir Parmar; Mirali Purohit; Neeraj Varshney; Phani Rohitha Kaza; Pulkit Verma; Ravsehaj Singh Puri; Rushang Karia; Savan Doshi; Shailaja Keyur Sampat; Siddhartha Mishra; Sujan Reddy A; Sumanta Patro; Tanay Dixit; Xudong Shen; |
341 | MetaFill: Text Infilling for Meta-Path Generation on Heterogeneous Information Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing meta-path generation approaches cannot fully exploit the rich textual information in HINs, such as node names and edge type names. To address this problem, we propose MetaFill, a text-infilling-based approach for meta-path generation. |
Zequn Liu; Kefei Duan; Junwei Yang; Hanwen Xu; Ming Zhang; Sheng Wang; |
342 | DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose DRLK (Dynamic Hierarchical Reasoning with Language Model and Knowledge Graphs), a novel model that utilizes dynamic hierarchical interactions between the QA context and KG for reasoning. |
Miao Zhang; Rufeng Dai; Ming Dong; Tingting He; |
343 | AEG: Argumentative Essay Generation Via A Dual-Decoder Model with Content Planning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new task, Argumentative Essay Generation (AEG). |
Jianzhu Bao; Yasheng Wang; Yitong Li; Fei Mi; Ruifeng Xu; |
344 | BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To build open-domain chatbots that are able to use diverse communicative skills, we propose a novel framework BotsTalk, where multiple agents grounded to the specific target skills participate in a conversation to automatically annotate multi-skill dialogues. |
Minju Kim; Chaehyeong Kim; Yong Ho Song; Seung-won Hwang; Jinyoung Yeo; |
345 | Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently. |
Jun-Yu Ma; Beiduo Chen; Jia-Chen Gu; Zhenhua Ling; Wu Guo; Quan Liu; Zhigang Chen; Cong Liu; |
346 | An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To combine the strength of both approaches, we propose the Efficient Memory-Augmented Transformer (EMAT) – it encodes external knowledge into a key-value memory and exploits the fast maximum inner product search for memory querying. |
Yuxiang Wu; Yu Zhao; Baotian Hu; Pasquale Minervini; Pontus Stenetorp; Sebastian Riedel; |
347 | Supervised Prototypical Contrastive Learning for Emotion Recognition in Conversation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a Supervised Prototypical Contrastive Learning (SPCL) loss for the ERC task. |
Xiaohui Song; Longtao Huang; Hui Xue; Songlin Hu; |
348 | RuCoLA: Russian Corpus of Linguistic Acceptability Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce the Russian Corpus of Linguistic Acceptability (RuCoLA), built from the ground up under the well-established binary LA approach. |
Vladislav Mikhailov; Tatiana Shamardina; Max Ryabinin; Alena Pestova; Ivan Smurov; Ekaterina Artemova; |
349 | Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper aims to utilize the representation capacity of the complex hyperbolic geometry in multi-relational KG embeddings. |
Huiru Xiao; Xin Liu; Yangqiu Song; Ginny Wong; Simon See; |
350 | Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. |
Longxu Dou; Yan Gao; Xuqi Liu; Mingyang Pan; Dingzirui Wang; Wanxiang Che; Dechen Zhan; Min-Yen Kan; Jian-Guang Lou; |
351 | Should We Ban English NLP for A Year? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Many have argued that it is almost impossible to mitigate inequality amplification. I argue that, on the contrary, it is quite simple to do so, and that counter-measures would have little-to-no negative impact, except for, perhaps, in the very short term. |
Anders S�gaard; |
352 | LittleBird: Efficient Faster & Longer Transformer for Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: But it has a limitation dealing with long inputs due to its attention mechanism. Longformer, ETC and BigBird addressed this issue and effectively solved the quadratic dependency problem.However we find that these models are not sufficient, and propose LittleBird, a novel model based on BigBird with improved speed and memory footprint while maintaining accuracy. |
Minchul Lee; Kijong Han; Myeong Cheol Shin; |
353 | WeTS: A Benchmark for Translation Suggestion Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To break these limitations mentioned above and spur the research in TS, we create a benchmark dataset, called WeTS, which is a golden corpus annotated by expert translators on four translation directions. |
Zhen Yang; Fandong Meng; Yingxue Zhang; Ernan Li; Jie Zhou; |
354 | Discrete Cross-Modal Alignment Enables Zero-Shot Speech Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In order to enable zero-shot ST, we propose a novel Discrete Cross-Modal Alignment (DCMA) method that employs a shared discrete vocabulary space to accommodate and match both modalities of speech and text. |
Chen Wang; Yuchen Liu; Boxing Chen; Jiajun Zhang; Wei Luo; Zhongqiang Huang; Chengqing Zong; |
355 | Abstractive Summarization Guided By Latent Hierarchical Document Structure Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this shortcoming, we propose a hierarchy-aware graph neural network (HierGNN) which captures such dependencies through three main steps: 1) learning a hierarchical document structure through a latent structure tree learned by a sparse matrix-tree computation; 2) propagating sentence information over this structure using a novel message-passing node propagation mechanism to identify salient information; 3) using graph-level attention to concentrate the decoder on salient information. |
Yifu Qiu; Shay B. Cohen; |
356 | Explainable Question Answering Based on Semantic Graph By Global Differentiable Learning and Dynamic Adaptive Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To alleviate it, we propose a simple yet effective Global Differentiable Learning strategy to explore optimal reasoning paths from the latent probability space so that the model learns to solve intermediate reasoning processes without expert annotations. |
Jianguo Mao; Wenbin Jiang; Xiangdong Wang; Hong Liu; Yu Xia; Yajuan Lyu; QiaoQiao She; |
357 | DuReader-Retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present DuReader-retrieval, a large-scale Chinese dataset for passage retrieval. |
Yifu Qiu; Hongyu Li; Yingqi Qu; Ying Chen; QiaoQiao She; Jing Liu; Hua Wu; Haifeng Wang; |
358 | Pair-Based Joint Encoding with Relational Graph Convolutional Networks for Emotion-Cause Pair Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This lead to an imbalance in inter-task feature interaction where features extracted later have no direct contact with the former. To address this issue, we propose a novel **P**air-**B**ased **J**oint **E**ncoding (**PBJE**) network, which generates pairs and clauses features simultaneously in a joint feature encoding manner to model the causal relationship in clauses. |
Junlong Liu; Xichen Shang; Qianli Ma; |
359 | Affective Knowledge Enhanced Multiple-Graph Fusion Networks for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel multi-graph fusion network (MGFN) based on latent graph to leverage the richer syntax dependency relation label information and affective semantic information of words. |
Siyu Tang; Heyan Chai; Ziyi Yao; Ye Ding; Cuiyun Gao; Binxing Fang; Qing Liao; |
360 | IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages. |
Aman Kumar; Himani Shrotriya; Prachi Sahu; Amogh Mishra; Raj Dabre; Ratish Puduppully; Anoop Kunchukuttan; Mitesh M. Khapra; Pratyush Kumar; |
361 | Improving Machine Translation with Phrase Pair Injection and Corpus Filtering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. |
Akshay Batheja; Pushpak Bhattacharyya; |
362 | An Anchor-based Relative Position Embedding Method for Cross-Modal Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a unified position embedding method for these problems, called AnChor-basEd Relative Position Embedding (ACE-RPE), in which we first introduce an anchor locating mechanism to bridge the semantic gap and locate anchors from different modalities. |
Ya Wang; Xingwu Sun; Lian Fengzong; ZhanHui Kang; Chengzhong Xu Xu; |
363 | Norm-based Noisy Corpora Filtering and Refurbishing in Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a norm-based noisy corpora filtering and refurbishing method with no external data and costly scorers. |
Yu Lu; Jiajun Zhang; |
364 | TeleMelody: Lyric-to-Melody Generation with A Template-Based Two-Stage Method Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop TeleMelody, a two-stage lyric-to-melody generation system with music template (e.g., tonality, chord progression, rhythm pattern, and cadence) to bridge the gap between lyrics and melodies (i.e., the system consists of a lyric-to-template module and a template-to-melody module). |
Zeqian Ju; Peiling Lu; Xu Tan; Rui Wang; Chen Zhang; Songruoyao Wu; Kejun Zhang; Xiang-Yang Li; Tao Qin; Tie-Yan Liu; |
365 | SEEN: Structured Event Enhancement Network for Explainable Need Detection of Information Recall Assistance Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a pilot model?structured event enhancement network (SEEN) that detects life event inconsistency, additional information in life events, and forgotten events. |
You-En Lin; An-Zi Yen; Hen-Hsen Huang; Hsin-Hsi Chen; |
366 | Rethinking Style Transformer with Energy-based Interpretation: Adversarial Unsupervised Style Transfer Using A Pretrained Model Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, adversarial training significantly degrades fluency compared to the other two metrics. In this work, we explain this phenomenon using energy-based interpretation, and leverage a pretrained language model to improve fluency. |
Hojun Cho; Dohee Kim; Seungwoo Ryu; ChaeHun Park; Hyungjong Noh; Jeong-in Hwang; Minseok Choi; Edward Choi; Jaegul Choo; |
367 | Towards Robust K-Nearest-Neighbor Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate the impact of noise, we propose a confidence-enhanced kNN-MT model with robust training. |
Hui Jiang; Ziyao Lu; Fandong Meng; Chulun Zhou; Jie Zhou; Degen Huang; Jinsong Su; |
368 | Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose Tiny-NewsRec, which can improve both the effectiveness and the efficiency of PLM-based news recommendation. |
Yang Yu; Fangzhao Wu; Chuhan Wu; Jingwei Yi; Qi Liu; |
369 | TABS: Efficient Textual Adversarial Attack for Pre-trained NL Code Model Using Semantic Beam Search Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose TABS, an efficient beam search black-box adversarial attack method. |
YunSeok Choi; Hyojun Kim; Jee-Hyong Lee; |
370 | Investigating The Robustness of Natural Language Generation from Logical Forms Via Counterfactual Samples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: State-of-the-art methods based on pre-trained models have achieved remarkable performance on the standard test dataset. However, we question whether these methods really learn how to perform logical reasoning, rather than just relying on the spurious correlations between the headers of the tables and operators of the logical form. |
Chengyuan Liu; Leilei Gan; Kun Kuang; Fei Wu; |
371 | Helping The Weak Makes You Strong: Simple Multi-Task Learning Improves Non-Autoregressive Translators Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a simple and model-agnostic multi-task learning framework to provide more informative learning signals. |
Xinyou Wang; Zaixiang Zheng; Shujian Huang; |
372 | RACE: Retrieval-augmented Commit Message Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose RACE, a new retrieval-augmented neural commit message generation method, which treats the retrieved similar commit as an exemplar and leverages it to generate an accurate commit message. |
Ensheng Shi; Yanlin Wang; Wei Tao; Lun Du; Hongyu Zhang; Shi Han; Dongmei Zhang; Hongbin Sun; |
373 | PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a Pretrained Logical Form Generator (PLOG) framework to improve generation fidelity. |
Ao Liu; Haoyu Dong; Naoaki Okazaki; Shi Han; Dongmei Zhang; |
374 | GHAN: Graph-Based Hierarchical Aggregation Network for Text-Video Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, there are structural and semantic differences between text and video, making this approach challenging for fine-grained understanding. In order to solve this, we propose an end-to-end graph-based hierarchical aggregation network for text-video retrieval according to the hierarchy possessed by text and video. |
Yahan Yu; Bojie Hu; Yu Li; |
375 | MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering Over Images and Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these methods are restricted to retrieving only textual knowledge, neglecting the ubiquitous amount of knowledge in other modalities like images ? much of which contains information not covered by any text. To address this limitation, we propose the first Multimodal Retrieval-Augmented Transformer (MuRAG), which accesses an external non-parametric multimodal memory to augment language generation. |
Wenhu Chen; Hexiang Hu; Xi Chen; Pat Verga; William Cohen; |
376 | PHEE: A Dataset for Pharmacovigilance Event Extraction from Text Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present PHEE, a novel dataset for pharmacovigilance comprising over 5000 annotated events from medical case reports and biomedical literature, making it the largest such public dataset to date. |
Zhaoyue Sun; Jiazheng Li; Gabriele Pergola; Byron Wallace; Bino John; Nigel Greene; Joseph Kim; Yulan He; |
377 | OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the labels in XMTC tasks are essentially an unordered set rather than an ordered sequence, the default order of labels restrains Seq2Seq models in training. To address this limitation in Seq2Seq, we propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set. |
Jie Cao; Yin Zhang; |
378 | SimQA: Detecting Simultaneous MT Errors Through Word-by-Word Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Yet, evaluations of simultaneous machine translation (SimulMT) fail to capture if systems correctly translate the most salient elements of a question: people, places, and dates. To address this problem, we introduce a downstream word-by-word question answering evaluation task (SimQA): given a source language question, translate the question word by word into the target language, and answer as soon as possible. |
HyoJung Han; Marine Carpuat; Jordan Boyd-Graber; |
379 | Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide a novel view of projecting away language-specific factors from a multilingual embedding space. |
Zhihui Xie; Handong Zhao; Tong Yu; Shuai Li; |
380 | Rethinking The Authorship Verification Experimental Setups Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we improve the experimental setup by proposing five new public splits over the PAN dataset, specifically designed to isolate and identify biases related to the text topic and to the author?s writing style. |
Florin Brad; Andrei Manolache; Elena Burceanu; Antonio Barbalau; Radu Tudor Ionescu; Marius Popescu; |
381 | Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To better glue the cross-modal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. |
Chunpu Xu; Jing Li; |
382 | Training Language Models with Memory Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present TRIME, a novel yet simple training approach designed for training LMs with memory augmentation. |
Zexuan Zhong; Tao Lei; Danqi Chen; |
383 | Data-Efficient Strategies for Expanding Hate Speech Detection Into Under-Resourced Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To mitigate these issues, we explore data-efficient strategies for expanding hate speech detection into under-resourced languages. In a series of experiments with mono- and multilingual models across five non-English languages, we find that 1) a small amount of target-language fine-tuning data is needed to achieve strong performance, 2) the benefits of using more such data decrease exponentially, and 3) initial fine-tuning on readily-available English data can partially substitute target-language data and improve model generalisability. |
Paul R�ttger; Debora Nozza; Federico Bianchi; Dirk Hovy; |
384 | Dimension Reduction for Efficient Dense Retrieval Via Conditional Autoencoder Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To reduce the embedding dimensions of dense retrieval, this paper proposes a Conditional Autoencoder (ConAE) to compress the high-dimensional embeddings to maintain the same embedding distribution and better recover the ranking features. |
Zhenghao Liu; Han Zhang; Chenyan Xiong; Zhiyuan Liu; Yu Gu; Xiaohua Li; |
385 | Controlled Text Reduction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Concretely, we formalize Controlled Text Reduction as a standalone task, whose input is a source text with marked spans of targeted content (?highlighting?). |
Aviv Slobodkin; Paul Roit; Eran Hirsch; Ori Ernst; Ido Dagan; |
386 | Questioning The Validity of Summarization Datasets and Improving Their Factual Consistency Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Due to this lack of well-defined formulation, a large number of popular abstractive summarization datasets are constructed in a manner that neither guarantees validity nor meets one of the most essential criteria of summarization: factual consistency. In this paper, we address this issue by combining state-of-the-art factual consistency models to identify the problematic instances present in popular summarization datasets. |
Yanzhu Guo; Chlo� Clavel; Moussa Kamal Eddine; Michalis Vazirgiannis; |
387 | Invariant Language Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize better across multiple environments. |
Maxime Peyrard; Sarvjeet Ghotra; Martin Josifoski; Vidhan Agarwal; Barun Patra; Dean Carignan; Emre Kiciman; Saurabh Tiwary; Robert West; |
388 | AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose AdaMix as a general PEFT method that tunes a mixture of adaptation modules ? given the underlying PEFT method of choice ? introduced in each Transformer layer while keeping most of the PLM weights frozen. |
Yaqing Wang; Sahaj Agarwal; Subhabrata Mukherjee; Xiaodong Liu; Jing Gao; Ahmed Hassan Awadallah; Jianfeng Gao; |
389 | How �Multi� Is Multi-Document Summarization? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Accordingly, it is expected that both reference summaries in MDS datasets, as well as system summaries, would indeed be based on such dispersed information. In this paper, we argue for quantifying and assessing this expectation. |
Ruben Wolhandler; Arie Cattan; Ori Ernst; Ido Dagan; |
390 | BioReader: A Retrieval-Enhanced Text-to-Text Transformer for Biomedical Literature Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce BioReader, the first retrieval-enhanced text-to-text model for biomedical natural language processing. |
Giacomo Frisoni; Miki Mizutani; Gianluca Moro; Lorenzo Valgimigli; |
391 | T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We present a new approach to perform zero-shot cross-modal transfer between speech and text for translation tasks. |
Paul-Ambroise Duquenne; Hongyu Gong; Beno�t Sagot; Holger Schwenk; |
392 | LILA: A Unified Benchmark for Mathematical Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Towards evaluating and improving AI systems in this domain, we proposeLILA, a unified mathematical reasoning benchmark consisting of 23 diversetasks along four dimensions:(i) mathematical abilities e.g., arithmetic, calculus (ii) language format e.g., question-answering, fill-in-the-blanks (iii) language diversity e.g., no language, simple language (iv) external knowledge e.g., commonsense, physics. |
Swaroop Mishra; Matthew Finlayson; Pan Lu; Leonard Tang; Sean Welleck; Chitta Baral; Tanmay Rajpurohit; Oyvind Tafjord; Ashish Sabharwal; Peter Clark; Ashwin Kalyan; |
393 | Leveraging Affirmative Interpretations from Negation Improves Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Inspired by the fact that understanding a negated statement often requires humans to infer affirmative interpretations, in this paper we show that doing so benefits models for three natural language understanding tasks. |
Md Mosharaf Hossain; Eduardo Blanco; |
394 | GraphQ IR: Unifying The Semantic Parsing of Graph Query Languages with One Intermediate Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper proposes a unified intermediate representation for graph query languages, named GraphQ IR. |
Lunyiu Nie; Shulin Cao; Jiaxin Shi; Jiuding Sun; Qi Tian; Lei Hou; Juanzi Li; Jidong Zhai; |
395 | InforMask: Unsupervised Informative Masking for Language Model Pretraining Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose InforMask, a new unsupervised masking strategy for training masked language models. |
Nafis Sadeq; Canwen Xu; Julian McAuley; |
396 | CTRLsum: Towards Generic Controllable Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Current summarization systems yield generic summaries that are disconnected from users? preferences and expectations. To address this limitation, we present CTRLsum, a generic framework to control generated summaries through a set of keywords. |
Junxian He; Wojciech Kryscinski; Bryan McCann; Nazneen Rajani; Caiming Xiong; |
397 | Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we contrast and compare NLP fact-checking with how professional fact-checkers combat misinformation in the absence of counter-evidence. |
Max Glockner; Yufang Hou; Iryna Gurevych; |
398 | A Framework for Adapting Pre-Trained Language Models to Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we conduct a comprehensive exploration of how to best extract and incorporate those embeddings into knowledge graph completion models. |
Justin Lovelace; Carolyn Ros�; |
399 | Mutual Information Alleviates Hallucinations in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we identify a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty. |
Liam van der Poel; Ryan Cotterell; Clara Meister; |
400 | Toward The Limitation of Code-Switching in Cross-Lingual Transfer Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper mitigates the limitation of the code-switching method by not only making the token replacement but considering the similarity between the context and the switched tokens so that the newly substituted sentences are grammatically consistent during both training and inference. |
Yukun Feng; Feng Li; Philipp Koehn; |
401 | Syntactically Rich Discriminative Training: An Effective Method for Open Information Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose several new methods for training neural OIE models in this paper. |
Frank Mtumbuka; Thomas Lukasiewicz; |
402 | Transformer-based Entity Typing in Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel Transformer-based Entity Typing (TET) approach, effectively encoding the content of neighbours of an entity by means of a transformer mechanism. |
Zhiwei Hu; Victor Gutierrez-Basulto; Zhiliang Xiang; Ru Li; Jeff Pan; |
403 | NewsClaims: A New Benchmark for Claim Detection from News with Attribute Knowledge Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we present NewsClaims, a new benchmark for attribute-aware claim detection in the news domain. |
Revanth Gangi Reddy; Sai Chetan Chinthakindi; Zhenhailong Wang; Yi Fung; Kathryn Conger; Ahmed ELsayed; Martha Palmer; Preslav Nakov; Eduard Hovy; Kevin Small; Heng Ji; |
404 | IsoVec: Controlling The Relative Isomorphism of Word Embedding Spaces Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We incorporate global measures of isomorphism directly into the skipgram loss function, successfully increasing the relative isomorphism of trained word embedding spaces and improving their ability to be mapped to a shared cross-lingual space. |
Kelly Marchisio; Neha Verma; Kevin Duh; Philipp Koehn; |
405 | Adversarial Concept Erasure in Kernel Space Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a kernelization of the recently-proposed linear concept-removal objective, and show that it is effective in guarding against the ability of certain nonlinear adversaries to recover the concept. |
Shauli Ravfogel; Francisco Vargas; Yoav Goldberg; Ryan Cotterell; |
406 | The Authenticity Gap in Human Evaluation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We suggest improvements to the standard protocol to make it more theoretically sound, but even in its improved form, it cannot be used to evaluate open-ended tasks like story generation. |
Kawin Ethayarajh; Dan Jurafsky; |
407 | BERT in Plutarch�s Shadows Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper presents a BERT language model for Ancient Greek. |
Ivan Yamshchikov; Alexey Tikhonov; Yorgos Pantis; Charlotte Schubert; J�rgen Jost; |
408 | Leveraging Locality in Abstractive Text Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the quadratic memory complexity of the self-attention module with respect to the input length hinders their applications in long text summarization. Instead of designing more efficient attention modules, we approach this problem by investigating if models with a restricted context can have competitive performance compared with the memory-efficient attention models that maintain a global context by treating the input as a single sequence. |
Yixin Liu; Ansong Ni; Linyong Nan; Budhaditya Deb; Chenguang Zhu; Ahmed Hassan Awadallah; Dragomir Radev; |
409 | Salience Allocation As Guidance for Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON). |
Fei Wang; Kaiqiang Song; Hongming Zhang; Lifeng Jin; Sangwoo Cho; Wenlin Yao; Xiaoyang Wang; Muhao Chen; Dong Yu; |
410 | Fine-tuned Language Models Are Continual Learners Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this limitation, we argue that a model should be able to keep extending its knowledge and abilities, without forgetting previous skills. |
Thomas Scialom; Tuhin Chakrabarty; Smaranda Muresan; |
411 | Natural Logic-guided Autoregressive Multi-hop Document Retrieval for Fact Verification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a novel retrieve-and-rerank method for multi-hop retrieval, that consists of a retriever that jointly scores documents in the knowledge source and sentences from previously retrieved documents using an autoregressive formulation and is guided by a proof system based on natural logic that dynamically terminates the retrieval process if the evidence is deemed sufficient. |
Rami Aly; Andreas Vlachos; |
412 | AX-MABSA: A Framework for Extremely Weakly Supervised Multi-label Aspect Based Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We explore unsupervised language model post-training to improve the overall performance, and propose a multi-label generator model to generate multiple aspect category-sentiment pairs per review sentence. |
Sabyasachi Kamila; Walid Magdy; Sourav Dutta; MingXue Wang; |
413 | Transfer Learning with Synthetic Corpora for Spatial Role Labeling and Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We provide two new data resources on multiple spatial language processing tasks. |
Roshanak Mirzaee; Parisa Kordjamshidi; |
414 | A Survey of Active Learning for Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we provide a literature review of active learning (AL) for its applications in natural language processing (NLP). |
Zhisong Zhang; Emma Strubell; Eduard Hovy; |
415 | Bernice: A Multilingual Pre-trained Encoder for Twitter Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Bernice, the first multilingual RoBERTa language model trained from scratch on 2. |
Alexandra DeLucia; Shijie Wu; Aaron Mueller; Carlos Aguirre; Philip Resnik; Mark Dredze; |
416 | CEFR-Based Sentence Difficulty Annotation and Assessment Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address this problem, we created the CEFR-based Sentence Profile (CEFR-SP) corpus, containing 17k English sentences annotated with the levels based on the Common European Framework of Reference for Languages assigned by English-education professionals. |
Yuki Arase; Satoru Uchida; Tomoyuki Kajiwara; |
417 | Simple Questions Generate Named Entity Recognition Datasets Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces an ask-to-generate approach, which automatically generates NER datasets by asking simple natural language questions to an open-domain question answering system (e. g. , �Which disease?�) |
Hyunjae Kim; Jaehyo Yoo; Seunghyun Yoon; Jinhyuk Lee; Jaewoo Kang; |
418 | TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we introduce TemporalWiki, a lifelong benchmark for ever-evolving LMs that utilizes the difference between consecutive snapshots of English Wikipedia and English Wikidata for training and evaluation, respectively. |
Joel Jang; Seonghyeon Ye; Changho Lee; Sohee Yang; Joongbo Shin; Janghoon Han; Gyeonghun Kim; Minjoon Seo; |
419 | Bi-Directional Iterative Prompt-Tuning for Event Argument Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a bi-directional iterative prompt-tuning method for EAE, where the EAE task is treated as a cloze-style task to take full advantage of entity information and pre-trained language models (PLMs). |
Lu Dai; Bang Wang; Wei Xiang; Yijun Mo; |
420 | Learning Robust Representations for Continual Relation Extraction Via Adversarial Class Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Most previous work attributes catastrophic forgetting to the corruption of the learned representations as new relations come, with an implicit assumption that the CRE models have adequately learned the old relations. In this paper, through empirical studies we argue that this assumption may not hold, and an important reason for catastrophic forgetting is that the learned representations do not have good robustness against the appearance of analogous relations in the subsequent learning process. |
Peiyi Wang; Yifan Song; Tianyu Liu; Binghuai Lin; Yunbo Cao; Sujian Li; Zhifang Sui; |
421 | ConvFinQA: Exploring The Chain of Numerical Reasoning in Conversational Finance Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we investigate the application domain of finance that involves real-world, complex numerical reasoning. |
Zhiyu Chen; Shiyang Li; Charese Smiley; Zhiqiang Ma; Sameena Shah; William Yang Wang; |
422 | A Span-based Multimodal Variational Autoencoder for Semi-supervised Multimodal Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To fuse the text and image features for MNER effectively under semi-supervised setting, we propose a novel span-based multimodal variational autoencoder (SMVAE) model for semi-supervised MNER. |
Baohang Zhou; Ying Zhang; Kehui Song; Wenya Guo; Guoqing Zhao; Hongbin Wang; Xiaojie Yuan; |
423 | R-TeaFor: Regularized Teacher-Forcing for Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Nevertheless, they do not consider the pairwise relationship between the original training data and the modified ones, which provides more information during training. Hence, we propose Regularized Teacher-Forcing (R-TeaFor) to utilize this relationship for better regularization. |
Guan-Yu Lin; Pu-Jen Cheng; |
424 | Modeling Consistency Preference Via Lexical Chains for Document-level Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we aim to relieve the issue of lexical translation inconsistency for document-level neural machine translation (NMT) by modeling consistency preference for lexical chains, which consist of repeated words in a source-side document and provide a representation of the lexical consistency structure of the document. |
Xinglin Lyu; Junhui Li; Shimin Tao; Hao Yang; Ying Qin; Min Zhang; |
425 | Just Fine-tune Twice: Selective Differential Privacy for Large Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we develop a novel framework, *Just Fine-tune Twice* (JFT), that achieves SDP for state-of-the-art large transformer-based models. |
Weiyan Shi; Ryan Shea; Si Chen; Chiyuan Zhang; Ruoxi Jia; Zhou Yu; |
426 | Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We argue that disentangling content selection from the budget used to cover salient content improves the performance and applicability of abstractive summarizers. |
Marcio Fonseca; Yftah Ziser; Shay B. Cohen; |
427 | Open-Domain Sign Language Translation Learned from Online Video Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce OpenASL, a large-scale American Sign Language (ASL) – English dataset collected from online video sites (e. g. , YouTube). |
Bowen Shi; Diane Brentari; Gregory Shakhnarovich; Karen Livescu; |
428 | Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we empirically observe that temporal generalization is closely affiliated with lexical semantic change, which is one of the essential phenomena of natural languages. |
Zhaochen Su; Zecheng Tang; Xinyan Guan; Lijun Wu; Min Zhang; Juntao Li; |
429 | ULN: Towards Underspecified Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: As a primary step toward ULN, we propose a VLN framework that consists of a classification module, a navigation agent, and an Exploitation-to-Exploration (E2E) module. |
Weixi Feng; Tsu-Jui Fu; Yujie Lu; William Yang Wang; |
430 | Federated Model Decomposition with Private Vocabulary for Text Classification Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a fedrated model decomposition method that protects the privacy of vocabularies, shorted as FEDEVOCAB. |
Zhuo Zhang; Xiangjing Hu; Lizhen Qu; Qifan Wang; Zenglin Xu; |
431 | ReCo: Reliable Causal Chain Reasoning Via Structural Causal Recurrent Neural Networks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In other words, the causal pairs to be spliced may have a conflicting threshold boundary or scenario. To address these issues, we propose a novel Reliable Causal chain reasoning framework (ReCo), which introduces exogenous variables to represent the threshold and scene factors of each causal pair within the causal chain, and estimates the threshold and scene contradictions across exogenous variables via structural causal recurrent neural networks (SRNN). |
Kai Xiong; Xiao Ding; Zhongyang Li; Li Du; Ting Liu; Bing Qin; Yi Zheng; Baoxing Huai; |
432 | Video Question Answering: Datasets, Algorithms and Challenges Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This survey aims to sort out the recent advances in video question answering (VideoQA) and point towards future directions. |
Yaoyao Zhong; Wei Ji; Junbin Xiao; Yicong Li; Weihong Deng; Tat-Seng Chua; |
433 | Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce a new method to improve existing multilingual sentence embeddings with Abstract Meaning Representation (AMR). |
Deng Cai; Xin Li; Jackie Chun-Sing Ho; Lidong Bing; Wai Lam; |
434 | Breaking The Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel representation method for Chinese characters to break the bottlenecks, namely StrokeNet, which represents a Chinese character by a Latinized stroke sequence (e. g. , �? |
Zhijun Wang; Xuebo Liu; Min Zhang; |
435 | Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To overcome these issues, we propose Boundary-Driven Table-Filling (BDTF), which represents each triplet as a relation region in the 2D table and transforms the ASTE task into detection and classification of relation regions. |
Yice Zhang; Yifan Yang; Yihui Li; Bin Liang; Shiwei Chen; Yixue Dang; Min Yang; Ruifeng Xu; |
436 | Attention and Edge-Label Guided Graph Convolutional Networks for Named Entity Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose the Attention and Edge-Label guided Graph Convolution Network (AELGCN) model. |
Renjie Zhou; Zhongyi Xie; Jian Wan; Jilin Zhang; Yong Liao; Qiang Liu; |
437 | Title2Event: Benchmarking Open Event Extraction with A Large-scale Chinese Title Dataset Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present Title2Event, a large-scale sentence-level dataset benchmarking Open Event Extraction without restricting event types. |
Haolin Deng; Yanan Zhang; Yangfan Zhang; Wangyang Ying; Changlong Yu; Jun Gao; Wei Wang; Xiaoling Bai; Nan Yang; Jin Ma; Xiang Chen; Tianhua Zhou; |
438 | Cascading Biases: Investigating The Effect of Heuristic Annotation Strategies on Data and Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we study cognitive heuristic use in the context of annotating multiple-choice reading comprehension datasets. |
Chaitanya Malaviya; Sudeep Bhatia; Mark Yatskar; |
439 | Teaching Broad Reasoning Skills for Multi-Step QA By Generating Hard Contexts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion. |
Harsh Trivedi; Niranjan Balasubramanian; Tushar Khot; Ashish Sabharwal; |
440 | ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing methods show worse than random guess performance under this scenario. To overcome this limitation, we propose a new technique, ADDMU, adversary detection with data and model uncertainty, which combines two types of uncertainty estimation for both regular and FB adversarial example detection. |
Fan Yin; Yao Li; Cho-Jui Hsieh; Kai-Wei Chang; |
441 | G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, this domain-adaptive pre-training (DAPT (CITATION)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of Memory-Augmented Pre-trained Language Model (MAP), which augments the domain-specific PLM by a memory built from the frozen general PLM without losing the general knowledge. |
Zhongwei Wan; Yichun Yin; Wei Zhang; Jiaxin Shi; Lifeng Shang; Guangyong Chen; Xin Jiang; Qun Liu; |
442 | Towards Unifying Reference Expression Generation and Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To address the problems, we propose a unified model for REG and REC, named UniRef. |
Duo Zheng; Tao Kong; Ya Jing; Jiaan Wang; Xiaojie Wang; |
443 | Textual Manifold-based Defense Against Natural Language Adversarial Examples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this study, we find a similar phenomenon occurs in the contextualized embedding space of natural sentences induced by pretrained language models in which textual adversarial examples tend to have their embeddings diverge off the manifold of natural sentence embeddings. |
Dang Nguyen Minh; Anh Tuan Luu; |
444 | Tiny-Attention Adapter: Contexts Are More Important Than The Number of Parameters Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the effectiveness of using tiny-attention�i. e. , attention with extremely small per-head dimensionality�as adapters. |
Hongyu Zhao; Hao Tan; Hongyuan Mei; |
445 | Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Negatives Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the instability in the standard dense retrieval training, which iterates between model training and hard negative selection using the being-trained model. |
Si Sun; Chenyan Xiong; Yue Yu; Arnold Overwijk; Zhiyuan Liu; Jie Bao; |
446 | ATTEMPT: Parameter-Efficient Multi-task Tuning Via Attentional Mixtures of Soft Prompts Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This work introduces a new multi-task, parameter-efficient language model (LM) tuning method that learns to transfer knowledge across different tasks via a mixture of soft prompts�small prefix embedding vectors pre-trained for different tasks. |
Akari Asai; Mohammadreza Salehi; Matthew Peters; Hannaneh Hajishirzi; |
447 | Exploration of The Usage of Color Terms By Color-blind Participants in Online Discussion Platforms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We study this question by making a step forward towards a better understanding of the conceptual perception of colors by color-blind individuals, as reflected in their spontaneous linguistic productions. |
Ella Rabinovich; Boaz Carmeli; |
448 | DEER: Descriptive Knowledge Graph for Explaining Entity Relationships Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To construct DEER, we propose a self-supervised learning method to extract relation descriptions with the analysis of dependency patterns and generate relation descriptions with a transformer-based relation description synthesizing model, where no human labeling is required. |
Jie Huang; Kerui Zhu; Kevin Chen-Chuan Chang; Jinjun Xiong; Wen-mei Hwu; |
449 | META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a new TOD architecture: GUI-based task-oriented dialogue system (GUI-TOD). |
Liangtai Sun; Xingyu Chen; Lu Chen; Tianle Dai; Zichen Zhu; Kai Yu; |
450 | Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we provide an in-depth analysis of the mechanism of KD on attention recovery of quantized large Transformers. |
Minsoo Kim; Sihwa Lee; Suk-Jin Hong; Du-Seong Chang; Jungwook Choi; |
451 | Exploring Mode Connectivity for Pre-trained Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate the geometric connections of different minima through the lens of mode connectivity, which measures whether two minima can be connected with a low-loss path. |
Yujia Qin; Cheng Qian; Jing Yi; Weize Chen; Yankai Lin; Xu Han; Zhiyuan Liu; Maosong Sun; Jie Zhou; |
452 | Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Translation has played a crucial role in improving the performance on multilingual tasks: (1) to generate the target language data from the source language data for training and (2) to generate the source language data from the target language data for inference. However, prior works have not considered the use of both translations simultaneously. This paper shows that combining them can synergize the results on various multilingual sentence classification tasks. |
Jaehoon Oh; Jongwoo Ko; Se-Young Yun; |
453 | Increasing Visual Awareness in Multimodal Neural Machine Translation from An Information Theoretic Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we endeavor to improve MMT performance by increasing visual awareness from an information theoretic perspective. |
Baijun Ji; Tong Zhang; Yicheng Zou; Bojie Hu; Si Shen; |
454 | Improving Event Coreference Resolution Using Document-level and Topic-level Information Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: They failed to capture the interactions and contextual cues among those long-distance event mentions. Besides, high-level information, such as event topics, is rarely considered to enhance representation learning for ECR. To address the above two issues, we first apply a Longformer-based encoder to obtain the document-level embeddings and an encoder with a trigger-mask mechanism to learn sentence-level embeddings based on local context. In addition, we propose an event topic generator to infer the latent topic-level representations. |
Sheng Xu; Peifeng Li; Qiaoming Zhu; |
455 | Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: A fixed prompt, however, may not generalize well to the diverse kinds of inputs the task comprises. In order to address this, we propose Vector-quantized Input-contextualized Prompts (VIP) as an extension to the soft prompt tuning framework. |
Rishabh Bhardwaj; Amrita Saha; Steven C.H. Hoi; Soujanya Poria; |
456 | Boosting Natural Language Generation from Instructions with Meta-Learning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper we investigate whether meta-learning applied to MTIL can further improve generalization to unseen tasks in a zero-shot setting. |
Budhaditya Deb; Ahmed Hassan Awadallah; Guoqing Zheng; |
457 | Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We tackle the task of segmenting running (spoken) narratives, which poses hitherto unaddressed challenges. |
Eitan Wagner; Renana Keydar; Amit Pinchevski; Omri Abend; |
458 | Unifying The Convergences in Multilingual Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel training strategy named LSSD (LanguageSpecific Self-Distillation), which can alleviate the convergence inconsistency and help MNMT models achieve the best performance on each language pair simultaneously. |
Yichong Huang; Xiaocheng Feng; Xinwei Geng; Bing Qin; |
459 | Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To this end, we use an undirected graphical model called pairwise conditional random field (PCRF) to formulate the UFET problem, in which the type variables are not only unarily influenced by the input but also pairwisely relate to all the other type variables. |
Chengyue Jiang; Yong Jiang; Weiqi Wu; Pengjun Xie; Kewei Tu; |
460 | Help Me Write A Poem – Instruction Tuning As A Vehicle for Collaborative Poetry Writing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Building on the prior success of large language models in the realm of computer assisted creativity, in this work, we present CoPoet, a collaborative poetry writing system, with the goal of to study if LLM�s actually improve the quality of the generated content. |
Tuhin Chakrabarty; Vishakh Padmakumar; He He; |
461 | Open Relation and Event Type Discovery with Type Abstraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This calls for systems that can automatically infer new types from given corpora, a task which we refer to as type discovery. To tackle this problem, we introduce the idea of type abstraction, where the model is prompted to generalize and name the type. |
Sha Li; Heng Ji; Jiawei Han; |
462 | Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we explore methods to make better use of the multilingual annotation and language agnostic property of KG triples, and present novel knowledge based multilingual language models (KMLMs) trained directly on the knowledge triples. |
Linlin Liu; Xin Li; Ruidan He; Lidong Bing; Shafiq Joty; Luo Si; |
463 | Revisiting Grammatical Error Correction Evaluation and Beyond Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To alleviate the limitation, we propose a novel GEC evaluation metric to achieve the best of both worlds, namely PT-M2 which only uses PT-based metrics to score those corrected parts. |
Peiyuan Gong; Xuebo Liu; Heyan Huang; Min Zhang; |
464 | R2D2: Robust Data-to-Text with Replacement Detection Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce R2D2, a training framework that addresses unfaithful Data-to-Text generation by training a system both as a generator and a faithfulness discriminator with additional replacement detection and unlikelihood learning tasks. |
Linyong Nan; Lorenzo Jaime Flores; Yilun Zhao; Yixin Liu; Luke Benson; Weijin Zou; Dragomir Radev; |
465 | IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Existing Indonesian MRC datasets (Purwarianti et al. , 2007; Clark et al. , 2020) are still inadequate because of the small size and limited question types, i. e. , they only cover answerable questions. To fill this gap, we build a new Indonesian MRC dataset called I(n)don�tKnow- MRC (IDK-MRC) by combining the automatic and manual unanswerable question generation to minimize the cost of manual dataset construction while maintaining the dataset quality. |
Rifki Afina Putri; Alice Oh; |
466 | XLM-D: Decorate Cross-lingual Pre-training Model As Non-Autoregressive Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we establish the connection between a pre-trained masked language model (MLM) and non-autoregressive generation on machine translation. |
Yong Wang; Shilin He; Guanhua Chen; Yun Chen; Daxin Jiang; |
467 | Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Here, we introduce cross-stitch bi-encoders, which allow full interaction between the text encoder and the KG encoder via a cross-stitch mechanism. |
Qin Dai; Benjamin Heinzerling; Kentaro Inui; |
468 | Assist Non-native Viewers: Multimodal Cross-Lingual Summarization for How2 Videos Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, existing works are restricted to monolingual video scenarios, ignoring the demands of non-native video viewers to understand the cross-language videos in practical applications. It stimulates us to propose a new task, named Multimodal Cross-Lingual Summarization for videos (MCLS), which aims to generate cross-lingual summaries from multimodal inputs of videos. |
Nayu Liu; Kaiwen Wei; Xian Sun; Hongfeng Yu; Fanglong Yao; Li Jin; Guo Zhi; Guangluan Xu; |
469 | PACIFIC: Towards Proactive Conversational Question Answering Over Tabular and Textual Data in Finance Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To facilitate conversational question answering (CQA) over hybrid contexts in finance, we present a new dataset, named PACIFIC. |
Yang Deng; Wenqiang Lei; Wenxuan Zhang; Wai Lam; Tat-Seng Chua; |
470 | Generative Data Augmentation with Contrastive Learning for Zero-Shot Stance Detection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Among them, one of the important challenges is to reduce the domain transfer between seen and unseen targets. To tackle this problem, we propose a generative data augmentation approach to generate training samples containing targets and stances for testing data, and map the real samples and generated synthetic samples into the same embedding space with contrastive learning, then perform the final classification based on the augmented data. |
Yang Li; Jiawei Yuan; |
471 | Better Few-Shot Relation Extraction with Label Prompt Dropout Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Instead, we present a novel approach called label prompt dropout, which randomly removes label descriptions in the learning process. |
Peiyuan Zhang; Wei Lu; |
472 | Break It Down Into BTS: Basic, Tiniest Subword Units for Korean Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We introduce Basic, Tiniest Subword (BTS) units for the Korean language, which are inspired by the invention principle of Hangeul, the Korean writing system. |
Nayeon Kim; Jun-Hyung Park; Joon-Young Choi; Eojin Jeon; Youjin Kang; SangKeun Lee; |
473 | The Devil in Linear Transformer Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, they usually suffer from degraded performances on various tasks and corpus. In this paper, we examine existing kernel-based linear transformers and identify two key issues that lead to such performance gaps: 1) unbounded gradients in the attention computation adversely impact the convergence of linear transformer models; 2) attention dilution which trivially distributes attention scores over long sequences while neglecting neighbouring structures. |
Zhen Qin; Xiaodong Han; Weixuan Sun; Dongxu Li; Lingpeng Kong; Nick Barnes; Yiran Zhong; |
474 | Zero-Shot Learners for Natural Language Understanding Via A Unified Multiple Choice Perspective Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose a new paradigm for zero-shot learners that is format agnostic, i. e. , it is compatible with any format and applicable to a list of language tasks, such as text classification, commonsense reasoning, coreference resolution, and sentiment analysis. |
Ping Yang; Junjie Wang; Ruyi Gan; Xinyu Zhu; Lin Zhang; Ziwei Wu; Xinyu Gao; Jiaxing Zhang; Tetsuya Sakai; |
475 | Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To compress and accelerate Transformer, we propose a Hybrid Tensor-Train (HTT) decomposition, which retains full rank and meanwhile reduces operations and parameters. |
Sunzhu Li; Peng Zhang; Guobing Gan; Xiuqing Lv; Benyou Wang; Junqiu Wei; Xin Jiang; |
476 | FigMemes: A Dataset for Figurative Language Identification in Politically-Opinionated Memes Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To enable future research into this area, we first present FigMemes, a dataset for figurative language classification in politically-opinionated memes. |
Chen Liu; Gregor Geigle; Robin Krebs; Iryna Gurevych; |
477 | UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Therefore, the rich correlations are not fully exploited by existing works. In this paper, we propose UniRel to address these challenges. |
Wei Tang; Benfeng Xu; Yuyue Zhao; Zhendong Mao; Yifeng Liu; Yong Liao; Haiyong Xie; |
478 | X-FACTOR: A Cross-metric Evaluation of Factual Correctness in Abstractive Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we present X-FACTOR, a cross-evaluation of three high-performing fact-aware abstractive summarization methods. |
Subhajit Chaudhury; Sarathkrishna Swaminathan; Chulaka Gunasekara; Maxwell Crouse; Srinivas Ravishankar; Daiki Kimura; Keerthiram Murugesan; Ram�n Fernandez Astudillo; Tahira Naseem; Pavan Kapanipathi; Alexander Gray; |
479 | ParaTag: A Dataset of Paraphrase Tagging for Fine-Grained Labels, NLG Evaluation, and Data Augmentation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a novel fine-grained paraphrase annotation schema that labels the minimum spans of tokens in a sentence that don�t have the corresponding paraphrases in the other sentence. |
Shuohang Wang; Ruochen Xu; Yang Liu; Chenguang Zhu; Michael Zeng; |
480 | Factual Accuracy Is Not Enough: Planning Consistent Description Order for Radiology Report Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We employ a planning-based radiology report generation system that generates the overall structure of reports as �plans�� prior to generating reports that are accurate and consistent in order. Additionally, we propose a novel reinforcement learning and inference method, Coordinated Planning (CoPlan), that includes a content planner and a text generator to train and infer in a coordinated manner to alleviate the cascading of errors that are often inherent in planning-based models. |
Toru Nishino; Yasuhide Miura; Tomoki Taniguchi; Tomoko Ohkuma; Yuki Suzuki; Shoji Kido; Noriyuki Tomiyama; |
481 | FLUTE: Figurative Language Understanding Through Textual Explanations Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Yet no such data exists for figurative language, making it harder to assess genuine understanding of such expressions. To address this issue, we release FLUTE, a dataset of 9,000 figurative NLI instances with explanations, spanning four categories: Sarcasm, Simile, Metaphor, and Idioms. |
Tuhin Chakrabarty; Arkadiy Saakyan; Debanjan Ghosh; Smaranda Muresan; |
482 | Precisely The Point: Adversarial Augmentations for Faithful and Informative Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we conduct the first quantitative analysis on the robustness of pre-trained Seq2Seq models. |
Wenhao Wu; Wei Li; Jiachen Liu; Xinyan Xiao; Sujian Li; Yajuan Lyu; |
483 | RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this work, we propose RLET, a Reinforcement Learning based Entailment Tree generation framework, which is trained utilising the cumulative signals across the whole tree. |
Tengxiao Liu; Qipeng Guo; Xiangkun Hu; Yue Zhang; Xipeng Qiu; Zheng Zhang; |
484 | Let The CAT Out of The Bag: Contrastive Attributed Explanations for Text Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a method Contrastive Attributed explanations for Text (CAT) which provides contrastive explanations for natural language text data with a novel twist as we build and exploit attribute classifiers leading to more semantically meaningful explanations. |
Saneem Chemmengath; Amar Prakash Azad; Ronny Luss; Amit Dhurandhar; |
485 | MonoQA: Multi-Task Learning of Reranking and Answer Extraction for Open-Retrieval Conversational Question Answering Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we investigate the use of Multi-Task Learning (MTL) to improve performance on the ORConvQA task by sharing the reranker and reader�s learned structure in a generative model. |
Sarawoot Kongyoung; Craig Macdonald; Iadh Ounis; |
486 | Composing Ci with Reinforced Non-autoregressive Text Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Moreover, consider that with the format prepared, Ci generation can be operated by an efficient synchronous process, where autoregressive models are limited in doing so since they follow the character-by-character generation protocol. Therefore, in this paper, we propose to compose Ci through a non-autoregressive approach, which not only ensure that the generation process accommodates tune patterns by controlling the rhythm and essential meaning of each sentence, but also allow the model to perform synchronous generation. |
Yan Song; |
487 | MetaTKG: Learning Evolutionary Meta-Knowledge for Temporal Knowledge Graph Reasoning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Since existing models highly rely on historical information to learn embeddings for entities, they perform poorly on such entities with little historical information. To tackle these issues, we propose a novel Temporal Meta-learning framework for TKG reasoning, MetaTKG for brevity. |
Yuwei Xia; Mengqi Zhang; Qiang Liu; Shu Wu; Xiao-Yu Zhang; |
488 | MPLUG: Effective and Efficient Vision-Language Learning By Cross-modal Skip-connections Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: This paper presents mPLUG, a new vision-language foundation model for both cross-modal understanding and generation. |
Chenliang Li; Haiyang Xu; Junfeng Tian; Wei Wang; Ming Yan; Bin Bi; Jiabo Ye; He Chen; Guohai Xu; Zheng Cao; Ji Zhang; Songfang Huang; Fei Huang; Jingren Zhou; Luo Si; |
489 | Q-TOD: A Query-driven Task-oriented Dialogue System Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce a novel query-driven task-oriented dialogue system, namely Q-TOD. |
Xin Tian; Yingzhan Lin; Mengfei Song; Siqi Bao; Fan Wang; Huang He; Shuqi Sun; Hua Wu; |
490 | Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we introduce the task of learning unsupervised dialogue embeddings. |
Che Liu; Rui Wang; Junfeng Jiang; Yongbin Li; Fei Huang; |
491 | WR-One2Set: Towards Well-Calibrated Keyphrase Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Nevertheless, we observe serious calibration errors outputted by ONE2SET, especially in the over-estimation of � token (means �no corresponding keyphrase�). In this paper, we deeply analyze this limitation and identify two main reasons behind: 1) the parallel generation has to introduce excessive � as padding tokens into training instances; and 2) the training mechanism assigning target to each slot is unstable and further aggravates the � token over-estimation. |
Binbin Xie; Xiangpeng Wei; Baosong Yang; Huan Lin; Jun Xie; Xiaoli Wang; Min Zhang; Jinsong Su; |
492 | Eeny, Meeny, Miny, Moe. How to Choose Data for Morphological Inflection Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we explore four sampling strategies for the task of morphological inflection using a Transformer model: a pair of oracle experiments where data is chosen based on correct/incorrect predictions by the model, model confidence, entropy, and random selection. |
Saliha Muradoglu; Mans Hulden; |
493 | An Adaptive Logical Rule Embedding Model for Inductive Reasoning Over Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We combine the two methods to capture deep causal logic by learning rule embeddings, and propose an interpretable model for temporal knowledge graph reasoning called adaptive logical rule embedding model for inductive reasoning (ALRE-IR). |
Xin Mei; Libin Yang; Xiaoyan Cai; Zuowei Jiang; |
494 | UniNL: Aligning Representation Learning with Scoring Function for OOD Detection Via Unified Neighborhood Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a unified neighborhood learning framework (UniNL) to detect OOD intents. |
Yutao Mou; Pei Wang; Keqing He; Yanan Wu; Jingang Wang; Wei Wu; Weiran Xu; |
495 | Open-domain Video Commentary Generation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We detail the construction of a new large-scale dataset of transcribed commentary aligned with videos containing various human actions in a variety of domains, and propose approaches based on well-known neural architectures to tackle the task. |
Edison Marrese-Taylor; Yumi Hamazono; Tatsuya Ishigaki; Goran Topic; Yusuke Miyao; Ichiro Kobayashi; Hiroya Takamura; |
496 | One Size Does Not Fit All: Investigating Strategies for Differentially-private Learning Across NLP Tasks Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this short paper, we provide an extensive analysis of different privacy preserving strategies on seven downstream datasets in five different �typical� NLP tasks with varying complexity using modern neural models based on BERT and XtremeDistil architectures. |
Manuel Senge; Timour Igamberdiev; Ivan Habernal; |
497 | Counterfactual Recipe Generation: Exploring Compositional Generalization in A Realistic Scenario Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we investigate whether pretrained language models can perform compositional generalization in a realistic setting: recipe generation. |
Xiao Liu; Yansong Feng; Jizhi Tang; Chengang Hu; Dongyan Zhao; |
498 | Tutoring Helps Students Learn Better: Improving Knowledge Distillation for BERT with Tutor Network Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a novel KD framework, Tutor-KD, which improves the distillation effectiveness by controlling the difficulty of training examples during pre-training. |
Junho Kim; Jun-Hyung Park; Mingyu Lee; Wing-Lam Mok; Joon-Young Choi; SangKeun Lee; |
499 | Does Corpus Quality Really Matter for Low-Resource Languages? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Taking representation learning in Basque as a case study, we explore tailored crawling (manually identifying and scraping websites with high-quality content) as an alternative to filtering CommonCrawl. |
Mikel Artetxe; Itziar Aldabe; Rodrigo Agerri; Olatz Perez-de-Vi�aspre; Aitor Soroa; |
500 | Unifying Data Perspectivism and Personalization: An Application to Social Norms Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we examine a corpus of social media posts about conflict from a set of 13k annotators and 210k judgements of social norms. |
Joan Plepi; B�la Neuendorf; Lucie Flek; Charles Welch; |
501 | Does Self-Rationalization Improve Robustness to Spurious Correlations? Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Specifically, we evaluate how training self-rationalization models with free-text rationales affects robustness to spurious correlations in fine-tuned encoder-decoder and decoder-only models of six different sizes. |
Alexis Ross; Matthew Peters; Ana Marasovic; |
502 | Efficient Pre-training of Masked Language Model Via Concept-based Curriculum Masking Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a novel concept-based curriculum masking (CCM) method to efficiently pre-train a language model. |
Mingyu Lee; Jun-Hyung Park; Junho Kim; Kang-Min Kim; SangKeun Lee; |
503 | Subword Evenness (SuE) As A Predictor of Cross-lingual Transfer to Low-resource Languages Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this study, we show that languages written in non-Latin and non-alphabetic scripts (mostly Asian languages) are the best choices for improving performance on the task of Masked Language Modelling (MLM) in a diverse set of 30 low-resource languages and that the success of the transfer is well predicted by our novel measure of Subword Evenness (SuE). |
Olga Pelloni; Anastassia Shaitarova; Tanja Samardzic; |
504 | A Unified Neural Network Model for Readability Assessment with Feature Projection and Length-Balanced Loss Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a BERT-based model with feature projection and length-balanced loss (BERT-FP-LBL) to determine the difficulty level of a given text. |
Wenbiao Li; Wang Ziyang; Yunfang Wu; |
505 | Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: To overcome the disadvantages, we reformulate overlapped speaker diarization task as a single-label prediction problem via the proposed power set encoding (PSE). |
Zhihao Du; ShiLiang Zhang; Siqi Zheng; Zhi-Jie Yan; |
506 | GREENER: Graph Neural Networks for News Media Profiling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. |
Panayot Panayotov; Utsav Shukla; Husrev Taha Sencar; Mohamed Nabeel; Preslav Nakov; |
507 | Graph Hawkes Transformer for Extrapolated Reasoning on Temporal Knowledge Graphs Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To this end, we propose a Graph Hawkes Transformer (GHT) for both TKG entity prediction and time prediction tasks in the future time. |
Haohai Sun; Shangyi Geng; Jialun Zhong; Han Hu; Kun He; |
508 | UniRPG: Unified Discrete Reasoning Over Table and Text As Program Generation Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose UniRPG, a semantic-parsing-based approach advanced in interpretability and scalability, to perform Unified discrete Reasoning over heterogeneous knowledge resources, i. e. , table and text, as Program Generation. |
Yongwei Zhou; Junwei Bao; Chaoqun Duan; Youzheng Wu; Xiaodong He; Tiejun Zhao; |
509 | Don�t Prompt, Search! Mining-based Zero-Shot Learning with Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose an alternative mining-based approach for zero-shot learning. |
Mozes van de Kar; Mengzhou Xia; Danqi Chen; Mikel Artetxe; |
510 | SEMGraph: Incorporating Sentiment Knowledge and Eye Movement Into Graph Model for Sentiment Analysis Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This paper investigates the sentiment analysis task from a novel perspective by incorporating sentiment knowledge and eye movement into a graph architecture, aiming to draw the eye movement-based sentiment relationships for learning the sentiment expression of the context. |
Bingbing Wang; Bin Liang; Jiachen Du; Min Yang; Ruifeng Xu; |
511 | Cross-lingual Neural Fuzzy Matching for Exploiting Target-language Monolingual Corpora in Computer-aided Translation Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, the reduced availability of in-domain TMs, as compared to in-domain monolingual corpora, limits its adoption for a number of translation tasks. In this paper, we introduce a novel neural approach aimed at overcoming this limitation by exploiting not only TMs, but also in-domain target-language (TL) monolingual corpora, and still enabling a similar functionality to that offered by conventional TM-based CAT tools. |
Miquel Espl�-Gomis; V�ctor M. S�nchez-Cartagena; Juan Antonio P�rez-Ortiz; Felipe S�nchez-Mart�nez; |
512 | Multi-Label Intent Detection Via Contrastive Task Specialization of Sentence Encoders Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Deploying task-oriented dialog ToD systems for new domains and tasks requires natural language understanding models that are 1) resource-efficient and work under low-data regimes; 2) adaptable, efficient, and quick-to-train; 3) expressive and can handle complex ToD scenarios with multiple user intents in a single utterance. Motivated by these requirements, we introduce a novel framework for multi-label intent detection (mID): MultI-ConvFiT (Multi-Label Intent Detection via Contrastive Conversational Fine-Tuning). |
Ivan Vulic; I�igo Casanueva; Georgios Spithourakis; Avishek Mondal; Tsung-Hsien Wen; Pawel Budzianowski; |
513 | Discovering Language-neutral Sub-networks in Multilingual Language Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: However, the extent to which they learn language-neutral representations (i. e. , shared representations that encode similar phenomena across languages), and the effect of such representations on cross-lingual transfer performance, remain open questions. In this work, we conceptualize language neutrality of multilingual models as a function of the overlap between language-encoding sub-networks of these models. |
Negar Foroutan; Mohammadreza Banaei; R�mi Lebret; Antoine Bosselut; Karl Aberer; |
514 | Parameter-Efficient Tuning Makes A Good Classification Head Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we find that parameter-efficient tuning makes a good classification head, with which we can simply replace the randomly initialized heads for a stable performance gain. |
Zhuoyi Yang; Ming Ding; Yanhui Guo; Qingsong Lv; Jie Tang; |
515 | STGN: An Implicit Regularization Method for Learning with Noisy Labels in Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, previous studies exert identical perturbation for all samples, which may cause overfitting on incorrect ones or optimizing correct ones inadequately. To facilitate this, we propose a novel stochastic tailor-made gradient noise (STGN), mitigating the effect of inherent label noise by introducing tailor-made benign noise for each sample. |
Tingting Wu; Xiao Ding; Minji Tang; Hao Zhang; Bing Qin; Ting Liu; |
516 | Cross-Modal Similarity-Based Curriculum Learning for Image Captioning Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we propose a simple yet efficient difficulty measurement for image captioning using cross-modal similarity calculated by a pretrained vision�language model. |
Hongkuan Zhang; Saku Sugawara; Akiko Aizawa; Lei Zhou; Ryohei Sasano; Koichi Takeda; |
517 | Debiasing Masks: A New Framework for Shortcut Mitigation in NLU Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: We propose a new debiasing method in which we identify debiased pruning masks that can be applied to a finetuned model. |
Johannes Mario Meissner; Saku Sugawara; Akiko Aizawa; |
518 | Extending Phrase Grounding with Pronouns in Visual Dialogues Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: First, we construct a dataset of phrase grounding with both noun phrases and pronouns to image regions. Based on the dataset, we test the performance of phrase grounding by using a state-of-the-art literature model of this line. Then, we enhance the baseline grounding model with coreference information which should help our task potentially, modeling the coreference structures with graph convolutional networks. |
Panzhong Lu; Xin Zhang; Meishan Zhang; Min Zhang; |
519 | EUR-Lex-Sum: A Multi- and Cross-lingual Dataset for Long-form Summarization in The Legal Domain Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we propose a novel dataset, called EUR-Lex-Sum, based on manually curated document summaries of legal acts from the European Union law platform (EUR-Lex). |
Dennis Aumiller; Ashish Chouhan; Michael Gertz; |
520 | Differentiable Data Augmentation for Contrastive Sentence Representation Learning Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Although the contrastive learning framework has shown its superiority on sentence representation learning over previous methods, the potential of such a framework is under-explored so far due to the simple method it used to construct positive pairs. Motivated by this, we propose a method that makes hard positives from the original training examples. |
Tianduo Wang; Wei Lu; |
521 | Text Style Transferring Via Adversarial Masking and Styled Filling Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle both challenges, in this study, we propose a style transfer model, with an adversarial masking approach and a styled filling technique (AMSF). |
Jiarui Wang; Richong Zhang; Junfan Chen; Jaein Kim; Yongyi Mao; |
522 | Character-level White-Box Adversarial Attacks Against Transformers Via Attachable Subwords Substitution Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We propose the first character-level white-box adversarial attack method against transformer models. |
Aiwei Liu; Honghai Yu; Xuming Hu; Shu�ang Li; Li Lin; Fukun Ma; Yawen Yang; Lijie Wen; |
523 | Query-based Instance Discrimination Network for Relational Triple Extraction Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, they still suffer from error propagation, relation redundancy and lack of high-level connections between triples. To address these issues, we propose a novel query-based approach to construct instance-level representations for relational triples. |
Zeqi Tan; Yongliang Shen; Xuming Hu; Wenqi Zhang; Xiaoxia Cheng; Weiming Lu; Yueting Zhuang; |
524 | Learning Inter-Entity-Interaction for Few-Shot Knowledge Graph Completion Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Such practice, however, ignores the inter-entity interaction, resulting in low-discrimination representations for entity pairs, especially when these entity pairs are associated with 1-to-N, N-to-1, and N-to-N relations. To address this issue, this paper proposes a novel FKGC model, named Cross-Interaction Attention Network (CIAN) to investigate the inter-entity interaction between head and tail entities. |
Yuling Li; Kui Yu; Xiaoling Huang; Yuhong Zhang; |
525 | Empowering The Fact-checkers! Automatic Identification of Claim Spans on Twitter Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this work, we introduce the novel task of Claim Span Identification (CSI). |
Megha Sundriyal; Atharva Kulkarni; Vaibhav Pulastya; Md. Shad Akhtar; Tanmoy Chakraborty; |
526 | ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We present ClidSum, a benchmark dataset towards building cross-lingual summarization systems on dialogue documents. |
Jiaan Wang; Fandong Meng; Ziyao Lu; Duo Zheng; Zhixu Li; Jianfeng Qu; Jie Zhou; |
527 | Spectral Probing Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: Contextualized embeddings have analogously been found to capture these phenomena at distinctive layers and frequencies. Leveraging these findings, we develop a fully learnable frequency filter to identify spectral profiles for any given task. |
Max M�ller-Eberstein; Rob van der Goot; Barbara Plank; |
528 | QASem Parsing: Text-to-text Modeling of QA-based Semantics Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: More recently, an appealing trend introduces semi-structured natural-language structures as an intermediate meaning-capturing representation, often in the form of questions and answers. In this work, we further promote this line of research by considering three prior QA-based semantic representations. |
Ayal Klein; Eran Hirsch; Ron Eliav; Valentina Pyatkin; Avi Caciularu; Ido Dagan; |
529 | Keyphrase Generation Via Soft and Hard Semantic Corrections Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: To tackle the above biases, we propose a novel correction model CorrKG on top of the MLE pipeline, where the biases are corrected via the optimal transport (OT) and a frequency-based filtering-and-sorting (FreqFS) strategy. |
Guangzhen Zhao; Guoshun Yin; Peng Yang; Yu Yao; |
530 | Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: Previous works have shown promising results; however, they relied on the expensive query annotations for the VCMR, i. e. , the corresponding moment intervals. To overcome this problem, we propose a self-supervised learning framework: Modal-specific Pseudo Query Generation Network (MPGN). |
Minjoon Jung; SeongHo Choi; JooChan Kim; Jin-Hwa Kim; Byoung-Tak Zhang; |
531 | DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating The Robustness of Question Matching Models Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we focus on the robustness evaluation of Chinese Question Matching (QM) models. |
Hongyu Zhu; Yan Chen; Jing Yan; Jing Liu; Yu Hong; Ying Chen; Hua Wu; Haifeng Wang; |
532 | DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: We introduce DivEMT, the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. |
Gabriele Sarti; Arianna Bisazza; Ana Guerberof-Arenas; Antonio Toral; |
533 | Bridging Fairness and Environmental Sustainability in Natural Language Processing Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: This lacuna is highly problematic, since there is increasing evidence that an exclusive focus on fairness can actually hinder environmental sustainability, and vice versa. In this work, we shed light on this crucial intersection in NLP by (1) investigating the efficiency of current fairness approaches through surveying example methods for reducing unfair stereotypical bias from the literature, and (2) evaluating a common technique to reduce energy consumption (and thus environmental impact) of English NLP models, knowledge distillation (KD), for its impact on fairness. |
Marius Hessenthaler; Emma Strubell; Dirk Hovy; Anne Lauscher; |
534 | UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition Related Papers Related Patents Related Grants Related Orgs Related Experts Related Code View Highlight: In this paper, we propose a multimodal sentiment knowledge-sharing framework (UniMSE) that unifies MSA and ERC tasks from features, labels, and models. |
Guimin Hu; Ting-En Lin; Yi Zhao; Guangming Lu; Yuchuan Wu; Yongbin Li; |
535 | Is The Brain Mechanism for Hierarchical Structure Building Universal Across Languages? An FMRI Study of Chinese and English Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: In this paper, we first analyze the differences in language structure between two diverse languages: Chinese and English. By computing the working memory requirements when applying parsing strategies to different language structures, we find that top-down parsing generates less memory load for the right-branching English and bottom-up parsing is less memory-demanding for Chinese. |
Xiaohan Zhang; Shaonan Wang; Nan Lin; Chengqing Zong; |
536 | HashFormers: Towards Vocabulary-independent Pre-trained Transformers Related Papers Related Patents Related Grants Related Orgs Related Experts View Highlight: However, these methods are not pre-trained. Inspired by this line of work, we propose HashFormers, a new family of vocabulary-independent pre-trained transformers that support an unlimited vocabulary (i. e. all possible tokens in a corpus) given a substantially smaller fixed-sized embedding matrix. |
Huiyin Xue; Nikolaos Aletras; |
537 | MatchPrompt: Prompt-based Open Relation Extraction with Semantic Consistency Guided Clustering Related Papers Related Patents |