Paper Digest: ACL 2023 Highlights
Annual Meeting of the Association for Computational Linguistics (ACL) is one of the top natural language processing conferences in the world. In 2023, it is to be held in Toronto, Canada.
To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.
Based in New York, Paper Digest is dedicated to helping people generate contents & reason over unstructured data. Different from black-box approaches, we build deep models on semantics, which allows results to be produced with explainations. Such models power this website, and are behind our services including “search engine”, “summarization”, “question answering”, and “literature review”.
If you do not want to miss interesting academic papers, you are welcome to sign up our daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.
Paper Digest Team
New York City, New York, 10017
team@paperdigest.org
TABLE 1: Paper Digest: ACL 2023 Highlights
Paper | Author(s) | |
---|---|---|
1 | One Cannot Stand for Everyone! Leveraging Multiple User Simulators to Train Task-oriented Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a framework called MUST to optimize ToD systems via leveraging Multiple User SimulaTors. |
Yajiao Liu; Xin Jiang; Yichun Yin; Yasheng Wang; Fei Mi; Qun Liu; Xiang Wan; Benyou Wang; |
2 | SafeConv: Explaining and Correcting Conversational Unsafe Behavior Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we construct a new dataset called SafeConv for the research of conversational safety: (1) Besides the utterance-level safety labels, SafeConv also provides unsafe spans in an utterance, information able to indicate which words contribute to the detected unsafe behavior; (2) SafeConv provides safe alternative responses to continue the conversation when unsafe behavior detected, guiding the conversation to a gentle trajectory. |
Mian Zhang; Lifeng Jin; Linfeng Song; Haitao Mi; Wenliang Chen; Dong Yu; |
3 | Detecting and Mitigating Hallucinations in Machine Translation: Model Internal Workings Alone Do Well, Sentence Similarity Even Better Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to use a method that evaluates the percentage of the source contribution to a generated translation. |
David Dale; Elena Voita; Loic Barrault; Marta R. Costa-juss�; |
4 | Explainable Recommendation with Personalized Review Retrieval and Aspect Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, historical user reviews of items are often insufficient, making it challenging to ensure the precision of generated explanation text. To address this issue, we propose a novel model, ERRA (Explainable Recommendation by personalized Review retrieval and Aspect learning). |
Hao Cheng; Shuo Wang; Wensheng Lu; Wei Zhang; Mingyang Zhou; Kezhong Lu; Hao Liao; |
5 | Binary and Ternary Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We approach the problem with a mix of statistics-based quantization for the weights and elastic quantization of the activations and demonstrate the first ternary and binary transformer models on the downstream tasks of summarization and machine translation. |
Zechun Liu; Barlas Oguz; Aasish Pappu; Yangyang Shi; Raghuraman Krishnamoorthi; |
6 | Span-Selective Linear Attention Transformers for Effective and Robust Schema-Guided Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SPLAT, a novel architecture which achieves better generalization and efficiency than prior approaches by constraining outputs to a limited prediction space. |
Bj�rn Bebensee; Haejun Lee; |
7 | EM Pre-training for Multi-party Dialogue Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, due to the lack of annotated addressee labels in multi-party dialogue datasets, it is hard to use them to pre-train a response generation model for multi-party dialogues. To tackle this obstacle, we propose an Expectation-Maximization (EM) approach that iteratively performs the expectation steps to generate addressee labels, and the maximization steps to optimize a response generation model. |
Yiyang Li; Hai Zhao; |
8 | ACLM: A Selective-Denoising Based Generative Data Augmentation Approach for Low-Resource Complex NER Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present ACLM Attention-map aware keyword selection for Conditional Language Model fine-tuning), a novel data augmentation approach based on conditional generation, to address the data scarcity problem in low-resource complex NER. |
Sreyan Ghosh; Utkarsh Tyagi; Manan Suri; Sonal Kumar; Ramaneswaran S; Dinesh Manocha; |
9 | Natural Language to Code Generation in Interactive Data Science Notebooks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. |
Pengcheng Yin; Wen-Ding Li; Kefan Xiao; Abhishek Rao; Yeming Wen; Kensen Shi; Joshua Howland; Paige Bailey; Michele Catasta; Henryk Michalewski; Oleksandr Polozov; Charles Sutton; |
10 | Subset Retrieval Nearest Neighbor Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose �Subset kNN-MT�, which improves the decoding speed of kNN-MT by two methods: (1) retrieving neighbor target tokens from a subset that is the set of neighbor sentences of the input sentence, not from all sentences, and (2) efficient distance computation technique that is suitable for subset neighbor search using a look-up table. |
Hiroyuki Deguchi; Taro Watanabe; Yusuke Matsui; Masao Utiyama; Hideki Tanaka; Eiichiro Sumita; |
11 | MIL-Decoding: Detoxifying Language Models at Token-Level Via Multiple Instance Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MIL-Decoding, which detoxifies language models at token-level by interpolating it with a trained multiple instance learning (MIL) network. |
Xu Zhang; Xiaojun Wan; |
12 | Dependency Resolution at The Syntax-semantics Interface: Psycholinguistic and Computational Insights on Control Dependencies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results show that while humans correctly identify the (un)acceptability of the strings, language models often fail to identify the correct antecedent in non-adjacent dependencies, showing their reliance on linearity. |
Iria de-Dios-Flores; Juan Garcia Amboage; Marcos Garcia; |
13 | Open-ended Long Text Generation Via Masked Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance the long text generation capability of MLMs, we introduce two simple yet effective strategies for the iterative NAR model: dynamic sliding window attention (DSWA) and linear temperature decay (LTD). |
Xiaobo Liang; Zecheng Tang; Juntao Li; Min Zhang; |
14 | A Method for Studying Semantic Construal in Grammatical Constructions with Interpretable Contextual Embedding Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study semantic construal in grammatical constructions using large language models. |
Gabriella Chronis; Kyle Mahowald; Katrin Erk; |
15 | Holographic CCG Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method for formulating CCG as a recursive composition in a continuous vector space. |
Ryosuke Yamaki; Tadahiro Taniguchi; Daichi Mochihashi; |
16 | Prompts Can Play Lottery Tickets Well: Achieving Lifelong Information Extraction Via Lottery Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the UIE system under a more challenging yet practical scenario, i. e. , �lifelong learning� settings, to evaluate its abilities in three aspects, including knowledge sharing and expansion, catastrophic forgetting prevention, and rapid generalization on few-shot and unseen tasks. To achieve these three goals, we present a novel parameter- and deployment-efficient prompt tuning method namely Lottery Prompt Tuning (LPT). |
Zujie Liang; Feng Wei; Yin Jie; Yuxi Qian; Zhenghong Hao; Bing Han; |
17 | Retrieve-and-Sample: Document-level Event Argument Extraction Via Hybrid Retrieval Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate various retrieval settings from the input and label distribution views in this paper. |
Yubing Ren; Yanan Cao; Ping Guo; Fang Fang; Wei Ma; Zheng Lin; |
18 | WeCheck: Strong Factual Consistency Checker Via Weakly Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Bias in synthetic text or upstream tasks makes them perform poorly on text actually generated by language models, especially for general evaluation for various tasks. To alleviate this problem, we propose a weakly supervised framework named WeCheck that is directly trained on actual generated samples from language models with weakly annotated labels. |
Wenhao Wu; Wei Li; Xinyan Xiao; Jiachen Liu; Sujian Li; Yajuan Lyu; |
19 | AMR-based Network for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, further improvement is limited due to the potential mismatch between the dependency tree as a syntactic structure and the sentiment classification as a semantic task. To alleviate this gap, we replace the syntactic dependency tree with the semantic structure named Abstract Meaning Representation (AMR) and propose a model called AMR-based Path Aggregation Relational Network (APARN) to take full advantage of semantic structures. |
Fukun Ma; Xuming Hu; Aiwei Liu; Yawen Yang; Shuang Li; Philip S. Yu; Lijie Wen; |
20 | Text Adversarial Purification As Defense Against Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel adversarial purification method that focuses on defending against textual adversarial attacks. |
Linyang Li; Demin Song; Xipeng Qiu; |
21 | SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In most NLP cases, event structures are complex with manifold dependency, and it is challenging to effectively represent these complicated structured events. To address these issues, we propose Structured Prediction with Energy-based Event-Centric Hyperspheres (SPEECH). |
Shumin Deng; Shengyu Mao; Ningyu Zhang; Bryan Hooi; |
22 | Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Rule By Example (RBE): a novel exemplar-based contrastive learning approach for learning from logical rules for the task of textual content moderation. |
Christopher Clarke; Matthew Hall; Gaurav Mittal; Ye Yu; Sandra Sajeev; Jason Mars; Mei Chen; |
23 | What About �em�? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Wrong pronoun translations can discriminate against marginalized groups, e. g. , non-binary individuals (Dev et al. , 2021). In this �reality check�, we study how three commercial MT systems translate 3rd-person pronouns. |
Anne Lauscher; Debora Nozza; Ehm Miltersen; Archie Crowley; Dirk Hovy; |
24 | What Is Overlap Knowledge in Event Argument Extraction? APE: A Cross-datasets Transfer Learning Model for EAE Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we clearly define the overlap knowledge across datasets and split the knowledge of the EAE task into overlap knowledge across datasets and specific knowledge of the target dataset. |
Kaihang Zhang; Kai Shuang; Xinyue Yang; Xuyang Yao; Jinyu Guo; |
25 | Tailor: A Soft-Prompt-Based Approach to Attribute-Based Controlled Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work usually utilize fine-tuning or resort to extra attribute classifiers, yet suffer from increases in storage and inference time. To address these concerns, we explore attribute-based CTG in a parameter-efficient manner. |
Kexin Yang; Dayiheng Liu; Wenqiang Lei; Baosong Yang; Mingfeng Xue; Boxing Chen; Jun Xie; |
26 | Knowledge of Cultural Moral Norms in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the extent to which monolingual English language models contain knowledge about moral norms in different countries. |
Aida Ramezani; Yang Xu; |
27 | Songs Across Borders: Singable and Controllable Neural Lyric Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper bridges the singability quality gap by formalizing lyric translation into a constrained translation problem, converting theoretical guidance and practical techniques from translatology literature to prompt-driven NMT approaches, exploring better adaptation methods, and instantiating them to an English-Chinese lyric translation system. |
Longshen Ou; Xichu Ma; Min-Yen Kan; Ye Wang; |
28 | Fantastic Expressions and Where to Find Them: Chinese Simile Generation with Multiple Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce controllable simile generation (CSG), a new task that requires the model to generate a simile with multiple simile elements, e. g. , context and vehicle. |
Kexin Yang; Dayiheng Liu; Wenqiang Lei; Baosong Yang; Xiangpeng Wei; Zhengyuan Liu; Jun Xie; |
29 | Revealing Single Frame Bias for Video-and-Language Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore single-frame models for video-and-language learning. |
Jie Lei; Tamara Berg; Mohit Bansal; |
30 | Learning with Partial Annotations for Event Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct a seminal study for learning with partial annotations for ED. |
Jian Liu; Dianbo Sui; Kang Liu; Haoyan Liu; Zhe Zhao; |
31 | World-to-Words: Grounded Open Vocabulary Acquisition Through Fast Mapping in Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As an initial attempt, we propose World-to-Words (W2W), a novel visually-grounded language model by pre-training on image-text pairs highlighting grounding as an objective. |
Ziqiao Ma; Jiayi Pan; Joyce Chai; |
32 | A Causal Framework to Quantify The Robustness of Mathematical Reasoning with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Building on the idea of behavioral testing, we propose a novel framework, which pins down the causal effect of various factors in the input, e. g. , the surface form of the problem text, the operands, and math operators on the output solution. |
Alessandro Stolfo; Zhijing Jin; Kumar Shridhar; Bernhard Schoelkopf; Mrinmaya Sachan; |
33 | Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i. e. , there may be multiple suitable responses which differ in semantics for a given conversational context. To tackle this challenge, we propose a novel learning-based automatic evaluation metric (CMN), which can robustly evaluate open-domain dialogues by augmenting Conditional Variational Autoencoders (CVAEs) with a Next Sentence Prediction (NSP) objective and employing Mutual Information (MI) to model the semantic similarity of text in the latent space. |
Kun Zhao; Bohao Yang; Chenghua Lin; Wenge Rong; Aline Villavicencio; Xiaohui Cui; |
34 | Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore human-AI partnerships to facilitate high diversity and accuracy in LLM-based text data generation. |
John Chung; Ece Kamar; Saleema Amershi; |
35 | Pruning Pre-trained Language Models Without Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we argue fine-tuning is redundant for first-order pruning, since first-order pruning is sufficient to converge PLMs to downstream tasks without fine-tuning. |
Ting Jiang; Deqing Wang; Fuzhen Zhuang; Ruobing Xie; Feng Xia; |
36 | When Does Translation Require Context? A Data-driven, Multilingual Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop the Multilingual Discourse-Aware (MuDA) benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena in any given dataset. |
Patrick Fernandes; Kayo Yin; Emmy Liu; Andr� Martins; Graham Neubig; |
37 | Causal Intervention and Counterfactual Reasoning for Multi-modal Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we analyze and identify the psycholinguistic bias in the text and the bias of inferring news label based on only image features. |
Ziwei Chen; Linmei Hu; Weixin Li; Yingxia Shao; Liqiang Nie; |
38 | LexSym: Compositionality As Lexical Symmetry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a domain-general and model-agnostic formulation of compositionality as a constraint on symmetries of data distributions rather than models. |
Ekin Akyurek; Jacob Andreas; |
39 | Layer-wise Fusion with Modality Independence Modeling for Multi-modal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose that maintaining modality independence is beneficial for the model performance. |
Jun Sun; Shoukang Han; Yu-Ping Ruan; Xiaoning Zhang; Shu-Kai Zheng; Yulong Liu; Yuxin Huang; Taihao Li; |
40 | CASN:Class-Aware Score Network for Textual Adversarial Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods suffer from significant performance degradation when the adversarial samples lie close to the non-adversarial data manifold. To address this limitation, we propose a score-based generative method to implicitly model the data distribution. |
Rong Bao; Rui Zheng; Liang Ding; Qi Zhang; Dacheng Tao; |
41 | Do Androids Laugh at Electric Sheep? Humor �Understanding� Benchmarks from The New Yorker Caption Contest Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate both multimodal and language-only models: the former are challenged with the cartoon images directly, while the latter are given multifaceted descriptions of the visual scene to simulate human-level visual understanding. |
Jack Hessel; Ana Marasovic; Jena D. Hwang; Lillian Lee; Jeff Da; Rowan Zellers; Robert Mankoff; Yejin Choi; |
42 | Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate whether data augmentation techniques could help improve low-resource ASR performance, focusing on four typologically diverse minority languages or language variants (West Germanic: Gronings, West-Frisian; Malayo-Polynesian: Besemah, Nasal). |
Martijn Bartelds; Nay San; Bradley McDonnell; Dan Jurafsky; Martijn Wieling; |
43 | CLCL: Non-compositional Expression Detection with Contrastive Learning and Curriculum Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging contrastive learning techniques to build improved representations it tackles the non-compositionality challenge. |
Jianing Zhou; Ziheng Zeng; Suma Bhat; |
44 | Multi-VALUE: A Framework for Cross-Dialectal English NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a suite of resources for evaluating and achieving English dialect invariance. |
Caleb Ziems; William Held; Jingfeng Yang; Jwala Dhamala; Rahul Gupta; Diyi Yang; |
45 | Self-Edit: Fault-Aware Code Editor for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the process of human programming, we propose a generate-and-edit approach named Self-Edit that utilizes execution results of the generated code from LLMs to improve the code quality on the competitive programming task. |
Kechi Zhang; Zhuo Li; Jia Li; Ge Li; Zhi Jin; |
46 | ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ColD Fusion, a method that provides the benefits of multitask learning but leverages distributed computation and requires limited communication and no sharing of data. |
Shachar Don-Yehiya; Elad Venezian; Colin Raffel; Noam Slonim; Leshem Choshen; |
47 | Test-time Adaptation for Machine Translation Evaluation By Uncertainty Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to address the inference bias of neural metrics through uncertainty minimization during test time, without requiring additional data. |
Runzhe Zhan; Xuebo Liu; Derek F. Wong; Cuilian Zhang; Lidia S. Chao; Min Zhang; |
48 | Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Multi-CLS BERT, a novel ensembling method for CLS-based prediction tasks that is almost as efficient as a single BERT model. |
Haw-Shiuan Chang; Ruei-Yao Sun; Kathryn Ricci; Andrew McCallum; |
49 | On-the-fly Cross-lingual Masking for Multilingual Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present CLPM (Cross-lingual Prototype Masking), a dynamic and token-wise masking scheme, for multilingual pre-training, using a special token [??] |
Xi Ai; Bin Fang; |
50 | How About Kind of Generating Hedges Using End-to-End Neural Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop a model of hedge generation based on i) fine-tuning state-of-the-art language models trained on human-human tutoring data, followed by ii) reranking to select the candidate that best matches the expected hedging strategy within a candidate pool using a hedge classifier. |
Alafate Abulimiti; Chlo� Clavel; Justine Cassell; |
51 | DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, generating images with desired details requires proper prompts, and it is often unclear how a model reacts to different prompts or what the best prompts are. To help researchers tackle these critical challenges, we introduce DiffusionDB, the first large-scale text-to-image prompt dataset totaling 6. |
Zijie J. Wang; Evan Montoya; David Munechika; Haoyang Yang; Benjamin Hoover; Duen Horng Chau; |
52 | From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While key points are more expressive than word clouds and key phrases, making sense of a long, flat list of key points, which often express related ideas in varying levels of granularity, may still be challenging. To address this limitation of KPA, we introduce the task of organizing a given set of key points into a hierarchy, according to their specificity. |
Arie Cattan; Lilach Eden; Yoav Kantor; Roy Bar-Haim; |
53 | When to Use What: An In-Depth Comparative Empirical Analysis of OpenIE Systems for Downstream Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an application-focused empirical survey of neural OpenIE models, training sets, and benchmarks in an effort to help users choose the most suitable OpenIE systems for their applications. |
Kevin Pei; Ishan Jindal; Kevin Chen-Chuan Chang; ChengXiang Zhai; Yunyao Li; |
54 | Subjective Crowd Disagreements for Subjective Data: Uncovering Meaningful CrowdOpinion with Population-level Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce CrowdOpinion, an unsupervised learning based approach that uses language features and label distributions to pool similar items into larger samples of label distributions. |
Tharindu Cyril Weerasooriya; Sarah Luger; Saloni Poddar; Ashiqur KhudaBukhsh; Christopher Homan; |
55 | Post-Abstention: Towards Reliably Re-Attempting The Abstained Instances in QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present an explorative study on �Post-Abstention�, a task that allows re-attempting the abstained instances with the aim of increasing **coverage** of the system without significantly sacrificing its **accuracy**. |
Neeraj Varshney; Chitta Baral; |
56 | UniLG: A Unified Structure-aware Framework for Lyrics Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified structure-aware lyrics generation framework named UniLG. |
Tao Qian; Fan Lou; Jiatong Shi; Yuning Wu; Shuai Guo; Xiang Yin; Qin Jin; |
57 | FC-KBQA: A Fine-to-Coarse Composition Framework for Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a Fine-to-Coarse Composition framework for KBQA (FC-KBQA) to both ensure the generalization ability and executability of the logical expression. |
Lingxi Zhang; Jing Zhang; Yanling Wang; Shulin Cao; Xinmei Huang; Cuiping Li; Hong Chen; Juanzi Li; |
58 | Does GPT-3 Grasp Metaphors? Identifying Metaphor Mappings with Generative Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes to probe the ability of GPT-3 to detect metaphoric language and predict the metaphor�s source domain without any pre-set domains. |
Lennart Wachowiak; Dagmar Gromann; |
59 | Being Right for Whose Right Reasons? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents what we think is a first of its kind, a collection of human rationale annotations augmented with the annotators demographic information. |
Terne Sasha Thorn Jakobsen; Laura Cabello; Anders S�gaard; |
60 | ALERT: Adapt Language Models to Reasoning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it is unclear whether these models are applying reasoning skills they have learnt during pre-training , or if they are simply memorizing their training corpus at finer granularity and have learnt to better understand their context. To address this question, we introduce {pasted macro �OUR�}model, a benchmark and suite of analyses for evaluating reasoning skills of language models. |
Ping Yu; Tianlu Wang; Olga Golovneva; Badr AlKhamissi; Siddharth Verma; Zhijing Jin; Gargi Ghosh; Mona Diab; Asli Celikyilmaz; |
61 | Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our work addresses an important goal of NLP research: we should notlimit NLP to a small fraction of the world�s languages and instead strive to support as many languages as possible to bring the benefits of NLP technology to all languages and cultures. |
Ayyoob ImaniGooghari; Peiqin Lin; Amir Hossein Kargaran; Silvia Severini; Masoud Jalili Sabet; Nora Kassner; Chunlan Ma; Helmut Schmid; Andr� Martins; Fran�ois Yvon; Hinrich Sch�tze; |
62 | Joint Constrained Learning with Boundary-adjusting for Emotion-Cause Pair Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a **J**oint **C**onstrained Learning framework with **B**oundary-adjusting for Emotion-Cause Pair Extraction (**JCB**). |
Huawen Feng; Junlong Liu; Junhao Zheng; Haibin Chen; Xichen Shang; Qianli Ma; |
63 | Pretrained Bidirectional Distillation for Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Pretrained Bidirectional Distillation (PBD) for NMT, which aims to efficiently transfer bidirectional language knowledge from masked language pretraining to NMT models. |
Yimeng Zhuang; Mei Tu; |
64 | Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that language modeling applied directly to task-specific user histories achieves excellent results on diverse recommendation tasks. |
Kyuyong Shin; Hanock Kwak; Wonjae Kim; Jisu Jeong; Seungjae Jung; Kyungmin Kim; Jung-Woo Ha; Sang-Woo Lee; |
65 | Improving Continual Relation Extraction By Distinguishing Analogous Semantics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We conduct an empirical study on existing works and observe that their performance is severely affected by analogous relations. To address this issue, we propose a novel continual extraction model for analogous relations. |
Wenzheng Zhao; Yuanning Cui; Wei Hu; |
66 | Improving Pretraining Techniques for Code-Switched NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore different masked language modeling (MLM) pretraining techniques for code-switched text that are cognizant of language boundaries prior to masking. |
Richeek Das; Sahasra Ranjan; Shreya Pathak; Preethi Jyothi; |
67 | A Theory of Unsupervised Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we proposed a general theoretical framework to study the properties of {pasted macro �ASRU�}/ systems based on random matrix theory and the theory of neural tangent kernels. |
Liming Wang; Mark Hasegawa-Johnson; Chang Yoo; |
68 | ThinkSum: Probabilistic Reasoning Over Sets Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a two-stage probabilistic inference paradigm, ThinkSum, which reasons over sets of objects or facts in a structured manner. |
Batu Ozturkler; Nikolay Malkin; Zhen Wang; Nebojsa Jojic; |
69 | NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we analyze automatic evaluation metrics for Natural Language Generation (NLG), specifically task-agnostic metrics and human-aligned metrics. |
Iftitahu Nimah; Meng Fang; Vlado Menkovski; Mykola Pechenizkiy; |
70 | DialoGPS: Dialogue Path Sampling in Continuous Semantic Space for Data Augmentation in Multi-Turn Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DialoGue Path Sampling (DialoGPS) method in continuous semantic space, the first many-to-many augmentation method for multi-turn dialogues. |
Ang Lv; Jinpeng Li; Yuhan Chen; Gao Xing; Ji Zhang; Rui Yan; |
71 | TECHS: Temporal Logical Graph Networks for Explainable Extrapolation Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose an explainable extrapolation reasoning framework TEemporal logiCal grapH networkS (TECHS), which mainly contains a temporal graph encoder and a logical decoder. |
Qika Lin; Jun Liu; Rui Mao; Fangzhi Xu; Erik Cambria; |
72 | Consistency Regularization Training for Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Without modifying model architectures, we improve the capability of Transformer on compositional generalization through consistency regularization training, which promotes representation consistency across samples and prediction consistency for a single sample. |
Yongjing Yin; Jiali Zeng; Yafu Li; Fandong Meng; Jie Zhou; Yue Zhang; |
73 | NUWA-XL: Diffusion Over Diffusion for EXtremely Long Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose NUWA-XL, a novel Diffusion over Diffusion architecture for eXtremely Long video generation. |
Shengming Yin; Chenfei Wu; Huan Yang; Jianfeng Wang; Xiaodong Wang; Minheng Ni; Zhengyuan Yang; Linjie Li; Shuguang Liu; Fan Yang; Jianlong Fu; Ming Gong; Lijuan Wang; Zicheng Liu; Houqiang Li; Nan Duan; |
74 | Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that a simple and practical recipe in the text domain is effective: simply fine-tuning a pretrained generative language model with DP enables the model to generate useful synthetic text with strong privacy protection. |
Xiang Yue; Huseyin Inan; Xuechen Li; Girish Kumar; Julia McAnallen; Hoda Shajari; Huan Sun; David Levitan; Robert Sim; |
75 | A Close Look Into The Calibration of Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Pre-trained language models (PLMs) may fail in giving reliable estimates of their predictive uncertainty. We take a close look into this problem, aiming to answer two questions: (1) Do PLMs learn to become calibrated in the training process? |
Yangyi Chen; Lifan Yuan; Ganqu Cui; Zhiyuan Liu; Heng Ji; |
76 | DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose DIONYSUS (dynamic input optimization in pre-training for dialogue summarization), a pre-trained encoder-decoder model for summarizing dialogues in any new domain. |
Yu Li; Baolin Peng; Pengcheng He; Michel Galley; Zhou Yu; Jianfeng Gao; |
77 | MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we adopt a proposal-based solution that generates proposals (i. e. candidate moments) and then select the best matching proposal. |
Wang Jing; Aixin Sun; Hao Zhang; Xiaoli Li; |
78 | Diverse Demonstrations Improve In-context Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, in the setup of compositional generalization, where models are tested on outputs with structures that are absent from the training set, selecting similar demonstrations is insufficient, as often no example will be similar enough to the input. In this work, we propose a method to select diverse demonstrations that aims to collectively cover all of the structures required in the output program, in order to encourage the model to generalize to new structures from these demonstrations. |
Itay Levy; Ben Bogin; Jonathan Berant; |
79 | Self-Adaptive In-Context Learning: An Information Compression Perspective for In-Context Example Selection and Ordering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To validate the effectiveness of self-adaptive ICL, we propose a general select-then-rank framework and instantiate it with new selection and ranking algorithms. |
Zhiyong Wu; Yaoxiang Wang; Jiacheng Ye; Lingpeng Kong; |
80 | On The Efficacy of Sampling Adapters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate this issue, various modifications to a model�s sampling distribution, such as top-p or top-k sampling, have been introduced and are now ubiquitously used in language generation systems. We propose a unified framework for understanding these techniques, which we term sampling adapters. |
Clara Meister; Tiago Pimentel; Luca Malagutti; Ethan Wilcox; Ryan Cotterell; |
81 | Cross-Domain Data Augmentation with Domain-Adaptive Language Modeling for Aspect-Based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these CDDA methods still suffer from several issues: 1) preserving many source-specific attributes such as syntactic structures; 2) lack of fluency and coherence; 3) limiting the diversity of generated data. To address these issues, we propose a new cross-domain Data Augmentation approach based on Domain-Adaptive Language Modeling named DA2LM, which contains three stages: 1) assigning pseudo labels to unlabeled target-domain data; 2) unifying the process of token generation and labeling with a Domain-Adaptive Language Model (DALM) to learn the shared context and annotation across domains; 3) using the trained DALM to generate labeled target-domain data. |
Jianfei Yu; Qiankun Zhao; Rui Xia; |
82 | Compositional Data Augmentation for Abstractive Conversation Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, collecting and annotating these conversations can be a time-consuming and labor-intensive task. To address this issue, in this work, we present a sub-structure level compositional data augmentation method, Compo, for generating diverse and high-quality pairs of conversations and summaries. |
Siru Ouyang; Jiaao Chen; Jiawei Han; Diyi Yang; |
83 | PMAES: Prompt-mapping Contrastive Learning for Cross-prompt Automated Essay Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In fact, when the representations of two prompts are more similar, we can gain more shared features between them. Based on this motivation, in this paper, we propose a learning strategy called �prompt-mapping� to learn about more consistent representations of source and target prompts. |
Yuan Chen; Xia Li; |
84 | Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To recognize and mitigate harms from large language models (LLMs), we need to understand the prevalence and nuances of stereotypes in LLM outputs. Toward this end, we present Marked Personas, a prompt-based method to measure stereotypes in LLMs for intersectional demographic groups without any lexicon or data labeling. |
Myra Cheng; Esin Durmus; Dan Jurafsky; |
85 | On Prefix-tuning for Lightweight Out-of-distribution Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we depart from the classic fine-tuning based OOD detection toward a parameter-efficient alternative, and propose an unsupervised prefix-tuning based OOD detection framework termed PTO. |
Yawen Ouyang; Yongchang Cao; Yuan Gao; Zhen Wu; Jianbing Zhang; Xinyu Dai; |
86 | GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network that outputs a self-attention weight matrix that can be used in beam search to find the best permutation of input tokens (with auxiliary tokens) and a decoder network based on a step-unrolled denoising autoencoder that fills in specific tokens. |
Konstantin Yakovlev; Alexander Podolskiy; Andrey Bout; Sergey Nikolenko; Irina Piontkovskaya; |
87 | Measuring Progress in Fine-grained Vision-and-Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This has resulted in an increased interest in the community to either develop new benchmarks or models for such capabilities. To better understand and quantify progress in this direction, we investigate four competitive V&L models on four fine-grained benchmarks. |
Emanuele Bugliarello; Laurent Sartran; Aishwarya Agrawal; Lisa Anne Hendricks; Aida Nematzadeh; |
88 | Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces an unsupervised VWSD approach that uses gloss information of an external lexical knowledge-base, especially the sense definitions. |
Sunjae Kwon; Rishabh Garodia; Minhwa Lee; Zhichao Yang; Hong Yu; |
89 | Chain-of-Skills: A Configurable Model for Open-Domain Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a modular retriever where individual modules correspond to key skills that can be reused across datasets. |
Kaixin Ma; Hao Cheng; Yu Zhang; Xiaodong Liu; Eric Nyberg; Jianfeng Gao; |
90 | Elaboration-Generating Commonsense Question Answering at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In question answering requiring common sense, language models (e. g. , GPT-3) have been used to generate text expressing background knowledge that helps improve performance. Yet the cost of working with such models is very high; in this work, we finetune smaller language models to generate useful intermediate context, referred to here as elaborations. |
Wenya Wang; Vivek Srikumar; Hannaneh Hajishirzi; Noah A. Smith; |
91 | Neural Unsupervised Reconstruction of Protolanguage Word Forms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a state-of-the-art neural approach to the unsupervised reconstruction of ancient word forms. |
Andre He; Nicholas Tomlin; Dan Klein; |
92 | DaMSTF: Domain Adversarial Learning Enhanced Meta Self-Training for Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new self-training framework for domain adaptation, namely Domain adversarial learning enhanced Self-Training Framework (DaMSTF). |
Menglong Lu; Zhen Huang; Yunxiang Zhao; Zhiliang Tian; Yang Liu; Dongsheng Li; |
93 | On Evaluating Multilingual Compositional Generalization with Translated Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, we show that this entails critical semantic distortion. To address this limitation, we craft a faithful rule-based translation of the MCWQ dataset from English to Chinese and Japanese. |
Zi Wang; Daniel Hershcovich; |
94 | FAA: Fine-grained Attention Alignment for Cascade Document Ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In fact, the document ranker can provide fine-grained supervision to make the selector more generalizable and compatible, and the selector built upon a different structure can offer a distinct perspective to assist in document ranking. Inspired by this, we propose a fine-grained attention alignment approach to jointly optimize a cascade document ranking model. |
Zhen Li; Chongyang Tao; Jiazhan Feng; Tao Shen; Dongyan Zhao; Xiubo Geng; Daxin Jiang; |
95 | Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the observation, in this paper, we study the problem of re-parameterizing and fine-tuning PLMs from a new perspective: Discovery of intrinsic task-specific subspace. |
Zhong Zhang; Bang Liu; Junming Shao; |
96 | Facilitating Multi-turn Emotional Support Conversation with Positive Emotion Elicitation: A Reinforcement Learning Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Supporter, a mixture-of-expert-based reinforcement learning model, and well design ES and dialogue coherence rewards to guide policy�s learning for responding. |
Jinfeng Zhou; Zhuang Chen; Bo Wang; Minlie Huang; |
97 | Query Enhanced Knowledge-Intensive Conversation Via Unsupervised Joint Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv. |
Mingzhu Cai; Siqi Bao; Xin Tian; Huang He; Fan Wang; Hua Wu; |
98 | Why Aren�t We NER Yet? Artifacts of ASR Errors in Named Entity Recognition in Spontaneous Speech Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we examine in detail the complex relationship between ASR and NER errors which limit the ability of NER models to recover entity mentions from spontaneous speech transcripts. |
Piotr Szymanski; Lukasz Augustyniak; Mikolaj Morzy; Adrian Szymczak; Krzysztof Surdyk; Piotr Zelasko; |
99 | Precise Zero-Shot Dense Retrieval Without Relevance Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we recognize the difficulty of zero-shot learning and encoding relevance. |
Luyu Gao; Xueguang Ma; Jimmy Lin; Jamie Callan; |
100 | White-Box Multi-Objective Adversarial Attack on Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a white-box multi-objective attack method called DGSlow. |
Yufei Li; Zexin Li; Yingfan Gao; Cong Liu; |
101 | A Cautious Generalization Goes A Long Way: Learning Morphophonological Rules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel approach for automatically learning morphophonological rules of Arabic from a corpus. |
Salam Khalifa; Sarah Payne; Jordan Kodner; Ellen Broselow; Owen Rambow; |
102 | Few-shot Adaptation Works with UnpredicTable Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Prior work on language models (LMs) shows that training on a large number of diverse tasks improves few-shot learning (FSL) performance on new tasks. We take this to the extreme, automatically extracting 413,299 tasks from internet tables – orders of magnitude more than the next-largest public datasets. |
Jun Shern Chan; Michael Pieler; Jonathan Jao; J�r�my Scheurer; Ethan Perez; |
103 | Cross-lingual Science Journalism: Select, Simplify and Rewrite Summaries for Non-expert Readers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CSJ as a downstream task of text simplification and cross-lingual scientific summarization to facilitate science journalists� work. |
Mehwish Fatima; Michael Strube; |
104 | HuCurl: Human-induced Curriculum Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the problem of curriculum discovery and describe a curriculum learning framework capable of discovering effective curricula in a curriculum space based on prior knowledge about sample difficulty. |
Mohamed Elgaar; Hadi Amiri; |
105 | KNN-TL: K-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a k-Nearest-Neighbor Transfer Learning (kNN-TL) approach for low-resource NMT, which leverages the parent knowledge throughout the entire developing process of the child model. |
Shudong Liu; Xuebo Liu; Derek F. Wong; Zhaocong Li; Wenxiang Jiao; Lidia S. Chao; Min Zhang; |
106 | Do Language Models Have Coherent Mental Models of Everyday Things? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Do language models similarly have a coherent picture of such everyday things? To investigate this, we propose a benchmark dataset consisting of 100 everyday things, their parts, and the relationships between these parts, expressed as 11,720 �X relation Y? |
Yuling Gu; Bhavana Dalvi Mishra; Peter Clark; |
107 | Rogue Scores Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Does this widespread benchmark metric meet these three evaluation criteria? This systematic review of over two thousand publications using ROUGE finds: (A) Critical evaluation decisions and parameters are routinely omitted, making most reported scores irreproducible. |
Max Grusky; |
108 | Instruction Induction: From Few Examples to Natural Language Task Descriptions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that language models can explicitly infer an underlying task from a few demonstrations by prompting them to generate a natural language instruction that fits the examples. To explore this ability, we introduce the instruction induction challenge, compile a dataset consisting of 24 tasks, and define a novel evaluation metric based on executing the generated instruction. |
Or Honovich; Uri Shaham; Samuel R. Bowman; Omer Levy; |
109 | In-Context Analogical Reasoning with Pre-Trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, we apply large pre-trained language models (PLMs) to visual Raven�s Progressive Matrices (RPM), a common relational reasoning test. |
Xiaoyang Hu; Shane Storks; Richard Lewis; Joyce Chai; |
110 | Peek Across: Improving Multi-Document Modeling Via Cross-Document Question-Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks. In this work, we propose extending this idea by pre-training a generic multi-document model from a novel cross-document question answering pre-training objective. |
Avi Caciularu; Matthew Peters; Jacob Goldberger; Ido Dagan; Arman Cohan; |
111 | Tailoring Instructions to Student�s Learning Levels Boosts Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Learning Good Teacher Matters (LGTM), an efficient training technique for incorporating distillation influence into the teacher�s learning process. |
Yuxin Ren; Zihan Zhong; Xingjian Shi; Yi Zhu; Chun Yuan; Mu Li; |
112 | REV: Information-Theoretic Evaluation of Free-Text Rationales Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: More concretely, we propose a metric called REV (Rationale Evaluation with conditional V-information), to quantify the amount of new, label-relevant information in a rationale beyond the information already available in the input or the label. |
Hanjie Chen; Faeze Brahman; Xiang Ren; Yangfeng Ji; Yejin Choi; Swabha Swayamdipta; |
113 | ELQA: A Corpus of Metalinguistic Questions and Answers About English Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present ELQA, a corpus of questions and answers in and about the English language. |
Shabnam Behzad; Keisuke Sakaguchi; Nathan Schneider; Amir Zeldes; |
114 | Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a simple and effective �divide, conquer and combine� solution, which explicitly disentangles the semantics of seen data, and leverages the performance and robustness with the mixture-of-experts mechanism. |
Qingyue Wang; Liang Ding; Yanan Cao; Yibing Zhan; Zheng Lin; Shi Wang; Dacheng Tao; Li Guo; |
115 | BIG-C: A Multimodal Multi-Purpose Dataset for Bemba Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present BIG-C (Bemba Image Grounded Conversations), a large multimodal dataset for Bemba. |
Claytone Sikasote; Eunice Mukonde; Md Mahfuz Ibn Alam; Antonios Anastasopoulos; |
116 | Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose SG-USM, a novel schema-guided user satisfaction modeling framework. |
Yue Feng; Yunlong Jiao; Animesh Prasad; Nikolaos Aletras; Emine Yilmaz; Gabriella Kazai; |
117 | Robust Multi-bit Natural Language Watermarking Through Invariant Features Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore ways to advance both payload and robustness by following a well-known proposition from image watermarking and identify features in natural language that are invariant to minor corruption. |
KiYoon Yoo; Wonhyuk Ahn; Jiho Jang; Nojun Kwak; |
118 | KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, incorporating varying contexts can especially benefit long document understanding tasks that leverage pre-trained LMs, typically bounded by the input sequence length. In light of these challenges, we propose KALM, a language model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. |
Shangbin Feng; Zhaoxuan Tan; Wenqian Zhang; Zhenyu Lei; Yulia Tsvetkov; |
119 | AtTGen: Attribute Tree Generation for Real-World Attribute Joint Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: And prior studies attempting to integrate these paradigms through ensemble, pipeline, and co-training models, still face challenges like cascading errors, high computational overhead, and difficulty in training. To address these existing problems, this paper presents Attribute Tree, a unified formulation for real-world attribute extraction application, where closed-world, open-world, and semi-open attribute extraction tasks are modeled uniformly. |
Yanzeng Li; Bingcong Xue; Ruoyu Zhang; Lei Zou; |
120 | Extractive Is Not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we define a typology with five types of broad unfaithfulness problems (including and beyond not-entailment) that can appear in extractive summaries, including incorrect coreference, incomplete coreference, incorrect discourse, incomplete discourse, as well as other misleading information. |
Shiyue Zhang; David Wan; Mohit Bansal; |
121 | Improving Translation Quality Estimation with Bias Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel method to mitigate the bias of the QE model and improve estimation performance. |
Hui Huang; Shuangzhi Wu; Kehai Chen; Hui Di; Muyun Yang; Tiejun Zhao; |
122 | Breeding Machine Translations: Evolutionary Approach to Survive and Thrive in The World of Automated Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a genetic algorithm (GA) based method for modifying n-best lists produced by a machine translation (MT) system. |
Josef Jon; Ondrej Bojar; |
123 | MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems Via Moral Discussions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems. |
Hao Sun; Zhexin Zhang; Fei Mi; Yasheng Wang; Wei Liu; Jianwei Cui; Bin Wang; Qun Liu; Minlie Huang; |
124 | Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: They often suppress the redundant and noisy information at the risk of losing critical information. Therefore, we propose a denoising bottleneck fusion (DBF) model for fine-grained video multimodal fusion. |
Shaoxiang Wu; Damai Dai; Ziwei Qin; Tianyu Liu; Binghuai Lin; Yunbo Cao; Zhifang Sui; |
125 | SimLM: Pre-training with Representation Bottleneck for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose SimLM (Similarity matching with Language Model pre-training), a simple yet effective pre-training method for dense passage retrieval. |
Liang Wang; Nan Yang; Xiaolong Huang; Binxing Jiao; Linjun Yang; Daxin Jiang; Rangan Majumder; Furu Wei; |
126 | From Ultra-Fine to Fine: Fine-tuning Ultra-Fine Entity Typing Models to Fine-grained Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new approach that can avoid the need of creating distantly labeled data whenever there is a new type schema. |
Hongliang Dai; Ziqian Zeng; |
127 | Controlling Learned Effects to Reduce Spurious Correlations in Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this can be counter-productive when the features have a non-zero causal effect on the target label and thus are important for prediction. Therefore, using methods from the causal inference literature, we propose an algorithm to regularize the learnt effect of the features on the model�s prediction to the estimated effect of feature on label. |
Parikshit Bansal; Amit Sharma; |
128 | What Makes Pre-trained Language Models Better Zero-shot Learners? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection). |
Jinghui Lu; Dongsheng Zhu; Weidong Han; Rui Zhao; Brian Mac Namee; Fei Tan; |
129 | Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input using a raw text corpus. |
Xinxi Lyu; Sewon Min; Iz Beltagy; Luke Zettlemoyer; Hannaneh Hajishirzi; |
130 | Learning Optimal Policy for Simultaneous Machine Translation Via Binary Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new method for constructing the optimal policy online via binary search. |
Shoutao Guo; Shaolei Zhang; Yang Feng; |
131 | Better Simultaneous Translation with Monotonic Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach that leverages traditional translation models as teachers and employs a two-stage beam search algorithm to generate monotonic yet accurate reference translations for sequence-level knowledge distillation. |
Shushu Wang; Jing Wu; Kai Fan; Wei Luo; Jun Xiao; Zhongqiang Huang; |
132 | StoryARG: A Corpus of Narratives and Personal Experiences in Argumentative Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis of the annotations in StoryARG uncover a positive impact on effectiveness for stories which illustrate a solution to a problem, and in general, annotator-specific preferences that we investigate with regression analysis. |
Neele Falk; Gabriella Lapesa; |
133 | Injecting Knowledge Into Language Generation: A Case Study in Auto-charting After-visit Care Instructions from Medical Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the �utilization rate� that encodes knowledge and serves as a regularizer by maximizing the marginal probability of selected tokens. |
Maksim Eremeev; Ilya Valmianski; Xavier Amatriain; Anitha Kannan; |
134 | Sequence Parallelism: Long Sequence Training from System Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing work focuses on reducing time and space complexity from an algorithm perspective. In this work, we propose sequence parallelism, a memory-efficient parallelism to solve this issue from system perspective instead. |
Shenggui Li; Fuzhao Xue; Chaitanya Baranwal; Yongbin Li; Yang You; |
135 | MUSTIE: Multimodal Structural Transformer for Web Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel MUltimodal Structural Transformer (MUST) that incorporates multiple modalities for web information extraction. |
Qifan Wang; Jingang Wang; Xiaojun Quan; Fuli Feng; Zenglin Xu; Shaoliang Nie; Sinong Wang; Madian Khabsa; Hamed Firooz; Dongfang Liu; |
136 | Augmentation-Adapted Retriever Improves Generalization of Language Models As Generic Plug-In Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the scheme of generic retrieval plug-in: the retriever is to assist target LMs that may not be known beforehand or are unable to be fine-tuned together. |
Zichun Yu; Chenyan Xiong; Shi Yu; Zhiyuan Liu; |
137 | TableVLM: Multi-modal Pre-training for Table Structure Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a novel multi-modal pre-training model for table structure recognition, named TableVLM. |
Leiyuan Chen; Chengsong Huang; Xiaoqing Zheng; Jinshu Lin; Xuanjing Huang; |
138 | Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. |
Jiashu Xu; Mingyu Derek Ma; Muhao Chen; |
139 | Dynamic Routing Transformer Network for Multimodal Sarcasm Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by routing-based dynamic network, we model the dynamic mechanism in multimodal sarcasm detection and propose the Dynamic Routing Transformer Network (DynRT-Net). |
Yuan Tian; Nan Xu; Ruike Zhang; Wenji Mao; |
140 | What Are You Token About? Dense Retrieval As Distributions Over The Vocabulary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Yet, we have little understanding of how they represent text, and why this leads to good performance. In this work, we shed light on this question via distributions over the vocabulary. |
Ori Ram; Liat Bezalel; Adi Zicher; Yonatan Belinkov; Jonathan Berant; Amir Globerson; |
141 | Cold-Start Data Selection for Better Few-shot Language Model Fine-tuning: A Prompt-based Uncertainty Propagation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PATRON, a prompt-based data selection method for pre-trained language model fine-tuning under cold-start scenarios, i. e. , no initial labeled data are available. |
Yue Yu; Rongzhi Zhang; Ran Xu; Jieyu Zhang; Jiaming Shen; Chao Zhang; |
142 | Training-free Neural Architecture Search for RNNs and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate training-free NAS metrics for recurrent neural network (RNN) and BERT-based transformer architectures, targeted towards language modeling tasks. |
Aaron Serianni; Jugal Kalita; |
143 | CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a multistage data sampling algorithm to effectively train a cross-lingual summarization model capable of summarizing an article in any target language. |
Abhik Bhattacharjee; Tahmid Hasan; Wasi Uddin Ahmad; Yuan-Fang Li; Yong-Bin Kang; Rifat Shahriyar; |
144 | Improving Gradient Trade-offs Between Tasks in Multi-task Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel gradient trade-off approach to mitigate the task conflict problem, dubbed GetMTL, which can achieve a specific trade-off among different tasks nearby the main objective of multi-task text classification (MTC), so as to improve the performance of each task simultaneously. |
Heyan Chai; Jinhao Cui; Ye Wang; Min Zhang; Binxing Fang; Qing Liao; |
145 | Bi-Phone: Modeling Inter Language Phonetic Influences in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a method to mine phoneme confusions (sounds in L2 that an L1 speaker is likely to conflate) for pairs of L1 and L2. |
Abhirut Gupta; Ananya B. Sai; Richard Sproat; Yuri Vasilevski; James Ren; Ambarish Jash; Sukhdeep Sodhi; Aravindan Raghuveer; |
146 | Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unpaired cross-lingual image captioning has long suffered from irrelevancy and disfluency issues, due to the inconsistencies of the semantic scene and syntax attributes during transfer. In this work, we propose to address the above problems by incorporating the scene graph (SG) structures and the syntactic constituency (SC) trees. |
Shengqiong Wu; Hao Fei; Wei Ji; Tat-Seng Chua; |
147 | Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the missing-step errors, we propose Plan-and-Solve (PS) Prompting. |
Lei Wang; Wanyu Xu; Yihuai Lan; Zhiqiang Hu; Yunshi Lan; Roy Ka-Wei Lee; Ee-Peng Lim; |
148 | RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel pre-training method called Duplex Masked Auto-Encoder, a. k. a. DupMAE. |
Zheng Liu; Shitao Xiao; Yingxia Shao; Zhao Cao; |
149 | DecompX: Explaining Transformers Decisions By Propagating Token Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, providing a faithful vector-based explanation for a multi-layer model could be challenging in three aspects: (1) Incorporating all components into the analysis, (2) Aggregating the layer dynamics to determine the information flow and mixture throughout the entire model, and (3) Identifying the connection between the vector-based analysis and the model�s predictions. In this paper, we present DecompX to tackle these challenges. |
Ali Modarressi; Mohsen Fayyaz; Ehsan Aghazadeh; Yadollah Yaghoobzadeh; Mohammad Taher Pilehvar; |
150 | Symbolic Chain-of-Thought Distillation: Small Models Can Also �Think� Step-by-Step Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 3B parameters) can still benefit from chain-of-thought prompting. To achieve this, we introduce Symbolic Chain-of-Thought Distillation (SCoTD), a method to train a smaller student model on rationalizations sampled from a significantly larger teacher model. |
Liunian Harold Li; Jack Hessel; Youngjae Yu; Xiang Ren; Kai-Wei Chang; Yejin Choi; |
151 | Generating EDU Extracts for Plan-Guided Summary Re-Ranking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Yet, standard decoding methods (i. e. , beam search, nucleus sampling, and diverse beam search) produce candidates with redundant, and often low quality, content. In this paper, we design a novel method to generate candidates for re-ranking that addresses these issues. |
Griffin Adams; Alex Fabbri; Faisal Ladhak; No�mie Elhadad; Kathleen McKeown; |
152 | A Survey on Asking Clarification Questions Datasets in Conversational Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, it is noticeable that a key limitation of the existing ACQs studies is their incomparability, from inconsistent use of data, distinct experimental setups and evaluation strategies. Therefore, in this paper, to assist the development of ACQs techniques, we comprehensively analyse the current ACQs research status, which offers a detailed comparison of publicly available datasets, and discusses the applied evaluation metrics, joined with benchmarks for multiple ACQs-related tasks. |
Hossein A. Rahmani; Xi Wang; Yue Feng; Qiang Zhang; Emine Yilmaz; Aldo Lipani; |
153 | Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that CoT reasoning is possible even with invalid demonstrations – prompting with invalid reasoning steps can achieve over 80-90% of the performance obtained using CoT under various metrics, while still generating coherent lines of reasoning during inference. |
Boshi Wang; Sewon Min; Xiang Deng; Jiaming Shen; You Wu; Luke Zettlemoyer; Huan Sun; |
154 | Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We describe a broad data collection effort involving around 6k professionally translated sentence pairs for each of 39 low-resource languages, which we make publicly available. |
Jean Maillard; Cynthia Gao; Elahe Kalbassi; Kaushik Ram Sadagopan; Vedanuj Goswami; Philipp Koehn; Angela Fan; Francisco Guzman; |
155 | RMLM: A Flexible Defense Framework for Proactively Mitigating Word-level Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they often neglect to proactively mitigate adversarial attacks during inference. Towards this overlooked aspect, we propose a defense framework that aims to mitigate attacks by confusing attackers and correcting adversarial contexts that are caused by malicious perturbations. |
Zhaoyang Wang; Zhiyue Liu; Xiaopeng Zheng; Qinliang Su; Jiahai Wang; |
156 | Gradient-based Intra-attention Pruning on Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a structured pruning method GRAIN (gradient-based intra-attention pruning), which performs task-specific pruning with knowledge distillation and yields highly effective models. |
Ziqing Yang; Yiming Cui; Xin Yao; Shijin Wang; |
157 | Learning to Substitute Spans Towards Improving Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the two challenges, we first propose a novel compositional augmentation strategy dubbed Span Substitution (SpanSub) that enables multi-grained composition of substantial substructures in the whole training set. Over and above that, we introduce the Learning to Substitute Span (L2S2) framework which empowers the learning of span substitution probabilities in SpanSub in an end-to-end manner by maximizing the loss of neural sequence models, so as to outweigh those challenging compositions with elusive concepts and novel surroundings. |
Zhaoyi Li; Ying Wei; Defu Lian; |
158 | DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to use explicit control to guide the empathy expression and design a framework DiffusEmp based on conditional diffusion language model to unify the utilization of dialogue context and attribute-oriented control signals. |
Guanqun Bi; Lei Shen; Yanan Cao; Meng Chen; Yuqiang Xie; Zheng Lin; Xiaodong He; |
159 | BREAK: Breaking The Dialogue State Tracking Barrier with Beam Search and Re-ranking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our preliminary error analysis, we find that beam search produces a pool of candidates that is likely to include the correct dialogue state. Motivated by this observation, we introduce a novel framework, called BREAK (Beam search and RE-rAnKing), that achieves outstanding performance on DST. |
Seungpil Won; Heeyoung Kwak; Joongbo Shin; Janghoon Han; Kyomin Jung; |
160 | Faithful Low-Resource Data-to-Text Generation Through Cycle Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Sufficient annotated data is often not available for specific domains, leading us to seek an unsupervised approach to improve the faithfulness of output text. Since the problem is fundamentally one of consistency between the representations of the structured data and text, we evaluate the effectiveness of cycle training in this work. |
Zhuoer Wang; Marcus Collins; Nikhita Vedula; Simone Filice; Shervin Malmasi; Oleg Rokhlenko; |
161 | Towards Stable Natural Language Understanding Via Information Entropy Guided Debiasing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, our analyses show that the empirical debiasing methods may fail to capture part of the potential dataset biases and mistake semantic information of input text as biases, which limits the effectiveness of debiasing. To address these issues, we propose a debiasing framework IEGDB that comprehensively detects the dataset biases to induce a set of biased features, and then purifies the biased features with the guidance of information entropy. |
Li Du; Xiao Ding; Zhouhao Sun; Ting Liu; Bing Qin; Jingshuo Liu; |
162 | Dynamic and Efficient Inference for Text Generation Via BERT Family Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel fine-tuning method DEER, which can make a single pre-trained model support Dynamic and Efficient infERence and achieve an adaptive trade-off between model performance and latency. |
Xiaobo Liang; Juntao Li; Lijun Wu; Ziqiang Cao; Min Zhang; |
163 | Learning to Generate Equitable Text in Dialogue from Biased Training Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, there is no comprehensive study of equitable text generation in dialogue. Aptly, in this work, we use theories of computational learning to study this problem. |
Anthony Sicilia; Malihe Alikhani; |
164 | Hierarchical Verbalizer for Few-Shot Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the issue, we propose the hierarchical verbalizer (�HierVerb�), a multi-verbalizer framework treating HTC as a single- or multi-label classification problem at multiple layers and learning vectors as verbalizers constrained by hierarchical structure and hierarchical contrastive learning. |
Ke Ji; Yixin Lian; Jingsheng Gao; Baoyuan Wang; |
165 | Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to improve the summary quality through summary-oriented visual features. |
Yunlong Liang; Fandong Meng; Jinan Xu; Jiaan Wang; Yufeng Chen; Jie Zhou; |
166 | Helping A Friend or Supporting A Cause? Disentangling Active and Passive Cosponsorship in The U.S. Congress Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we develop an Encoder+RGCN based model that learns legislator representations from bill texts and speech transcripts. |
Giuseppe Russo; Christoph Gote; Laurence Brandenberger; Sophia Schlosser; Frank Schweitzer; |
167 | TREA: Tree-Structure Reasoning Schema for Conversational Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent reasoning-based models heavily rely on simplified structures such as linear structures or fixed-hierarchical structures for causality reasoning, hence they cannot fully figure out sophisticated relationships among utterances with external knowledge. To address this, we propose a novel Tree structure Reasoning schEmA named TREA. |
Wendi Li; Wei Wei; Xiaoye Qu; Xian-Ling Mao; Ye Yuan; Wenfeng Xie; Dangyang Chen; |
168 | CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High Quality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Last, current datasets bias in the English language while leaving other languages underexplored. To alleviate these limitations, in this paper, we present CATS, a pragmatic Chinese answer-to-sequence dataset with large scale and high quality. |
Liang Li; Ruiying Geng; Chengyang Fang; Bing Li; Can Ma; Rongyu Cao; Binhua Li; Fei Huang; Yongbin Li; |
169 | Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing, and Persuasion Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new multilingual multifacet dataset of news articles, each annotated for genre (objective news reporting vs. opinion vs. satire), framing (what key aspects are highlighted), and persuasion techniques (logical fallacies, emotional appeals, ad hominem attacks, etc. ). |
Jakub Piskorski; Nicolas Stefanovitch; Nikolaos Nikolaidis; Giovanni Da San Martino; Preslav Nakov; |
170 | Learning Action Conditions from Instructional Manuals for Instruction Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a task dubbed action condition inference, which extracts mentions of preconditions and postconditions of actions in instructional manuals. |
Te-Lin Wu; Caiqi Zhang; Qingyuan Hu; Alexander Spangher; Nanyun Peng; |
171 | StoryWars: A Dataset and Instruction Tuning Baselines for Collaborative Story Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Understanding and generating such stories remains an underexplored area due to the lack of open-domain corpora. To address this, we introduce StoryWars, a new dataset of over 40,000 collaborative stories written by 9,400 different authors from an online platform. |
Yulun Du; Lydia Chilton; |
172 | Did You Read The Instructions? Rethinking The Effectiveness of Task Definitions in Instruction Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we systematically study the role of task definitions in instruction learning. |
Fan Yin; Jesse Vig; Philippe Laban; Shafiq Joty; Caiming Xiong; Chien-Sheng Wu; |
173 | Do PLMs Know and Understand Ontological Knowledge? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on probing whether PLMs store ontological knowledge and have a semantic un- derstanding of the knowledge rather than rote memorization of the surface form. |
Weiqi Wu; Chengyue Jiang; Yong Jiang; Pengjun Xie; Kewei Tu; |
174 | CORE: Cooperative Training of Retriever-Reranker for Effective Dialogue Response Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a cooperative training of the response retriever and the reranker whose parameters are dynamically optimized by the ground-truth labels as well as list-wise supervision signals from each other. |
Chongyang Tao; Jiazhan Feng; Tao Shen; Chang Liu; Juntao Li; Xiubo Geng; Daxin Jiang; |
175 | Exploring How Generative Adversarial Networks Learn Phonological Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores how Generative Adversarial Networks (GANs) learn representations of phonological phenomena. |
Jingyi Chen; Micha Elsner; |
176 | Interpretable Word Sense Representations Via Definition Generation: The Case of Semantic Change Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose using automatically generated natural language definitions of contextualised word usages as interpretable word and word sense representations. |
Mario Giulianelli; Iris Luden; Raquel Fernandez; Andrey Kutuzov; |
177 | Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new task of simulating NL feedback for interactive semantic parsing. |
Hao Yan; Saurabh Srivastava; Yintao Tai; Sida I. Wang; Wen-tau Yih; Ziyu Yao; |
178 | InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, we humans can easily identify the problems of captions in details, e. g. , which words are inaccurate and which salient objects are not described, and then rate the caption quality. To support such informative feedback, we propose an Informative Metric for Reference-free Image Caption evaluation (InfoMetIC). |
Anwen Hu; Shizhe Chen; Liang Zhang; Qin Jin; |
179 | An Invariant Learning Characterization of Controlled Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that the performance of controlled generation may be poor if the distributions of text in response to user prompts differ from the distribution the predictor was trained on. |
Carolina Zheng; Claudia Shi; Keyon Vafa; Amir Feder; David Blei; |
180 | HistRED: A Historical Document-Level Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To demonstrate the usefulness of our dataset, we propose a bilingual RE model that leverages both Korean and Hanja contexts to predict relations between entities. |
Soyoung Yang; Minseok Choi; Youngwoo Cho; Jaegul Choo; |
181 | A Critical Evaluation of Evaluations for Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a careful analysis of experts� evaluation, which focuses on new aspects such as the comprehensiveness of the answer. |
Fangyuan Xu; Yixiao Song; Mohit Iyyer; Eunsol Choi; |
182 | HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there still poses problems when fine-tuning pre-trained language models on downstream tasks, such as over-fitting or representation collapse. In this work, we propose HyPe, a simple yet effective fine-tuning technique to alleviate such problems by perturbing hidden representations of Transformers layers. |
Hongyi Yuan; Zheng Yuan; Chuanqi Tan; Fei Huang; Songfang Huang; |
183 | Generating User-Engaging News Headlines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, presenting the same news headline to all readers is a suboptimal strategy, because it does not take into account the different preferences and interests of diverse readers, who may be confused about why a particular article has been recommended to them and do not see a clear connection between their interests and the recommended article. In this paper, we present a novel framework that addresses these challenges by incorporating user profiling to generate personalized headlines, and a combination of automated and human evaluation methods to determine user preference for personalized headlines. |
Pengshan Cai; Kaiqiang Song; Sangwoo Cho; Hongwei Wang; Xiaoyang Wang; Hong Yu; Fei Liu; Dong Yu; |
184 | Word Sense Extension Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a paradigm of word sense extension (WSE) thatenables words to spawn new senses toward novel context. |
Lei Yu; Yang Xu; |
185 | PVGRU: Generating Diverse and Relevant Dialogue Responses Via Pseudo-Variational Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Pseudo-Variational Gated Recurrent Unit (PVGRU). |
Yongkang Liu; Shi Feng; Daling Wang; Yifei Zhang; Hinrich Sch�tze; |
186 | Decoding Symbolism in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present our evaluative framework, Symbolism Analysis (SymbA), which compares LMs (e. g. , RoBERTa, GPT-J) on different types of symbolism and analyze the outcomes along multiple metrics. |
Meiqi Guo; Rebecca Hwa; Adriana Kovashka; |
187 | A Survey on Zero Pronoun Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This phenomenon has been studied extensively in machine translation (MT), as it poses a significant challenge for MT systems due to the difficulty in determining the correct antecedent for the pronoun. This survey paper highlights the major works that have been undertaken in zero pronoun translation (ZPT) after the neural revolution so that researchers can recognize the current state and future directions of this field. |
Longyue Wang; Siyou Liu; Mingzhou Xu; Linfeng Song; Shuming Shi; Zhaopeng Tu; |
188 | We Understand Elliptical Sentences, and Language Models Should Too: A New Dataset for Studying Ellipsis and Its Interaction with Thematic Fit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explored the issue of how the prototypicality of event participants affects the ability of Language Models (LMs) to handle elliptical sentences and to identify the omitted arguments at different degrees of thematic fit, ranging from highly typical participants to semantically anomalous ones. |
Davide Testa; Emmanuele Chersoni; Alessandro Lenci; |
189 | MPCHAT: Towards Multimodal Persona-Grounded Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we extend persona-based dialogue to the multimodal domain and make two main contributions. First, we present the first multimodal persona-based dialogue dataset named MPCHAT, which extends persona with both text and images to contain episodic memories. |
Jaewoo Ahn; Yeda Song; Sangdoo Yun; Gunhee Kim; |
190 | DOC: Improving Long Story Coherence With Detailed Outline Control Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Detailed Outline Control (DOC) framework for improving long-range plot coherence when automatically generating several-thousand-word-long stories. |
Kevin Yang; Dan Klein; Nanyun Peng; Yuandong Tian; |
191 | Dual-Alignment Pre-training for Cross-lingual Sentence Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on our findings, we propose a dual-alignment pre-training (DAP) framework for cross-lingual sentence embedding that incorporates both sentence-level and token-level alignment. |
Ziheng Li; Shaohan Huang; Zihan Zhang; Zhi-Hong Deng; Qiang Lou; Haizhen Huang; Jian Jiao; Furu Wei; Weiwei Deng; Qi Zhang; |
192 | Exploring Better Text Image Translation with Multimodal Codebook Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we first annotate a Chinese-English TIT dataset named OCRMT30K, providing convenience for subsequent studies. |
Zhibin Lan; Jiawei Yu; Xiang Li; Wen Zhang; Jian Luan; Bin Wang; Degen Huang; Jinsong Su; |
193 | FEDLEGAL: The First Real-World Federated Learning Benchmark for Legal NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, to the best of our knowledge, there is no work on applying FL to legal NLP. To fill this gap, this paper presents the first real-world FL benchmark for legal NLP, coined FEDLEGAL, which comprises five legal NLP tasks and one privacy task based on the data from Chinese courts. |
Zhuo Zhang; Xiangjing Hu; Jingyuan Zhang; Yating Zhang; Hui Wang; Lizhen Qu; Zenglin Xu; |
194 | A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: And this problem might result in forgetting the backdoor. Based on this finding, we propose a gradient control method to consolidate the attack effect, comprising two strategies. |
Naibin Gu; Peng Fu; Xiyu Liu; Zhengxiao Liu; Zheng Lin; Weiping Wang; |
195 | History Semantic Graph Enhanced Conversational KBQA with Temporal Information Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a History Semantic Graph Enhanced KBQA model (HSGE) that is able to effectively model long-range semantic dependencies in conversation history while maintaining low computational cost. |
Hao Sun; Yang Li; Liwei Deng; Bowen Li; Binyuan Hui; Binhua Li; Yunshi Lan; Yan Zhang; Yongbin Li; |
196 | From The One, Judge of The Whole: Typed Entailment Graph Construction with Predicate Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, EGs built by previous methods often suffer from the severe sparsity issues, due to limited corpora available and the long-tail phenomenon of predicate distributions. In this paper, we propose a multi-stage method, Typed Predicate-Entailment Graph Generator (TP-EGG), to tackle this problem. |
Zhibin Chen; Yansong Feng; Dongyan Zhao; |
197 | Alleviating Over-smoothing for Unsupervised Sentence Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Experimentally, we observe that the over-smoothing problem reduces the capacity of these powerful PLMs, leading to sub-optimal sentence representations. In this paper, we present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue, which samples negatives from PLMs intermediate layers, improving the quality of the sentence representation. |
Nuo Chen; Linjun Shou; Jian Pei; Ming Gong; Bowen Cao; Jianhui Chang; Jia Li; Daxin Jiang; |
198 | Memory-efficient NLLB-200: Language-specific Expert Pruning of A Massively Multilingual Machine Translation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a pruning method that enables the removal of up to 80% of experts without further finetuning and with a negligible loss in translation quality, which makes it feasible to run the model on a single 32GB GPU. |
Yeskendir Koishekenov; Alexandre Berard; Vassilina Nikoulina; |
199 | DAMP: Doubly Aligned Multilingual Parser for Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work we dramatically improve the zero-shot performance of a multilingual and codeswitched semantic parsing system using two stages of multilingual alignment. |
William Held; Christopher Hidey; Fei Liu; Eric Zhu; Rahul Goel; Diyi Yang; Rushin Shah; |
200 | From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel open-vocabulary language model that adopts a hierarchical two-level approach: one at the word level and another at the sequence level. |
Li Sun; Florian Luisier; Kayhan Batmanghelich; Dinei Florencio; Cha Zhang; |
201 | MatSci-NLP: Evaluating Scientific Language Models on Materials Science Language Tasks Using Text-to-Schema Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MatSci-NLP, a natural language benchmark for evaluating the performance of natural language processing (NLP) models on materials science text. |
Yu Song; Santiago Miret; Bang Liu; |
202 | Code4Struct: Code Generation for Few-Shot Event Structure Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We observe that semantic structures can be conveniently translated into code and propose Code4Struct to leverage such text-to-structure translation capability to tackle structured prediction tasks. |
Xingyao Wang; Sha Li; Heng Ji; |
203 | GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Then, exhaustive human expert annotations are collected to build the ontology, concluding with 115 events and 220 argument roles, with a significant portion of roles not being entities. We utilize this ontology to further introduce GENEVA, a diverse generalizability benchmarking dataset comprising four test suites aimed at evaluating models� ability to handle limited data and unseen event type generalization. |
Tanmay Parekh; I-Hung Hsu; Kuan-Hao Huang; Kai-Wei Chang; Nanyun Peng; |
204 | Efficient Semiring-Weighted Earley Parsing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Earley�s (1970) context-free parsing algorithm as a deduction system, incorporating various known and new speed-ups. |
Andreas Opedal; Ran Zmigrod; Tim Vieira; Ryan Cotterell; Jason Eisner; |
205 | Tree-Based Representation and Generation of Natural and Mathematical Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a series of modifications to existing language models to jointly represent and generate text and math: representing mathematical expressions as sequences of node tokens in their operator tree format, using math symbol and tree position embeddings to preserve the semantic and structural properties of mathematical expressions, and using a constrained decoding method to generate mathematically valid expressions. |
Alexander Scarlatos; Andrew Lan; |
206 | ParaLS: Lexical Substitution Via Pretrained Paraphraser Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. |
Jipeng Qiang; Kang Liu; Yun Li; Yunhao Yuan; Yi Zhu; |
207 | Peer-Label Assisted Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fully explore the peer-label relationship, we develop a PeerHTC method. |
Junru Song; Feifei Wang; Yang Yang; |
208 | Free Lunch for Efficient Textual Commonsense Integration in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, incorporating textual commonsense descriptions is computationally expensive, as compared to encoding conventional symbolic knowledge. In this paper, we propose a method to improve its efficiency without modifying the model. |
Wanyun Cui; Xingran Chen; |
209 | A Probabilistic Framework for Discovering New Intents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, starting from the intuition that discovering intents could be beneficial for identifying known intents, we propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables. |
Yunhua Zhou; Guofeng Quan; Xipeng Qiu; |
210 | MultiTACRED: A Multilingual Version of The TAC Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al. , 2017). To address this gap, we introduce the MultiTACRED dataset, covering 12 typologically diverse languages from 9 language families, which is created by machine-translating TACRED instances and automatically projecting their entity annotations. |
Leonhard Hennig; Philippe Thomas; Sebastian M�ller; |
211 | Towards Higher Pareto Frontier in Multilingual Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new training framework, Pareto Mutual Distillation (Pareto-MD), towards pushing the Pareto frontier outwards rather than making trade-offs. |
Yichong Huang; Xiaocheng Feng; Xinwei Geng; Baohang Li; Bing Qin; |
212 | Small Pre-trained Language Models Can Be Fine-tuned As Large Models Via Over-Parameterization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on scaling up the parameters of PLMs only during fine-tuning, to benefit from the over-parameterization, while without increasing the inference latency. |
Ze-Feng Gao; Kun Zhou; Peiyu Liu; Wayne Xin Zhao; Ji-Rong Wen; |
213 | Entity Tracking in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a task probing to what extent a language model can infer the final state of an entity given an English description of the initial state and a series of state-changing operations. |
Najoung Kim; Sebastian Schuster; |
214 | A Textual Dataset for Situated Proactive Response Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, we introduce a task of proactive response selection based on situational information. |
Naoki Otani; Jun Araki; HyeongSik Kim; Eduard Hovy; |
215 | DiffusionNER: Boundary Diffusion for Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose DiffusionNER, which formulates the named entity recognition task as a boundary-denoising diffusion process and thus generates named entities from noisy spans. |
Yongliang Shen; Kaitao Song; Xu Tan; Dongsheng Li; Weiming Lu; Yueting Zhuang; |
216 | WACO: Word-Aligned Contrastive Learning for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Word-Aligned COntrastive learning (WACO), a simple and effective method for extremely low-resource speech-to-text translation. |
Siqi Ouyang; Rong Ye; Lei Li; |
217 | Cross-lingual Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm, where we analyze different categories of approaches used to continually adapt to emerging data from different languages. |
Meryem M�hamdi; Xiang Ren; Jonathan May; |
218 | Faithful Question Answering with Monte-Carlo Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose FAME (FAithful question answering with MontE-carlo planning) to answer questions based on faithful reasoning steps. |
Ruixin Hong; Hongming Zhang; Hong Zhao; Dong Yu; Changshui Zhang; |
219 | Unbalanced Optimal Transport for Unbalanced Word Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To achieve unbalanced word alignment that values both alignment and null alignment, this study shows that the family of optimal transport (OT), i. e. , balanced, partial, and unbalanced OT, are natural and powerful approaches even without tailor-made techniques. |
Yuki Arase; Han Bao; Sho Yokoi; |
220 | Guiding Computational Stance Detection with Expanded Stance Triangle Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the limited amount of available training data leads to subpar performance in out-of-domain and cross-target scenarios, as data-driven approaches are prone to rely on superficial and domain-specific features. In this work, we decompose the stance detection task from a linguistic perspective, and investigate key components and inference paths in this task. |
Zhengyuan Liu; Yong Keong Yap; Hai Leong Chieu; Nancy Chen; |
221 | Analyzing and Reducing The Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper analyzes the fine-tuning process, discovers when the performance gap changes and identifies which network weights affect the overall performance most. |
Yiduo Guo; Yaobo Liang; Dongyan Zhao; Bing Liu; Nan Duan; |
222 | Improving Self-training for Cross-lingual Named Entity Recognition with Contrastive and Prototype Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to improve self-training for cross-lingual NER by combining representation learning and pseudo label refinement in one coherent framework. |
Ran Zhou; Xin Li; Lidong Bing; Erik Cambria; Chunyan Miao; |
223 | MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, we propose MM-SHAP, a performance-agnostic multimodality score based on Shapley values that reliably quantifies in which proportions a multimodal model uses individual modalities. |
Letitia Parcalabescu; Anette Frank; |
224 | Towards Boosting The Open-Domain Chatbot with Human Feedback Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel and efficient framework Diamante to boost the open-domain chatbot, where two kinds of human feedback (including explicit demonstration and implicit preference) are collected and leveraged. |
Hua Lu; Siqi Bao; Huang He; Fan Wang; Hua Wu; Haifeng Wang; |
225 | Knowledge-enhanced Mixed-initiative Dialogue System for Emotional Support Conversations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study the problem of mixed-initiative ESC where the user and system can both take the initiative in leading the conversation. |
Yang Deng; Wenxuan Zhang; Yifei Yuan; Wai Lam; |
226 | UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the reformulation, we propose a Unified Token-pair Classification architecture for Information Extraction (UTC-IE), where we introduce Plusformer on top of the token-pair feature matrix. |
Hang Yan; Yu Sun; Xiaonan Li; Yunhua Zhou; Xuanjing Huang; Xipeng Qiu; |
227 | Social-Group-Agnostic Bias Mitigation Via The Stereotype Content Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose that the Stereotype Content Model (SCM) � a theoretical framework developed in social psychology for understanding the content of stereotyping � can help debiasing efforts to become social-group-agnostic by capturing the underlying connection between bias and stereotypes. |
Ali Omrani; Alireza Salkhordeh Ziabari; Charles Yu; Preni Golazizian; Brendan Kennedy; Mohammad Atari; Heng Ji; Morteza Dehghani; |
228 | Revisiting The Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient scale, and an in-depth analysis of human evaluation is lacking. Therefore, we address the shortcomings of existing summarization evaluation along the following axes: (1) We propose a modified summarization salience protocol, Atomic Content Units (ACUs), which is based on fine-grained semantic units and allows for a high inter-annotator agreement. |
Yixin Liu; Alex Fabbri; Pengfei Liu; Yilun Zhao; Linyong Nan; Ruilin Han; Simeng Han; Shafiq Joty; Chien-Sheng Wu; Caiming Xiong; Dragomir Radev; |
229 | FIREBALL: A Dataset of Dungeons and Dragons Actual-Play with Structured Game State Information Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present FIREBALL, a large dataset containing nearly 25,000 unique sessions from real D&D gameplay on Discord with true game state info. |
Andrew Zhu; Karmanya Aggarwal; Alexander Feng; Lara Martin; Chris Callison-Burch; |
230 | A Fine-grained Comparison of Pragmatic Language Understanding in Humans and Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We perform a fine-grained comparison of language models and humans on seven pragmatic phenomena, using zero-shot prompting on an expert-curated set of English materials. |
Jennifer Hu; Sammy Floyd; Olessia Jouravlev; Evelina Fedorenko; Edward Gibson; |
231 | Counterfactual Multihop QA: A Cause-Effect Approach for Reducing Disconnected Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the existing QA models always rely on shortcuts, e. g. , providing the true answer by only one fact, rather than multi-hop reasoning, which is referred as disconnected reasoning problem. To alleviate this issue, we propose a novel counterfactual multihop QA, a causal-effect approach that enables to reduce the disconnected reasoning. |
Wangzhen Guo; Qinkang Gong; Yanghui Rao; Hanjiang Lai; |
232 | Causal-Debias: Unifying Debiasing in Pretrained Language Models and Fine-tuning Via Causal Invariant Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified debiasing framework Causal-Debias to remove unwanted stereotypical associations in PLMs during fine-tuning. |
Fan Zhou; Yuzhou Mao; Liu Yu; Yi Yang; Ting Zhong; |
233 | Parameter-Efficient Fine-Tuning Without Introducing New Latency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we demonstrate the feasibility of generating a sparse mask in a task-agnostic manner, wherein all downstream tasks share a common mask. |
Baohao Liao; Yan Meng; Christof Monz; |
234 | MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the task of cross domain few-shot named entity recognition (NER), which aims to adapt the knowledge learned from source domain to recognize named entities in target domain with only a few labeled examples. To address this challenging task, we propose MANNER, a variational memory-augmented few-shot NER model. |
Jinyuan Fang; Xiaobin Wang; Zaiqiao Meng; Pengjun Xie; Fei Huang; Yong Jiang; |
235 | MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset with 51 Typologically-Diverse Languages Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the MASSIVE dataset�Multilingual Amazon Slu resource package (SLURP) for Slot-filling, Intent classification, and Virtual assistant Evaluation. |
Jack FitzGerald; Christopher Hench; Charith Peris; Scott Mackie; Kay Rottmann; Ana Sanchez; Aaron Nash; Liam Urbach; Vishesh Kakarala; Richa Singh; Swetha Ranganath; Laurie Crist; Misha Britan; Wouter Leeuwis; Gokhan Tur; Prem Natarajan; |
236 | Distilling Script Knowledge from Large Language Models for Constrained Language Planning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we define the task of constrained language planning for the first time. |
Siyu Yuan; Jiangjie Chen; Ziquan Fu; Xuyang Ge; Soham Shah; Charles Jankowski; Yanghua Xiao; Deqing Yang; |
237 | REDFM: A Filtered and Multilingual Relation Extraction Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current RE models often rely on small datasets with low coverage of relation types, particularly when working with languages other than English. In this paper, we address the above issue and provide two new resources that enable the training and evaluation of multilingual RE systems. |
?Pere-Llu�s Huguet Cabot; Simone Tedeschi; Axel-Cyrille Ngonga Ngomo; Roberto Navigli; |
238 | Modeling Appropriate Language in Argumentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we operationalize appropriate language in argumentation for the first time. |
Timon Ziegenbein; Shahbaz Syed; Felix Lange; Martin Potthast; Henning Wachsmuth; |
239 | CELDA: Leveraging Black-box Language Model As Enhanced Classifier Without Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Clustering-enhanced Linear Discriminative Analysis (CELDA), a novel approach that improves the text classification accuracy with a very weak-supervision signal (i. e. , name of the labels). |
Hyunsoo Cho; Youna Kim; Sang-goo Lee; |
240 | MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Multi-view Prompting (MVP) that aggregates sentiment elements generated in different orders, leveraging the intuition of human-like problem-solving processes from different views. |
Zhibin Gou; Qingyan Guo; Yujiu Yang; |
241 | ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose ACCENT, an event commonsense evaluation metric empowered by commonsense knowledge bases (CSKBs). |
Sarik Ghazarian; Yijia Shao; Rujun Han; Aram Galstyan; Nanyun Peng; |
242 | Explanation-based Finetuning Makes Models More Robust to Spurious Cues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose explanation-based finetuning as a general approach to mitigate LLMs� reliance on spurious correlations. |
Josh Magnus Ludan; Yixuan Meng; Tai Nguyen; Saurabh Shah; Qing Lyu; Marianna Apidianaki; Chris Callison-Burch; |
243 | CAME: Confidence-guided Adaptive Memory Efficient Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first study a confidence-guided strategy to reduce the instability of existing memory efficient optimizers. Based on this strategy, we propose CAME to simultaneously achieve two goals: fast convergence as in traditional adaptive methods, and low memory usage as in memory-efficient methods. |
Yang Luo; Xiaozhe Ren; Zangwei Zheng; Zhuo Jiang; Xin Jiang; Yang You; |
244 | On Second Thought, Let�s Not Think Step By Step! Bias and Toxicity in Zero-Shot Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Concretely, we perform a controlled evaluation of zero-shot CoT across two socially sensitive domains: harmful questions and stereotype benchmarks. |
Omar Shaikh; Hongxin Zhang; William Held; Michael Bernstein; Diyi Yang; |
245 | Solving Math Word Problems Via Cooperative Reasoning Induced Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We notice that human reasoning has a dual reasoning framework that consists of an immediate reaction system (system 1) and a delicate reasoning system (system 2), where the entire reasoning is determined by their interaction. This inspires us to develop a cooperative reasoning-induced PLM for solving MWPs, called Cooperative Reasoning (CoRe), resulting in a human-like reasoning architecture with system 1 as the generator and system 2 as the verifier. |
Xinyu Zhu; Junjie Wang; Lin Zhang; Yuxiang Zhang; Yongfeng Huang; Ruyi Gan; Jiaxing Zhang; Yujiu Yang; |
246 | Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To eliminate the rule-based nature of data creation, we instead propose using machine translation models to create gender-biased text from real gender-fair text via round-trip translation. |
Chantal Amrhein; Florian Schottmann; Rico Sennrich; Samuel L�ubli; |
247 | Early Discovery of Disappearing Entities in Microblogs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We make decisions by reacting to changes in the real world, particularly the emergence and disappearance of impermanent entities such as restaurants, services, and events. |
Satoshi Akasaki; Naoki Yoshinaga; Masashi Toyoda; |
248 | DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present DiffusionBERT, a new generative masked language model based on discrete dif- fusion models. |
Zhengfu He; Tianxiang Sun; Qiong Tang; Kuanning Wang; Xuanjing Huang; Xipeng Qiu; |
249 | Lifting The Curse of Capacity Gap in Distilling Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim at lifting the curse of capacity gap via enlarging the capacity of the student without notably increasing the inference compute. |
Chen Zhang; Yang Yang; Jiahao Liu; Jingang Wang; Yunsen Xian; Benyou Wang; Dawei Song; |
250 | Towards Faithful Dialogues Via Focus Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing models heavily rely on elaborate data engineering or increasing the model�s parameters ignoring to track the tokens that significantly influence losses, which is decisive for the optimization direction of the model in each iteration. To address this issue, we propose Focus Learning (FocusL), a novel learning approach that adjusts the contribution of each token to the optimization direction by directly scaling the corresponding objective loss. |
Yifan Deng; Xingsheng Zhang; Heyan Huang; Yue Hu; |
251 | Back Translation for Speech-to-text Translation Without Transcripts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to utilize large amounts of target-side monolingual data to enhance ST without transcripts. |
Qingkai Fang; Yang Feng; |
252 | Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our method, Prompter, uses descriptions of target domain slots to generate dynamic prefixes that are concatenated to the key and values at each layer?s self-attention mechanism. |
Ibrahim Taha Aksu; Min-Yen Kan; Nancy Chen; |
253 | Enhancing Dialogue Generation Via Dynamic Graph Knowledge Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we propose a novel framework for knowledge graph enhanced dialogue generation. |
Chen Tang; Hongbo Zhang; Tyler Loakman; Chenghua Lin; Frank Guerin; |
254 | Multi-modal Action Chain Abductive Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate this research community, this paper sheds new light on Abductive Reasoning by studying a new vision-language task, Multi-modal Action chain abductive Reasoning (MAR), together with a large-scale Abductive Reasoning dataset: Given an incomplete set of language described events, MAR aims to imagine the most plausible event by spatio-temporal grounding in past video and then infer the hypothesis of subsequent action chain that can best explain the language premise. |
Mengze Li; Tianbao Wang; Jiahe Xu; Kairong Han; Shengyu Zhang; Zhou Zhao; Jiaxu Miao; Wenqiao Zhang; Shiliang Pu; Fei Wu; |
255 | Exploring The Capacity of Pretrained Language Models for Reasoning About Actions and Change Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose four essential RAC tasks as a comprehensive textual benchmark and generate problems in a way that minimizes the influence of other linguistic requirements (e. g. , grounding) to focus on RAC. |
Weinan He; Canming Huang; Zhanhao Xiao; Yongmei Liu; |
256 | Unified Demonstration Retriever for In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Unified Demonstration Retriever (UDR), a single model to retrieve demonstrations for a wide range of tasks. |
Xiaonan Li; Kai Lv; Hang Yan; Tianyang Lin; Wei Zhu; Yuan Ni; Guotong Xie; Xiaoling Wang; Xipeng Qiu; |
257 | Movie101: A New Movie Understanding Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing works benchmark this challenge as a normal video captioning task via some simplifications, such as removing role names and evaluating narrations with ngram-based metrics, which makes it difficult for automatic systems to meet the needs of real application scenarios. To narrow this gap, we construct a large-scale Chinese movie benchmark, named Movie101. |
Zihao Yue; Qi Zhang; Anwen Hu; Liang Zhang; Ziheng Wang; Qin Jin; |
258 | Enhancing Language Representation with Constructional Information for Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, PLMs primarily focus on acquiring lexico-semantic information, while they may be unable to adequately handle the meaning of constructions. To address this issue, we introduce construction grammar (CxG), which highlights the pairings of form and meaning, to enrich language representation. |
Lvxiaowei Xu; Jianwang Wu; Jiawei Peng; Zhilin Gong; Ming Cai; Tianxiang Wang; |
259 | Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a structure-modeled textual encoding framework for inductive logical reasoning over KGs. |
Siyuan Wang; Zhongyu Wei; Meng Han; Zhihao Fan; Haijun Shan; Qi Zhang; Xuanjing Huang; |
260 | DimonGen: Diversified Generative Commonsense Reasoning for Explaining Concept Relationships Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose DimonGen, which aims to generate diverse sentences describing concept relationships in various everyday scenarios. |
Chenzhengyi Liu; Jie Huang; Kerui Zhu; Kevin Chen-Chuan Chang; |
261 | Incorporating Attribution Importance for Improving Faithfulness Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a simple yet effective soft erasure criterion. |
Zhixue Zhao; Nikolaos Aletras; |
262 | Reward Gaming in Conditional Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Under this framework, we identify three common cases where high rewards are incorrectly assigned to undesirable patterns: noise-induced spurious correlation, naturally occurring spurious correlation, and covariate shift. |
Richard Yuanzhe Pang; Vishakh Padmakumar; Thibault Sellam; Ankur Parikh; He He; |
263 | Hidden Schema Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we introduce a novel neural language model that enforces, via inductive biases, explicit relational structures which allow for compositionality onto the output representations of pretrained language models. |
Ramses Sanchez; Lukas Conrads; Pascal Welke; Kostadin Cvejoski; Cesar Ojeda Marin; |
264 | Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a novel method that operates on the hidden representations of a PLM to reduce overfitting. |
Linlin Liu; Xingxuan Li; Megh Thakkar; Xin Li; Shafiq Joty; Luo Si; Lidong Bing; |
265 | An Ordinal Latent Variable Model of Conflict Intensity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a probabilistic generative model that assumes each observed event is associated with a latent intensity class. |
Niklas Stoehr; Lucas Torroba Hennigen; Josef Valvoda; Robert West; Ryan Cotterell; Aaron Schein; |
266 | Multilingual Conceptual Coverage in Text-to-Image Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose �Conceptual Coverage Across Languages� (CoCo-CroLa), a technique for benchmarking the degree to which any generative text-to-image system provides multilingual parity to its training language in terms of tangible nouns. |
Michael Saxon; William Yang Wang; |
267 | Pre-Training to Learn in Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models� in-context learning ability by pre-training the model on a large collection of �intrinsic tasks� in the general plain-text corpus using the simple language modeling objective. |
Yuxian Gu; Li Dong; Furu Wei; Minlie Huang; |
268 | Ethical Considerations for Machine Translation of Indigenous Languages: Giving A Voice to The Speakers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The data collection, modeling and deploying machine translation systems thus result in new ethical questions that must be addressed. Motivated by this, we first survey the existing literature on ethical considerations for the documentation, translation, and general natural language processing for Indigenous languages. Afterward, we conduct and analyze an interview study to shed light on the positions of community leaders, teachers, and language activists regarding ethical concerns for the automatic translation of their languages. |
Manuel Mager; Elisabeth Mager; Katharina Kann; Ngoc Thang Vu; |
269 | Revisiting Non-English Text Simplification: A Unified Multilingual Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces the MultiSim benchmark, a collection of 27 resources in 12 distinct languages containing over 1. |
Michael Ryan; Tarek Naous; Wei Xu; |
270 | Don�t Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability. |
Yu Gu; Xiang Deng; Yu Su; |
271 | Privacy-Preserving Domain Adaptation of Semantic Parsers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study ways in which realistic user utterances can be generated synthetically, to help increase the linguistic and functional coverage of the system, without compromising the privacy of actual users. |
Fatemehsadat Mireshghallah; Yu Su; Tatsunori Hashimoto; Jason Eisner; Richard Shin; |
272 | Guide The Many-to-One Assignment: Open Information Extraction Via IoU-aware Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The commonly utilized Hungarian algorithm for this procedure is restricted to handling one-to-one assignment among the desired tuples and tuple proposals, which ignores the correlation between proposals and affects the recall of the models. To solve this problem, we propose a dynamic many-to-one label assignment strategy named IOT. |
Kaiwen Wei; Yiran Yang; Li Jin; Xian Sun; Zequn Zhang; Jingyuan Zhang; Xiao Li; Linhao Zhang; Jintao Liu; Guo Zhi; |
273 | Actively Supervised Clustering for Open Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel setting, named actively supervised clustering for OpenRE. |
Jun Zhao; Yongxin Zhang; Qi Zhang; Tao Gui; Zhongyu Wei; Minlong Peng; Mingming Sun; |
274 | ConvGQR: Generative Query Reformulation for Conversational Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose ConvGQR, a new framework to reformulate conversational queries based on generative pre-trained language models (PLMs), one for query rewriting and another for generating potential answers. |
Fengran Mo; Kelong Mao; Yutao Zhu; Yihong Wu; Kaiyu Huang; Jian-Yun Nie; |
275 | KILM: Knowledge Injection Into Encoder-Decoder Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into Language Models (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs, via a generative knowledge infilling objective through continued pre-training. |
Yan Xu; Mahdi Namazifar; Devamanyu Hazarika; Aishwarya Padmakumar; Yang Liu; Dilek Hakkani-Tur; |
276 | VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Video-grounded Scene&Topic AwaRe dialogue (VSTAR) dataset, a large scale video-grounded dialogue understanding dataset based on 395 TV series. |
Yuxuan Wang; Zilong Zheng; Xueliang Zhao; Jinpeng Li; Yueqian Wang; Dongyan Zhao; |
277 | NLPeer: A Unified Resource for The Computational Study of Peer Review Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To remedy this, we introduce NLPeer? the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues. |
Nils Dycke; Ilia Kuznetsov; Iryna Gurevych; |
278 | IM-TQA: A Chinese Table Question Answering Dataset with Implicit and Multi-type Table Structures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Correspondingly, we propose an RGCN-RCI framework outperforming recent baselines. |
Mingyu Zheng; Yang Hao; Wenbin Jiang; Zheng Lin; Yajuan Lyu; QiaoQiao She; Weiping Wang; |
279 | Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents Z-Code++, a new pre-trained language model optimized for abstractive text summarization. |
Pengcheng He; Baolin Peng; Song Wang; Yang Liu; Ruochen Xu; Hany Hassan; Yu Shi; Chenguang Zhu; Wayne Xiong; Michael Zeng; Jianfeng Gao; Xuedong Huang; |
280 | Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models� Memories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether we can adapt PLMs both effectively and efficiently by only tuning a few parameters. |
Shizhe Diao; Tianyang Xu; Ruijia Xu; Jiawei Wang; Tong Zhang; |
281 | Unsupervised Graph-Text Mutual Conversion with A Unified Pretrained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose INFINITY, a simple yet effective unsupervised method with a unified pretrained language model that does not introduce external annotation tools or additional parallel information. |
Yi Xu; Shuqian Sheng; Jiexing Qi; Luoyi Fu; Zhouhan Lin; Xinbing Wang; Chenghu Zhou; |
282 | Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce RSMI, a novel two-stage framework that combines randomized smoothing (RS) with masked inference (MI) to improve the adversarial robustness of NLP systems. |
Han Cheol Moon; Shafiq Joty; Ruochen Zhao; Megh Thakkar; Chi Xu; |
283 | SESCORE2: Learning Text Generation Evaluation Via Synthesizing Realistic Mistakes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SEScore2, a self-supervised approach for training a model-based metric for text generation evaluation. |
Wenda Xu; Xian Qian; Mingxuan Wang; Lei Li; William Yang Wang; |
284 | Tokenization and The Noiseless Channel Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose that good tokenizers lead to efficient channel usage, where the channel is the means by which some input is conveyed to the model and efficiency can be quantified in information-theoretic terms as the ratio of the Shannon entropy to the maximum entropy of the subword distribution. |
Vil�m Zouhar; Clara Meister; Juan Gastaldi; Li Du; Mrinmaya Sachan; Ryan Cotterell; |
285 | Contextual Distortion Reveals Constituency: Masked Language Models Are Implicit Parsers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent advancements in pre-trained language models (PLMs) have demonstrated that these models possess some degree of syntactic awareness. To leverage this knowledge, we propose a novel chart-based method for extracting parse trees from masked language models (LMs) without the need to train separate parsers. |
Jiaxi Li; Wei Lu; |
286 | MetaAdapt: Domain Adaptive Few-Shot Misinformation Detection Via Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the data scarcity issue, we propose MetaAdapt, a meta learning based approach for domain adaptive few-shot misinformation detection. |
Zhenrui Yue; Huimin Zeng; Yang Zhang; Lanyu Shang; Dong Wang; |
287 | Tackling Modality Heterogeneity with Multi-View Calibration Network for Multimodal Sentiment Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: All these issues will increase the difficulty in understanding the sentiment of the multimodal content. In this paper, we propose a novel Multi-View Calibration Network (MVCN) to alleviate the above issues systematically. |
Yiwei Wei; Shaozu Yuan; Ruosong Yang; Lei Shen; Zhangmeizhi Li; Longbiao Wang; Meng Chen; |
288 | COLA: Contextualized Commonsense Causal Reasoning from The Causal Inference Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new task to detect commonsense causation between two events in an event sequence (i. e. , context), called contextualized commonsense causal reasoning. |
Zhaowei Wang; Quyet V. Do; Hongming Zhang; Jiayao Zhang; Weiqi Wang; Tianqing Fang; Yangqiu Song; Ginny Wong; Simon See; |
289 | MEMEX: Detecting Explanatory Evidence for Memes Via Knowledge-Enriched Contextualization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel task, MEMEX – given a meme and a related document, the aim is to mine the context that succinctly explains the background of the meme. |
Shivam Sharma; Ramaneswaran S; Udit Arora; Md. Shad Akhtar; Tanmoy Chakraborty; |
290 | WikiHowQA: A Comprehensive Benchmark for Multi-Document Non-Factoid Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is a critical need for high-quality resources for multi-document NFQA (MD-NFQA) to train new models and evaluate answers? grounding and factual consistency in relation to supporting documents. To address this gap, we introduce WikiHowQA, a new multi-document NFQA benchmark built on WikiHow, a website dedicated to answering ?how-to? questions. |
Valeriia Bolotova-Baranova; Vladislav Blinov; Sofya Filippova; Falk Scholer; Mark Sanderson; |
291 | Making Language Models Better Reasoners with Step-Aware Verifier Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present DiVeRSe (Diverse Verifier on Reasoning Step), a novel approach that further enhances the reasoning capability of language models. |
Yifei Li; Zeqi Lin; Shizhuo Zhang; Qiang Fu; Bei Chen; Jian-Guang Lou; Weizhu Chen; |
292 | Distributed Marker Representation for Ambiguous Discourse Markers and Entangled Relations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose to learn a Distributed Marker Representation (DMR) by utilizing the (potentially) unlimited discourse marker data with a latent discourse sense, thereby bridging markers with sentence pairs. |
Dongyu Ru; Lin Qiu; Xipeng Qiu; Yue Zhang; Zheng Zhang; |
293 | MISGENDERED: Limits of Large Language Models in Understanding Pronouns Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we comprehensively evaluate popular language models for their ability to correctly use English gender-neutral pronouns (e. g. , singular they, them) and neo-pronouns (e. g. , ze, xe, thon) that are used by individuals whose gender identity is not represented by binary pronouns. |
Tamanna Hossain; Sunipa Dev; Sameer Singh; |
294 | Reasoning with Language Model Prompting: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce research works with comparisons and summaries and provide systematic resources to help beginners. |
Shuofei Qiao; Yixin Ou; Ningyu Zhang; Xiang Chen; Yunzhi Yao; Shumin Deng; Chuanqi Tan; Fei Huang; Huajun Chen; |
295 | Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new MMT approach based on a strong text-only MT model, which uses neural adapters, a novel guided self-attention mechanism and which is jointly trained on both visually-conditioned masking and MMT. |
Matthieu Futeral; Cordelia Schmid; Ivan Laptev; Beno�t Sagot; Rachel Bawden; |
296 | Hybrid Knowledge Transfer for Improved Cross-Lingual Event Detection Via Hierarchical Sample Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the Event Detection task under a zero-shot cross-lingual setting where a model is trained on a source language but evaluated on a distinct target language for which there is no labeled data available. |
Luis Guzman Nateras; Franck Dernoncourt; Thien Nguyen; |
297 | BLEURT Has Universal Translations: An Analysis of Automatic Metrics By Minimum Risk Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we systematically analyze and compare various mainstream and cutting-edge automatic metrics from the perspective of their guidance for training machine translation systems. |
Yiming Yan; Tao Wang; Chengqi Zhao; Shujian Huang; Jiajun Chen; Mingxuan Wang; |
298 | Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that a critical component lacking from current vision-language models is relation-level alignment: the ability to match directional semantic relations in text (e. g. , �mug in grass�) with spatial relationships in the image (e. g. , the position of the mug relative to the grass). To tackle this problem, we show that relation alignment can be enforced by encouraging the language attention from �mug� to �grass� (capturing the semantic relation �in�) to match the visual attention from the mug to the grass (capturing the corresponding physical relation). |
Rohan Pandey; Rulin Shao; Paul Pu Liang; Ruslan Salakhutdinov; Louis-Philippe Morency; |
299 | Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we combine the advantages of the three resources to obtain a richer and more accurate persona. |
Yihong Tang; Bo Wang; Miao Fang; Dongming Zhao; Kun Huang; Ruifang He; Yuexian Hou; |
300 | Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We take a step forward and study LMs? abilities to make inferences based on injected facts (or propagate those facts): for example, after learning that something is a TV show, does an LM predict that you can watch it? |
Yasumasa Onoe; Michael Zhang; Shankar Padmanabhan; Greg Durrett; Eunsol Choi; |
301 | Explaining How Transformers Use Context to Build Predictions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we leverage recent advances in explainability of the Transformer and present a procedure to analyze models for language generation. |
Javier Ferrando; Gerard I. G�llego; Ioannis Tsiamas; Marta R. Costa-juss�; |
302 | DISCO: Distilling Counterfactuals with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce DISCO (DIStilled COunterfactual Data), a new method for automatically generating high-quality counterfactual data at scale. |
Zeming Chen; Qiyue Gao; Antoine Bosselut; Ashish Sabharwal; Kyle Richardson; |
303 | Non-Sequential Graph Script Induction Via Multimedia Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the new challenging task of non-sequential graph script induction, aiming to capture optional and interchangeable steps in procedural planning. |
Yu Zhou; Sha Li; Manling Li; Xudong Lin; Shih-Fu Chang; Mohit Bansal; Heng Ji; |
304 | SCOTT: Self-Consistent Chain-of-Thought Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose SCOTT, a faithful knowledge distillation method to learn a small, self-consistent CoT model from a teacher model that is orders of magnitude larger. |
Peifeng Wang; Zhengyang Wang; Zheng Li; Yifan Gao; Bing Yin; Xiang Ren; |
305 | Clinical Note Owns Its Hierarchy: Multi-Level Hypergraph Neural Networks for Patient-Level Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, we propose a taxonomy-aware multi-level hypergraph neural network (TM-HGNN), where multi-level hypergraphs assemble useful neutral words with rare keywords via note and taxonomy level hyperedges to retain the clinical semantic information. |
Nayeon Kim; Yinhua Piao; Sun Kim; |
306 | Incorporating Distributions of Discourse Structure for Long Document Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the �RSTformer�, a novel summarization model that comprehensively incorporates both the types and uncertainty of rhetorical relations. |
Dongqi Pu; Yifan Wang; Vera Demberg; |
307 | Evaluating Open-Domain Question Answering in The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct a thorough analysis of various open-domain QA models, including LLMs, by manually evaluating their answers on a subset of NQ-open, a popular benchmark. |
Ehsan Kamalloo; Nouha Dziri; Charles Clarke; Davood Rafiei; |
308 | No Clues Good Clues: Out of Context Lexical Relation Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are indications that commonly used PTLMs already encode enough linguistic knowledge to allow the use of minimal (or none) textual context for some linguistically motivated tasks, thus notably reducing human effort, the need for data pre-processing, and favoring techniques that are language neutral since do not rely on syntactic structures. In this work, we explore this idea for the tasks of lexical relation classification (LRC) and graded Lexical Entailment (LE). |
Lucia Pitarch; Jordi Bernad; Lacramioara Dranca; Carlos Bobed Lisbona; Jorge Gracia; |
309 | Won�t Get Fooled Again: Answering Questions with False Premises Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such frailties of PLMs often allude to the lack of knowledge within them. In this paper, we find that the PLMs already possess the knowledge required to rebut such questions, and the key is how to activate the knowledge. |
Shengding Hu; Yifan Luo; Huadong Wang; Xingyi Cheng; Zhiyuan Liu; Maosong Sun; |
310 | What The DAAM: Interpreting Stable Diffusion Using Cross Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we perform a text-image attribution analysis on Stable Diffusion, a recently open-sourced model. |
Raphael Tang; Linqing Liu; Akshat Pandey; Zhiying Jiang; Gefei Yang; Karun Kumar; Pontus Stenetorp; Jimmy Lin; Ferhan Ture; |
311 | Zero-shot Faithful Factual Error Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Faithfully correcting factual errors is critical for maintaining the integrity of textual knowledge bases and preventing hallucinations in sequence-to-sequence models. Drawing on humans� ability to identify and correct factual errors, we present a zero-shot framework that formulates questions about input claims, looks for correct answers in the given evidence, and assesses the faithfulness of each correction based on its consistency with the evidence. |
Kung-Hsiang Huang; Hou Pong Chan; Heng Ji; |
312 | Open-Domain Hierarchical Event Schema Induction By Incremental Prompting and Verification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast, we propose to treat event schemas as a form of commonsense knowledge that can be derived from large language models (LLMs). |
Sha Li; Ruining Zhao; Manling Li; Heng Ji; Chris Callison-Burch; Jiawei Han; |
313 | Zero-shot Approach to Overcome Perturbation Sensitivity of Prompts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study aims to find high-quality prompts for the given task in a zero-shot setting. |
Mohna Chakraborty; Adithya Kulkarni; Qi Li; |
314 | Free Lunch: Robust Cross-Lingual Transfer Via Model Checkpoint Averaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, aiming to improve the robustness of �true� ZS-XLT and FS-XLT, we propose a simple and effective method that averages different checkpoints (i. e. , model snapshots) during task fine-tuning. |
Fabian Schmidt; Ivan Vulic; Goran Glava�; |
315 | Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Cross-View Language Modeling, a simple and effective pre-training framework that unifies cross-lingual and cross-modal pre-training with shared architectures and objectives. |
Yan Zeng; Wangchunshu Zhou; Ao Luo; Ziming Cheng; Xinsong Zhang; |
316 | Unsupervised Discontinuous Constituency Parsing with Mildly Context-Sensitive Grammars Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study grammar induction with mildly context-sensitive grammars for unsupervised discontinuous parsing. |
Songlin Yang; Roger Levy; Yoon Kim; |
317 | Simplicity Bias in Transformers and Their Ability to Learn Sparse Boolean Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct an extensive empirical study on Boolean functions to demonstrate the following: (i) Random Transformers are relatively more biased towards functions of low sensitivity. |
Satwik Bhattamishra; Arkil Patel; Varun Kanade; Phil Blunsom; |
318 | Counterspeeches Up My Sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore intent-conditioned counterspeech generation. |
Rishabh Gupta; Shaily Desai; Manvi Goel; Anil Bandhakavi; Tanmoy Chakraborty; Md. Shad Akhtar; |
319 | DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Choosing an informative subset of speech samples that are most representative of the target accents becomes important for effective ASR finetuning. To address this problem, we propose DITTO (Data-efficient and faIr Targeted subseT selectiOn that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget. |
Suraj Kothawade; Anmol Mekala; D.Chandra Sekhara Hetha Havya; Mayank Kothyari; Rishabh Iyer; Ganesh Ramakrishnan; Preethi Jyothi; |
320 | Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the Verify-and-Edit framework for CoT prompting, which seeks to increase prediction factuality by post-editing reasoning chains according to external knowledge. |
Ruochen Zhao; Xingxuan Li; Shafiq Joty; Chengwei Qin; Lidong Bing; |
321 | Bridging The Domain Gaps in Context Representations for K-Nearest Neighbor Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there often exists a significant gap between upstream and downstream domains, which hurts the datastore retrieval and the final translation quality. To deal with this issue, we propose a novel approach to boost the datastore retrieval of kNN-MT by reconstructing the original datastore. |
Zhiwei Cao; Baosong Yang; Huan Lin; Suhang Wu; Xiangpeng Wei; Dayiheng Liu; Jun Xie; Min Zhang; Jinsong Su; |
322 | Node Placement in Argument Maps: Modeling Unidirectional Relations in High & Low-Resource Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To support those users, we introduce the task of node placement: suggesting candidate nodes as parents for a new contribution. We establish an upper-bound of human performance, and conduct experiments with models of various sizes and training strategies. |
Iman Jundi; Neele Falk; Eva Maria Vecchi; Gabriella Lapesa; |
323 | Towards A Common Understanding of Contributing Factors for Cross-Lingual Transfer in Multilingual Language Models: A Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, given that the aspiration for such an ability has not been explicitly incorporated in the design of the majority of MLLMs, it is challenging to obtain a unique and straightforward explanation for its emergence. In this review paper, we survey literature that investigates different factors contributing to the capacity of MLLMs to perform zero-shot cross-lingual transfer and subsequently outline and discuss these factors in detail. |
Fred Philippy; Siwen Guo; Shohreh Haddadan; |
324 | Toward Human-Like Evaluation for Natural Language Generation with Error Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we argue that the ability to estimate sentence confidence is the tip of the iceberg for PLM-based metrics. |
Qingyu Lu; Liang Ding; Liping Xie; Kanjian Zhang; Derek F. Wong; Dacheng Tao; |
325 | Connective Prediction for Implicit Discourse Relation Recognition Via Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, these approaches spend lots of effort on template construction, negatively affecting the generalization capability. To address these problems,we propose a novel Connective Prediction via Knowledge Distillation (CP-KD) approach to instruct large-scale pre-trained language models (PLMs) mining the latent correlations between connectives and discourse relations, which is meaningful for IDRR. |
Hongyi Wu; Hao Zhou; Man Lan; Yuanbin Wu; Yadong Zhang; |
326 | What Is The Best Recipe for Character-level Encoder-only Modelling? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper aims to benchmark recent progress in language understanding models that output contextualised representations at the character level. |
Kris Cao; |
327 | Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a more practical and scalable setting: weakly supervised multilingual VLP with only English image-text pairs and multilingual text corpora. |
Zejun Li; Zhihao Fan; Jingjing Chen; Qi Zhang; Xuanjing Huang; Zhongyu Wei; |
328 | Learning �O� Helps for Learning More: Handling The Unlabeled Entity Problem for Class-incremental NER Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct an empirical study on the �Unlabeled Entity Problem� and find that it leads to severe confusion between �O� and entities, decreasing class discrimination of old classes and declining the model�s ability to learn new classes. |
Ruotian Ma; Xuanting Chen; Zhang Lin; Xin Zhou; Junzhe Wang; Tao Gui; Qi Zhang; Xiang Gao; Yun Wen Chen; |
329 | Scene Graph As Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene Hallucination Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate a more realistic unsupervised multimodal machine translation (UMMT) setup, inference-time image-free UMMT, where the model is trained with source-text image pairs, and tested with only source-text inputs. |
Hao Fei; Qian Liu; Meishan Zhang; Min Zhang; Tat-Seng Chua; |
330 | CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these methods may suffer from label noise due to the automatic labeling process. In this paper, we propose CoLaDa, a Collaborative Label Denoising Framework, to address this problem. |
Tingting Ma; Qianhui Wu; Huiqiang Jiang; B�rje Karlsson; Tiejun Zhao; Chin-Yew Lin; |
331 | Dialect-robust Evaluation of Generated Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a suite of methods to assess whether metrics are dialect robust. |
Jiao Sun; Thibault Sellam; Elizabeth Clark; Tu Vu; Timothy Dozat; Dan Garrette; Aditya Siddhant; Jacob Eisenstein; Sebastian Gehrmann; |
332 | Understanding and Improving The Robustness of Terminology Constraints in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the robustness of two typical terminology translation methods: Placeholder (PH) and Code-Switch (CS), concerning (1) the number of constraints and (2) the target constraint length. |
Huaao Zhang; Qiang Wang; Bo Qin; Zelin Shi; Haibo Wang; Ming Chen; |
333 | Language Model Acceptability Judgements Are Not Always Robust to Context Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we vary the input contexts based on: length, the types of syntactic phenomena it contains, and whether or not there are grammatical violations. |
Koustuv Sinha; Jon Gauthier; Aaron Mueller; Kanishka Misra; Keren Fuentes; Roger Levy; Adina Williams; |
334 | RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our results indicate that both state-of-the-art Table QA models and large language models (e. g. , GPT-3) with few-shot learning falter in these adversarial sets. We propose to address this problem by using large language models to generate adversarial examples to enhance training, which significantly improves the robustness of Table QA models. |
Yilun Zhao; Chen Zhao; Linyong Nan; Zhenting Qi; Wenlin Zhang; Xiangru Tang; Boyu Mi; Dragomir Radev; |
335 | Morphological Inflection: A Reality Check Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To improve generalizability and reliability of results, we propose new data sampling and evaluation strategies that better reflect likely use-cases. |
Jordan Kodner; Sarah Payne; Salam Khalifa; Zoey Liu; |
336 | TOME: A Two-stage Approach for Model-based Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its attractive qualities, there remain several major challenges in model-based retrieval, including the discrepancy between pre-training and fine-tuning, and the discrepancy between training and inference. To deal with the above challenges, we propose a novel two-stage model-based retrieval approach called TOME, which makes two major technical contributions, including the utilization of tokenized URLs as identifiers and the design of a two-stage generation architecture. |
Ruiyang Ren; Wayne Xin Zhao; Jing Liu; Hua Wu; Ji-Rong Wen; Haifeng Wang; |
337 | Using Neural Machine Translation for Generating Diverse Challenging Exercises for Language Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel approach to automatically generate distractors for cloze exercises for English language learners, using round-trip neural machine translation. |
Frank Palma Gomez; Subhadarshi Panda; Michael Flor; Alla Rozovskaya; |
338 | Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-intense Argumentation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a new unsupervised method for constructing Contextualized Commonsense Knowledge Graphs (CCKGs) that selects contextually relevant knowledge from large knowledge graphs (KGs) efficiently and at high quality. |
Moritz Plenz; Juri Opitz; Philipp Heinisch; Philipp Cimiano; Anette Frank; |
339 | MiCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. |
Tassilo Klein; Moin Nabi; |
340 | Learning Non-linguistic Skills Without Sacrificing Linguistic Proficiency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we take a closer look into the phenomena of catastrophic forgetting as it pertains to LLMs and subsequently offer a novel framework for non-linguistic skill injection for LLMs based on information-theoretic interventions and skill-specific losses that enable the learning of strict arithmetic reasoning. |
Mandar Sharma; Nikhil Muralidhar; Naren Ramakrishnan; |
341 | Forgotten Knowledge: Examining The Citational Amnesia in NLP Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work systematically and empirically examines: How far back in time do we tend to go to cite papers? |
Janvijay Singh; Mukund Rungta; Diyi Yang; Saif Mohammad; |
342 | Measuring The Instability of Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze SD and six other measures quantifying instability of different granularity levels. |
Yupei Du; Dong Nguyen; |
343 | FairPrism: Evaluating Fairness-Related Harms in Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To that end, we introduce FairPrism, a dataset of 5,000 examples of AI-generated English text with detailed human annotations covering a diverse set of harms relating to gender and sexuality. |
Eve Fleisig; Aubrie Amstutz; Chad Atalla; Su Lin Blodgett; Hal Daum� III; Alexandra Olteanu; Emily Sheng; Dan Vann; Hanna Wallach; |
344 | Factually Consistent Summarization Via Reinforcement Learning with Textual Entailment Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work we leverage recent progress on textual entailment models to directly address this problem for abstractive summarization systems. |
Paul Roit; Johan Ferret; Lior Shani; Roee Aharoni; Geoffrey Cideron; Robert Dadashi; Matthieu Geist; Sertan Girgin; Leonard Hussenot; Orgad Keller; Nikola Momchev; Sabela Ramos Garea; Piotr Stanczyk; Nino Vieillard; Olivier Bachem; Gal Elidan; Avinatan Hassidim; Olivier Pietquin; Idan Szpektor; |
345 | SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose SIMMC-VR, an extension of the SIMMC-2. |
Te-Lin Wu; Satwik Kottur; Andrea Madotto; Mahmoud Azab; Pedro Rodriguez; Babak Damavandi; Nanyun Peng; Seungwhan Moon; |
346 | Multilingual LLMs Are Better Cross-lingual In-context Learners with Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the prevalent mode of selecting random input-label pairs to construct the prompt-context is severely limited in the case of cross-lingual ICL, primarily due to the lack of alignment in the input as well as the output spaces. To mitigate this, we propose a novel prompt construction strategy � Cross-lingual In-context Source Target Alignment (X-InSTA). |
Eshaan Tanwar; Subhabrata Dutta; Manish Borthakur; Tanmoy Chakraborty; |
347 | APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose APOLLO, a simple adaptive pretraining approach to improve the logical reasoning skills of language models. |
Soumya Sanyal; Yichong Xu; Shuohang Wang; Ziyi Yang; Reid Pryzant; Wenhao Yu; Chenguang Zhu; Xiang Ren; |
348 | MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Furthermore, multi-table operations often result in a tabular output, which necessitates table generation capabilities of tabular QA models. To fill this gap, we propose a new task of answering questions over multiple tables. |
Vaishali Pal; Andrew Yates; Evangelos Kanoulas; Maarten de Rijke; |
349 | To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this paper points out that the multi-hop relation rules are hard to be reliably memorized due to the inherent deficiencies of such implicit memorization strategy, making embedding models underperform in predicting links between distant entity pairs. To alleviate this problem, we present Vertical Learning Paradigm (VLP), which extends embedding models by allowing to explicitly copy target information from related factual triples for more accurate prediction. |
Rui Li; Xu Chen; Chaozhuo Li; Yanming Shen; Jianan Zhao; Yujing Wang; Weihao Han; Hao Sun; Weiwei Deng; Qi Zhang; Xing Xie; |
350 | CoAD: Automatic Diagnosis Through Symptom and Disease Collaborative Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its simplicity and superior performance demonstrated, a decline in disease diagnosis accuracy is observed caused by 1) a mismatch between symptoms observed during training and generation, and 2) the effect of different symptom orders on disease prediction. To address the above obstacles, we introduce the CoAD, a novel disease and symptom collaborative generation framework, which incorporates several key innovations to improve AD: 1) aligning sentence-level disease labels with multiple possible symptom inquiry steps to bridge the gap between training and generation; 2) expanding symptom labels for each sub-sequence of symptoms to enhance annotation and eliminate the effect of symptom order; 3) developing a repeated symptom input schema to effectively and efficiently learn the expanded disease and symptom labels. |
Huimin Wang; Wai Chung Kwan; Kam-Fai Wong; Yefeng Zheng; |
351 | Long-Tailed Question Answering in An Open World Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we define Open Long-Tailed QA (OLTQA) as learning from long-tailed distributed data and optimizing performance over seen and unseen QA tasks. |
Yi Dai; Hao Lang; Yinhe Zheng; Fei Huang; Yongbin Li; |
352 | Parallel Context Windows for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Parallel Context Windows (PCW), a method that alleviates the context window restriction for any off-the-shelf LLM without further training. |
Nir Ratner; Yoav Levine; Yonatan Belinkov; Ori Ram; Inbal Magar; Omri Abend; Ehud Karpas; Amnon Shashua; Kevin Leyton-Brown; Yoav Shoham; |
353 | Efficient Transformers with Dynamic Token Pooling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Nevertheless, natural units of meaning, such as words or phrases, display varying sizes. To address this mismatch, we equip language models with a dynamic-pooling mechanism, which predicts segment boundaries in an autoregressive fashion. |
Piotr Nawrot; Jan Chorowski; Adrian Lancucki; Edoardo Maria Ponti; |
354 | Did The Models Understand Documents? Benchmarking Models for Language Understanding in Document-Level Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Document-level relation extraction (DocRE) attracts more research interest recently. While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied: Do they make the right predictions according to rationales? In this paper, we take the first step toward answering this question and then introduce a new perspective on comprehensively evaluating a model. |
Haotian Chen; Bingsheng Chen; Xiangdong Zhou; |
355 | ContraCLM: Contrastive Learning For Causal Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite exciting progress in causal language models, the expressiveness of their representations is largely limited due to poor discrimination ability. To remedy this issue, we present CONTRACLM, a novel contrastive learning framework at both the token-level and the sequence-level. |
Nihal Jain; Dejiao Zhang; Wasi Uddin Ahmad; Zijian Wang; Feng Nan; Xiaopeng Li; Ming Tan; Ramesh Nallapati; Baishakhi Ray; Parminder Bhatia; Xiaofei Ma; Bing Xiang; |
356 | Advancing Multi-Criteria Chinese Word Segmentation Through Criterion Classification and Denoising Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that through a simple yet elegant input-hint-based MCCWS model, we can achieve state-of-the-art (SoTA) performances on several datasets simultaneously. |
Tzu Hsuan Chou; Chun-Yi Lin; Hung-Yu Kao; |
357 | Infusing Hierarchical Guidance Into Prompt Tuning: A Parameter-Efficient Framework for Multi-level Implicit Discourse Relation Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, the comprehension of hierarchical semantics for MIDRR makes the conversion much harder. In this paper, we propose a prompt-based Parameter-Efficient Multi-level IDRR (PEMI) framework to solve the above problems. |
Haodong Zhao; Ruifang He; Mengnan Xiao; Jing Xu; |
358 | Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explain the model pathology from the view of sentence representation and argue that the counter-intuitive bias degree and direction of the out-of-distribution examples� representation cause the pathology. |
Pengwei Zhan; Jing Yang; Xiao Huang; Chunlei Jing; Jingying Li; Liming Wang; |
359 | Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children�s Fairy Tales Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work joins this interdisciplinary effort and makes a unique contribution by taking into account the event narrative structures when analyzing the social bias of stories. We propose a computational pipeline that automatically extracts a story�s temporal narrative verb-based event chain for each of its characters as well as character attributes such as gender. |
Paulina Toro Isaza; Guangxuan Xu; Toye Oloko; Yufang Hou; Nanyun Peng; Dakuo Wang; |
360 | FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context using a self-training framework. |
Weihao Zeng; Keqing He; Yejie Wang; Chen Zeng; Jingang Wang; Yunsen Xian; Weiran Xu; |
361 | LAMBADA: Backward Chaining for Automated Reasoning in Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The classical automated reasoning literature has shown that reasoning in the backward direction (i. e. from intended conclusion to supporting axioms) is significantly more efficient at proof-finding. Importing this intuition into the LM setting, we develop a Backward Chaining algorithm, called LAMBADA, that decomposes reasoning into four sub-modules, that are simply implemented by few-shot prompted LLM inference. |
Mehran Kazemi; Najoung Kim; Deepti Bhatia; Xin Xu; Deepak Ramachandran; |
362 | PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we construct a new large-scale persona commonsense knowledge graph, PeaCoK, containing ~100K human-validated persona facts. |
Silin Gao; Beatriz Borges; Soyoung Oh; Deniz Bayazit; Saya Kanno; Hiromi Wakaki; Yuki Mitsufuji; Antoine Bosselut; |
363 | OpenSR: Open-Modality Speech Recognition Via Maintaining Multi-Modality Alignment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, when training the specific model of new domain, it often gets stuck in the lack of new-domain utterances, especially the labeled visual utterances. To break through this restriction, we attempt to achieve zero-shot modality transfer by maintaining the multi-modality alignment in phoneme space learned with unlabeled multimedia utterances in the high resource domain during the pre-training, and propose a training system Open-modality Speech Recognition (OpenSR) that enables the models trained on a single modality (e. g. , audio-only) applicable to more modalities (e. g. , visual-only and audio-visual). |
Xize Cheng; Tao Jin; Linjun Li; Wang Lin; Xinyu Duan; Zhou Zhao; |
364 | Retrieval-free Knowledge Injection Through Multi-Document Traversal for Dialogue Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nevertheless, retrieval-augmented approaches rely on finely annotated retrieval training data and knowledge-grounded response generation data, making it costly to transfer. To tackle this challenge, this paper proposed a retrieval-free approach, KiDG, by automatically turning knowledge documents into simulated multi-turn dialogues through a Multi-Document Traversal algorithm. |
Rui Wang; Jianzhu Bao; Fei Mi; Yi Chen; Hongru Wang; Yasheng Wang; Yitong Li; Lifeng Shang; Kam-Fai Wong; Ruifeng Xu; |
365 | BERM: Training The Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM. |
Shicheng Xu; Liang Pang; Huawei Shen; Xueqi Cheng; |
366 | Multiview Identifiers Enhanced Generative Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current approaches use either a numeric ID or a text piece (such as a title or substrings) as the identifier. However, these identifiers cannot cover a passage?s content well. As such, we are motivated to propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage and could integrate contextualized information that text pieces lack. |
Yongqi Li; Nan Yang; Liang Wang; Furu Wei; Wenjie Li; |
367 | Prompting Language Models for Linguistic Structure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although pretrained language models (PLMs) can be prompted to perform a wide range of language tasks, it remains an open question how much this ability comes from generalizable linguistic understanding versus surface-level lexical patterns. To test this, we present a structured prompting approach for linguistic structured prediction tasks, allowing us to perform zero- and few-shot sequence tagging with autoregressive PLMs. |
Terra Blevins; Hila Gonen; Luke Zettlemoyer; |
368 | Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we develop a novel task of hawkish-dovish classification and benchmark various pre-trained language models on the proposed dataset. |
Agam Shah; Suvan Paturi; Sudheer Chava; |
369 | RE-Matching: A Fine-Grained Semantic Matching Method for Zero-Shot Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a fine-grained semantic matching method tailored for zero-shot relation extraction. |
Jun Zhao; WenYu Zhan; Xin Zhao; Qi Zhang; Tao Gui; Zhongyu Wei; Junzhe Wang; Minlong Peng; Mingming Sun; |
370 | SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, discussions on sensitive issues can become toxic even if the users are well-intentioned. For safer models in such scenarios, we present the Sensitive Questions and Acceptable Response (SQuARe) dataset, a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses. |
Hwaran Lee; Seokhee Hong; Joonsuk Park; Takyoung Kim; Meeyoung Cha; Yejin Choi; Byoungpil Kim; Gunhee Kim; Eun-Ju Lee; Yong Lim; Alice Oh; Sangchul Park; Jung-Woo Ha; |
371 | Towards Standardizing Korean Grammatical Error Correction: Datasets and Annotation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we collect three datasets from different sources (Kor-Lang8, Kor-Native, and Kor-Learner) that covers a wide range of Korean grammatical errors. |
Soyoung Yoon; Sungjoon Park; Gyuwan Kim; Junhee Cho; Kihyo Park; Gyu Tae Kim; Minjoon Seo; Alice Oh; |
372 | FLamE: Few-shot Learning from Natural Language Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively learn from explanations, we present FLamE, a two-stage few-shot learning framework that first generates explanations using GPT-3, and then fine-tunes a smaller model (e. g. , RoBERTa) with generated explanations. |
Yangqiaoyu Zhou; Yiming Zhang; Chenhao Tan; |
373 | Learning Symbolic Rules Over Abstract Meaning Representations for Textual Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Therefore, in this paper, we propose a modular, NEuro-Symbolic Textual Agent (NESTA) that combines a generic semantic parser with a rule induction system to learn abstract interpretable rules as policies. |
Subhajit Chaudhury; Sarathkrishna Swaminathan; Daiki Kimura; Prithviraj Sen; Keerthiram Murugesan; Rosario Uceda-Sosa; Michiaki Tatsubori; Achille Fokoue; Pavan Kapanipathi; Asim Munawar; Alexander Gray; |
374 | Counterfactual Debiasing for Fact Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike previous works, we propose a novel method from a counterfactual view, namely CLEVER, which is augmentation-free and mitigates biases on the inference stage. |
Weizhi Xu; Qiang Liu; Shu Wu; Liang Wang; |
375 | What Social Attitudes About Gender Does BERT Encode? Leveraging Insights from Psycholinguistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Much research has sought to evaluate the degree to which large language models reflect social biases. We complement such work with an approach to elucidating the connections between language model predictions and people?s social attitudes. |
Julia Watson; Barend Beekhuizen; Suzanne Stevenson; |
376 | Rethinking Multimodal Entity and Relation Extraction from A Translation Point of View Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit the multimodal entity and relation extraction from a translation point of view. |
Changmeng Zheng; Junhao Feng; Yi Cai; Xiaoyong Wei; Qing Li; |
377 | Annotating and Detecting Fine-grained Factual Errors for Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first dataset with fine-grained factual error annotations named DIASUMFACT. |
Rongxin Zhu; Jianzhong Qi; Jey Han Lau; |
378 | Improving The Robustness of Summarization Systems with Dual Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To create semantic-consistent substitutes, we propose a SummAttacker, which is an efficient approach to generating adversarial samples based on pre-trained language models. |
Xiuying Chen; Guodong Long; Chongyang Tao; Mingzhe Li; Xin Gao; Chengqi Zhang; Xiangliang Zhang; |
379 | Interpretable Math Word Problem Solution Generation Via Step-by-step Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a step-by-step planning approach for intermediate solution generation, which strategically plans the generation of the next solution step based on the MWP and the previous solution steps. |
Mengxue Zhang; Zichao Wang; Zhichao Yang; Weiqi Feng; Andrew Lan; |
380 | TemplateGEC: Improving Grammatical Error Correction with Detection Template Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Grammatical error correction (GEC) can be divided into sequence-to-edit (Seq2Edit) and sequence-to-sequence (Seq2Seq) frameworks, both of which have their pros and cons. To utilize the strengths and make up for the shortcomings of these frameworks, this paper proposes a novel method, TemplateGEC, which capitalizes on the capabilities of both Seq2Edit and Seq2Seq frameworks in error detection and correction respectively. |
Yinghao Li; Xuebo Liu; Shuo Wang; Peiyuan Gong; Derek F. Wong; Yang Gao; Heyan Huang; Min Zhang; |
381 | Deep Model Compression Also Helps Models Capture Ambiguity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this problem, we must consider how to exactly capture the degree of relationship between each sample and its candidate classes. In this work, we propose a novel method with deep model compression and show how such relationship can be accounted for. |
Hancheol Park; Jong Park; |
382 | Are Experts Needed? On Human Evaluation of Counselling Reflection Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Laypeople-based evaluation is less expensive and easier to scale, but its quality is unknown for reflections. Therefore, we explore whether laypeople can be an alternative to experts in evaluating a fundamental quality aspect: coherence and context-consistency. |
Zixiu Wu; Simone Balloccu; Ehud Reiter; Rim Helaoui; Diego Reforgiato Recupero; Daniele Riboni; |
383 | PairSpanBERT: An Enhanced Language Model for Bridging Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PairSpanBERT, a SpanBERT-based pre-trained model specialized for bridging resolution. |
Hideo Kobayashi; Yufang Hou; Vincent Ng; |
384 | Compounding Geometric Operations for Knowledge Graph Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the synergy, we propose a new KGE model by leveraging all three operations in this work. |
Xiou Ge; Yun Cheng Wang; Bin Wang; C.-C. Jay Kuo; |
385 | Few-shot In-context Learning on Knowledge Base Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To handle questions over diverse KBQA datasets with a unified training-free framework, we propose KB-BINDER, which for the first time enables few-shot in-context learning over KBQA tasks. |
Tianle Li; Xueguang Ma; Alex Zhuang; Yu Gu; Yu Su; Wenhu Chen; |
386 | Fact-Checking Complex Claims with Program-Guided Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Program-Guided Fact-Checking (ProgramFC), a novel fact-checking model that decomposes complex claims into simpler sub-tasks that can be solved using a shared library of specialized functions. |
Liangming Pan; Xiaobao Wu; Xinyuan Lu; Anh Tuan Luu; William Yang Wang; Min-Yen Kan; Preslav Nakov; |
387 | Patton: Language Model Pretraining on Text-Rich Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose our PretrAining on TexT-Rich NetwOrk framework Patton. |
Bowen Jin; Wentao Zhang; Yu Zhang; Yu Meng; Xinyang Zhang; Qi Zhu; Jiawei Han; |
388 | Soft Language Clustering for Multilingual Model Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose XLM-P, a method that contextually retrieves prompts as flexible guidance for encoding instances conditionally. |
Jiali Zeng; Yufan Jiang; Yongjing Yin; Yi Jing; Fandong Meng; Binghuai Lin; Yunbo Cao; Jie Zhou; |
389 | Curriculum Learning for Graph Neural Networks: A Multiview Competence-based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new perspective on curriculum learning by introducing a novel approach that builds on graph complexity formalisms (as difficulty criteria) and model competence during training. |
Nidhi Vakil; Hadi Amiri; |
390 | When and How to Paraphrase for Named Entity Recognition? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we utilize simple strategies to annotate entity spans in generations and compare established and novel methods of paraphrasing in NLP such as back translation, specialized encoder-decoder models such as Pegasus, and GPT-3 variants for their effectiveness in improving downstream performance for NER across different levels of gold annotations and paraphrasing strength on 5 datasets. |
Saket Sharma; Aviral Joshi; Yiyun Zhao; Namrata Mukhija; Hanoz Bhathena; Prateek Singh; Sashank Santhanam; |
391 | UniEvent: Unified Generative Model with Multi-Dimensional Prefix for Zero-Shot Event-Relational Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance knowledge transfer and enable zero-shot generalization among various combinations, in this work we propose a novel unified framework, called UNIEVENT. |
Zhengwei Tao; Zhi Jin; Haiyan Zhao; Chengfeng Dou; Yongqiang Zhao; Tao Shen; Chongyang Tao; |
392 | Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-text Rationales Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that, by estimating a rationale�s helpfulness in answering similar unseen instances, we can measure its human utility to a better extent. We also translate this finding into an automated score, Gen-U, that we propose, which can help improve LMs� ability to generate rationales with better human utility, while maintaining most of its task performance. |
Brihi Joshi; Ziyi Liu; Sahana Ramnath; Aaron Chan; Zhewei Tong; Shaoliang Nie; Qifan Wang; Yejin Choi; Xiang Ren; |
393 | Automatic Annotation of Direct Speech in Written French Narratives Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our goal is to create a unified framework to design and evaluate AADS models in French. |
No� Durandard; Viet Anh Tran; Gaspard Michel; Elena Epure; |
394 | Automatic Creation of Named Entity Recognition Datasets By Querying Phrase Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a novel framework, HighGEN, that generates NER datasets with high-coverage pseudo-dictionaries. |
Hyunjae Kim; Jaehyo Yoo; Seunghyun Yoon; Jaewoo Kang; |
395 | Dynamic Transformers Provide A False Sense of Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose a simple yet effective attacking framework, SAME, a novel slowdown attack framework on multi-exit models, which is specially tailored to reduce the efficiency of the multi-exit models. |
Yiming Chen; Simin Chen; Zexin Li; Wei Yang; Cong Liu; Robby Tan; Haizhou Li; |
396 | Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose M2C, a morphologically-aware framework for behavioral testing of NLP models. |
Ester Hlavnova; Sebastian Ruder; |
397 | Local Byte Fusion for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a Local Byte Fusion (LOBEF) method for byte-based machine translation�utilizing byte n-gram and word boundaries�to aggregate local semantic information. |
Makesh Narsimhan Sreedhar; Xiangpeng Wan; Yu Cheng; Junjie Hu; |
398 | Where�s The Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we thus introduce a multilingual punctuation-agnostic sentence segmentation method, currently covering 85 languages, trained in a self-supervised fashion on unsegmented text, by making use of newline characters which implicitly perform segmentation into paragraphs. |
Benjamin Minixhofer; Jonas Pfeiffer; Ivan Vulic; |
399 | Multi-target Backdoor Attacks for Code Pre-trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. |
Yanzhou Li; Shangqing Liu; Kangjie Chen; Xiaofei Xie; Tianwei Zhang; Yang Liu; |
400 | Learning Better Masking for Better Language Model Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the model may receive complicated impact from pre-training status, which changes accordingly as training time goes on. In this paper, we show that such time-invariant MLM settings on masking ratio and masked content are unlikely to deliver an optimal outcome, which motivates us to explore the influence of time-variant MLM settings. |
Dongjie Yang; Zhuosheng Zhang; Hai Zhao; |
401 | VisText: A Benchmark for Semantically Rich Chart Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce VisText: a dataset of 12,441 pairs of charts and captions that describe the charts� construction, report key statistics, and identify perceptual and cognitive phenomena. |
Benny Tang; Angie Boggust; Arvind Satyanarayan; |
402 | Byte-Level Grammatical Error Correction Using Synthetic and Curated Corpora Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Grammatical error correction (GEC) is the task of correcting typos, spelling, punctuation and grammatical issues in text. Approaching the problem as a sequence-to-sequence task, we compare the use of a common subword unit vocabulary and byte-level encoding. Initial synthetic training data is created using an error-generating pipeline, and used for finetuning two subword-level models and one byte-level model. |
Svanhv�t Lilja Ing�lfsd�ttir; Petur Ragnarsson; Haukur J�nsson; Haukur Simonarson; Vilhjalmur Thorsteinsson; V�steinn Sn�bjarnarson; |
403 | Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the complementary characteristic of both methods and propose a multi-level knowledge distillation approach that integrates their strengths while mitigating their limitations. |
Qianhui Wu; Huiqiang Jiang; Haonan Yin; B�rje Karlsson; Chin-Yew Lin; |
404 | Peeking Inside The Black Box: A Commonsense-aware Generative Framework for Explainable Complaint Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address the task of explainable complaint detection and propose a commonsense-aware unified generative framework by reframing the multitask problem as a text-to-text generation task. |
Apoorva Singh; Raghav Jain; Prince Jha; Sriparna Saha; |
405 | MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the MMDialog dataset to facilitate multi-modal conversation better. |
Jiazhan Feng; Qingfeng Sun; Can Xu; Pu Zhao; Yaming Yang; Chongyang Tao; Dongyan Zhao; Qingwei Lin; |
406 | ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. |
Jonas Belouadi; Steffen Eger; |
407 | Envisioning Future from The Past: Hierarchical Duality Learning for Multi-Turn Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we define a widely neglected property in dialogue text, duality, which is a hierarchical property that is reflected in human behaviours in daily conversations: Based on the logic in a conversation (or a sentence), people can infer follow-up utterances (or tokens) based on the previous text, and vice versa. |
Ang Lv; Jinpeng Li; Shufang Xie; Rui Yan; |
408 | DualGATs: Dual Graph Attention Networks for Emotion Recognition in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Dual Graph ATtention networks (DualGATs) to concurrently consider the complementary aspects of discourse structure and speaker-aware context, aiming for more precise ERC. |
Duzhen Zhang; Feilong Chen; Xiuyi Chen; |
409 | Consistent Prototype Learning for Few-Shot Continual Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a new N-way-K-shot Continual Relation Extraction (NK-CRE) task and propose a novel few-shot continual relation extraction method with Consistent Prototype Learning (ConPL) to address the aforementioned issues. |
Xiudi Chen; Hui Wu; Xiaodong Shi; |
410 | Matching Pairs: Attributing Fine-Tuned Models to Their Pre-Trained Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we need a method to investigate how a model was trained or a piece of text was generated and what their pre-trained base model was. In this paper we take the first step to address this open problem by tracing back the origin of a given fine-tuned LLM to its corresponding pre-trained base model. |
Myles Foley; Ambrish Rawat; Taesung Lee; Yufang Hou; Gabriele Picco; Giulio Zizzo; |
411 | Large Language Models Meet NL2Code: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate further research and applications in this field, in this paper, we present a comprehensive survey of 27 existing large language models for NL2Code, and also review benchmarks and metrics. |
Daoguang Zan; Bei Chen; Fengji Zhang; Dianjie Lu; Bingchao Wu; Bei Guan; Wang Yongji; Jian-Guang Lou; |
412 | When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conduct a case study in Financial NLP where multiple datasets exist for skills relevant to the domain, such as numeric reasoning and sentiment analysis. |
Jingwei Ni; Zhijing Jin; Qian Wang; Mrinmaya Sachan; Markus Leippold; |
413 | Enhancing Grammatical Error Correction Systems with Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To enhance GEC systems with explanations, we introduce EXPECT, a large dataset annotated with evidence words and grammatical error types. We propose several baselines and anlysis to understand this task. |
Yuejiao Fei; Leyang Cui; Sen Yang; Wai Lam; Zhenzhong Lan; Shuming Shi; |
414 | Linguistic Representations for Fewer-shot Relation Extraction Across Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We focus on the task of relation extraction on three datasets of procedural text in two domains, cooking and materials science. |
Sireesh Gururaja; Ritam Dutt; Tinglong Liao; Carolyn Ros�; |
415 | DarkBERT: A Language Model for The Dark Side of The Internet Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce DarkBERT, a language model pretrained on Dark Web data. |
Youngjin Jin; Eugene Jang; Jian Cui; Jin-Woo Chung; Yongjae Lee; Seungwon Shin; |
416 | MDACE: MIMIC Documents Annotated with Code Evidence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a dataset for evidence/rationale extraction on an extreme multi-label classification task over long medical documents. |
Hua Cheng; Rana Jafari; April Russell; Russell Klopfer; Edmond Lu; Benjamin Striner; Matthew Gormley; |
417 | Towards Zero-Shot Multilingual Transfer for Code-Switched Responses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a new adapter-based framework that allows for efficient transfer by learning task-specific representations and encapsulating source and target language representations. |
Ting-Wei Wu; Changsheng Zhao; Ernie Chang; Yangyang Shi; Pierce Chuang; Vikas Chandra; Biing Juang; |
418 | One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To achieve even greater storage reduction, we propose ProPETL, a novel method that enables efficient sharing of a single prototype PETL network (e. g. adapter, LoRA, and prefix-tuning) across layers and tasks. |
Guangtao Zeng; Peiyuan Zhang; Wei Lu; |
419 | Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to preliminarily test *whether NLG can generate humor as humans do*. |
Jianquan Li; XiangBo Wu; Xiaokang Liu; Qianqian Xie; Prayag Tiwari; Benyou Wang; |
420 | Convergence and Diversity in The Control Hierarchy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We adapt Weir�s definition of a controllable CFG (called a labeled distinguished CFG) to give a definition of controllable pushdown automata (PDAs), called labeled distinguished PDAs. |
Alexandra Butoi; Ryan Cotterell; David Chiang; |
421 | ConFEDE: Contrastive Feature Decomposition for Multimodal Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose ConFEDE, a unified learning framework that jointly performs contrastive representation learning and contrastive feature decomposition to enhance the representation of multimodal information. |
Jiuding Yang; Yakun Yu; Di Niu; Weidong Guo; Yu Xu; |
422 | Using Domain Knowledge to Guide Dialog Structure Induction Via Neural Probabilistic Soft Logic Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Neural Probabilistic Soft Logic Dialogue Structure Induction (NEUPSL DSI), a principled approach that injects symbolic knowledge into the latent space of a generative neural model. |
Connor Pryor; Quan Yuan; Jeremiah Liu; Mehran Kazemi; Deepak Ramachandran; Tania Bedrax-Weiss; Lise Getoor; |
423 | Are You Copying My Model? Protecting The Copyright of Large Language Models for EaaS Via Backdoor Watermark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To protect the copyright of LLMs for EaaS, we propose an Embedding Watermark method called {pasted macro �METHOD�} that implants backdoors on embeddings. |
Wenjun Peng; Jingwei Yi; Fangzhao Wu; Shangxi Wu; Bin Bin Zhu; Lingjuan Lyu; Binxing Jiao; Tong Xu; Guangzhong Sun; Xing Xie; |
424 | Answering Ambiguous Questions Via Iterative Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions. |
Weiwei Sun; Hengyi Cai; Hongshen Chen; Pengjie Ren; Zhumin Chen; Maarten de Rijke; Zhaochun Ren; |
425 | A Dataset of Argumentative Dialogues on Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ArgSciChat, a dataset of 41 argumentative dialogues between scientists on 20 NLP papers. |
Federico Ruggeri; Mohsen Mesgar; Iryna Gurevych; |
426 | Massively Multilingual Lexical Specialization of Multilingual Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Concretely, we use BabelNet�s multilingual synsets to create synonym pairs (or synonym-gloss pairs) across 50 languages and then subject the MMTs (mBERT and XLM-R) to a lexical specialization procedure guided by a contrastive objective. We show that such massively multilingual lexical specialization brings substantial gains in two standard cross-lingual lexical tasks, bilingual lexicon induction and cross-lingual word similarity, as well as in cross-lingual sentence retrieval. |
Tommaso Green; Simone Paolo Ponzetto; Goran Glava�; |
427 | RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, in the era of large general-purpose language agents, fine-tuning is neither computationally nor spatially efficient as it results in multiple copies of the network. In this work, we introduce RL4F (Reinforcement Learning for Feedback), a multi-agent collaborative framework where the critique generator is trained to maximize end-task performance of GPT-3, a fixed model more than 200 times its size. |
Afra Feyza Akyurek; Ekin Akyurek; Ashwin Kalyan; Peter Clark; Derry Tanti Wijaya; Niket Tandon; |
428 | WebIE: Faithful and Robust Information Extraction on The Web Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present WebIE, the first large-scale, entity-linked closed IE dataset consisting of 1. |
Chenxi Whitehouse; Clara Vania; Alham Fikri Aji; Christos Christodoulopoulos; Andrea Pierleoni; |
429 | NormBank: A Knowledge Bank of Situational Social Norms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present NormBank, a knowledge bank of 155k situational norms. |
Caleb Ziems; Jane Dwivedi-Yu; Yi-Chia Wang; Alon Halevy; Diyi Yang; |
430 | DIP: Dead Code Insertion Based Black-box Attack for Programming Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose DIP (Dead code Insertion based Black-box Attack for Programming Language Model), a high-performance and effective black-box attack method to generate adversarial examples using dead code insertion. |
CheolWon Na; YunSeok Choi; Jee-Hyong Lee; |
431 | Modeling Structural Similarities Between Documents for Coherence Assessment with Graph Convolutional Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate a GCN-based coherence model that is capable of capturing structural similarities between documents. |
Wei Liu; Xiyan Fu; Michael Strube; |
432 | HiTIN: Hierarchy-aware Tree Isomorphism Network for Hierarchical Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Hierarchy-aware Tree Isomorphism Network (HiTIN) to enhance the text representations with only syntactic information of the label hierarchy. |
He Zhu; Chong Zhang; Junjie Huang; Junran Wu; Ke Xu; |
433 | Contextual Knowledge Learning for Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach to context and knowledge weighting as an integral part of model training. |
Wen Zheng; Natasa Milic-Frayling; Ke Zhou; |
434 | Easy Guided Decoding in Providing Suggestions for Interactive Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we utilize the parameterized objective function of neural machine translation (NMT) and propose a novel constrained decoding algorithm, namely Prefix-Suffix Guided Decoding (PSGD), to deal with the TS problem without additional training. |
Ke Wang; Xin Ge; Jiayi Wang; Yuqi Zhang; Yu Zhao; |
435 | Discourse-Centric Evaluation of Document-level Machine Translation with A New Densely Annotated Parallel Corpus of Novels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using these annotations, we systematically investigate the similarities and differences between the discourse structures of source and target languages, and the challenges they pose to MT. We discover that MT outputs differ fundamentally from human translations in terms of their latent discourse structures. This gives us a new perspective on the challenges and opportunities in document-level MT. We make our resource publicly available to spur future research in document-level MT and its generalization to other language translation tasks. |
Yuchen Eleanor Jiang; Tianyu Liu; Shuming Ma; Dongdong Zhang; Mrinmaya Sachan; Ryan Cotterell; |
436 | CMOT: Cross-modal Mixup Via Optimal Transport for Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Cross-modal Mixup via Optimal Transport (CMOT) to overcome the modality gap. |
Yan Zhou; Qingkai Fang; Yang Feng; |
437 | On The Evaluation of Neural Selective Prediction Methods for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a survey and empirical comparison of the state-of-the-art in neural selective classification for NLP tasks. |
Zhengyao Gu; Mark Hopkins; |
438 | Speech-Text Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Speech-text Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model. |
Tianshu Yu; Haoyu Gao; Ting-En Lin; Min Yang; Yuchuan Wu; Wentao Ma; Chao Wang; Fei Huang; Yongbin Li; |
439 | Text Style Transfer with Contrastive Transfer Pattern Mining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing studies mainly focus on the transformation between styles, yet ignore that this transformation can be actually carried out via different hidden transfer patterns. To address this problem, we propose a novel approach, contrastive transfer pattern mining (CTPM), which automatically mines and utilizes inherent latent transfer patterns to improve the performance of TST. |
Jingxuan Han; Quan Wang; Licheng Zhang; Weidong Chen; Yan Song; Zhendong Mao; |
440 | Zero- and Few-Shot Event Detection Via Prompt-Based Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our framework, we propose to use the cloze-based prompt and a trigger-aware soft verbalizer to efficiently project output to unseen event types. |
Zhenrui Yue; Huimin Zeng; Mengfei Lan; Heng Ji; Dong Wang; |
441 | Text Style Transfer Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For natural inputs, BT brings only slight improvements and sometimes even adverse effects. To address this issue, we propose Text Style Transfer Back Translation (TST BT), which uses a style transfer to modify the source side of BT data. |
Daimeng Wei; Zhanglin Wu; Hengchao Shang; Zongyao Li; Minghan Wang; Jiaxin Guo; Xiaoyu Chen; Zhengzhe Yu; Hao Yang; |
442 | Generating Visual Spatial Description Via Holistic 3D Scene Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the incorporation of 3D scene features for VSD. |
Yu Zhao; Hao Fei; Wei Ji; Jianguo Wei; Meishan Zhang; Min Zhang; Tat-Seng Chua; |
443 | Continual Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a method called continual knowledge distillation to take advantage of existing translation models to improve one model of interest. |
Yuanchi Zhang; Peng Li; Maosong Sun; Yang Liu; |
444 | Query Refinement Prompts for Closed-Book Long-Form QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We resolve the difficulties to evaluate long-form output by doing both tasks at once ? to do question answering that requires long-form answers. Such questions tend to be multifaceted, i. e. , they may have ambiguities and/or require information from multiple sources. |
Reinald Kim Amplayo; Kellie Webster; Michael Collins; Dipanjan Das; Shashi Narayan; |
445 | CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Compared with short videos, long videos are also highly demanded but less explored, which brings new challenges in higher inference computation cost and weaker multi-modal alignment. To address these challenges, we propose CONE, an efficient COarse-to-fiNE alignment framework. |
Zhijian Hou; Wanjun Zhong; Lei Ji; Difei Gao; Kun Yan; W.k. Chan; Chong-Wah Ngo; Mike Zheng Shou; Nan Duan; |
446 | Few-Shot Document-Level Event Argument Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study to capture event arguments that actually spread across sentences in documents. |
Xianjun Yang; Yujie Lu; Linda Petzold; |
447 | ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset By AMR Back-Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present ParaAMR, a large-scale syntactically diverse paraphrase dataset created by abstract meaning representation back-translation. |
Kuan-Hao Huang; Varun Iyer; I-Hung Hsu; Anoop Kumar; Kai-Wei Chang; Aram Galstyan; |
448 | Towards Understanding and Improving Knowledge Distillation for Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Firstly, the current objective of KD spreads its focus to whole distributions to learn the knowledge, yet lacks special treatment on the most crucial top-1 information. Secondly, the knowledge is largely covered by the golden information due to the fact that most top-1 predictions of teachers overlap with ground-truth tokens, which further restricts the potential of KD. To address these issues, we propose a new method named Top-1 Information Enhanced Knowledge Distillation (TIE-KD). |
Songming Zhang; Yunlong Liang; Shuaibo Wang; Yufeng Chen; Wenjuan Han; Jian Liu; Jinan Xu; |
449 | Multi-Row, Multi-Span Distant Supervision For Table+Text Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This leads to a noisy multi-instance training regime. We present MITQA, a transformer-based TextTableQA system that is explicitly designed to cope with distant supervision along both these axes, through a multi-instance loss objective, together with careful curriculum design. |
Vishwajeet Kumar; Yash Gupta; Saneem Chemmengath; Jaydeep Sen; Soumen Chakrabarti; Samarth Bharadwaj; Feifei Pan; |
450 | HAHE: Hierarchical Attention for Hyper-Relational Knowledge Graphs in Global and Local Level Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing research seldom simultaneously models the graphical and sequential structure of HKGs, limiting HKGs� representation. To overcome this limitation, we propose a novel Hierarchical Attention model for HKG Embedding (HAHE), including global-level and local-level attention. |
Haoran Luo; Haihong E; Yuhao Yang; Yikai Guo; Mingzhi Sun; Tianyu Yao; Zichen Tang; Kaiyang Wan; Meina Song; Wei Lin; |
451 | ORGAN: Observation-Guided Radiology Report Generation Via Tree Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an Observation-guided radiology Report Generation framework (ORGan). |
Wenjun Hou; Kaishuai Xu; Yi Cheng; Wenjie Li; Jiang Liu; |
452 | Data Curation Alone Can Stabilize In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that carefully curating a subset of training data greatly stabilizes ICL performance without any other changes to the ICL algorithm (e. g. , prompt retrieval or calibration). |
Ting-Yun Chang; Robin Jia; |
453 | MidMed: Towards Mixed-Type Dialogues for Medical Consultation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, in many real situations, due to the lack of medical knowledge, it is usually difficult for patients to determine clear goals with all necessary slots. In this paper, we identify this challenge as how to construct medical consultation dialogue systems to help patients clarify their goals. |
Xiaoming Shi; Zeming Liu; Chuan Wang; Haitao Leng; Kui Xue; Xiaofan Zhang; Shaoting Zhang; |
454 | FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by fusion-in-decoder (FiD) models which efficiently aggregate more passages and thus outperforms concatenation-based models in open-domain QA, we hypothesize that similar techniques can be applied to improve the efficiency and end-task performance of ICL. To verify this, we present a comprehensive study on applying three fusion methods�concatenation-based (early fusion), FiD (intermediate), and ensemble-based (late)�to ICL. |
Qinyuan Ye; Iz Beltagy; Matthew Peters; Xiang Ren; Hannaneh Hajishirzi; |
455 | S2ynRE: Two-stage Self-training with Synthetic Data for Low-resource Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose S2ynRE, a framework of two-stage Self-training with Synthetic data for Relation Extraction. |
Benfeng Xu; Quan Wang; Yajuan Lyu; Dai Dai; Yongdong Zhang; Zhendong Mao; |
456 | DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, two pain points persist for this paradigm: (a) as the pre-trained models grow bigger (e. g. , 175B parameters for GPT-3), even the fine-tuning process can be time-consuming and computationally expensive; (b) the fine-tuned model has the same size as its starting point by default, which is neither sensible due to its more specialized functionality, nor practical since many fine-tuned models will be deployed in resource-constrained environments. To address these pain points, we propose a framework for resource- and parameter-efficient fine-tuning by leveraging the sparsity prior in both weight updates and the final model weights. |
Xuxi Chen; Tianlong Chen; Weizhu Chen; Ahmed Hassan Awadallah; Zhangyang Wang; Yu Cheng; |
457 | CASE: Aligning Coarse-to-Fine Cognition and Affection for Empathetic Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose the CASE model for empathetic dialogue generation. |
Jinfeng Zhou; Chujie Zheng; Bo Wang; Zheng Zhang; Minlie Huang; |
458 | Comparative Evaluation of Boundary-relaxed Annotation for Entity Linking Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For those cases, a lenient annotation guideline could relieve the annotators� workload and speed up the process. This paper presents a case study designed to verify the feasibility of such annotation process and evaluate the impact of boundary-relaxed annotation in an Entity Linking pipeline. |
Gabriel Herman Bernardim Andrade; Shuntaro Yada; Eiji Aramaki; |
459 | Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the generalization of over 20 different models trained on CoNLL-2003, and show that NER models have very different generalization. |
Shuheng Liu; Alan Ritter; |
460 | READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, little study has been done to construct such benchmarks for Chinese, where various language-specific input noises happen in the real world. In order to fill this important gap, we construct READIN: a Chinese multi-task benchmark with REalistic And Diverse Input Noises. |
Chenglei Si; Zhengyan Zhang; Yingfa Chen; Xiaozhi Wang; Zhiyuan Liu; Maosong Sun; |
461 | MAD-TSC: A Multilingual Aligned News Dataset for Target-dependent Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce MAD-TSC, a new dataset which differs substantially from existing resources. |
Evan Dufraisse; Adrian Popescu; Julien Tourille; Armelle Brun; Jerome Deshayes; |
462 | A New Dataset and Empirical Study for Sentence Simplification in Chinese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The development of Chinese sentence simplification is relatively slow due to the lack of data. To alleviate this limitation, this paper introduces CSS, a new dataset for assessing sentence simplification in Chinese. |
Shiping Yang; Renliang Sun; Xiaojun Wan; |
463 | Factual or Contextual? Disentangling Error Types in Entity Description Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop an evaluation paradigm that enables us to disentangle these two types of errors in naturally occurring textual contexts. |
Navita Goyal; Ani Nenkova; Hal Daum� III; |
464 | Weakly Supervised Vision-and-Language Pre-training with Relative Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This affects the data quality and thus the effectiveness of pre-training. In this paper, we propose to directly take a small number of aligned image-text pairs as anchors, and represent each unaligned image and text by its similarities to these anchors, i. e. , relative representations. |
Chi Chen; Peng Li; Maosong Sun; Yang Liu; |
465 | HermEs: Interactive Spreadsheet Formula Prediction Via Hierarchical Formulet Expansion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose HermEs, the first approach for spreadsheet formula prediction via HiEraRchical forMulet ExpanSion, where hierarchical expansion means generating formulas following the underlying parse tree structure, and Formulet refers to commonly-used multi-level patterns mined from real formula parse trees. |
Wanrong He; Haoyu Dong; Yihuai Gao; Zhichao Fan; Xingzhuo Guo; Zhitao Hou; Xiao Lv; Ran Jia; Shi Han; Dongmei Zhang; |
466 | ArgU: A Controllable Factual Argument Generator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce ArgU: a neural argument generator capable of producing factual arguments from input facts and real-world concepts that can be explicitly controlled for stance and argument structure using Walton�s argument scheme-based control codes. |
Sougata Saha; Rohini Srihari; |
467 | Learning Answer Generation Using Supervision from Automatic Question Answering Evaluators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel training paradigm for GenQA using supervision from automatic QA evaluation models (GAVA). |
Matteo Gabburo; Siddhant Garg; Rik Koncel-Kedziorski; Alessandro Moschitti; |
468 | RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new retrieval-enhanced approach for personalized response generation. |
Shuai Liu; Hyundong Cho; Marjorie Freedman; Xuezhe Ma; Jonathan May; |
469 | Don�t Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing Via Autoregressive Span Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a simple and unified approach for both continuous and discontinuous constituency parsing via autoregressive span selection. |
Songlin Yang; Kewei Tu; |
470 | Laziness Is A Virtue When It Comes to Compositionality in Neural Semantic Parsing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we approach semantic parsing from, quite literally, the opposite direction; that is, we introduce a neural semantic parsing generation method that constructs logical forms from the bottom up, beginning from the logical form�s leaves. |
Maxwell Crouse; Pavan Kapanipathi; Subhajit Chaudhury; Tahira Naseem; Ramon Fernandez Astudillo; Achille Fokoue; Tim Klinger; |
471 | AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel attribution-driven knowledge distillation approach, which explores the token-level rationale behind the teacher model based on Integrated Gradients (IG) and transfers attribution knowledge to the student model. |
Siyue Wu; Hongzhan Chen; Xiaojun Quan; Qifan Wang; Rui Wang; |
472 | (QA)2: Question Answering with Questionable Assumptions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose (QA)2 (Question Answering with Questionable Assumptions), an open-domain evaluation dataset consisting of naturally occurring search engine queries that may or may not contain questionable assumptions. |
Najoung Kim; Phu Mon Htut; Samuel R. Bowman; Jackson Petty; |
473 | Attributable and Scalable Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a method for unsupervised opinion summarization that encodes sentences from customer reviews into a hierarchical discrete latent space, then identifies common opinions based on the frequency of their encodings. |
Tom Hosking; Hao Tang; Mirella Lapata; |
474 | Targeted Data Generation: Finding and Fixing Model Weaknesses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Targeted Data Generation (TDG), a framework that automatically identifies challenging subgroups, and generates new data for those subgroups using large language models (LLMs) with a human in the loop. |
Zexue He; Marco Tulio Ribeiro; Fereshte Khani; |
475 | HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this paradigm poses issues of inefficient updating and resource over-consuming for fine-tuning in data-scarce and resource-limited scenarios, because of the large scale of parameters in PLMs. To alleviate these concerns, in this paper, we propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuned. |
Anchun Gui; Han Xiao; |
476 | CFSum Coarse-to-Fine Contribution Network for Multimodal Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing multimodal summarization approaches focus on designing the fusion methods of different modalities, while ignoring the adaptive conditions under which visual modalities are useful. Therefore, we propose a novel Coarse-to-Fine contribution network for multimodal Summarization (CFSum) to consider different contributions of images for summarization. |
Min Xiao; Junnan Zhu; Haitao Lin; Yu Zhou; Chengqing Zong; |
477 | On �Scientific Debt� in NLP: A Case for More Rigour in Language Model Pre-Training Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a case in point by revisiting the success of BERT over its baselines, ELMo and GPT-1, and demonstrate how ? under comparable conditions where the baselines are tuned to a similar extent ? these baselines (and even-simpler variants thereof) can, in fact, achieve competitive or better performance than BERT. |
Made Nindyatama Nityasya; Haryo Wibowo; Alham Fikri Aji; Genta Winata; Radityo Eko Prasojo; Phil Blunsom; Adhiguna Kuncoro; |
478 | End-to-end Knowledge Retrieval with Multi-modal Queries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a retriever model �ReViz� that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion without being dependent on intermediate modules such as object detectors or caption generators. |
Man Luo; Zhiyuan Fang; Tejas Gokhale; Yezhou Yang; Chitta Baral; |
479 | AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present AV-TranSpeech, the first audio-visual speech-to-speech (AV-S2ST) translation model without relying on intermediate text. |
Rongjie Huang; Huadai Liu; Xize Cheng; Yi Ren; Linjun Li; Zhenhui Ye; Jinzheng He; Lichao Zhang; Jinglin Liu; Xiang Yin; Zhou Zhao; |
480 | Dual Class Knowledge Propagation Network for Multi-label Few-shot Intent Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies are confused by the identical representation of the utterance with multiple labels and overlook the intrinsic intra-class and inter-class interactions. To address these two limitations, we propose a novel dual class knowledge propagation network in this paper. |
Feng Zhang; Wei Chen; Fei Ding; Tengjiao Wang; |
481 | VendorLink: An NLP Approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To identify relationships between illegal markets and their vendors, we propose VendorLink, an NLP-based approach that examines writing patterns to verify, identify, and link unique vendor accounts across text advertisements (ads) on seven public Darknet markets. |
Vageesh Saxena; Nils Rethmeier; Gijs van Dijck; Gerasimos Spanakis; |
482 | Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the reference summaries of those datasets turn out to be noisy, mainly in terms of factual hallucination and information redundancy. To address this challenge, we first annotate new expert-writing Element-aware test sets following the ?Lasswell Communication Model? proposed by Lasswell, allowing reference summaries to focus on more fine-grained news elements objectively and comprehensively. |
Yiming Wang; Zhuosheng Zhang; Rui Wang; |
483 | Efficient Shapley Values Estimation By Amortization for Text Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the trade-off between stability and efficiency, we develop an amortized model that directly predicts each input feature�s Shapley Value without additional model evaluations. |
Chenghao Yang; Fan Yin; He He; Kai-Wei Chang; Xiaofei Ma; Bing Xiang; |
484 | PeerDA: Data Augmentation Via Modeling Peer Relation for Span Identification Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Different from previous works that merely leverage the Subordinate (SUB) relation (i. e. if a span is an instance of a certain category) to train models, this paper for the first time explores the Peer (PR) relation, which indicates that two spans are instances of the same category and share similar features. |
Weiwen Xu; Xin Li; Yang Deng; Wai Lam; Lidong Bing; |
485 | Dynamic Regularization in UDA for Transformers in Multimodal Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work focuses on two key challenges in multimodal machine learning. The first is finding efficient ways to combine information from different data types. The second is that often, one modality (e. g. , text) is stronger and more relevant, making it difficult to identify meaningful patterns in the weaker modality (e. g. , image). |
Ivonne Monter-Aldana; Adrian Pastor Lopez Monroy; Fernando Sanchez-Vega; |
486 | Conflicts, Villains, Resolutions: Towards Models of Narrative Media Framing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite increasing interest in the automatic detection of media frames in NLP, the problem is typically simplified as single-label classification and adopts a topic-like view on frames, evading modelling the broader document-level narrative. In this work, we revisit a widely used conceptualization of framing from the communication sciences which explicitly captures elements of narratives, including conflict and its resolution, and integrate it with the narrative framing of key entities in the story as heroes, victims or villains. |
Lea Frermann; Jiatong Li; Shima Khanehzar; Gosia Mikolajczak; |
487 | BgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present bgGLUE (Bulgarian General Language Understanding Evaluation), a benchmark for evaluating language models on Natural Language Understanding (NLU) tasks in Bulgarian. |
Momchil Hardalov; Pepa Atanasova; Todor Mihaylov; Galia Angelova; Kiril Simov; Petya Osenova; Veselin Stoyanov; Ivan Koychev; Preslav Nakov; Dragomir Radev; |
488 | DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Augmented only by self-generated pseudo text, generation models over-exploit the previously learned text space and fail to explore a larger one, suffering from a restricted generalization boundary and limited controllability. In this work, we propose DuNST, a novel ST framework to tackle these problems. |
Yuxi Feng; Xiaoyuan Yi; Xiting Wang; Laks Lakshmanan; V.S.; Xing Xie; |
489 | What Does The Failure to Reason with �Respectively� in Zero/Few-Shot Settings Tell Us About Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a controlled synthetic dataset WikiResNLI and a naturally occurring dataset NatResNLI to encompass various explicit and implicit realizations of �respectively�. |
Ruixiang Cui; Seolhwa Lee; Daniel Hershcovich; Anders S�gaard; |
490 | BLIND: Bias Removal With No Demographics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce BLIND, a method for bias removal with no prior knowledge of the demographics in the dataset. |
Hadas Orgad; Yonatan Belinkov; |
491 | How Do Humans Perceive Adversarial Text? A Reality Check on The Validity and Naturalness of Word-based Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This entails that adversarial perturbations would not pass any human quality gate and do not represent real threats to human-checked NLP systems. To bypass this limitation and enable proper assessment (and later, improvement) of NLP model robustness, we have surveyed 378 human participants about the perceptibility of text adversarial examples produced by state-of-the-art methods. |
Salijona Dyrmishi; Salah Ghamizi; Maxime Cordy; |
492 | Soft Alignment Objectives for Robust Adaptation of Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces novel training objectives built upon a semantic similarity of the predicted tokens to the reference. |
Michal �tef�nik; Marek Kadlcik; Petr Sojka; |
493 | The CRINGE Loss: Learning What Language Not to Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Growing evidence shows that even with very large amounts of positive training data, issues remain that can be alleviated with relatively small amounts of negative data � examples of what the model should not do. In this work, we propose a novel procedure to train with such data called the �CRINGE� loss (ContRastive Iterative Negative GEneration). |
Leonard Adolphs; Tianyu Gao; Jing Xu; Kurt Shuster; Sainbayar Sukhbaatar; Jason Weston; |
494 | Modeling User Satisfaction Dynamics in Dialogue Via Hawkes Process Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In order to fully simulate users, it is crucial to take satisfaction dynamics into account. To fill this gap, we propose a new estimator ASAP (sAtisfaction eStimation via HAwkes Process) that treats user satisfaction across turns as an event sequence and employs a Hawkes process to effectively model the dynamics in this sequence. |
Fanghua Ye; Zhiyuan Hu; Emine Yilmaz; |
495 | Towards Identifying Fine-Grained Depression Symptoms from Memes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we conduct a focused study on depression disorder and introduce a new task of identifying fine-grained depressive symptoms from memes. |
Shweta Yadav; Cornelia Caragea; Chenye Zhao; Naincy Kumari; Marvin Solberg; Tanmay Sharma; |
496 | SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. |
Suwon Shon; Siddhant Arora; Chyi-Jiunn Lin; Ankita Pasad; Felix Wu; Roshan S Sharma; Wei-Lun Wu; Hung-yi Lee; Karen Livescu; Shinji Watanabe; |
497 | My Side, Your Side and The Evidence: Discovering Aligned Actor Groups and The Narratives They Weave Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider a more feasible proxy task: Identify the distinct sets of aligned story actors responsible for sustaining the issue-specific narratives. |
Pavan Holur; David Chong; Timothy Tangherlini; Vwani Roychowdhury; |
498 | Characterizing and Measuring Linguistic Dataset Drift Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose three dimensions of linguistic dataset drift: vocabulary, structural, and semantic drift. |
Tyler Chang; Kishaloy Halder; Neha Anna John; Yogarshi Vyas; Yassine Benajiba; Miguel Ballesteros; Dan Roth; |
499 | WebCPM: Interactive Web Search for Chinese Long-form Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce WebCPM, the first Chinese LFQA dataset. |
Yujia Qin; Zihan Cai; Dian Jin; Lan Yan; Shihao Liang; Kunlun Zhu; Yankai Lin; Xu Han; Ning Ding; Huadong Wang; Ruobing Xie; Fanchao Qi; Zhiyuan Liu; Maosong Sun; Jie Zhou; |
500 | Synthesize, Prompt and Transfer: Zero-shot Conversational Question Generation with Pre-trained Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a more realistic and less explored setting, Zero-shot Conversational Question Generation (ZeroCQG), which requires no human-labeled conversations for training. |
Hongwei Zeng; Bifan Wei; Jun Liu; Weiping Fu; |
501 | FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In FormNetV2, we introduce a centralized multimodal graph contrastive learning strategy to unify self-supervised pre-training for all modalities in one loss. |
Chen-Yu Lee; Chun-Liang Li; Hao Zhang; Timothy Dozat; Vincent Perot; Guolong Su; Xiang Zhang; Kihyuk Sohn; Nikolay Glushnev; Renshen Wang; Joshua Ainslie; Shangbang Long; Siyang Qin; Yasuhisa Fujii; Nan Hua; Tomas Pfister; |
502 | MixCE: Training Autoregressive Language Models By Mixing Forward and Reverse Cross-Entropies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Hence, we propose learning with MixCE, an objective that mixes the forward and reverse cross-entropies. |
Shiyue Zhang; Shijie Wu; Ozan Irsoy; Steven Lu; Mohit Bansal; Mark Dredze; David Rosenberg; |
503 | Knowledgeable Parameter Efficient Tuning Network for Commonsense Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple knowledgeable parameter efficient tuning network to couple PLMs with external knowledge for commonsense question answering. |
Ziwang Zhao; Linmei Hu; Hanyu Zhao; Yingxia Shao; Yequan Wang; |
504 | BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems. |
Mingda Chen; Paul-Ambroise Duquenne; Pierre Andrews; Justine Kao; Alexandre Mourachko; Holger Schwenk; Marta R. Costa-juss�; |
505 | NLPositionality: Characterizing Design Biases of Datasets and Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce NLPositionality, a framework for characterizing design biases and quantifying the positionality of NLP datasets and models. |
Sebastin Santy; Jenny Liang; Ronan Le Bras; Katharina Reinecke; Maarten Sap; |
506 | Backpack Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Backpacks: a new neural architecture that marries strong modeling performancewith an interface for interpretability and control. |
John Hewitt; John Thickstun; Christopher Manning; Percy Liang; |
507 | WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community. |
Virginia Felkner; Ho-Chun Herbert Chang; Eugene Jang; Jonathan May; |
508 | Grounded Multimodal Named Entity Recognition on Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing MNER studies only extract entity-type pairs in text, which is useless for multimodal knowledge graph construction and insufficient for entity disambiguation. To solve these issues, in this work, we introduce a Grounded Multimodal Named Entity Recognition (GMNER) task. |
Jianfei Yu; Ziyan Li; Jieming Wang; Rui Xia; |
509 | Preserving Commonsense Knowledge from Pre-trained Language Models Via Causal Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the causal view, we propose a unified objective for fine-tuning to retrieve the causality back. |
Junhao Zheng; Qianli Ma; Shengjie Qiu; Yue Wu; Peitian Ma; Junlong Liu; Huawen Feng; Xichen Shang; Haibin Chen; |
510 | Translation-Enhanced Multilingual Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide two key contributions. 1) Relying on a multilingual multi-modal encoder, we provide a systematic empirical study of standard methods used in cross-lingual NLP when applied to mTTI: Translate Train, Translate Test, and Zero-Shot Transfer. 2) We propose Ensemble Adapter (EnsAd), a novel parameter-efficient approach that learns to weigh and consolidate the multilingual text knowledge within the mTTI framework, mitigating the language gap and thus improving mTTI performance. |
Yaoyiran Li; Ching-Yun Chang; Stephen Rawls; Ivan Vulic; Anna Korhonen; |
511 | Benchmarking Large Language Model Capabilities for Conditional Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we discuss how to adapt existing application-specific generation benchmarks to PLMs and provide an in-depth, empirical study of the limitations and capabilities of PLMs in natural language generation tasks along dimensions such as scale, architecture, input and output language. |
Joshua Maynez; Priyanka Agrawal; Sebastian Gehrmann; |
512 | LilGym: Natural Language Visual Reasoning with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present lilGym, a new benchmark for language-conditioned reinforcement learning in visual environments. |
Anne Wu; Kiante Brantley; Noriyuki Kojima; Yoav Artzi; |
513 | Unsupervised Melody-to-Lyrics Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. |
Yufei Tian; Anjali Narayan-Chen; Shereen Oraby; Alessandra Cervone; Gunnar Sigurdsson; Chenyang Tao; Wenbo Zhao; Tagyoung Chung; Jing Huang; Nanyun Peng; |
514 | Causality-aware Concept Extraction Based on Knowledge-guided Prompting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, through the lens of a Structural Causal Model (SCM), we propose equipping the PLM-based extractor with a knowledge-guided prompt as an intervention to alleviate concept bias. |
Siyu Yuan; Deqing Yang; Jinxi Liu; Shuyu Tian; Jiaqing Liang; Yanghua Xiao; Rui Xie; |
515 | Span-level Aspect-based Sentiment Analysis Via Table Filling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel span-level model for Aspect-Based Sentiment Analysis (ABSA), which aims at identifying the sentiment polarity of the given aspect. |
Mao Zhang; Yongxin Zhu; Zhen Liu; Zhimin Bao; Yunfei Wu; Xing Sun; Linli Xu; |
516 | Limitations of Language Models in Arithmetic and Symbolic Induction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the end, we introduce LMs with tutor, which demonstrates every single step of teaching. |
Jing Qian; Hong Wang; Zekun Li; Shiyang Li; Xifeng Yan; |
517 | EEL: Efficiently Encoding Lattices for Reranking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore an approach for reranking hypotheses by using Transformers to efficiently encode lattices of generated outputs, a method we call EEL. |
Prasann Singhal; Jiacheng Xu; Xi Ye; Greg Durrett; |
518 | CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose CLAPSpeech, a cross-modal contrastive pre-training framework that learns from the prosody variance of the same text token under different contexts. |
Zhenhui Ye; Rongjie Huang; Yi Ren; Ziyue Jiang; Jinglin Liu; Jinzheng He; Xiang Yin; Zhou Zhao; |
519 | Revisiting Cross-Lingual Summarization: A Corpus-based Study and A New Benchmark with Improved Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing cross-lingual summarization (CLS) work constructs CLS corpora by simply and directly translating pre-annotated summaries from one language to another, which can contain errors from both summarization and translation processes. To address this issue, we propose ConvSumX, a cross-lingual conversation summarization benchmark, through a new annotation schema that explicitly considers source input context. |
Yulong Chen; Huajian Zhang; Yijie Zhou; Xuefeng Bai; Yueguan Wang; Ming Zhong; Jianhao Yan; Yafu Li; Judy Li; Xianchao Zhu; Yue Zhang; |
520 | Learning Dynamic Contextualised Word Embeddings Via Template-based Temporal Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a method for learning DCWEs by time-adapting a pretrained Masked Language Model (MLM) using time-sensitive templates. |
Xiaohang Tang; Yi Zhou; Danushka Bollegala; |
521 | How Poor Is The Stimulus? Evaluating Hierarchical Generalization in Neural Networks Trained on Child-directed Speech Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Is this preference due to a learning bias for hierarchical structure, or due to more general biases that interact with hierarchical cues in children�s linguistic input? We explore these possibilities by training LSTMs and Transformers – two types of neural networks without a hierarchical bias – on data similar in quantity and content to children�s linguistic input: text from the CHILDES corpus. |
Aditya Yedetore; Tal Linzen; Robert Frank; R. Thomas McCoy; |
522 | GanLM: Encoder-Decoder Pre-training with An Auxiliary Discriminator Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the idea of Generative Adversarial Networks (GANs), we propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator, unifying the ability of language understanding and generation in a single model. |
Jian Yang; Shuming Ma; Li Dong; Shaohan Huang; Haoyang Huang; Yuwei Yin; Dongdong Zhang; Liqun Yang; Furu Wei; Zhoujun Li; |
523 | Linear Guardedness and Its Implications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the impact of this removal on the behavior of downstream classifiers trained on the modified representations is not fully understood. In this work, we formally define the notion of linear guardedness as the inability of an adversary to predict the concept directly from the representation, and study its implications. |
Shauli Ravfogel; Yoav Goldberg; Ryan Cotterell; |
524 | Searching for Needles in A Haystack: On The Role of Incidental Bilingualism in PaLM�s Translation Capability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a mixed-method approach to measure and understand incidental bilingualism at scale. |
Eleftheria Briakou; Colin Cherry; George Foster; |
525 | Open Set Relation Extraction Via Unknown-Aware Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an unknown-aware training method, regularizing the model by dynamically synthesizing negative instances that can provide the missing supervision signals. |
Jun Zhao; Xin Zhao; WenYu Zhan; Qi Zhang; Tao Gui; Zhongyu Wei; Yun Wen Chen; Xiang Gao; Xuanjing Huang; |
526 | Learning to Imagine: Visually-Augmented Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to utilize visual information for composition in the same manner as humans. |
Tianyi Tang; Yushuo Chen; Yifan Du; Junyi Li; Wayne Xin Zhao; Ji-Rong Wen; |
527 | Generating Hashtags for Short-form Videos with Guided Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Both of these properties cannot be easily modeled with classification approaches. To bridge this gap, we formulate SVHR as a generation task that better represents how hashtags are created naturally. |
Tiezheng Yu; Hanchao Yu; Davis Liang; Yuning Mao; Shaoliang Nie; Po-Yao Huang; Madian Khabsa; Pascale Fung; Yi-Chia Wang; |
528 | NEUROSTRUCTURAL DECODING: Neural Text Generation with Structural Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While most approaches for conditional text generation have primarily focused on lexical constraints, they often struggle to effectively incorporate syntactic constraints, which provide a richer language for approximating semantic constraints. We address this gap by introducing NeuroStructural Decoding, a new decoding algorithm that incorporates syntactic constraints to further improve the quality of the generated text. |
Mohaddeseh Bastan; Mihai Surdeanu; Niranjan Balasubramanian; |
529 | The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an active learning approach that exploits the strengths of both human and machine translations by iteratively adding small batches of human translations into the machine-translated training set. |
Zhuang Li; Lizhen Qu; Philip Cohen; Raj Tumuluri; Gholamreza Haffari; |
530 | Ideology Prediction from Scarce and Biased Supervision: Learn to Disregard The �What� and Focus on The �How�! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs. |
Chen Chen; Dylan Walker; Venkatesh Saligrama; |
531 | Unsupervised Extractive Summarization of Emotion Triggers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We instead pursue unsupervised systems that extract triggers from text. First, we introduce CovidET-EXT, augmenting (Zhan et al. , 2022)?s abstractive dataset (in the context of the COVID-19 crisis) with extractive triggers. Second, we develop new unsupervised learning models that can jointly detect emotions and summarize their triggers. |
Tiberiu Sosea; Hongli Zhan; Junyi Jessy Li; Cornelia Caragea; |
532 | Document-Level Event Argument Extraction With A Chain Reasoning Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce T-norm fuzzy logic for optimization, which permits end-to-end learning and shows promise for integrating the expressiveness of logical reasoning with the generalization of neural networks. |
Jian Liu; Chen Liang; Jinan Xu; Haoyan Liu; Zhe Zhao; |
533 | Pre-training Multi-party Dialogue Models with Latent Discourse Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, due to the lack of explicitly annotated discourse labels in multi-party dialogue corpora, previous works fail to scale up the pre-training process by putting aside the unlabeled multi-party conversational data for nothing. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model by unsupervised latent variable inference methods. |
Yiyang Li; Xinting Huang; Wei Bi; Hai Zhao; |
534 | Interpreting Positional Information in Perspective of Word Order Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although several studies have attempted to improve positional encoding and investigate the influence of word order perturbation, it remains unclear how positional encoding impacts NLP models from the perspective of word order. In this paper, we aim to shed light on this problem by analyzing the working mechanism of the attention module and investigating the root cause of its inability to encode positional information. |
Zhang Xilong; Liu Ruochen; Liu Jin; Liang Xuefeng; |
535 | I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce I2D2, a novel commonsense distillation framework that loosely follows the Symbolic Knowledge Distillation of West et al. but breaks the dependence on the extreme-scale teacher model with two innovations: (1) the novel adaptation of NeuroLogic Decoding to enhance the generation quality of the weak, off-the-shelf language models, and (2) self-imitation learning to iteratively learn from the model�s own enhanced commonsense acquisition capabilities. |
Chandra Bhagavatula; Jena D. Hwang; Doug Downey; Ronan Le Bras; Ximing Lu; Lianhui Qin; Keisuke Sakaguchi; Swabha Swayamdipta; Peter West; Yejin Choi; |
536 | More Than Classification: A Unified Framework for Event Temporal Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified event temporal relation extraction framework, which transforms temporal relations into logical expressions of time points and completes the ETRE by predicting the relations between certain time point pairs. |
Quzhe Huang; Yutong Hu; Shengqi Zhu; Yansong Feng; Chang Liu; Dongyan Zhao; |
537 | Multi-Source Test-Time Adaptation As Dueling Bandits for Extractive Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we study multi-source test-time model adaptation from user feedback, where K distinct models are established for adaptation. |
Hai Ye; Qizhe Xie; Hwee Tou Ng; |
538 | Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a decoupled prototype learning framework (DPL) to decouple pseudo label disambiguation and representation learning. |
Yutao Mou; Xiaoshuai Song; Keqing He; Chen Zeng; Pei Wang; Jingang Wang; Yunsen Xian; Weiran Xu; |
539 | DecompEval: Evaluating Generated Texts As Unsupervised Decomposed Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, existing metrics only provide an evaluation score for each dimension without revealing the evidence to interpret how this score is obtained. To deal with these challenges, we propose a simple yet effective metric called DecompEval. |
Pei Ke; Fei Huang; Fei Mi; Yasheng Wang; Qun Liu; Xiaoyan Zhu; Minlie Huang; |
540 | Backdooring Neural Code Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This may impact the downstream software (e. g. , stock trading systems and autonomous driving) and cause financial loss and/or life-threatening incidents. In this paper, we demonstrate such attacks are feasible and can be quite stealthy. |
Weisong Sun; Yuchen Chen; Guanhong Tao; Chunrong Fang; Xiangyu Zhang; Quanjun Zhang; Bin Luo; |
541 | Concise Answers to Complex Questions: Summarization of Long-form Answers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Together, we present the first study on summarizing long-form answers, taking a step forward for QA agents that can provide answers at multiple granularities. |
Abhilash Potluri; Fangyuan Xu; Eunsol Choi; |
542 | Towards Better Entity Linking with Multi-View Enhanced Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Aiming at learning entity representations that can match divergent mentions, this paper proposes a Multi-View Enhanced Distillation (MVD) framework, which can effectively transfer knowledge of multiple fine-grained and mention-relevant parts within entities from cross-encoders to dual-encoders. |
Yi Liu; Yuan Tian; Jianxun Lian; Xinlong Wang; Yanan Cao; Fang Fang; Wen Zhang; Haizhen Huang; Weiwei Deng; Qi Zhang; |
543 | A Measure-Theoretic Characterization of Tight Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to characterize the notion of leakage more precisely, this paper offers a measure-theoretic treatment of language modeling. |
Li Du; Lucas Torroba Hennigen; Tiago Pimentel; Clara Meister; Jason Eisner; Ryan Cotterell; |
544 | PAED: Zero-Shot Persona Attribute Extraction in Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although there is a public dataset for triplet-based persona attribute extraction from conversations, its automatically generated labels present many issues, including unspecific relations and inconsistent annotations. We fix such issues by leveraging more reliable text-label matching criteria to generate high-quality data for persona attribute extraction. |
Luyao Zhu; Wei Li; Rui Mao; Vlad Pandelea; Erik Cambria; |
545 | PromptRank: Unsupervised Keyphrase Extraction Using Prompt Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, in this paper, we propose a simple yet effective unsupervised approach, PromptRank, based on the PLM with an encoder-decoder architecture. |
Aobo Kong; Shiwan Zhao; Hao Chen; Qicheng Li; Yong Qin; Ruiqi Sun; Xiaoyan Bai; |