Paper Digest: EMNLP 2023 Highlights

https://www.paperdigest.org


1, Is ChatGPT A General-Purpose Natural Language Processing Task Solver?
Chengwei Qin; Aston Zhang; Zhuosheng Zhang; Jiaao Chen; Michihiro Yasunaga; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we empirically analyze the zero-shot learning ability of ChatGPT by evaluating it on 20 popular NLP datasets covering 7 representative task categories.


2, Lifelong Sequence Generation with Dynamic Module Expansion and Adaptation
Chengwei Qin; Chen Chen; Shafiq Joty;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks.


3, FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min; Kalpesh Krishna; Xinxi Lyu; Mike Lewis; Wen-tau Yih; Pang Koh; Mohit Iyyer; Luke Zettlemoyer; Hannaneh Hajishirzi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce FACTSCORE, a new evaluation that breaks a generation into a series of atomic facts and computes the percentage of atomic facts supported by a reliable knowledge source.


4, Automatic Prompt Optimization with ?Gradient Descent? and Beam Search
Reid Pryzant; Dan Iter; Jerry Li; Yin Lee; Chenguang Zhu; Michael Zeng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-error effort. We propose a simple and nonparametric solution to this problem, Prompt Optimization with Textual Gradients (ProTeGi), which is inspired by numerical gradient descent to automatically improve prompts, assuming access to training data and an LLM API.


5, Language Models with Rationality
Nora Kassner; Oyvind Tafjord; Ashish Sabharwal; Kyle Richardson; Hinrich Schuetze; Peter Clark;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This lack of interpretability is a growing impediment to widespread use of LLMs. To address this, our goals are to make model beliefs and their inferential relationships explicit, and to resolve inconsistencies that may exist, so that answers are supported by interpretable chains of reasoning drawn from a consistent network of beliefs.


6, Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
Canwen Xu; Daya Guo; Nan Duan; Julian McAuley;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose a pipeline that can automatically generate a high-quality multi-turn chat corpus by leveraging ChatGPT to engage in a conversation with itself.


7, Reasoning with Language Model Is Planning with World Model
Shibo Hao; Yi Gu; Haodi Ma; Joshua Hong; Zhen Wang; Daisy Wang; Zhiting Hu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To overcome the limitations, we propose a new LLM reasoning framework, Reasoning via Planning (RAP).


8, Revisiting Machine Translation for Cross-lingual Classification
Mikel Artetxe; Vedanuj Goswami; Shruti Bhosale; Angela Fan; Luke Zettlemoyer;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.


9, API-Assisted Code Generation for Question Answering on Varied Table Structures
Yihan Cao; Shuyi Chen; Ryan Liu; Zhiruo Wang; Daniel Fried;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In response, this paper introduces a unified TableQA framework that: (1) provides a unified representation for structured tables as multi-index Pandas data frames, (2) uses Python as a powerful querying language, and (3) uses few-shot prompting to translate NL questions into Python programs, which are executable on Pandas data frames.


10, Navigating The Grey Area: How Expressions of Uncertainty and Overconfidence Affect Language Models
Kaitlyn Zhou; Dan Jurafsky; Tatsunori Hashimoto;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: The increased deployment of LMs for real-world tasks involving knowledge and facts makes it important to understand model epistemology: what LMs think they know, and how their attitudes toward that knowledge are affected by language use in their inputs. Here, we study an aspect of model epistemology: how epistemic markers of certainty, uncertainty, or evidentiality like ?I?m sure it?s?, ?I think it?s?, or ?Wikipedia says it?s? affect models, and whether they contribute to model failures.


11, SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
Potsawee Manakul; Adian Liusie; Mark Gales;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose ?SelfCheckGPT?, a simple sampling-based approach that can be used to fact-check the responses of black-box models in a zero-resource fashion, i. e. without an external database.


12, C-STS: Conditional Semantic Textual Similarity
Ameet Deshpande; Carlos Jimenez; Howard Chen; Vishvak Murahari; Victoria Graf; Tanmay Rajpurohit; Ashwin Kalyan; Danqi Chen; Karthik Narasimhan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, it is an inherently ambiguous task, with the sentence similarity depending on the specific aspect of interest. We resolve this ambiguity by proposing a novel task called conditional STS (C-STS) which measures similarity conditioned on an aspect elucidated in natural language (hereon, condition).


13, Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay; Jason Wei; Hyung Chung; Vinh Tran; David So; Siamak Shakeri; Xavier Garcia; Steven Zheng; Jinfeng Rao; Aakanksha Chowdhery; Denny Zhou; Donald Metzler; Slav Petrov; Neil Houlsby; Quoc Le; Mostafa Dehghani;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we continue training a baseline language model, PaLM, with ULR2, introducing a new set of models at 8B, 62B, and 540B scale which we call U-PaLM.


14, RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation
Fengji Zhang; Bei Chen; Yue Zhang; Jacky Keung; Jin Liu; Daoguang Zan; Yi Mao; Jian-Guang Lou; Weizhu Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose RepoCoder, a simple, generic, and effective framework to address the challenge.


15, Active Retrieval Augmented Generation
Zhengbao Jiang; Frank Xu; Luyu Gao; Zhiqing Sun; Qian Liu; Jane Dwivedi-Yu; Yiming Yang; Jamie Callan; Graham Neubig;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we provide a generalized view of active retrieval augmented generation, methods that actively decide when and what to retrieve across the course of the generation.


16, MEGA: Multilingual Evaluation of Generative AI
Kabir Ahuja; Harshita Diddee; Rishav Hada; Millicent Ochieng; Krithika Ramesh; Prachi Jain; Akshay Nambi; Tanuja Ganu; Sameer Segal; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present a thorough analysis of the performance of models across languages and tasks and discuss challenges in improving the performance of generative LLMs on low-resource languages.


17, CAPSTONE: Curriculum Sampling for Dense Retrieval with Document Expansion
Xingwei He; Yeyun Gong; A-Long Jin; Hang Zhang; Anlei Dong; Jian Jiao; Siu Yiu; Nan Duan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose a curriculum sampling strategy that utilizes pseudo queries during training and progressively enhances the relevance between the generated query and the real query.


18, Document-Level Machine Translation with Large Language Models
Longyue Wang; Chenyang Lyu; Tianbo Ji; Zhirui Zhang; Dian Yu; Shuming Shi; Zhaopeng Tu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: The study focuses on three aspects: 1) Effects of Context-Aware Prompts, where we investigate the impact of different prompts on document-level translation quality and discourse phenomena; 2) Comparison of Translation Models, where we compare the translation performance of ChatGPT with commercial MT systems and advanced document-level MT methods; 3) Analysis of Discourse Modelling Abilities, where we further probe discourse knowledge encoded in LLMs and shed light on impacts of training techniques on discourse modeling.


19, We?re Afraid Language Models Aren?t Modeling Ambiguity
Alisa Liu; Zhaofeng Wu; Julian Michael; Alane Suhr; Peter West; Alexander Koller; Swabha Swayamdipta; Noah Smith; Yejin Choi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We capture ambiguity in a sentence through its effect on entailment relations with another sentence, and collect AmbiEnt, a linguist-annotated benchmark of 1,645 examples with diverse kinds of ambiguity.


20, CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations
Myra Cheng; Tiziano Piccardi; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Moreover, there is growing concern that these LLM simulations are flattened caricatures of the personas that they aim to simulate, failing to capture the multidimensionality of people and perpetuating stereotypes. To bridge these gaps, we present CoMPosT, a framework to characterize LLM simulations using four dimensions: Context, Model, Persona, and Topic.


21, Answering Questions By Meta-Reasoning Over Multiple Chains of Thought
Ori Yoran; Tomer Wolfson; Ben Bogin; Uri Katz; Daniel Deutch; Jonathan Berant;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce Multi-Chain Reasoning (MCR), an approach which prompts large language models to meta-reason over multiple chains of thought, rather than aggregate their answers.


22, Reward-Augmented Decoding: Efficient Controlled Text Generation With A Unidirectional Reward Model
Haikang Deng; Colin Raffel;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce Reward-Augmented Decoding (RAD), a text generation procedure that uses a small unidirectional reward model to encourage a language model to generate text that has certain properties.


23, ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering Over Knowledge Graph
Jinhao Jiang; Kun Zhou; Xin Zhao; Yaliang Li; Ji-Rong Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Despite the effectiveness, due to the divergence in model architecture, the PLM and GNN are not closely integrated, limiting the knowledge sharing and fine-grained feature interactions. To solve it, we aim to simplify the above two-module approach, and develop a more capable PLM that can directly support subgraph reasoning for KGQA, namely ReasoningLM.


24, StructGPT: A General Framework for Large Language Model to Reason Over Structured Data
Jinhao Jiang; Kun Zhou; Zican Dong; Keming Ye; Xin Zhao; Ji-Rong Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we aim to improve the reasoning ability of large language models (LLMs) over structured data in a unified way.


25, Contrastive Learning for Inference in Dialogue
Etsuko Ishii; Yan Xu; Bryan Wilie; Ziwei Ji; Holy Lovenia; Willy Chung; Pascale Fung;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we analyze the behavior of the models based on the task difficulty defined by the semantic information gap ? which distinguishes inductive and deductive reasoning.


26, LM Vs LM: Detecting Factual Errors Via Cross Examination
Roi Cohen; May Hamri; Mor Geva; Amir Globerson;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Inspired by truth-seeking mechanisms in law, we propose a factuality evaluation framework for LMs that is based on cross-examination.


27, Query2doc: Query Expansion with Large Language Models
Liang Wang; Nan Yang; Furu Wei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper introduces a simple yet effective query expansion approach, denoted as query2doc, to improve both sparse and dense retrieval systems.


28, XLM-V: Overcoming The Vocabulary Bottleneck in Multilingual Masked Language Models
Davis Liang; Hila Gonen; Yuning Mao; Rui Hou; Naman Goyal; Marjan Ghazvininejad; Luke Zettlemoyer; Madian Khabsa;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce a new approach for scaling to very large multilingual vocabularies by de-emphasizing token sharing between languages with little lexical overlap and assigning vocabulary capacity to achieve sufficient coverage for each individual language.


29, WiCE: Real-World Entailment for Claims in Wikipedia
Ryo Kamoi; Tanya Goyal; Juan Rodriguez; Greg Durrett;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia.


30, TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim; Akari Asai; Gabriel Ilharco; Hannaneh Hajishirzi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we investigate whether knowing task relationships via pairwise task transfer improves choosing one or more source tasks that help to learn a new target task.


31, Query Rewriting in Retrieval-Augmented Large Language Models
Xinbei Ma; Yeyun Gong; Pengcheng He; Hai Zhao; Nan Duan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs from the perspective of the query rewriting.


32, G-Eval: NLG Evaluation Using Gpt-4 with Better Human Alignment
Yang Liu; Dan Iter; Yichong Xu; Shuohang Wang; Ruochen Xu; Chenguang Zhu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm, to assess the quality of NLG outputs.


33, TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
Zorik Gekhman; Jonathan Herzig; Roee Aharoni; Chen Elkind; Idan Szpektor;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Alternatively, large language models (LLMs) have recently shown promising results in directly evaluating generative tasks, but are too computationally expensive for practical use. Motivated by these limitations, we introduce TrueTeacher, a method for generating synthetic data by annotating diverse model-generated summaries using a LLM.


34, Poisoning Retrieval Corpora By Injecting Adversarial Passages
Zexuan Zhong; Ziqing Huang; Alexander Wettig; Danqi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose a novel attack for dense retrieval systems in which a malicious user generates a small number of adversarial passages by perturbing discrete tokens to maximize similarity with a provided set of training queries.


35, MQuAKE: Assessing Knowledge Editing in Language Models Via Multi-Hop Questions
Zexuan Zhong; Zhengxuan Wu; Christopher Manning; Christopher Potts; Danqi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present a benchmark MQuAKE (Multi-hop Question Answering for Knowledge Editing) comprising multi-hop questions that assess whether edited models correctly answer questions where the answer should change as an entailed consequence of edited facts.


36, FactKB: Generalizable Factuality Evaluation Using Language Models Enhanced with Factual Knowledge
Shangbin Feng; Vidhisha Balachandran; Yuyang Bai; Yulia Tsvetkov;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose FactKB?a simple new approach to factuality evaluation that is generalizable across domains, in particular with respect to entities and relations.


37, Batch Prompting: Efficient Inference with Large Language Model APIs
Zhoujun Cheng; Jungo Kasai; Tao Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose batch prompting, a simple yet effective prompting approach that enables the LLM to run inference in batches, instead of one sample at a time.


38, Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva; Jasmijn Bastings; Katja Filippova; Amir Globerson;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: While previous work looked into where factual associations are stored, only little is known about how they are retrieved internally during inference. We investigate this question through the lens of information flow.


39, The Troubling Emergence of Hallucination in Large Language Models - An Extensive Definition, Quantification, and Prescriptive Remediations
Vipula Rawte; Swagata Chakraborty; Agnibh Pathak; Anubhav Sarkar; S.M Towhidul Islam Tonmoy; Aman Chadha; Amit Sheth; Amitava Das;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In conclusion, we propose two solution strategies for mitigating hallucinations.


40, MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Steven Wang; Antoine Scardigli; Leonard Tang; Wei Chen; Dmitry Levkin; Anya Chen; Spencer Ball; Thomas Woodside; Oliver Zhang; Dan Hendrycks;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association?s 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations.


41, AnyTOD: A Programmable Task-Oriented Dialog System
Jeffrey Zhao; Yuan Cao; Raghav Gupta; Harrison Lee; Abhinav Rastogi; Mingqiu Wang; Hagen Soltau; Izhak Shafran; Yonghui Wu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose AnyTOD, an end-to-end, zero-shot task-oriented dialog (TOD) system capable of zero-shot adaptation onto unseen tasks or domains.


42, PALS: Personalized Active Learning for Subjective Tasks in NLP
Kamil Kanclerz; Konrad Karanowski; Julita Bielaniewicz; Marcin Gruza; Piotr Milkowski; Jan Kocon; Przemyslaw Kazienko;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present novel Personalized Active Learning techniques for Subjective NLP tasks (PALS) to either reduce the cost of the annotation process or to boost the learning effect.


43, Reading Order Matters: Information Extraction from Visually-rich Documents By Token Path Prediction
Chong Zhang; Ya Guo; Yi Tu; Huan Chen; Jinyang Tang; Huijia Zhu; Qi Zhang; Tao Gui;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To address the reading order issue, we introduce Token Path Prediction (TPP), a simple prediction head to predict entity mentions as token sequences within documents.


44, Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
Jiacheng Liu; Wenya Wang; Dianzhuo Wang; Noah Smith; Yejin Choi; Hannaneh Hajishirzi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Today?s language models can be remarkably intelligent yet still produce text that contains trivial commonsense errors. Therefore, we seek a retrospective verification approach that can reflect on the commonsense plausibility of the machine text, and introduce Vera, a general-purpose model that learns to estimate the commonsense plausibility of declarative statements.


45, Exchange-of-Thought: Enhancing Large Language Model Capabilities Through Cross-Model Communication
Zhangyue Yin; Qiushi Sun; Cheng Chang; Qipeng Guo; Junqi Dai; Xuanjing Huang; Xipeng Qiu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Despite this progress, their reasoning is often constrained by their intrinsic understanding, lacking external insights. To address this, we propose Exchange-of-Thought (EoT), a novel framework that enables cross-model communication during problem-solving.


46, Evaluating Object Hallucination in Large Vision-Language Models
Yifan Li; Yifan Du; Kun Zhou; Jinpeng Wang; Xin Zhao; Ji-Rong Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To investigate it, this work presents the first systematic study on object hallucination of LVLMs.


47, Rethinking The Evaluation for Conversational Recommendation in The Era of Large Language Models
Xiaolei Wang; Xinyu Tang; Xin Zhao; Jingyuan Wang; Ji-Rong Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we embark on an investigation into the utilization of ChatGPT for CRSs, revealing the inadequacy of the existing evaluation protocol.


48, Meta-Learning Online Adaptation of Language Models
Nathan Hu; Eric Mitchell; Christopher Manning; Chelsea Finn;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: That is, the gradient signal from important tokens representing factual information is drowned out by the gradient from inherently noisy tokens, suggesting that a dynamic, context-aware learning rate may be beneficial. We therefore propose learning which tokens to upweight.


49, HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
Junyi Li; Xiaoxue Cheng; Xin Zhao; Jian-Yun Nie; Ji-Rong Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation for Large Language Models (HaluEval) benchmark, a large collection of generated and human-annotated hallucinated samples for evaluating the performance of LLMs in recognizing hallucination. To generate these samples, we propose a ChatGPT-based two-step framework, i. e. , sampling-then-filtering.


50, Enhancing Generative Retrieval with Reinforcement Learning from Relevance Feedback
Yujia Zhou; Zhicheng Dou; Ji-Rong Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Nevertheless, this approach faces two fundamental challenges: (i) a discrepancy between the token-level probabilistic optimization and the broader document-level relevance estimation; (ii) an overemphasis on top-1 results at the expense of overall ranking quality. To tackle these challenges, we propose a generative retrieval model with reinforcement learning from relevance feedback, which aims to align token-level docid generation with document-level relevance estimation.


51, Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
Katherine Tian; Eric Mitchell; Allan Zhou; Archit Sharma; Rafael Rafailov; Huaxiu Yao; Chelsea Finn; Christopher Manning;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, the most widely-used LMs are fine-tuned with reinforcement learning from human feedback (RLHF-LMs), and some studies have suggested that RLHF-LMs produce conditional probabilities that are very poorly calibrated. In light of this perceived weakness, we conduct a broad evaluation of methods for extracting confidence scores from RLHF-LMs.


52, A Cheaper and Better Diffusion Language Model with Soft-Masked Noise
Jiaao Chen; Aston Zhang; Mu Li; Alex Smola; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: For example, the generally used Gaussian noise can not handle the discrete corruption well, and the objectives in continuous spaces fail to be stable for textual data in the diffusion process especially when the dimension is high. To alleviate these issues, we introduce a novel diffusion model for language modeling, Masked-Diffuse LM, with lower training cost and better performances, inspired by linguistic features in languages.


53, Unlearn What You Want to Forget: Efficient Unlearning for LLMs
Jiaao Chen; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: As a result, the ability to easily remove data related to individual users from such models while not deteriorating their predictive quality after the removal becomes increasingly important. To address these issues, in this work, we propose an efficient unlearning framework that could efficiently update LLMs without having to retrain the whole model after data removals, by introducing lightweight unlearning layers learned with a selective teacher-student objective into the transformers.


54, CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou; Uri Alon; Sumit Agarwal; Graham Neubig;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose CodeBERTScore: an evaluation metric for code generation, which builds on BERTScore (Zhang et al. , 2020).


55, NLI4CT: Multi-Evidence Natural Language Inference for Clinical Trial Reports
Mael Jullien; Marco Valentino; Hannah Frost; Paul O?Regan; D?nal Landers; Andre Freitas;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present a novel resource to advance research on NLI for reasoning on CTRs.


56, Instructed Language Models with Retrievers Are Powerful Entity Linkers
Zilin Xiao; Ming Gong; Jie Wu; Xingyao Zhang; Linjun Shou; Daxin Jiang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Several methods of equipping language models with EL ability were proposed in this work, including (i) a sequence-to-sequence training EL objective with instruction-tuning, (ii) a novel generative EL framework based on a light-weight potential mention retriever that frees the model from heavy and non-parallelizable decoding, achieving 4? speedup without compromise on linking metrics.


57, Privacy Implications of Retrieval-Based Language Models
Yangsibo Huang; Samyak Gupta; Zexuan Zhong; Kai Li; Danqi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present the first study of privacy risks in retrieval-based LMs, particularly kNN-LMs.


58, Enabling Large Language Models to Generate Text with Citations
Tianyu Gao; Howard Yen; Jiatong Yu; Danqi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, our aim is to allow LLMs to generate text with citations, improving their factual correctness and verifiability.


59, Doolittle: Benchmarks and Corpora for Academic Writing Formalization
Shizhe Diao; Yongyu Lei; Liangming Pan; Tianqing Fang; Wangchunshu Zhou; Sedrick Keh; Min-Yen Kan; Tong Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose a more general task, Academic Writing Formalization (AWF), to improve the overall quality of formal academic writing at the paragraph level.


60, Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Wanrong Zhu; Xinyi Wang; Yujie Lu; Tsu-Jui Fu; Xin Wang; Miguel Eckstein; William Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Despite the advancements of T2I models, a common issue encountered by users is the need for repetitive editing of input prompts in order to receive a satisfactory image, which is time-consuming and labor-intensive. Given the demonstrated text generation power of large-scale language models, such as GPT-k, we investigate the potential of utilizing such models to improve the prompt editing process for T2I generation.


61, Knowledge Rumination for Pre-trained Language Models
Yunzhi Yao; Peng Wang; Shengyu Mao; Chuanqi Tan; Fei Huang; Huajun Chen; Ningyu Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, despite the promising outcome, we empirically observe that PLMs may have already encoded rich knowledge in their pre-trained parameters but fails to fully utilize them when applying to knowledge-intensive tasks. In this paper, we propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize that related latent knowledge without retrieving them from the external corpus.


62, Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao; Peng Wang; Bozhong Tian; Siyuan Cheng; Zhoubo Li; Shumin Deng; Huajun Chen; Ningyu Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.


63, Conceptor-Aided Debiasing of Large Language Models
Li Yifei; Lyle Ungar; Jo?o Sedoc;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose two methods of applying conceptors (1) bias subspace projection by post-processing by the conceptor NOT operation; and (2) a new architecture, conceptor-intervened BERT (CI-BERT), which explicitly incorporates the conceptor projection into all layers during training.


64, Sparse Low-rank Adaptation of Pre-trained Language Models
Ning Ding; Xingtai Lv; Qiaosen Wang; Yulin Chen; Bowen Zhou; Zhiyuan Liu; Maosong Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Recognizing the need for more flexible adaptation, we extend the methodology of LoRA to an innovative approach we call sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.


65, Adapting Language Models to Compress Contexts
Alexis Chevalier; Alexander Wettig; Anirudh Ajith; Danqi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose to adapt pre-trained LMs into AutoCompressors.


66, We Are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields
Jan Philip Wahle; Terry Ruas; Mohamed Abdalla; Bela Gipp; Saif Mohammad;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we quantify the degree of influence between 23 fields of study and NLP (on each other).


67, Universal Self-Adaptive Prompting
Xingchen Wan; Ruoxi Sun; Hootan Nakhost; Hanjun Dai; Julian Eisenschlos; Sercan Arik; Tomas Pfister;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, while highly coveted and being the most general, zero-shot performances in LLMs are still typically weaker due to the lack of guidance and the difficulty of applying existing automatic prompt design methods in general tasks when ground-truth labels are unavailable. In this study, we address this by presenting Universal Self-Adaptive Prompting (USP), an automatic prompt design approach specifically tailored for zero-shot learning (while compatible with few-shot).


68, Syntactic Substitutability As Unsupervised Dependency Syntax
Jasper Jian; Siva Reddy;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Syntax is a latent hierarchical structure which underpins the robust and compositional nature of human language. In this work, we explore the hypothesis that syntactic dependencies can be represented in language model attention distributions and propose a new method to induce these structures theory-agnostically.


69, Composable Text Controls in Latent Space with ODEs
Guangyi Liu; Zeyu Feng; Yuan Gao; Zichao Yang; Xiaodan Liang; Junwei Bao; Xiaodong He; Shuguang Cui; Zhen Li; Zhiting Hu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper proposes a new efficient approach for composable text operations in the compact latent space of text.


70, CoAnnotating: Uncertainty-Guided Work Allocation Between Human and Large Language Models for Data Annotation
Minzhi Li; Taiwei Shi; Caleb Ziems; Min-Yen Kan; Nancy Chen; Zhengyuan Liu; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose CoAnnotating, a novel paradigm for Human-LLM co-annotation of unstructured texts at scale.


71, Task-Agnostic Low-Rank Adapters for Unseen English Dialects
Zedian Xiao; William Held; Yanchen Liu; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, prior work on dialect struggle with generalizing to evolving and emerging dialects in a scalable manner. To fill this gap, our method, HyperLoRA, leverages expert linguistic knowledge to enable resource-efficient adaptation via hypernetworks.


72, Impressions: Visual Semiotics and Aesthetic Impact Understanding
Julia Kruk; Caleb Ziems; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We present Impressions, a novel dataset through which to investigate the semiotics of images, and how specific visual features and design choices can elicit specific emotions, thoughts and beliefs.


73, DADA: Dialect Adaptation Via Dynamic Aggregation of Linguistic Rules
Yanchen Liu; William Held; Diyi Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features.


74, Language and Mental Health: Measures of Emotion Dynamics from Text As Linguistic Biosocial Markers
Daniela Teodorescu; Tiffany Cheng; Alona Fyshe; Saif Mohammad;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Here, for the first time, we study the relationship between tweet emotion dynamics and mental health disorders.


75, Contrastive Learning of Sentence Embeddings from Scratch
Junlei Zhang; Zhenzhong Lan; Junxian He;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: due to copyright restrictions, data distribution issues, and messy formats, among other factors. To address these issues, we present SynCSE, a contrastive learning framework that trains sentence embeddings with synthetic data.


76, Specialist or Generalist? Instruction Tuning for Specific NLP Tasks
Chufan Shi; Yixuan Su; Cheng Yang; Yujiu Yang; Deng Cai;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate whether incorporating broadcoverage generalist instruction tuning can contribute to building a specialist model.


77, Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study
Boxin Wang; Wei Ping; Peng Xu; Lawrence McAfee; Zihan Liu; Mohammad Shoeybi; Yi Dong; Oleksii Kuchaiev; Bo Li; Chaowei Xiao; Anima Anandkumar; Bryan Catanzaro;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Thus, it is still an open question: shall we pretrain large autoregressive LMs with retrieval? To answer it, we perform a comprehensive study on a scalable pre-trained retrieval-augmented LM (i. e. , RETRO) compared with standard GPT and retrieval-augmented GPT incorporated at fine-tuning or inference stages.


78, Exploring The Impact of Model Scaling on Parameter-Efficient Tuning
Yusheng Su; Chi-Min Chan; Jiali Cheng; Yujia Qin; Yankai Lin; Shengding Hu; Zonghan Yang; Ning Ding; Xingzhi Sun; Guotong Xie; Zhiyuan Liu; Maosong Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Hence, we hypothesize that model scaling mitigates the impact of design differences on PET methods. To investigate this hypothesis, we introduce a more flexible PET method called Arbitrary PET (APET) method.


79, Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and The Case of Information Extraction
Martin Josifoski; Marija Sakota; Maxime Peyrard; Robert West;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This work shows that useful data can be synthetically generated even for tasks that cannot be solved directly by LLMs: for problems with structured outputs, it is possible to prompt an LLM to perform the task in the reverse direction, by generating plausible input text for a target output structure.


80, Spoiler Detection As Semantic Text Matching
Ryan Tran; Canwen Xu; Julian McAuley;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This is primarily because the definition of a spoiler varies depending on the viewer?s progress in the show, and conventional spoiler detection methods lack the granularity to capture this complexity. To tackle this challenge, we propose the task of spoiler matching, which involves assigning an episode number to a spoiler given a specific TV show.


81, InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions
Bodhisattwa Majumder; Zexue He; Julian McAuley;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We explore two interactive setups with a frozen predictive model and show that users able to provide feedback can achieve a better and fairer balance between task performance and bias mitigation.


82, Editing Common Sense in Transformers
Anshita Gupta; Debanjan Mondal; Akshay Sheshadri; Wenlong Zhao; Xiang Li; Sarah Wiegreffe; Niket Tandon;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate whether commonsense judgments are causally associated with localized, editable parameters in Transformers, and we provide an affirmative answer.


83, Aligning Large Language Models Through Synthetic Feedback
Sungdong Kim; Sanghwan Bae; Jamin Shin; Soyoung Kang; Donghyun Kwak; Kang Yoo; Minjoon Seo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose a novel alignment learning framework with synthetic feedback not dependent on extensive human annotations and proprietary LLMs.


84, Three Stream Based Multi-level Event Contrastive Learning for Text-Video Event Extraction
Jiaqi Li; Chuanyi Zhang; Miaozeng Du; Dehai Min; Yongrui Chen; Guilin Qi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We observe that the same event triggers correspond to similar motion trajectories, which are hardly affected by the background noise. Moviated by this, we propose a Three Stream Multimodal Event Extraction framework (TSEE) that simultaneously utilizes the features of text sequence and video appearance, as well as the motion representations to enhance the event extraction capacity.


85, SOUL: Towards Sentiment and Opinion Understanding of Language
Yue Deng; Wenxuan Zhang; Sinno Pan; Lidong Bing;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, despite the success of pre-trained language models in this area, they often fall short of capturing the broader complexities of sentiment analysis. To address this issue, we propose a new task called Sentiment and Opinion Understanding of Language (SOUL).


86, KNN-LM Does Not Improve Open-ended Text Generation
Shufan Wang; Yixiao Song; Andrew Drozdov; Aparna Garimella; Varun Manjunatha; Mohit Iyyer;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we study the generation quality of interpolation-based retrieval-augmented language models (LMs).


87, The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models Via Chain-of-Thought Fine-Tuning
Seungone Kim; Se Joo; Doyoung Kim; Joel Jang; Seonghyeon Ye; Jamin Shin; Minjoon Seo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we aim to equip smaller LMs with the step-by-step reasoning capability by instruction tuning with CoT rationales.


88, PTP: Boosting Stability and Performance of Prompt Tuning with Perturbation-Based Regularizer
Lichang Chen; Jiuhai Chen; Heng Huang; Minhao Cheng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose a new algorithm, called Prompt Tuning with Perturbation-based regularizer (PTP), which can not only alleviate training instability dramatically but also boost the performance of prompt tuning.


89, Explore-Instruct: Enhancing Domain-Specific Instruction Coverage Through Active Exploration
Fanqi Wan; Xinting Huang; Tao Yang; Xiaojun Quan; Wei Bi; Shuming Shi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, existing data employed for such tuning often exhibit an inadequate coverage of individual domains, limiting the scope for nuanced comprehension and interactions within these areas. To address this deficiency, we propose Explore-Instruct, a novel approach to enhance the data coverage to be used in domain-specific instruction-tuning through active exploration via Large Language Models (LLMs).


90, TheoremQA: A Theorem-driven Question Answering Dataset
Wenhu Chen; Ming Yin; Max Ku; Pan Lu; Yixin Wan; Xueguang Ma; Jianyu Xu; Xinyi Wang; Tony Xia;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce TheoremQA, the first theorem-driven question-answering dataset designed to evaluate AI models? capabilities to apply theorems to solve challenging science problems.


91, MemeCap: A Dataset for Captioning and Interpreting Memes
EunJeong Hwang; Vered Shwartz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present the task of meme captioning and release a new dataset, MemeCap.


92, Building Real-World Meeting Summarization Systems Using Large Language Models: A Practical Perspective
Md Tahmid Rahman Laskar; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper studies how to effectively build meeting summarization systems for real-world usage using large language models (LLMs).


93, Character-LLM: A Trainable Agent for Role-Playing
Yunfan Shao; Linyang Li; Junqi Dai; Xipeng Qiu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc.


94, Sparse Universal Transformer
Shawn Tan; Yikang Shen; Zhenfang Chen; Aaron Courville; Chuang Gan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This is mainly because scaling UT parameters is more compute and memory intensive than scaling up a VT. This paper proposes the Sparse Universal Transformer (SUT), which leverages Sparse Mixture of Experts (SMoE) to reduce UT?s computation complexity while retaining its parameter efficiency and generalization ability.


95, Larger Probes Tell A Different Story: Extending Psycholinguistic Datasets Via In-Context Learning
Namrata Shivagunde; Vladislav Lialin; Anna Rumshisky;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we introduce new, larger datasets for negation (NEG-1500-SIMP) and role reversal (ROLE-1500) inspired by psycholinguistic studies.


96, RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation
Yue Zhang; Leyang Cui; Enbo Zhao; Wei Bi; Shuming Shi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce RobustGEC, a benchmark designed to evaluate the context robustness of GEC systems.


97, Symbolic Planning and Code Generation for Grounded Dialogue
Justin Chiu; Wenting Zhao; Derek Chen; Saujas Vaduguru; Alexander Rush; Daniel Fried;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, LLMs have had limited applicability in grounded task-oriented dialogue as they are difficult to steer toward task objectives and fail to handle novel grounding. We present a modular and interpretable grounded dialogue system that addresses these shortcomings by composing LLMs with a symbolic planner and grounded code execution.


98, Outlier Suppression+: Accurate Quantization of Large Language Models By Equivalent and Effective Shifting and Scaling
Xiuying Wei; Yunchen Zhang; Yuhang Li; Xiangguo Zhang; Ruihao Gong; Jinyang Guo; Xianglong Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We observe that these outliers are concentrated in specific channels and are asymmetric across channels. To address this issue, we propose the Outlier Suppression+ (OS+) framework, which contains the channel-wise shifting for asymmetry and channel-wise scaling for concentration.


99, Controlling Pre-trained Language Models for Grade-Specific Text Simplification
Sweta Agrawal; Marine Carpuat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we conduct an empirical study to understand how different control mechanisms impact the adequacy and simplicity of text simplification systems.


100, Do All Languages Cost The Same? Tokenization in The Era of Commercial Language Models
Orevaoghene Ahia; Sachin Kumar; Hila Gonen; Jungo Kasai; David Mortensen; Noah Smith; Yulia Tsvetkov;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: What constitutes a token, however, is training data and model dependent with a large variance in the number of tokens required to convey the same information in different languages. In this work, we analyze the effect of this non-uniformity on the fairness of an API?s pricing policy across languages.


101, Evaluating Evaluation Metrics: A Framework for Analyzing NLG Evaluation Metrics Using Measurement Theory
Ziang Xiao; Susu Zhang; Vivian Lai; Q. Vera Liao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Recognizing the limitations of existing automatic metrics and noises from how current human evaluation was conducted, we propose MetricEval, a framework informed by measurement theory, the foundation of educational test design, for conceptualizing and evaluating the reliability and validity of NLG evaluation metrics.


102, Improving Language Models? Meaning Understanding and Consistency By Learning Conceptual Roles from Dictionary
Myeongjun Jang; Thomas Lukasiewicz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To this end, we propose a practical approach that alleviates the inconsistent behaviour issue by fundamentally improving PLMs? meaning awareness.


103, Consistency Analysis of ChatGPT
Myeongjun Jang; Thomas Lukasiewicz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour, focusing specifically on semantic consistency and the properties of negation, symmetric, and transitive consistency.


104, Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs Without Fine-tuning
Ximing Lu; Faeze Brahman; Peter West; Jaehun Jung; Khyathi Chandu; Abhilasha Ravichander; Prithviraj Ammanabrolu; Liwei Jiang; Sahana Ramnath; Nouha Dziri; Jillian Fisher; Bill Lin; Skyler Hallinan; Lianhui Qin; Xiang Ren; Sean Welleck; Yejin Choi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose Inference-time Policy Adapters (IPA), which efficiently tailors a language model such as GPT-3 without fine-tuning it.


105, KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
Sehyun Choi; Tianqing Fang; Zhaowei Wang; Yangqiu Song;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Unfortunately, this method incurs high training costs and may cause catastrophic forgetting for multi-tasking models. To overcome these limitations, we propose a knowledge-constrained decoding method called KCTS (Knowledge-Constrained Tree Search), which guides a frozen LM to generate text aligned with the reference knowledge at each decoding step using a knowledge classifier score and MCTS (Monte-Carlo Tree Search).


106, Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination By Evaluation Benchmarks
Alon Jacovi; Avi Caciularu; Omer Goldman; Yoav Goldberg;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Assuming that all relevant actors value clean test data and will cooperate to mitigate data contamination, what can be done? We propose three strategies that can make a difference: (1) Test data made public should be encrypted with a public key and licensed to disallow derivative distribution; (2) demand training exclusion controls from closed API holders, and protect your test data by refusing to evaluate without them; (3) avoid data which appears with its solution on the internet, and release the web-page context of internet-derived data along with the data.


107, Focus Your Attention (with Adaptive IIR Filters)
Shahar Lutati; Itamar Zimerman; Lior Wolf;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We present a new layer in which dynamic (i. e. , input-dependent) Infinite Impulse Response (IIR) filters of order two are used to process the input sequence prior to applying conventional attention.


108, DetGPT: Detect What You Need Via Reasoning
Renjie Pi; Jiahui Gao; Shizhe Diao; Rui Pan; Hanze Dong; Jipeng Zhang; Lewei Yao; Jianhua Han; Hang Xu; Lingpeng Kong; Tong Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce a new paradigm for object detection that we call reasoning-based object detection.


109, Q2d: Turning Questions Into Dialogs to Teach Models How to Search
Yonatan Bitton; Shlomi Cohen-Ganor; Ido Hakimi; Yoad Lewenberg; Roee Aharoni; Enav Weinreb;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions.


110, ReTAG: Reasoning Aware Table to Analytic Text Generation
Deepanway Ghosal; Preksha Nema; Aravindan Raghuveer;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Through analysis of popular table to text benchmarks (ToTTo (Parikh et al. , 2020 and InfoTabs (Gupta et al. , 2020) we observe that in order to generate the ideal summary, multiple types of reasoning is needed coupled with access to knowledge beyond the scope of the table. To address this gap, we propose ReTAG, a table and reasoning aware model that uses vector-quantization to infuse different types of analytical reasoning into the output.


111, MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
Xiaonan Li; Xipeng Qiu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose a framework, **MoT**, to let the LLM self-improve through **M**emory **o**f **T**houghts, without annotated datasets and parameter updates.


112, UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation
Daixuan Cheng; Shaohan Huang; Junyu Bi; Yuefeng Zhan; Jianfeng Liu; Yujing Wang; Hao Sun; Furu Wei; Weiwei Deng; Qi Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose UPRISE (Universal Prompt Retrieval for Improving zero-Shot Evaluation), which tunes a lightweight and versatile retriever that automatically retrieves prompts for a given zero-shot task input.


113, Do Transformers Parse While Predicting The Masked Word?
Haoyu Zhao; Abhishek Panigrahi; Rong Ge; Sanjeev Arora;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Some doubts have been raised whether the models are doing parsing or only some computation weakly correlated with it. Concretely: (a) Is it possible to explicitly describe transformers with realistic embedding dimensions, number of heads, etc. that are capable of doing parsing ? or even approximate parsing? (b) Why do pre-trained models capture parsing structure? This paper takes a step toward answering these questions in the context of generative modeling with PCFGs. We show that masked language models like BERT or RoBERTa of moderate sizes can approximately execute the Inside-Outside algorithm for the English PCFG (Marcus et al. , 1993).


114, SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
Hyunwoo Kim; Jack Hessel; Liwei Jiang; Peter West; Ximing Lu; Youngjae Yu; Pei Zhou; Ronan Bras; Malihe Alikhani; Gunhee Kim; Maarten Sap; Yejin Choi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Data scarcity has been a long standing issue in the field of open-domain social dialogue. To quench this thirst, we present SODA: the first publicly available, million-scale high-quality social dialogue dataset.


115, Explaining with Contrastive Phrasal Highlighting: A Case Study in Assisting Humans to Detect Translation Differences
Eleftheria Briakou; Navita Goyal; Marine Carpuat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce a technique to generate contrastive phrasal highlights that explain the predictions of a semantic divergence model via phrase alignment guided erasure.


116, APrompt: Attention Prompt Tuning for Efficient Adaptation of Pre-trained Language Models
Qifan Wang; Yuning Mao; Jingang Wang; Hanchao Yu; Shaoliang Nie; Sinong Wang; Fuli Feng; Lifu Huang; Xiaojun Quan; Zenglin Xu; Dongfang Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose a novel Attention Prompt tuning method, namely APrompt, for efficient adaptation of pre-trained language models.


117, Cognitive Dissonance: Why Do Language Model Outputs Disagree with Internal Representations of Truthfulness?
Kevin Liu; Stephen Casper; Dylan Hadfield-Menell; Jacob Andreas;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We identify three different classes of disagreement, which we term confabulation, deception, and heterogeneity.


118, Can We Edit Multimodal Large Language Models?
Siyuan Cheng; Bozhong Tian; Qingbin Liu; Xi Chen; Yongheng Wang; Huajun Chen; Ningyu Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we focus on editing multimodal Large Language Models (LLMs).


119, Active Instruction Tuning: Improving Cross-Task Generalization By Training on Prompt Sensitive Tasks
Po-Nien Kung; Fan Yin; Di Wu; Kai-Wei Chang; Nanyun Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We discover that training on ambiguous (prompt-uncertain) tasks improves generalization while training on difficult (prompt-certain and low-probability) tasks offers no benefit, underscoring the importance of task selection for instruction tuning.


120, HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation
David Dale; Elena Voita; Janice Lam; Prangthip Hansanti; Christophe Ropers; Elahe Kalbassi; Cynthia Gao; Loic Barrault; Marta Costa-juss?;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we release an annotated dataset for the hallucination and omission phenomena covering 18 translation directions with varying resource levels and scripts.


121, Oolong: Investigating What Makes Transfer Learning Hard with Controlled Studies
Zhengxuan Wu; Alex Tamkin; Isabel Papadimitriou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To disentangle the impact of different factors like syntactic similarity and vocabulary similarity, we propose a set of controlled transfer studies: we systematically transform the language of the GLUE benchmark, altering one axis of crosslingual variation at a time, and then measure the resulting drops in a pretrained model?s downstream performance.


122, UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Ahmed Masry; Parsa Kavehzadeh; Do Long; Enamul Hoque; Shafiq Joty;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose several chart-specific pretraining tasks that include: (i) low-level tasks to extract the visual elements (e. g. , bars, lines) and data from charts, and (ii) high-level tasks to acquire chart understanding and reasoning skills.


123, Reformulating NLP Tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia
Dimitris Gkoumas; Matthew Purver; Maria Liakata;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Here, we automatically learn linguistic disorder patterns by making use of a moderately-sized pre-trained language model and forcing it to focus on reformulated natural language processing (NLP) tasks and associated linguistic patterns.


124, A Digital Language Coherence Marker for Monitoring Dementia
Dimitris Gkoumas; Adam Tsakalidis; Maria Liakata;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Here we propose methods to capture language coherence as a cost-effective, human-interpretable digital marker for monitoring cognitive changes in people with dementia.


125, Ties Matter: Meta-Evaluating Modern Metrics with Pairwise Accuracy and Tie Calibration
Daniel Deutsch; George Foster; Markus Freitag;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose instead to meta-evaluate metrics with a version of pairwise accuracy that gives metrics credit for correctly predicting ties, in combination with a tie calibration procedure that automatically introduces ties into metric scores, enabling fair comparison between metrics that do and do not predict ties.


126, ReCEval: Evaluating Reasoning Chains Via Correctness and Informativeness
Archiki Prasad; Swarnadeep Saha; Xiang Zhou; Mohit Bansal;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Specifically, we propose ReCEval (Reasoning Chain Evaluation), a framework that evaluates reasoning chains via two key properties: (1) correctness, i. e. , each step makes a valid inference based on information contained within the step, preceding steps, and input context, and (2) informativeness, i. e. , each step provides new information that is helpful towards deriving the generated answer.


127, Skill-Based Few-Shot Selection for In-Context Learning
Shengnan An; Bo Zhou; Zeqi Lin; Qiang Fu; Bei Chen; Nanning Zheng; Weizhu Chen; Jian-Guang Lou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose **Skill-KNN**, a skill-based few-shot selection method for in-context learning.


128, IfQA: A Dataset for Open-domain Question Answering Under Counterfactual Presuppositions
Wenhao Yu; Meng Jiang; Peter Clark; Ashish Sabharwal;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Although counterfactual reasoning is a fundamental aspect of intelligence, the lack of large-scale counterfactual open-domain question-answering (QA) benchmarks makes it difficult to evaluate and improve models on this ability. To address this void, we introduce the first such dataset, named IfQA, where each question is based on a counterfactual presupposition via an ?if? clause.


129, Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting
Xi Ye; Greg Durrett;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.


130, GlobalBench: A Benchmark for Global Progress in Natural Language Processing
Yueqi Song; Simran Khanuja; Pengfei Liu; Fahim Faisal; Alissa Ostapenko; Genta Winata; Alham Aji; Samuel Cahyawijaya; Yulia Tsvetkov; Antonios Anastasopoulos; Graham Neubig;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To track and further incentivize the global development of equitable language technology, we introduce GlobalBench.


131, UDAPDR: Unsupervised Domain Adaptation Via LLM Prompting and Distillation of Rerankers
Jon Saad-Falcon; Omar Khattab; Keshav Santhanam; Radu Florian; Martin Franz; Salim Roukos; Avirup Sil; Md Sultan; Christopher Potts;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, such datasets are often unavailable, and their utility for real-world applications can diminish quickly due to domain shifts. To address this challenge, we develop and motivate a method for using large language models (LLMs) to generate large numbers of synthetic queries cheaply.


132, Byte Pair Encoding for Symbolic Music
Nathan Fradet; Nicolas Gutowski; Fabien Chhel; Jean-Pierre Briot;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we show that Byte Pair Encoding, a compression technique widely used for natural language, significantly decreases the sequence length while increasing the vocabulary size.


133, Incorporating Structured Representations Into Pretrained Vision & Language Models Using Scene Graphs
Roei Herzig; Alon Mendelson; Leonid Karlinsky; Assaf Arbelle; Rogerio Feris; Trevor Darrell; Amir Globerson;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Here we ask whether small SG datasets can provide sufficient information for enhancing structured understanding of pretrained VLMs. We show that it is indeed possible to improve VLMs when learning from SGs by integrating components that incorporate structured information into both visual and textual representations.


134, Can We Edit Factual Knowledge By In-Context Learning?
Ce Zheng; Lei Li; Qingxiu Dong; Yuxuan Fan; Zhiyong Wu; Jingjing Xu; Baobao Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Inspired by in-context learning (ICL), a new paradigm based on demonstration contexts without parameter updating, we explore whether ICL can edit factual knowledge.


135, Merging Experts Into One: Improving Computational Efficiency of Mixture of Experts
Shwai He; Run-Ze Fan; Liang Ding; Li Shen; Tianyi Zhou; Dacheng Tao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Can we retain the advantages of adding more experts without substantially increasing the computational costs? In this paper, we first demonstrate the superiority of selecting multiple experts and then propose a computation-efficient approach called Merging Experts into One (MEO), which reduces the computation cost to that of a single expert.


136, SummEdits: Measuring LLM Ability at Factual Reasoning Through The Lens of Summarization
Philippe Laban; Wojciech Kryscinski; Divyansh Agarwal; Alexander Fabbri; Caiming Xiong; Shafiq Joty; Chien-Sheng Wu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, a closer analysis reveals issues with existing evaluation benchmarks, affecting evaluation precision. To address this, we propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits.


137, Mitigating Temporal Misalignment By Discarding Outdated Facts
Michael Zhang; Eunsol Choi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To mitigate the effects of temporal misalignment, we propose fact duration prediction: the task of predicting how long a given fact will remain true.


138, Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models
Daman Arora; Himanshu Singh; Mausam;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In response, we present JEEBench, a considerably more challenging benchmark dataset for evaluating the problem solving abilities of LLMs.


139, Noisy Exemplars Make Large Language Models More Robust: A Domain-Agnostic Behavioral Analysis
Hongyi Zheng; Abulhair Saparov;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, there is little existing work investigating the robustness of LLMs with few-shot prompting techniques. Therefore, we introduce a systematic approach to test the robustness of LLMs in multi-hop reasoning tasks via domain-agnostic perturbations.


140, Conversational Semantic Parsing Using Dynamic Context Graphs
Parag Jain; Mirella Lapata;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper we consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types.


141, Enhancing Textbooks with Visuals from The Web for Improved Learning
Janvijay Singh; Vil?m Zouhar; Mrinmaya Sachan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we investigate the effectiveness of vision-language models to automatically enhance textbooks with images from the web.


142, Enhancing Biomedical Lay Summarisation with External Knowledge Graphs
Tomas Goldsack; Zhihao Zhang; Chen Tang; Carolina Scarton; Chenghua Lin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Using both automatic and human evaluations, we systematically investigate the effectiveness of three different approaches for incorporating knowledge graphs within lay summarisation models, with each method targeting a distinct area of the encoder-decoder model architecture.


143, 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
Zehan Wang; Haifeng Huang; Yang Zhao; Linjun Li; Xize Cheng; Yichen Zhu; Aoxiong Yin; Zhou Zhao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose a relation-aware one-stage framework, named 3D Relative Position-aware Network (3DRP-Net), which can effectively capture the relative spatial relationships between objects and enhance object attributes.


144, Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation
Da Yin; Xiao Liu; Fan Yin; Ming Zhong; Hritik Bansal; Jiawei Han; Kai-Wei Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose Dynosaur, a dynamic growth paradigm for the automatic curation of instruction-tuning data.


145, MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter
Zhiyuan Liu; Sihang Li; Yanchen Luo; Hao Fei; Yixin Cao; Kenji Kawaguchi; Xiang Wang; Tat-Seng Chua;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, they inherently lack 2D graph perception ? a critical ability of human professionals in comprehending molecules? topological structures. To bridge this gap, we propose MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter.


146, GD-COMET: A Geo-Diverse Commonsense Inference Model
Mehar Bhatia; Vered Shwartz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present GD-COMET, a geo-diverse version of the COMET commonsense inference model.


147, Crossing The Threshold: Idiomatic Machine Translation Through Retrieval Augmentation and Loss Weighting
Emmy Liu; Aditi Chaudhary; Graham Neubig;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To improve translation of natural idioms, we introduce two straightforward yet effective techniques: the strategic upweighting of training loss on potentially idiomatic sentences, and using retrieval-augmented models.


148, Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi; Grzegorz Chrupala; Willem Zuidema; Afra Alishahi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Transformers have become a key architecture in speech processing, but our understanding of how they build up representations of acoustic and linguistic structure is limited. In this study, we address this gap by investigating how measures of ?context-mixing? developed for text models can be adapted and applied to models of spoken language.


149, Non-autoregressive Streaming Transformer for Simultaneous Translation
Zhengrui Ma; Shaolei Zhang; Shoutao Guo; Chenze Shao; Min Zhang; Yang Feng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To address those issues, we propose non-autoregressive streaming Transformer (NAST) which comprises a unidirectional encoder and a non-autoregressive decoder with intra-chunk parallelism.


150, SeqXGPT: Sentence-Level AI-Generated Text Detection
Pengyu Wang; Linyang Li; Ke Ren; Botian Jiang; Dong Zhang; Xipeng Qiu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: These features are composed like waves in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks.


151, Knowledge-Augmented Language Model Verification
Jinheon Baek; Soyeong Jeong; Minki Kang; Jong Park; Sung Hwang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To overcome these, we propose to verify the output and the knowledge of the knowledge-augmented LMs with a separate verifier, which is a small LM that is trained to detect those two types of errors through instruction-finetuning.


152, Explicit Planning Helps Language Models in Logical Reasoning
Hongyu Zhao; Kangrui Wang; Mo Yu; Hongyuan Mei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose LEAP, a novel system that uses language models to perform multi-step logical reasoning and incorporates explicit planning into the inference procedure.


153, API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs
Minghao Li; Yingxiu Zhao; Bowen Yu; Feifan Song; Hangyu Li; Haiyang Yu; Zhoujun Li; Fei Huang; Yongbin Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, three pivotal questions remain unanswered: (1) How effective are current LLMs in utilizing tools? (2) How can we enhance LLMs? ability to utilize tools? (3) What obstacles need to be overcome to leverage tools? To address these questions, we introduce API-Bank, a groundbreaking benchmark, specifically designed for tool-augmented LLMs.


154, Towards Interpretable Mental Health Analysis with Large Language Models
Kailai Yang; Shaoxiong Ji; Tianlin Zhang; Qianqian Xie; Ziyan Kuang; Sophia Ananiadou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, existing relevant studies bear several limitations, including inadequate evaluations, lack of prompting strategies, and ignorance of exploring LLMs for explainability. To bridge these gaps, we comprehensively evaluate the mental health analysis and emotional reasoning ability of LLMs on 11 datasets across 5 tasks.


155, Label Words Are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Lean Wang; Lei Li; Damai Dai; Deli Chen; Hao Zhou; Fandong Meng; Jie Zhou; Xu Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate the working mechanism of ICL through an information flow lens.


156, A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documents
Benjamin Newman; Luca Soldaini; Raymond Fok; Arman Cohan; Kyle Lo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we use language models to rewrite snippets from scientific documents to be read on their own.


157, Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Kent Chang; Mackenzie Cramer; Sandeep Soni; David Bamman;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we carry out a data archaeology to infer books that are known to ChatGPT and GPT-4 using a name cloze membership inference query.


158, Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media
Shubham Mittal; Megha Sundriyal; Preslav Nakov;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Despite its importance to journalists and human fact-checkers, it remains a severely understudied problem, and the scarce research on this topic so far has only focused on English. Here we aim to bridge this gap by creating a novel dataset, X-CLAIM, consisting of 7K real-world claims collected from numerous social media platforms in five Indian languages and English.


159, Detecting Propaganda Techniques in Code-Switched Social Media Text
Muhammad Salman; Asif Hanif; Shady Shehata; Preslav Nakov;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Code-switching combines different languages within the same text, which poses a challenge for automatic systems. Considering this premise, we propose a novel task of detecting propaganda techniques in code-switched text.


160, LLM-powered Data Augmentation for Enhanced Cross-lingual Performance
Chenxi Whitehouse; Monojit Choudhury; Alham Aji;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper explores the potential of leveraging Large Language Models (LLMs) for data augmentation in multilingual commonsense reasoning datasets where the available training data is extremely limited.


161, On The Challenges of Using Black-Box APIs for Toxicity Evaluation in Research
Luiza Pozzobon; Beyza Ermis; Patrick Lewis; Sara Hooker;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Our findings suggest that research that relied on inherited automatic toxicity scores to compare models and techniques may have resulted in inaccurate findings.


162, Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization
Tianshi Che; Ji Liu; Yang Zhou; Jiaxiang Ren; Jiwen Zhou; Victor Sheng; Huaiyu Dai; Dejing Dou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper proposes a Parameter-efficient prompt Tuning approach with Adaptive Optimization, i. e. , FedPepTAO, to enable efficient and effective FL of LLMs.


163, Appraising The Potential Uses and Harms of LLMs for Medical Systematic Reviews
Hye Yun; Iain Marshall; Thomas Trikalinos; Byron Wallace;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We conducted 16 interviews with international systematic review experts to characterize the perceived utility and risks of LLMs in the specific context of medical evidence reviews.


164, Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU
Fajri Koto; Nurul Aisyah; Haonan Li; Timothy Baldwin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we introduce IndoMMLU, the first multi-task language understanding benchmark for Indonesian culture and languages, which consists of questions from primary school to university entrance exams in Indonesia.


165, IEKG: A Commonsense Knowledge Graph for Idiomatic Expressions
Ziheng Zeng; Kellen Cheng; Srihari Nanniyur; Jianing Zhou; Suma Bhat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Unlike prior works that enable IE comprehension through fine-tuning PTLMs with sentences containing IEs, in this work, we construct IEKG, a commonsense knowledge graph for figurative interpretations of IEs.


166, Generating Data for Symbolic Language with Large Language Models
Jiacheng Ye; Chengzu Li; Lingpeng Kong; Tao Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose SymGen which utilizes LLMs for generating various annotation-expensive symbolic language data.


167, Text Encoders Bottleneck Compositionality in Contrastive Vision-language Models
Amita Kamath; Jack Hessel; Kai-Wei Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We first curate CompPrompts, a set of increasingly compositional image captions that VL models should be able to capture (e. g. , single object, to object+property, to multiple interacting objects). Then, we train text-only recovery probes that aim to reconstruct captions from single-vector text representations produced by several VL models.


168, What?s ?up? with Vision-language Models? Investigating Their Struggle with Spatial Reasoning
Amita Kamath; Jack Hessel; Kai-Wei Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We evaluate 18 VL models, finding that all perform poorly, e. g. , BLIP finetuned on VQAv2, which nears human parity on VQAv2, achieves 56% accuracy on our benchmarks vs. humans at 99%.


169, An Integrated Search System for Korea Weather Data
Jinkyung Jo; Dayeon Ki; Soyoung Yoon; Minjoon Seo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce WeatherSearch, an integrated search system deployed at the Korea Meteorological Administration (KMA).


170, Guideline Learning for In-Context Information Extraction
Chaoxu Pang; Yixuan Cao; Qiang Ding; Ping Luo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a Guideline Learning (GL) framework for In-context IE which reflectively learns and follows guidelines.


171, RainProof: An Umbrella to Shield Text Generator from Out-Of-Distribution Data
Maxime Darrin; Pablo Piantanida; Pierre Colombo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we focus on leveraging soft-probabilities in a black-box framework, i. e. we can access the soft-predictions but not the internal states of the model.


172, Continually Improving Extractive QA Via Human Feedback
Ge Gao; Hung-Ting Chen; Yoav Artzi; Eunsol Choi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We study continually improving an extractive question answering (QA) system via human user feedback.


173, BERTie Bott?s Every Flavor Labels: A Tasty Introduction to Semantic Role Labeling for Galician
Micaella Bruton; Meriem Beloucif;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we leverage existing corpora, WordNet, and dependency parsing to build the first Galician dataset for training semantic role labeling systems in an effort to expand available NLP resources.


174, Clembench: Using Game Play to Evaluate Chat-Optimized Language Models As Conversational Agents
Kranti Chalamalasetti; Jana G?tze; Sherzod Hakimov; Brielen Madureira; Philipp Sadler; David Schlangen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: As a proof of concept, this paper investigates five interaction settings, showing that current chat-optimised LLMs are, to an extent, capable of following game-play instructions.


175, Distance-Based Propagation for Efficient Knowledge Graph Reasoning
Harry Shomer; Yao Ma; Juanhui Li; Bo Wu; Charu Aggarwal; Jiliang Tang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Though there are a few recent attempts to address this through learnable path pruning, they often sacrifice the performance to gain efficiency. In this work, we identify two intrinsic limitations of these methods that affect the efficiency and representation quality.


176, Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
Elizabeth Salesky; Neha Verma; Philipp Koehn; Matt Post;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations.


177, Hybrid Inverted Index Is A Robust Accelerator for Dense Retrieval
Peitian Zhang; Zheng Liu; Shitao Xiao; Zhicheng Dou; Jing Yao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present the Hybrid Inverted Index (HI2), where the embedding clusters and salient terms work collaboratively to accelerate dense retrieval.


178, Harnessing LLMs for Temporal Data - A Study on Explainable Financial Time Series Forecasting
Xinli Yu; Zheng Chen; Yanbin Lu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: The study demonstrates LLMs? ability to generate well-reasoned decisions by leveraging cross-sequence information and extracting insights from text and price time series.


179, MAggretriever: A Simple Yet Effective Approach to Zero-Shot Multilingual Dense Retrieval
Sheng-Chieh Lin; Amin Ahmad; Jimmy Lin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we introduce mAggretriever, which effectively leverages semantic and lexical features from pre-trained multilingual transformers (e. g. , mBERT and XLM-R) for dense retrieval.


180, Indicative Summarization of Long Discussions
Shahbaz Syed; Dominik Schwabe; Khalid Khatib; Martin Potthast;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper presents a novel unsupervised approach using large language models (LLMs) to generating indicative summaries for long discussions that basically serve as tables of contents.


181, A Training-Free Debiasing Framework with Counterfactual Reasoning for Conversational Emotion Detection
Geng Tu; Ran Jing; Bin Liang; Min Yang; Kam-Fai Wong; Ruifeng Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, previous studies in ERC generally focus on capturing context-sensitive and speaker-sensitive dependencies, ignoring the unintended dataset biases of data, which hampers the generalization and fairness in ERC. To address this issue, we propose a Training-Free Debiasing framework (TFD) that operates during prediction without additional training.


182, A Mechanistic Interpretation of Arithmetic Reasoning in Language Models Using Causal Mediation Analysis
Alessandro Stolfo; Yonatan Belinkov; Mrinmaya Sachan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture. In order to improve our understanding of this aspect of language models, we present a mechanistic interpretation of Transformer-based LMs on arithmetic questions using a causal mediation analysis framework.


183, Let?s Sample Step By Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs
Pranjal Aggarwal; Aman Madaan; Yiming Yang; Mausam;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In response, we introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question using a lightweight stopping criterion.


184, Enhancing Chat Language Models By Scaling High-quality Instructional Conversations
Ning Ding; Yulin Chen; Bokai Xu; Yujia Qin; Shengding Hu; Zhiyuan Liu; Maosong Sun; Bowen Zhou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper aims to push the upper bound of open-source models further.


185, Knowledge Graph Compression Enhances Diverse Commonsense Generation
EunJeong Hwang; Veronika Thost; Vered Shwartz; Tengfei Ma;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, due to the large coverage and, consequently, vast scale of ConceptNet, the extracted subgraphs may contain loosely related, redundant and irrelevant information, which can introduce noise into the model. We propose to address this by applying a differentiable graph compression algorithm that focuses on the relevant knowledge for the task.


186, Decoding The Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting
Chenkai Sun; Jinning Li; Yi Fung; Hou Chan; Tarek Abdelzaher; ChengXiang Zhai; Heng Ji;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, existing approaches have limited exploration of how to best process and utilize these important features. To address this gap, we propose a novel framework, named SocialSense, that leverages a large language model to induce a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics.


187, Non-Programmers Can Label Programs Indirectly Via Active Examples: A Case Study with Text-to-SQL
Ruiqi Zhong; Charlie Snell; Dan Klein; Jason Eisner;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e. g. , Codex).


188, Goal-Driven Explainable Clustering Via Language Descriptions
Zihan Wang; Jingbo Shang; Ruiqi Zhong;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose a new task formulation, ?Goal-Driven Clustering with Explanations? (GoalEx), which represents both the goal and the explanations as free-form language descriptions.


189, Grammar-Constrained Decoding for Structured NLP Tasks Without Finetuning
Saibo Geng; Martin Josifoski; Maxime Peyrard; Robert West;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we demonstrate that formal grammars can describe the output space for a much wider range of tasks and argue that GCD can serve as a unified framework for structured NLP tasks in general.


190, Does The Correctness of Factual Knowledge Matter for Factual Knowledge-Enhanced Pre-trained Language Models?
Boxi Cao; Qiaoyu Tang; Hongyu Lin; Xianpei Han; Le Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we introduce a counterfactual-based analysis framework to explore the causal effects of factual knowledge injection on the performance of language models within pretrain-finetune paradigm.


191, When Language Models Fall in Love: Animacy Processing in Transformer Language Models
Michael Hanna; Yonatan Belinkov; Sandro Pezzelle;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Like previous studies, we find that LMs behave much like humans when presented with entities whose animacy is typical. However, we also show that even when presented with stories about atypically animate entities, such as a peanut in love, LMs adapt: they treat these entities as animate, though they do not adapt as well as humans.


192, Accelerating Toeplitz Neural Network with Constant-time Inference Complexity
Zhen Qin; Yiran Zhong;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we aim to combine the strengths of TNNs and SSMs by converting TNNs to SSMs during inference, thereby enabling TNNs to achieve the same constant inference complexities as SSMs.


193, Mirages. On Anthropomorphism in Dialogue Systems
Gavin Abercrombie; Amanda Curry; Tanvi Dinkar; Verena Rieser; Zeerak Talat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we discuss the linguistic factors that contribute to the anthropomorphism of dialogue systems and the harms that can arise thereof, including reinforcing gender stereotypes and conceptions of acceptable language.


194, PK-ICR: Persona-Knowledge Interactive Multi-Context Retrieval for Grounded Dialogue
Minsik Oh; Joosung Lee; Jiwei Li; Guoyin Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We develop a novel grounding retrieval method that utilizes all contexts of dialogue simultaneously.


195, Did You Mean...? Confidence-based Trade-offs in Semantic Parsing
Elias Stengel-Eskin; Benjamin Van Durme;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose the DidYouMean system which better balances usability and safety by rephrasing low-confidence inputs.


196, STAIR: Learning Sparse Text and Image Representation in Grounded Tokens
Chen Chen; Bowen Zhang; Liangliang Cao; Jiguang Shen; Tom Gunter; Albin Jose; Alexander Toshev; Yantao Zheng; Jonathon Shlens; Ruoming Pang; Yinfei Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we show that it is possible to build a sparse semantic representation that is as powerful as, or even better than, dense presentations.


197, Where to Start? Analyzing The Potential Value of Intermediate Models
Leshem Choshen; Elad Venezian; Shachar Don-Yehiya; Noam Slonim; Yoav Katz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Such a model, finetuned on some source dataset, may provide a better starting point for a new finetuning process on a desired target dataset. Here, we perform a systematic analysis of this intertraining scheme, over a wide range of English classification tasks.


198, INFORM : Information ENtropy Based Multi-step Reasoning FOR Large Language Models
Chuyue Zhou; Wangjie You; Juntao Li; Jing Ye; Kehai Chen; Min Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose a novel approach by introducing information entropy (IE) as a criteria on for CoT prompt selection.


199, DecipherPref: Analyzing Influential Factors in Human Preference Judgments Via GPT-4
Yebowen Hu; Kaiqiang Song; Sangwoo Cho; Xiaoyang Wang; Hassan Foroosh; Fei Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we conduct an in-depth examination of a collection of pairwise human judgments released by OpenAI.


200, Rethinking Negative Pairs in Code Search
Haochen Li; Xin Zhou; Anh Luu; Chunyan Miao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: As an example, a bubble sorting algorithm example is less ?negative? than a file saving function for the quick sorting algorithm query. In this paper, we tackle the above problems by proposing a simple yet effective Soft-InfoNCE loss that inserts weight terms into InfoNCE.


201, ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu; Rongjie Huang; Xuan Lin; Wenqiang Xu; Maozong Zheng; Hong Chen; Jinzheng He; Zhou Zhao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose ViT-TTS, the first visual TTS model with scalable diffusion transformers.


202, Evaluating Cross-Domain Text-to-SQL Models and Benchmarks
Mohammadreza Pourreza; Davood Rafiei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, accurately matching a model-generated SQL query to a reference SQL query in a benchmark fails for various reasons, such as underspecified natural language queries, inherent assumptions in both model-generated and reference queries, and the non-deterministic nature of SQL output under certain conditions. In this paper, we conduct an extensive study of several prominent cross-domain text-to-SQL benchmarks and re-evaluate some of the top-performing models within these benchmarks, by both manually evaluating the SQL queries and rewriting them in equivalent expressions.


203, Robust Prompt Optimization for Large Language Models Against Distribution Shifts
Moxin Li; Wenjie Wang; Fuli Feng; Yixin Cao; Jizhi Zhang; Tat-Seng Chua;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this light, we propose a new problem of robust prompt optimization for LLMs against distribution shifts, which requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group. To solve this problem, we propose Generalized Prompt Optimization framework , which incorporates the unlabeled data from the target group into prompt optimization.


204, GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding
Zekun Li; Wenxuan Zhou; Yao-Yi Chiang; Muhao Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper introduces GeoLM, a geospatially grounded language model that enhances the understanding of geo-entities in natural language.


205, Can LLMs Facilitate Interpretation of Pre-trained Language Models?
Basel Mousi; Nadir Durrani; Fahim Dalvi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose using a large language model, ChatGPT, as an annotator to enable fine-grained interpretation analysis of pre-trained language models.


206, Enhancing Code-Switching for Cross-lingual SLU: A Unified View of Semantic and Grammatical Coherence
Zhihong Zhu; Xuxin Cheng; Zhiqi Huang; Dongsheng Chen; Yuexian Zou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We ascribe this lack to two issues: (1) randomly replacing code-switched tokens with equal probability and (2) disregarding token-level dependency within each language. To tackle these issues, in this paper, we propose a novel method termed SoGo, for zero-shot cross-lingual SLU.


207, Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with The GeNTE Corpus
Andrea Piergentili; Beatrice Savoldi; Dennis Fucci; Matteo Negri; Luisa Bentivogli;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Based on GeNTE, we then overview existing reference-based evaluation approaches, highlight their limits, and propose a reference-free method more suitable to assess gender-neutral translation.


208, Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models
Di Wu; Wasi Ahmad; Kai-Wei Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Regarding decoding, we demonstrate that while greedy search achieves strong F1 scores, it lags in recall compared with sampling-based methods. Based on these insights, we propose DeSel, a likelihood-based decode-select algorithm for seq2seq PLMs.


209, HistAlign: Improving Context Dependency in Language Generation By Aligning with History
David Wan; Shiyue Zhang; Mohit Bansal;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, we find that even with training, the performance gain stemming from the cache component of current cache-LMs is suboptimal due to the misalignment between the current hidden states and those stored in the memory. In this work, we present HistAlign, a new training approach to ensure good cache alignment such that the model receives useful signals from the history.


210, Data Factors for Better Compositional Generalization
Xiang Zhou; Yichen Jiang; Mohit Bansal;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, in contrast to this poor performance, state-of-the-art models trained on larger and more general datasets show better generalization ability. In this work, to reconcile this inconsistency, we conduct an empirical analysis by training Transformer models on a variety of training sets with different data factors, including dataset scale, pattern complexity, example difficulty, etc.


211, Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models
Jirui Qi; Raquel Fern?ndez; Arianna Bisazza;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: With the ultimate goal of ensuring that users with different language backgrounds obtain consistent feedback from the same model, we study the cross-lingual consistency (CLC) of factual knowledge in various multilingual PLMs.


212, Do LLMs Understand Social Knowledge? Evaluating The Sociability of Large Language Models with SocKET Benchmark
Minje Choi; Jiaxin Pei; Sagar Kumar; Chang Shu; David Jurgens;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Here, we introduce a new theory-driven benchmark, SocKET, that contains 58 NLP tasks testing social knowledge which we group into five categories: humor & sarcasm, offensiveness, sentiment & emotion, and trustworthiness.


213, Improved Pseudo Data for Machine Translation Quality Estimation with Constrained Beam Search
Xiang Geng; Yu Zhang; Zhejian Lai; Shuaijie She; Wei Zou; Shimin Tao; Hao Yang; Jiajun Chen; Shujian Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, pseudo data solutions are less satisfying in unsupervised scenarios because the pseudo labels are inaccurate or the pseudo translations differ from the real ones. To address these problems, we propose to generate pseudo data using the MT model with constrained beam search (CBSQE).


214, IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing Interactive Machine Translation Systems
Xu Huang; Zhirui Zhang; Ruize Gao; Yichao Du; Lemao Liu; Guoping Huang; Shuming Shi; Jiajun Chen; Shujian Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present IMTLab, an open-source end-to-end interactive machine translation (IMT) system platform that enables researchers to quickly build IMT systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems.


215, Beyond Factuality: A Comprehensive Evaluation of Large Language Models As Knowledge Generators
Liang Chen; Yang Deng; Yatao Bian; Zeyu Qin; Bingzhe Wu; Tat-Seng Chua; Kam-Fai Wong;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge. In light of this, we introduce CONNER, a COmpreheNsive kNowledge Evaluation fRamework, designed to systematically and automatically evaluate generated knowledge from six important perspectives ? Factuality, Relevance, Coherence, Informativeness, Helpfulness and Validity.


216, Simplicity Level Estimate (SLE): A Learned Reference-Less Metric for Sentence Simplification
Liam Cripwell; Jo?l Legrand; Claire Gardent;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose a new learned evaluation metric ? SLE ? which focuses on simplicity, outperforming almost all existing metrics in terms of correlation with human judgements.


217, The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models
Jingyuan Qi; Zhiyang Xu; Ying Shen; Minqian Liu; Di Jin; Qifan Wang; Lifu Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Inspired by the human cognitive process, we propose SOCRATIC QUESTIONING, a divide-and-conquer style algorithm that mimics the recursive thinking process.


218, Towards A Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models
Yifan Hou; Jiaoda Li; Yu Fei; Alessandro Stolfo; Wangchunshu Zhou; Guangtao Zeng; Antoine Bosselut; Mrinmaya Sachan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, it is unclear whether LMs perform these tasks by cheating with answers memorized from pretraining corpus, or, via a multi-step reasoning mechanism. In this paper, we try to answer this question by exploring a mechanistic interpretation of LMs for multi-step reasoning tasks.


219, A Diachronic Perspective on User Trust in AI Under Uncertainty
Shehzaad Dhuliawala; Vil?m Zouhar; Mennatallah El-Assady; Mrinmaya Sachan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, modern NLP systems are seldom calibrated and are often confidently incorrect about their predictions, which violates users? mental model and erodes their trust. In this work, we design a study where users bet on the correctness of an NLP system, and use it to study the evolution of user trust as a response to these trust-eroding events and how the user trust is rebuilt as a function of time after these events.


220, Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models
Jianwei Li; Qi Lei; Wei Cheng; Dongkuan Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: As humans step into the era of large language models, these issues become increasingly prominent. This paper proposes that the robustness of language models is proportional to the extent of pre-trained knowledge they encompass.


221, From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation
Jiaxin Ge; Sanjay Subramanian; Trevor Darrell; Boyi Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Addressing the challenge of adapting pre-trained vision-language models for generating insightful explanations for visual reasoning tasks with limited annotations, we present ReVisE: a Recursive Visual Explanation algorithm.


222, Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy
Sarah Wiegreffe; Matthew Finlayson; Oyvind Tafjord; Peter Clark; Ashish Sabharwal;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Are there direct ways of reducing it, and does doing so improve task performance? We propose a mathematical formalism for SFC which allows us to quantify and bound its impact for the first time.


223, INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback
Wenda Xu; Danqing Wang; Liangming Pan; Zhenqiao Song; Markus Freitag; William Wang; Lei Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Although recent learned metrics show high correlation with human judgement, these metrics do not provide explicit explanation of their verdict, nor associate the scores with defects in the generated text. To address this limitation, we present INSTRUCTSCORE, a fine-grained explainable evaluation metric for text generation.


224, Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Laura Cabello; Emanuele Bugliarello; Stephanie Brandl; Desmond Elliott;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we define gender bias as our case study.


225, ?Mistakes Help Us Grow?: Facilitating and Evaluating Growth Mindset Supportive Language in Classrooms
Kunal Handa; Margarett Clapper; Jessica Boyle; Rose Wang; Diyi Yang; David Yeager; Dorottya Demszky;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We explore whether large language models (LLMs) can provide automated, personalized coaching to support teachers? use of GMSL.


226, Model-tuning Via Prompts Makes NLP Models Adversarially Robust
Mrigank Raman; Pratyush Maini; J Kolter; Zachary Lipton; Danish Pruthi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we demonstrate surprising gains in adversarial robustness enjoyed by Model-tuning Via Prompts (MVP), an alternative method of adapting to downstream tasks.


227, Random Entity Quantization for Parameter-Efficient Compositional Knowledge Graph Representation
Jiaang Li; Quan Wang; Yi Liu; Licheng Zhang; Zhendong Mao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We refer to the process of obtaining corresponding codewords of each entity as entity quantization, for which previous works have designed complicated strategies. Surprisingly, this paper shows that simple random entity quantization can achieve similar results to current strategies.


228, Understanding The Inner-workings of Language Models Through Representation Dissimilarity
Davis Brown; Charles Godfrey; Nicholas Konz; Jonathan Tu; Henry Kvinge;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work we show that representation dissimilarity measures, which are functions that measure the extent to which two model?s internal representations differ, can be a valuable tool for gaining insight into the mechanics of language models.


229, Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Haoqi Zheng; Qihuang Zhong; Liang Ding; Zhiliang Tian; Xin Niu; Changjian Wang; Dongsheng Li; Dacheng Tao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a self-evolution learning (SE) based mixup approach for data augmentation in text classification, which can generate more adaptive and model-friendly pseudo samples for the model training.


230, Zero-shot Sharpness-Aware Quantization for Pre-trained Language Models
Miaoxi Zhu; Qihuang Zhong; Li Shen; Liang Ding; Juhua Liu; Bo Du; Dacheng Tao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Most of the cutting-edge zero-shot quantization methods primarily 1) apply to computer vision tasks, and 2) neglect of overfitting problem in the generative adversarial learning process, leading to sub-optimal performance. Motivated by this, we propose a novel zero-shot sharpness-aware quantization (ZSAQ) framework for the zero-shot quantization of various PLMs.


231, NL2TL: Transforming Natural Languages to Temporal Logics Using Large Language Models
Yongchao Chen; Rujul Gandhi; Yang Zhang; Chuchu Fan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose an accurate and generalizable transformation framework of English instructions from NL to TL, exploring the use of Large Language Models (LLMs) at multiple stages.


232, A Simple Baseline for Knowledge-Based Visual Question Answering
Alexandros Xenos; Themos Stafylakis; Ioannis Patras; Georgios Tzimiropoulos;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Our main contribution in this paper is to propose a much simpler and readily reproducible pipeline which, in a nutshell, is based on efficient in-context learning by prompting LLaMA (1 and 2) using question-informative captions as contextual information.


233, SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation
Elizabeth Clark; Shruti Rijhwani; Sebastian Gehrmann; Joshua Maynez; Roee Aharoni; Vitaly Nikolaev; Thibault Sellam; Aditya Siddhant; Dipanjan Das; Ankur Parikh;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we introduce SEAHORSE, a dataset for multilingual, multifaceted summarization evaluation.


234, Deciphering Stereotypes in Pre-Trained Language Models
Weicheng Ma; Henry Scheible; Brian Wang; Goutham Veeramachaneni; Pratim Chowdhary; Alan Sun; Andrew Koulogeorge; Lili Wang; Diyi Yang; Soroush Vosoughi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper addresses the issue of demographic stereotypes present in Transformer-based pre-trained language models (PLMs) and aims to deepen our understanding of how these biases are encoded in these models.


235, Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Yang Chen; Hexiang Hu; Yi Luan; Haitian Sun; Soravit Changpinyo; Alan Ritter; Ming-Wei Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we introduce InfoSeek, a visual question answering dataset tailored for information-seeking questions that cannot be answered with only common sense knowledge.


236, Length Does Matter: Summary Length Can Bias Summarization Metrics
Xiaobo Guo; Soroush Vosoughi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: The results indicate that most metrics tend to favor longer summaries, even after accounting for other factors. To address this issue, we introduce a Bayesian normalization technique that effectively diminishes this bias.


237, Exploring Distributional Shifts in Large Language Models for Code Analysis
Shushan Arakelyan; Rocktim Das; Yi Mao; Xiang Ren;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We systematically study how three large language models with code capabilities - CodeT5, Codex, and ChatGPT - generalize to out-of-domain data.


238, Towards Building More Robust NER Datasets: An Empirical Study on NER Dataset Bias from A Dataset Difficulty View
Ruotian Ma; Xiaolei Wang; Xin Zhou; Qi Zhang; Xuanjing Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Previous research attributes the robustness problem to the existence of NER dataset bias, where simpler and regular entity patterns induce shortcut learning. In this work, we bring new insights into this problem by comprehensively investigating the NER dataset bias from a dataset difficulty view.


239, Hallucination Detection for Generative Large Language Models By Bayesian Sequential Estimation
Xiaohua Wang; Yuliang Yan; Longtao Huang; Xiaoqing Zheng; Xuanjing Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce a unique framework that leverages statistical decision theory and Bayesian sequential analysis to optimize the trade-off between costs and benefits during the hallucination detection process.


240, Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4
Kellin Pelrine; Anne Imouza; Camille Thibault; Meilina Reksoprodjo; Caleb Gupta; Joel Christoph; Jean-Fran?ois Godbout; Reihaneh Rabbany;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose focusing on generalization, uncertainty, and how to leverage recent large language models, in order to create more practical tools to evaluate information veracity in contexts where perfect classification is impossible.


241, Exploring Discourse Structure in Document-level Machine Translation
Xinyu Hu; Xiaojun Wan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a more sound paragraph-to-paragraph translation mode and explore whether discourse structure can improve DocMT.


242, Evaluation Metrics in The Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks
Andrea Sottana; Bin Liang; Kai Zou; Zheng Yuan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We aim to improve the understanding of current models? performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three NLP benchmarks: text summarisation, text simplification and grammatical error correction (GEC), using both automatic and human evaluation.


243, How Does Generative Retrieval Scale to Millions of Passages?
Ronak Pradeep; Kai Hui; Jai Gupta; Adam Lelkes; Honglei Zhuang; Jimmy Lin; Donald Metzler; Vinh Tran;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We conduct the first empirical study of generative retrieval techniques across various corpus scales, ultimately scaling up to the entire MS MARCO passage ranking task with a corpus of 8.


244, EtiCor: Corpus for Analyzing LLMs for Etiquettes
Ashutosh Dwivedi; Pradhyumna Lavania; Ashutosh Modi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose EtiCor, an Etiquettes Corpus, having texts about social norms from five different regions across the globe.


245, ViStruct: Visual Structural Knowledge Extraction Via Curriculum Guided Code-Vision Representation
Yangyi Chen; Xingyao Wang; Manling Li; Derek Hoiem; Heng Ji;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present ViStruct, a training framework to learn VLMs for effective visual structural knowledge extraction.


246, Multilingual Holistic Bias: Extending Descriptors and Patterns to Unveil Demographic Biases in Languages at Scale
Marta Costa-juss?; Pierre Andrews; Eric Smith; Prangthip Hansanti; Christophe Ropers; Elahe Kalbassi; Cynthia Gao; Daniel Licht; Carleigh Wood;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce a multilingual extension of the HolisticBias dataset, the largest English template-based taxonomy of textual people references: Multilingual HolisticBias.


247, NORMSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi Fung; Tuhin Chakrabarty; Hao Guo; Owen Rambow; Smaranda Muresan; Heng Ji;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Most computational research on norms has focused on a single culture, and manually built datasets, from non-conversational settings. We address these limitations by proposing a new framework, NormSage, to automatically extract culture-specific norms from multi-lingual conversations.


248, JASMINE: Arabic GPT Models for Few-Shot Learning
El Moatez Billah Nagoudi; Muhammad Abdul-Mageed; AbdelRahim Elmadany; Alcides Inciarte; Md Tawkat Islam Khondaker;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Using our novel benchmark, we evaluate JASMINE extensively showing powerful performance intrinsically as well as in few-shot learning on a wide range of NLP tasks. We aim to responsibly release our models and evaluation benchmark with interested researchers, along with code for experimenting with them.


249, Set Learning for Generative Information Extraction
Jiangnan Li; Yice Zhang; Bin Liang; Kam-Fai Wong; Ruifeng Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Consequently, this formalization introduces a potential order bias, which can impair model learning. Targeting this issue, this paper proposes a set learning approach that considers multiple permutations of structured objects to optimize set probability approximately.


250, What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability
Mario Giulianelli; Joris Baan; Wilker Aziz; Raquel Fern?ndez; Barbara Plank;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: For each test input, we measure the generator?s calibration to human production variability. Following this instance-level approach, we analyse NLG models and decoding strategies, demonstrating that probing a generator with multiple samples and, when possible, multiple references, provides the level of detail necessary to gain understanding of a model?s representation of uncertainty.


251, Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? A Study on Several Typical Tasks
Xianzhi Li; Samuel Chan; Xiaodan Zhu; Yulong Pei; Zhiqiang Ma; Xiaomo Liu; Sameena Shah;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we conduct empirical studies and provide experimental evidences of their performance on a wide variety of financial text analytical problems, using eight benchmark datasets from five categories of tasks.


252, ART: Rule BAsed FutuRe-inference DeducTion
Mengze Li; Tianqi Zhao; Bai Jionghao; Baoyi He; Jiaxu Miao; Wei Ji; Zheqi Lv; Zhou Zhao; Shengyu Zhang; Wenqiao Zhang; Fei Wu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we introduce rule bAsed futuRe-inference deducTion (ART), which aims at deducing the correct future event based on the visual phenomenon (a video) and the rule-based premises, along with an explanation of the reasoning process.


253, Learning to Describe for Predicting Zero-shot Drug-Drug Interactions
Fangqi Zhu; Yongqi Zhang; Lei Chen; Bing Qin; Ruifeng Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we introduce a new problem setup as zero-shot DDI prediction that deals with the case of new drugs.


254, Learning Preference Model for LLMs Via Automatic Preference Data Generation
Shijia Huang; Jianqiao Zhao; Yanyang Li; Liwei Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose learning the preference model for LLMs via automatic preference data generation (AutoPM).


255, Bridging The Gap Between Synthetic and Authentic Images for Multimodal Machine Translation
Wenyu Guo; Qingkai Fang; Dong Yu; Yang Feng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Consequently, using authentic images for training and synthetic images for inference can introduce a distribution shift, resulting in performance degradation during inference. To tackle this challenge, in this paper, we feed synthetic and authentic images to the MMT model, respectively.


256, Pushdown Layers: Encoding Recursive Structure in Transformer Language Models
Shikhar Murty; Pratyusha Sharma; Jacob Andreas; Christopher Manning;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This work introduces Pushdown Layers, a new self-attention layer that models recursive state via a stack tape that tracks estimated depths of every token in an incremental parse of the observed prefix.


257, BiasX: ?Thinking Slow? in Toxic Content Moderation with Explanations of Implied Social Biases
Yiming Zhang; Sravani Nanduri; Liwei Jiang; Tongshuang Wu; Maarten Sap;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This can lead to subtle toxicity being missed, and seemingly toxic but harmless content being over-detected. We introduce BiasX, a framework that enhances content moderation setups with free-text explanations of statements? implied social biases, and explore its effectiveness through a large-scale crowdsourced user study.


258, Don?t Take This Out of Context!: On The Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola; Xuhui Zhou; Elizabeth Clark; Maarten Sap;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate integrating the preceding textual context into both the rewriting and evaluation stages of stylistic text rewriting, and introduce a new composite contextual evaluation metric CtxSimFit that combines similarity to the original sentence with contextual cohesiveness.


259, FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
Hyunwoo Kim; Melanie Sclar; Xuhui Zhou; Ronan Bras; Gunhee Kim; Yejin Choi; Maarten Sap;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce FANToM, a new benchmark designed to stress-test ToM within information-asymmetric conversational contexts via question answering.


260, LLM-FP4: 4-Bit Floating-Point Quantized Transformers
Shih-yang Liu; Zechun Liu; Xijie Huang; Pingcheng Dong; Kwang-Ting Cheng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose LLM-FP4 for quantizing both weights and activations in large language models (LLMs) down to 4-bit floating-point values, in a post-training manner.


261, Instruct and Extract: Instruction Tuning for On-Demand Information Extraction
Yizhu Jiao; Ming Zhong; Sha Li; Ruining Zhao; Siru Ouyang; Heng Ji; Jiawei Han;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, when it comes to information extraction ? a classic task in natural language processing ? most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users.


262, AutoTrial: Prompting Language Models for Clinical Trial Design
Zifeng Wang; Cao Xiao; Jimeng Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a method named AutoTrial to aid the design of clinical eligibility criteria using language models.


263, A Unified View of Evaluation Metrics for Structured Prediction
Yunmo Chen; William Gantt; Tongfei Chen; Aaron White; Benjamin Van Durme;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present a conceptual framework that unifies a variety of evaluation metrics for different structured prediction tasks (e. g. event and relation extraction, syntactic and semantic parsing).


264, WordArt Designer: User-Driven Artistic Typography Synthesis Using Large Language Models
Jun-Yan He; Zhi-Qi Cheng; Chenyang Li; Jingdong Sun; Wangmeng Xiang; Xianhui Lin; Xiaoyang Kang; Zengke Jin; Yusen Hu; Bin Luo; Yifeng Geng; Xuansong Xie;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on the Large Language Model (LLM).


265, QTSumm: Query-Focused Summarization Over Tabular Data
Yilun Zhao; Zhenting Qi; Linyong Nan; Boyu Mi; Yixin Liu; Weijin Zou; Simeng Han; Ruizhe Chen; Xiangru Tang; Yumo Xu; Dragomir Radev; Arman Cohan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Motivated by this, we define a new query-focused table summarization task, where text generation models have to perform human-like reasoning and analysis over the given table to generate a tailored summary. We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables covering diverse topics.


266, Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios
Yilun Zhao; Haowei Zhang; Shengyun Si; Linyong Nan; Xiangru Tang; Arman Cohan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate the table-to-text capabilities of different LLMs using four datasets within two real-world information seeking scenarios.


267, CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
Mete Ismayilzada; Debjit Paul; Syrielle Montariol; Mor Geva; Antoine Bosselut;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present CRoW, a manually-curated, multi-task benchmark that evaluates the ability of models to apply commonsense reasoning in the context of six real-world NLP tasks.


268, CRAB: Assessing The Strength of Causal Relationships Between Real-world Events
Angelika Romanou; Syrielle Montariol; Debjit Paul; Leo Laugier; Karl Aberer; Antoine Bosselut;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present CRAB, a new Causal Reasoning Assessment Benchmark designed to evaluate causal understanding of events in real-world narratives.


269, SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables
Xinyuan Lu; Liangming Pan; Qian Liu; Preslav Nakov; Min-Yen Kan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present SCITAB, a challenging evaluation dataset consisting of 1.


270, CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding
Yixiao Ma; Yueyue Wu; Weihang Su; Qingyao Ai; Yiqun Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Specifically, these models may not fully capture the underlying legal features in legal case documents. To address this issue, we propose CaseEncoder, a legal document encoder that leverages fine-grained legal knowledge in both the data sampling and pre-training phases.


271, LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following
Cheng-Fu Yang; Yen-Chun Chen; Jianwei Yang; Xiyang Dai; Lu Yuan; Yu-Chiang Wang; Kai-Wei Chang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This lack of generalizability is due to the agent?s insensitivity to subtle changes in natural language instructions. To mitigate this issue, we propose explicitly aligning the agent?s hidden states with the instructions via contrastive learning.


272, Counter Turing Test (CT2): AI-Generated Text Detection Is Not As Easy As You May Think - Introducing AI Detectability Index (ADI)
Megha Chakraborty; S.M Towhidul Islam Tonmoy; S M Mehedi Zaman; Shreya Gautam; Tanay Kumar; Krish Sharma; Niyar Barman; Chandan Gupta; Vinija Jain; Aman Chadha; Amit Sheth; Amitava Das;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Thus, to establish a quantifiable spectrum facilitating the evaluation and ranking of LLMs according to their detectability levels, we propose the AI Detectability Index (ADI).


273, FACTIFY3M: A Benchmark for Multimodal Fact Verification with Explainability Through 5W Question-Answering
Megha Chakraborty; Khushbu Pahwa; Anku Rani; Shreyas Chatterjee; Dwip Dalal; Harshit Dave; Ritvik G; Preethi Gurumurthy; Adarsh Mahor; Samahriti Mukherjee; Aditya Pakala; Ishan Paul; Janvita Reddy; Arghya Sarkar; Kinjal Sensharma; Aman Chadha; Amit Sheth; Amitava Das;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Despite progress in automatic text-based fact verification (e. g. , FEVER, LIAR), the research community lacks substantial effort in multimodal fact verification. To address this gap, we introduce FACTIFY 3M, a dataset of 3 million samples that pushes the boundaries of the domain of fact verification via a multimodal fake news dataset, in addition to offering explainability through the concept of 5W question-answering.


274, Simple and Effective Input Reformulations for Translation
Brian Yu; Hansen Lillemark; Kurt Keutzer;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we reformulate inputs during finetuning for challenging translation tasks, leveraging model strengths from pretraining in novel ways to improve downstream performance.


275, The Sentiment Problem: A Critical Survey Towards Deconstructing Sentiment Analysis
Pranav Venkit; Mukund Srinath; Sanjana Gautam; Saranya Venkatraman; Vipul Gupta; Rebecca Passonneau; Shomir Wilson;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Our study exposes a lack of explicit definitions and frameworks for characterizing sentiment, resulting in potential challenges and biases. To tackle this issue, we propose an ethics sheet encompassing critical inquiries to guide practitioners in ensuring equitable utilization of SA.


276, PAC-tuning: Fine-tuning Pre-trained Language Models with PAC-driven Perturbed Gradient Descent
Guangliang Liu; Zhiyu Xue; Xitong Zhang; Kristen Johnson; Rongrong Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, adding these regularizations necessitates heavy tuning of the hyperparameters of optimization algorithms, such as the popular Adam optimizer. In this paper, we propose a two-stage fine-tuning method, PAC-tuning, to address this optimization challenge.


277, Unveiling The Implicit Toxicity in Large Language Models
Jiaxin Wen; Pei Ke; Hao Sun; Zhexin Zhang; Chengfei Li; Jinfeng Bai; Minlie Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: While recent studies primarily focus on probing toxic outputs that can be easily detected with existing toxicity classifiers, we show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via simply zero-shot prompting.


278, Re3Dial: Retrieve, Reorganize and Rescale Conversations for Long-Turn Open-Domain Dialogue Pre-training
Jiaxin Wen; Hao Zhou; Jian Guan; Jie Zhou; Minlie Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Most dialogues in existing pre-training corpora contain fewer than three turns of dialogue. To alleviate this issue, we propose the Retrieve, Reorganize and Rescale framework (Re3Dial), which can automatically construct billion-scale long-turn dialogues by reorganizing existing short-turn ones.


279, Multi-Source Probing for Open-Domain Conversational Understanding
Yuanxi Li; Hao Zhou; Jie Zhou; Minlie Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this study, we propose a Multi-Source Probing (MSP) method to probe the dialogue comprehension abilities of open-domain dialogue models.


280, Building Multi-domain Dialog State Trackers from Single-domain Dialogs
Qi Zhu; Zheng Zhang; Xiaoyan Zhu; Minlie Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a divide-and-conquer (DAC) DST paradigm and a multi-domain dialog synthesis framework, which makes building multi-domain DST models from single-domain dialogs possible.


281, SPT: Learning to Selectively Insert Prompts for Better Prompt Tuning
Wei Zhu; Ming Tan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose a novel framework, Selective Prompt Tuning (SPT), that learns to select the proper prompt layers by inserting a prompt controlled by a learnable probabilistic gate at each intermediate layer.


282, Enhancing Structured Evidence Extraction for Fact Verification
Zirui Wu; Nan Hu; Yansong Feng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a simple but effective method to enhance the extraction of structured evidence by leveraging the row and column semantics of tables.


283, UniMath: A Foundational and Multimodal Mathematical Reasoner
Zhenwen Liang; Tianyu Yang; Jipeng Zhang; Xiangliang Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: While significant progress has been made in natural language processing (NLP), existing methods exhibit limitations in effectively interpreting and processing diverse mathematical modalities. Therefore, we introduce UniMath, a versatile and unified system designed for multimodal mathematical reasoning tasks.


284, SLOG: A Structural Generalization Benchmark for Semantic Parsing
Bingzhi Li; Lucia Donatelli; Alexander Koller; Tal Linzen; Yuekun Yao; Najoung Kim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce SLOG, a semantic parsing dataset that extends COGS (Kim and Linzen, 2020) with 17 structural generalization cases.


285, Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering
Wang Zhu; Jesse Thomason; Robin Jia;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose Chain-of-Questions, a framework that trains a model to robustly answer multistep questions by generating and answering sub-questions.


286, Memory-Based Invariance Learning for Out-of-Domain Text Classification
Chen Jia; Yue Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Specifically, we augment the original feature space using key-value memory and employ a meta-learning-based approach to enhance the quality of the invariant representations.


287, Hi-ArG: Exploring The Integration of Hierarchical Argumentation Graphs in Language Pretraining
Jingcong Liang; Rong Ye; Meng Han; Qi Zhang; Ruofei Lai; Xinyu Zhang; Zhao Cao; Xuanjing Huang; Zhongyu Wei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose the **Hi**erarchical **Ar**gumentation **G**raph (Hi-ArG), a new structure to organize arguments.


288, Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation
Jiayu Lin; Rong Ye; Meng Han; Qi Zhang; Ruofei Lai; Xinyu Zhang; Zhao Cao; Xuanjing Huang; Zhongyu Wei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present the ArgTersely benchmark for sentence-level counter-argument generation, drawing from a manually annotated dataset from the ChangeMyView debate forum.


289, When Do Decompositions Help for Machine Reading?
Kangda Wei; Dawn Lawrie; Benjamin Van Durme; Yunmo Chen; Orion Weller;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We find that decompositions can be helpful in zero or limited-data settings, giving several points of improvement in exact match.


290, Towards Conceptualization of ?Fair Explanation?: Disparate Impacts of Anti-Asian Hate Speech Explanations on Content Moderators
Tin Nguyen; Jiannan Xu; Aayushi Roy; Hal Daum? III; Marine Carpuat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose to characterize what constitutes an explanation that is itself ?fair? ? an explanation that does not adversely impact specific populations.


291, Expand, Highlight, Generate: RL-driven Document Generation for Passage Reranking
Arian Askari; Mohammad Aliannejadi; Chuan Meng; Evangelos Kanoulas; Suzan Verberne;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a new perspective of data augmentation: generating synthetic documents from queries.


292, E-THERAPIST: I Suggest You to Cultivate A Mindset of Positivity and Nurture Uplifting Thoughts
Kshitij Mishra; Priyanshu Priya; Manisha Burja; Asif Ekbal;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Focusing on this objective, we propose e-THERAPIST, a novel polite interpersonal psychotherapy dialogue system to address issues like depression, anxiety, schizophrenia, etc.


293, PHD: Pixel-Based Language Modeling of Historical Documents
Nadav Borenstein; Phillip Rust; Desmond Elliott; Isabelle Augenstein;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Due to the scarcity of real historical scans, we propose a novel method for generating synthetic scans to resemble real historical documents.


294, Towards LLM-driven Dialogue State Tracking
Yujie Feng; Zexin Lu; Bo Liu; Liming Zhan; Xiao-Ming Wu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we conduct an initial examination of ChatGPT?s capabilities in DST.


295, LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
Huiqiang Jiang; Qianhui Wu; Chin-Yew Lin; Yuqing Yang; Lili Qiu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To accelerate model inference and reduce cost, this paper presents LLMLingua, a coarse-to-fine prompt compression method that involves a budget controller to maintain semantic integrity under high compression ratios, a token-level iterative compression algorithm to better model the interdependence between compressed contents, and an instruction tuning based method for distribution alignment between language models.


296, Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model Collaboration
Yiquan Wu; Siying Zhou; Yifei Liu; Weiming Lu; Xiaozhong Liu; Yating Zhang; Changlong Sun; Fei Wu; Kun Kuang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose the precedent-enhanced LJP framework (PLJP) ? a system that leverages the strength of both LLM and domain models in the context of precedents.


297, The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment
Jared Fernandez; Jacob Kahn; Clara Na; Yonatan Bisk; Emma Strubell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We denote this phenomena as the framework tax, and observe that the disparity is growing as hardware speed increases over time. In this work, we examine this phenomena through a series of case studies analyzing the effects of model design decisions, framework paradigms, and hardware platforms on total model latency.


298, Once Is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling
Yuanhang Yang; Shiyi Qi; Chuanyi Liu; Qifan Wang; Cuiyun Gao; Zenglin Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To this end, this paper introduces a novel paradigm TopicAns for efficient sentence pair modeling.


299, Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation
Bashar Alhafni; Go Inoue; Christian Khairallah; Nizar Habash;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we present the first results on Arabic GEC using two newly developed Transformer-based pretrained sequence-to-sequence models.


300, On Bilingual Lexicon Induction with Large Language Models
Yaoyiran Li; Anna Korhonen; Ivan Vulic;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Inspired by the global paradigm shift in NLP towards Large Language Models (LLMs), we examine the potential of the latest generation of LLMs for the development of bilingual lexicons.


301, Multi-teacher Distillation for Multilingual Spelling Correction
Jingfen Zhang; Xuan Guo; Sravan Bodapati; Christopher Potts;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: For services that are deployed around the world, this poses a significant challenge for multilingual NLP: spelling errors need to be caught and corrected in all languages, and even in queries that use multiple languages. In this paper, we tackle this challenge using multi-teacher distillation.


302, Simple Temporal Adaptation to Changing Label Sets: Hashtag Prediction Via Dense KNN
Niloofar Mireshghallah; Nikolai Vogler; Junxian He; Omar Florez; Ahmed El-Kishky; Taylor Berg-Kirkpatrick;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we study temporal adaptation through the task of longitudinal hashtag prediction and propose a non-parametric dense retrieval technique, which does not require re-training, as a simple but effective solution.


303, STINMatch: Semi-Supervised Semantic-Topological Iteration Network for Financial Risk Detection Via News Label Diffusion
Xurui Li; Yue Qin; Rui Zhu; Tianqianjin Lin; Yongming Fan; Yangyang Kang; Kaisong Song; Fubang Zhao; Changlong Sun; Haixu Tang; Xiaozhong Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, unaffordable large-scale annotation as well as training data sparseness barrier the full exploitation of commercial news in risk detection. To address this problem, we propose a semi-supervised Semantic-Topological Iteration Network, STINMatch, along with a news-enterprise knowledge graph (NEKG) to endorse the risk detection enhancement.


304, CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning Without Full Large Language Model
Kaiyan Zhang; Ning Ding; Biqing Qi; Xuekai Zhu; Xinwei Long; Bowen Zhou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Simultaneously, we note subtle but potentially significant changes in representation and intermediate predictions across the layers. Inspired by these observations, we propose CRaSh, involving Clustering, Removing, and Sharing, a training-free strategy to derive improved emulators from LLMs.


305, CS2W: A Chinese Spoken-to-Written Style Conversion Dataset with Multiple Conversion Types
Zishan Guo; Linhao Yu; Minghui Xu; Renren Jin; Deyi Xiong;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Unfortunately, the availability of datasets for this is limited. To address this issue, we present CS2W, a Chinese Spoken-to-Written style conversion dataset comprising 7,237 spoken sentences extracted from transcribed conversational texts.


306, Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
Harman Singh; Pengchuan Zhang; Qifan Wang; Mengjiao Wang; Wenhan Xiong; Jingfei Du; Yu Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we consider the scene graph parsed from text as a proxy for the image scene graph and propose a graph decomposition and augmentation framework along with a coarse-to-fine contrastive learning objective between images and text that aligns sentences of various complexities to the same image.


307, Discourse Structures Guided Fine-grained Propaganda Identification
Yuanyuan Lei; Ruihong Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we aim to identify propaganda in political news at two fine-grained levels: sentence-level and token-level.


308, Fidelity-Enriched Contrastive Search: Reconciling The Faithfulness-Diversity Trade-Off in Text Generation
Wei-Lin Chen; Cheng-Kuang Wu; Hsin-Hsi Chen; Chung-Chi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we address the hallucination problem commonly found in natural language generation tasks.


309, Location-Aware Visual Question Generation with Lightweight Models
Nicholas Suwono; Justin Chen; Tun Hung; Ting-Hao Huang; I-Bin Liao; Yung-Hui Li; Lun-Wei Ku; Shao-Hua Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Specifically, we represent such location-aware information with surrounding images and a GPS coordinate. To tackle this task, we present a dataset generation pipeline that leverages GPT-4 to produce diverse and sophisticated questions.


310, GPT-RE: In-context Learning for Relation Extraction Using Large Language Models
Zhen Wan; Fei Cheng; Zhuoyuan Mao; Qianying Liu; Haiyue Song; Jiwei Li; Sadao Kurohashi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose GPT-RE to successfully address the aforementioned issues by (1) incorporating task-aware representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic.


311, Sociocultural Norm Similarities and Differences Via Situational Alignment and Explainable Textual Entailment
Sky CH-Wang; Arkadiy Saakyan; Oliver Li; Zhou Yu; Smaranda Muresan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Here, we propose a novel approach to discover and compare descriptive social norms across Chinese and American cultures.


312, RAPL: A Relation-Aware Prototype Learning Approach for Few-Shot Document-Level Relation Extraction
Shiao Meng; Xuming Hu; Aiwei Liu; Shuang Li; Fukun Ma; Yawen Yang; Lijie Wen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a relation-aware prototype learning method for FSDLRE to strengthen the relational semantics of prototype representations.


313, When The Majority Is Wrong: Modeling Annotator Disagreement for Subjective Tasks
Eve Fleisig; Rediet Abebe; Dan Klein;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Thus, a crucial problem in hate speech detection is determining if a statement is offensive to the demographic group that it targets, when that group may be a small fraction of the annotator pool. We construct a model that predicts individual annotator ratings on potentially offensive text and combines this information with the predicted target group of the text to predict the ratings of target group members.


314, Characterizing Mechanisms for Factual Recall in Language Models
Qinan Yu; Jack Merullo; Ellie Pavlick;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: On a dataset that queries for knowledge of world capitals, we investigate both distributional and mechanistic determinants of LM behavior in such situations.


315, DNA: Denoised Neighborhood Aggregation for Fine-grained Category Discovery
Wenbin An; Feng Tian; Wenkai Shi; Yan Chen; Qinghua Zheng; QianYing Wang; Ping Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose Denoised Neighborhood Aggregation (DNA), a self-supervised framework that encodes semantic structures of data into the embedding space.


316, Context Compression for Auto-regressive Transformers with Sentinel Tokens
Siyu Ren; Qi Jia; Kenny Zhu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose a plug-and-play approach that is able to incrementally compress the intermediate activation of a specified span of tokens into compact ones, thereby reducing both memory and computational cost when processing subsequent context.


317, Quantifying Character Similarity with Vision Transformers
Xinmei Yang; Abhishek Arora; Shao-Yu Jheng; Melissa Dell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This study develops an extensible way to measure character substitution costs for OCR?ed documents, by employing large-scale self-supervised training of vision transformers (ViT) with augmented digital fonts.


318, BanglaAbuseMeme: A Dataset for Bengali Abusive Meme Classification
Mithun Das; Animesh Mukherjee;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: The problem becomes more challenging in a low-resource setting (e. g. , Bengali memes, i. e. , images with Bengali text embedded on it) because of the absence of benchmark datasets on which AI models could be trained. In this paper we bridge this gap by building a Bengali meme dataset.


319, TrojanSQL: SQL Injection Against Natural Language Interface to Database
Jinchuan Zhang; Yan Zhou; Binyuan Hui; Yaxin Liu; Ziming Li; Songlin Hu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: By proposing TrojanSQL, a backdoor-based SQL injection framework for text-to-SQL systems, we show how state-of-the-art text-to-SQL parsers can be easily misled to produce harmful SQL statements that can invalidate user queries or compromise sensitive information about the database.


320, How Do Languages Influence Each Other? Studying Cross-lingual Data Sharing During LM Fine-tuning
Rochelle Choenni; Dan Garrette; Ekaterina Shutova;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Yet, it remains unclear to what extent, and under which conditions, languages rely on each other?s data. To answer this question, we use TracIn (Pruthi et al. , 2020), a training data attribution (TDA) method, to retrieve training samples from multilingual data that are most influential for test predictions in a given language.


321, Exploring Chain of Thought Style Prompting for Text-to-SQL
Chang-Yu Tai; Ziru Chen; Tianshu Zhang; Xiang Deng; Huan Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we hypothesize that a crucial aspect of LLMs to improve for text-to-SQL parsing is their multi-step reasoning ability.


322, ClimateBERT-NetZero: Detecting and Assessing Net Zero and Reduction Targets
Tobias Schimanski; Julia Bingler; Mathias Kraus; Camilla Hyslop; Markus Leippold;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Public and private actors struggle to assess the vast amounts of information about sustainability commitments made by various institutions. To address this problem, we create a novel tool for automatically detecting corporate and national net zero and reduction targets in three steps.


323, Primacy Effect of ChatGPT
Yiwei Wang; Yujun Cai; Muhao Chen; Yuxuan Liang; Bryan Hooi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer.


324, Rethinking and Improving Multi-task Learning for End-to-end Speech Translation
Yuhao Zhang; Chen Xu; Bei Li; Hao Chen; Tong Xiao; Chunliang Zhang; Jingbo Zhu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we investigate the consistency between different tasks, considering different times and modules.


325, MailEx: Email Event and Argument Extraction
Saurabh Srivastava; Gaurav Singh; Shou Matsumoto; Ali Raz; Paulo Costa; Joshua Poore; Ziyu Yao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we present the first dataset, MailEx, for performing event extraction from conversational email threads.


326, Multilingual Large Language Models Are Not (Yet) Code-Switchers
Ruochen Zhang; Samuel Cahyawijaya; Jan Christian Blaise Cruz; Genta Winata; Alham Aji;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we provide a comprehensive empirical analysis of various multilingual LLMs, benchmarking their performance across four tasks: sentiment analysis, machine translation, summarization and word-level language identification.


327, ALCUNA: Large Language Models Meet New Knowledge
Xunjian Yin; Baizhou Huang; Xiaojun Wan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we address the lack of benchmarks to evaluate LLMs? ability to handle new knowledge, an important and challenging aspect in the rapidly evolving world.


328, Models See Hallucinations: Evaluating The Factuality in Video Captioning
Hui Liu; Xiaojun Wan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we conduct the first human evaluation of the factuality in video captioning and annotate two factuality datasets.


329, Self-ICL: Zero-Shot In-Context Learning with Self-Generated Demonstrations
Wei-Lin Chen; Cheng-Kuang Wu; Yun-Nung Chen; Hsin-Hsi Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we introduce Self-ICL?a simple framework which bootstraps LMs? intrinsic capabilities to perform zero-shot ICL.


330, CRT-QA: A Dataset of Complex Reasoning Question Answering Over Tabular Data
Zhehao Zhang; Xitao Li; Yan Gao; Jian-Guang Lou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we first establish a comprehensive taxonomy of reasoning and operation types for tabular data analysis. Then, we construct a complex reasoning QA dataset over tabular data, named CRT-QA dataset (Complex Reasoning QA over Tabular data), with the following unique features: (1) it is the first Table QA dataset with multi-step operation and informal reasoning; (2) it contains fine-grained annotations on questions? directness, composition types of sub-questions, and human reasoning paths which can be used to conduct a thorough investigation on LLMs? reasoning ability; (3) it contains a collection of unanswerable and indeterminate questions that commonly arise in real-world situations.


331, Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding
Sangmin Bae; Jongwoo Ko; Hwanjun Song; Se-Young Yun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Consequently, we propose a Fast and Robust Early-Exiting (FREE) framework, which incorporates a shallow-deep module and a synchronized parallel decoding.


332, The Benefits of Label-Description Training for Zero-Shot Text Classification
Lingyu Gao; Debanjan Ghosh; Kevin Gimpel;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose a simple way to further improve zero-shot accuracies with minimal effort.


333, Crystal: Introspective Reasoners Reinforced with Self-Feedback
Jiacheng Liu; Ramakanth Pasunuru; Hannaneh Hajishirzi; Yejin Choi; Asli Celikyilmaz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose a novel method to develop an introspective commonsense reasoner, **Crystal**.


334, Reducing Sequence Length By Predicting Edit Spans with Large Language Models
Masahiro Kaneko; Naoaki Okazaki;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper proposes predicting edit spans for the source text for local sequence transduction tasks.


335, CompoundPiece: Evaluating and Improving Decompounding Performance of Language Models
Benjamin Minixhofer; Jonas Pfeiffer; Ivan Vulic;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we systematically study decompounding, the task of splitting compound words into their constituents, at a wide scale.


336, Revisiting The Optimality of Word Lengths
Tiago Pimentel; Clara Meister; Ethan Wilcox; Kyle Mahowald; Ryan Cotterell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we show that Piantadosi et al. ?s derivation does not minimize CCH?s cost, but rather a lower bound, which we term CCH-lower.


337, Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Benjamin Muller; John Wieting; Jonathan Clark; Tom Kwiatkowski; Sebastian Ruder; Livio Soares; Roee Aharoni; Jonathan Herzig; Xinyi Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We find that Natural Language Inference models and PaLM 2 fine-tuned on a very small amount of attribution data can accurately detect attribution. With these models, we improve the attribution level of a cross-lingual QA system.


338, Information Value: Measuring Utterance Predictability As Distance from Plausible Alternatives
Mario Giulianelli; Sarenne Wallbridge; Raquel Fern?ndez;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce a method to obtain interpretable estimates of information value using neural text generators, and exploit their psychometric predictive power to investigate the dimensions of predictability that drive human comprehension behaviour.


339, Target-to-Source Augmentation for Aspect Sentiment Triplet Extraction
Yice Zhang; Yifan Yang; Meng Li; Bin Liang; Shiwei Chen; Ruifeng Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, applying these methods to fine-grained tasks like ASTE poses challenges in generating diverse augmented samples while maintaining alignment between modified sentences and origin labels. Therefore, this paper proposes a target-to-source augmentation approach for ASTE.


340, Stance Detection on Social Media with Background Knowledge
Ang Li; Bin Liang; Jingqian Zhao; Bowen Zhang; Min Yang; Ruifeng Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate stance detection from a novel perspective, where the background knowledge of the targets is taken into account for better stance detection.


341, Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs
Simone Conia; Min Li; Daniel Lee; Umar Minhas; Ihab Ilyas; Yunyao Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, when it comes to non-English languages, the quantity and quality of textual information are comparatively scarce. To address this issue, we introduce the novel task of automatic Knowledge Graph Completion (KGE) and perform a thorough investigation on bridging the gap in both the quantity and quality of textual information between English and non-English languages.


342, Is ChatGPT Good at Search? Investigating Large Language Models As Re-Ranking Agents
Weiwei Sun; Lingyong Yan; Xinyu Ma; Shuaiqiang Wang; Pengjie Ren; Zhumin Chen; Dawei Yin; Zhaochun Ren;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we first investigate generative LLMs such as ChatGPT and GPT-4 for relevance ranking in IR. Surprisingly, our experiments reveal that properly instructed LLMs can deliver competitive, even superior results to state-of-the-art supervised methods on popular IR benchmarks.


343, Content- and Topology-Aware Representation Learning for Scientific Multi-Literature
Kai Zhang; Kaisong Song; Yangyang Kang; Xiaozhong Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose SMRC2, which extends representation learning to the multi-document level.


344, Countering Misinformation Via Emotional Response Generation
Daniel Russo; Shane Kaszefski-Yaschuk; Jacopo Staiano; Marco Guerini;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Thereby, significant effort has been made to automate the use of fact-checker material in social correction; however, no previous work has tried to integrate it with the style and pragmatics that are commonly employed in social media communication. To fill this gap, we present VerMouth, the first large-scale dataset comprising roughly 12 thousand claim-response pairs (linked to debunking articles), accounting for both SMP-style and basic emotions, two factors which have a significant role in misinformation credibility and spreading.


345, Dual-Feedback Knowledge Retrieval for Task-Oriented Dialogue Systems
Tianyuan Shi; Liangzhi Li; Zijian Lin; Tao Yang; Xiaojun Quan; Qifan Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Taking inspiration from open-domain question answering, we propose a retriever-generator architecture that harnesses a retriever to retrieve pertinent knowledge and a generator to generate system responses.


346, Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning Across Languages
Libo Qin; Qiguang Chen; Fuxuan Wei; Shijue Huang; Wanxiang Che;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we introduce cross-lingual prompting (CLP), aiming to improve zero-shot CoT reasoning across languages.


347, Dancing Between Success and Failure: Edit-level Simplification Evaluation Using SALSA
David Heineman; Yao Dou; Mounica Maddela; Wei Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Large language models (e. g. , GPT-4) are uniquely capable of producing highly rated text simplification, yet current human evaluation methods fail to provide a clear understanding of systems? specific strengths and weaknesses. To address this limitation, we introduce SALSA, an edit-based human annotation framework that enables holistic and fine-grained text simplification evaluation.


348, End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions
Libo Qin; Wenbo Pan; Qiguang Chen; Lizi Liao; Zhou Yu; Yue Zhang; Wanxiang Che; Min Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a thorough review and provide a unified perspective to summarize existing approaches as well as recent trends to advance the development of EToD research.


349, Tree Prompting: Efficient Task Adaptation Without Fine-Tuning
Chandan Singh; John Morris; Alexander Rush; Jianfeng Gao; Yuntian Deng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Tree Prompting is an approach to prompting which builds a decision tree of prompts, linking multiple prompt-LM calls together to solve a task.


350, Improving Diversity of Demographic Representation in Large Language Models Via Collective-Critiques and Self-Voting
Preethi Lahoti; Nicholas Blumm; Xiao Ma; Raghavendra Kotikalapudi; Sahitya Potluri; Qijun Tan; Hansa Srinivasan; Ben Packer; Ahmad Beirami; Alex Beutel; Jilin Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we formalize the problem diversity of representation in LLM generations.


351, Bridging Information-Theoretic and Geometric Compression in Language Models
Emily Cheng; Corentin Kervadec; Marco Baroni;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose analyzing compression in (pre-trained) LMs from two points of view: geometric and information-theoretic.


352, Interactive Text-to-SQL Generation Via Editable Step-by-Step Explanations
Yuan Tian; Zheng Zhang; Zheng Ning; Toby Li; Jonathan Kummerfeld; Tianyi Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Many techniques have been proposed to automatically generate SQL from natural language, but they suffer from two issues: (1) they still make many mistakes, particularly for complex queries, and (2) they do not provide a flexible way for non-expert users to validate and refine incorrect queries. To address these issues, we introduce a new interaction mechanism that allows users to directly edit a step-by-step explanation of a query to fix errors.


353, Better Quality Pre-training Data and T5 Models for African Languages
Akintunde Oladipo; Mofetoluwa Adeyemi; Orevaoghene Ahia; Abraham Owodunni; Odunayo Ogundepo; David Adelani; Jimmy Lin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this study, we highlight the importance of enhancing the quality of pretraining data in multilingual language models.


354, Prompt As Triggers for Backdoor Attack: Examining The Vulnerability in Language Models
Shuai Zhao; Jinming Wen; Anh Luu; Junbo Zhao; Jie Fu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we propose ProAttack, a novel and efficient method for performing clean-label backdoor attacks based on the prompt, which uses the prompt itself as a trigger.


355, CorefPrompt: Prompt-based Event Coreference Resolution By Measuring Event Type and Argument Compatibilities
Sheng Xu; Peifeng Li; Qiaoming Zhu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Most previous studies adopt the ?encoding first, then scoring? framework, making the coreference judgment rely on event encoding. Furthermore, current methods struggle to leverage human-summarized ECR rules, e. g. , coreferential events should have the same event type, to guide the model. To address these two issues, we propose a prompt-based approach, CorefPrompt, to transform ECR into a cloze-style MLM (masked language model) task.


356, FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models
Konstantin Dobler; Gerard de Melo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose FOCUS - **F**ast **O**verlapping Token **C**ombinations **U**sing **S**parsemax, a novel embedding initialization method that effectively initializes the embedding matrix for a new tokenizer based on information in the source model?s embedding matrix.


357, NeuSTIP: A Neuro-Symbolic Model for Link and Time Prediction in Temporal Knowledge Graphs
Ishaan Singh; Navdeep Kaur; Garima Gaur; Mausam;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In response, we propose a novel NS model for TKGC called NeuSTIP, which performs link prediction and time interval prediction in a TKG.


358, ZGUL: Zero-shot Generalization to Unseen Languages Using Multi-source Ensembling of Language Adapters
Vipul Rathore; Rajdeep Dhingra; Parag Singla; Mausam;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We tackle the problem of zero-shot cross-lingual transfer in NLP tasks via the use of language adapters (LAs).


359, TacoPrompt: A Collaborative Multi-Task Prompt Learning Method for Self-Supervised Taxonomy Completion
Hongyuan Xu; Ciyi Liu; Yuhang Niu; Yunong Chen; Xiangrui Cai; Yanlong Wen; Xiaojie Yuan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To address the aforementioned limitations, we propose TacoPrompt, a Collaborative Multi-Task Prompt Learning Method for Self-Supervised Taxonomy Completion.


360, Non-Autoregressive Math Word Problem Solver with Unified Tree Structure
Yi Bin; Mengqun Han; Wenhao Shi; Lei Wang; Yang Yang; See-Kiong Ng; Heng Shen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: The multiple solution variants depicting different possible solving procedures for the same input problem would raise two issues: 1) making it hard for the model to learn the mapping function between the input and output spaces effectively, and 2) wrongly indicating wrong when evaluating a valid expression variant. To address these issues, we introduce a unified tree structure to present a solution expression, where the elements are permutable and identical for all the expression variants.


361, Detecting Spoilers in Movie Reviews with External Movie Knowledge and User Networks
Heng Wang; Wenqian Zhang; Yuyang Bai; Zhaoxuan Tan; Shangbin Feng; Qinghua Zheng; Minnan Luo;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In light of these challenges, we first curate a large-scale network-based spoiler detection dataset LCS and a comprehensive and up-to-date movie knowledge base UKM. We then propose MVSD, a novel spoiler detection model that takes into account the external knowledge about movies and user activities on movie review platforms.


362, Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering
Fangkai Yang; Pu Zhao; Zezhong Wang; Lu Wang; Bo Qiao; Jue Zhang; Mohit Garg; Qingwei Lin; Saravan Rajmohan; Dongmei Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, centered around Microsoft products and IT technical problems encountered by customers.


363, TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models
Jing Xiong; Jianhao Shen; Ye Yuan; Haiming Wang; Yichun Yin; Zhengying Liu; Lin Li; Zhijiang Guo; Qingxing Cao; Yinya Huang; Chuanyang Zheng; Xiaodan Liang; Ming Zhang; Qun Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose TRIGO, an ATP benchmark that not only requires a model to reduce a trigonometric expression with step-by-step proof but also evaluates a generative LM?s reasoning ability on formulas and capability to manipulate, group, and factor number terms.


364, IBADR: An Iterative Bias-Aware Dataset Refinement Framework for Debiasing NLU Models
Xiaoyue Wang; Xin Liu; Lijie Wang; Yaoxiang Wang; Jinsong Su; Hua Wu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose IBADR, an Iterative Bias-Aware Dataset Refinement framework, which debiases NLU models without predefining biased features.


365, The Curious Case of Hallucinatory (Un)answerability: Finding Truths in The Hidden States of Over-Confident Large Language Models
Aviv Slobodkin; Omer Goldman; Avi Caciularu; Ido Dagan; Shauli Ravfogel;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we explore the behavior of LLMs when presented with (un)answerable queries.


366, Counting The Bugs in ChatGPT?s Wugs: A Multilingual Investigation Into The Morphological Capabilities of A Large Language Model
Leonie Weissweiler; Valentin Hofmann; Anjali Kantharuban; Anna Cai; Ritam Dutt; Amey Hengle; Anubha Kabra; Atharva Kulkarni; Abhishek Vijayakumar; Haofei Yu; Hinrich Schuetze; Kemal Oflazer; David Mortensen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We apply a version of Berko?s (1958) wug test to ChatGPT, using novel, uncontaminated datasets for the four examined languages.


367, Bridging Background Knowledge Gaps in Translation with Automatic Explicitation
HyoJung Han; Jordan Boyd-Graber; Marine Carpuat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This work introduces techniques for automatically generating explicitations, motivated by WikiExpl: a dataset that we collect from Wikipedia and annotate with human translators.


368, DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models
Chengcheng Han; Xiaowei Du; Che Zhang; Yixin Lian; Xiang Li; Ming Gao; Baoyuan Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose Dialogue-guided Chain-of-Thought (DialCoT) to improve the reasoning capabilities of SLMs, with the aim of generating intermediate reasoning steps in a dialogue format to guide the model to the final answer.


369, ALDi: Quantifying The Arabic Level of Dialectness of Text
Amr Keleg; Sharon Goldwater; Walid Magdy;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce the AOC-ALDi dataset (derived from the AOC dataset), containing 127,835 sentences (17% from news articles and 83% from user comments on those articles) which are manually labeled with their level of dialectness.


370, Self-Improvement of Non-autoregressive Model Via Sequence-Level Distillation
Yusheng Liao; Shuyang Jiang; Yiqi Li; Yu Wang; Yanfeng Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a method called Sequence-Level Self-Distillation (SLSD), which aims to generate distilled data by the NAT model itself, eliminating the need for additional teacher networks.


371, Contextual Interaction for Argument Post Quality Assessment
Yiran Wang; Xuanang Chen; Ben He; Le Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: By incorporating this approach, we aim to enhance the assessment of argument quality by effectively distinguishing between arguments with subtle differences in quality.


372, Improving Image Captioning Via Predicting Structured Concepts
Ting Wang; Weidong Chen; Yuanhe Tian; Yan Song; Zhendong Mao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a structured concept predictor (SCP) to predict concepts and their structures, then we integrate them into captioning, so that enhance the contribution of visual signals in this task via concepts and further use their relations to distinguish cross-modal semantics for better description generation.


373, Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation
Tianqi Zhong; Quan Wang; Jingxuan Han; Yongdong Zhang; Zhendong Mao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This limitation hinders the effectiveness of decoding methods in achieving high levels of controllability. To address this problem, we propose a novel lightweight decoding framework named Air-Decoding.


374, E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation
Fengyi Fu; Lei Zhang; Quan Wang; Zhendong Mao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a novel emotion correlation enhanced empathetic dialogue generation framework, which comprehensively realizes emotion correlation learning, utilization, and supervising.


375, ATFormer: A Learned Performance Model with Transfer Learning Across Devices for Deep Learning Tensor Programs
Yang Bai; Wenqian Zhao; Shuo Yin; Zixiao Wang; Bei Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper presents ATFormer, a simple yet efficient design with attention-inspired modules to accurately predict the performance of optimized operators by capturing global and long-range dependencies within a complete scheduling space.


376, Small Language Models Fine-tuned to Coordinate Larger Language Models Improve Complex Reasoning
Gurusha Juneja; Subhabrata Dutta; Soumen Chakrabarti; Sunny Manchanda; Tanmoy Chakraborty;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps.


377, Prototype-based HyperAdapter for Sample-Efficient Multi-task Tuning
Hao Zhao; Jie Fu; Zhaofeng He;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Despite the success, most existing methods independently adapt to each task without considering knowledge transfer between tasks and are limited to low-data regimes. To overcome this issue, we propose Prototype-based HyperAdapter (PHA), a novel framework built on the adapter-tuning and hypernetwork.


378, Large Language Models Can Self-Improve
Jiaxin Huang; Shixiang Gu; Le Hou; Yuexin Wu; Xuezhi Wang; Hongkun Yu; Jiawei Han;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets.


379, The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions
Siru Ouyang; Shuohang Wang; Yang Liu; Ming Zhong; Yizhu Jiao; Dan Iter; Reid Pryzant; Chenguang Zhu; Heng Ji; Jiawei Han;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper provides a comprehensive analysis of the divergence between academic research in NLP and the needs of real-world NLP applications via a large-scale collection of user-GPT conversations.


380, GLEN: General-Purpose Event Detection for Thousands of Types
Sha Li; Qiusi Zhan; Kathryn Conger; Martha Palmer; Heng Ji; Jiawei Han;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To make event extraction systems more accessible, we build a general-purpose event detection dataset GLEN, which covers 205K event mentions with 3,465 different types, making it more than 20x larger in ontology than today?s largest event dataset.


381, PIEClass: Weakly-Supervised Text Classification with Prompting and Noise-Robust Iterative Ensemble Training
Yunyi Zhang; Minhao Jiang; Yu Meng; Yu Zhang; Jiawei Han;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose a new method, PIEClass, consisting of two modules: (1) a pseudo label acquisition module that uses zero-shot prompting of pre-trained language models (PLM) to get pseudo labels based on contextualized text understanding beyond static keyword matching, and (2) a noise-robust iterative ensemble training module that iteratively trains classifiers and updates pseudo labels by utilizing two PLM fine-tuning methods that regularize each other.


382, CombLM: Adapting Black-Box Language Models Through Small Fine-Tuned Models
Aitor Ormazabal; Mikel Artetxe; Eneko Agirre;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we present a lightweight method for adapting large LMs to new domains and tasks, assuming no access to their weights or intermediate activations.


383, Language Model Is Suitable for Correction of Handwritten Mathematical Expressions Recognition
Zui Chen; Jiaqi Han; Chaofan Yang; Yi Zhou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This article investigates the distinctive language characteristics of LaTeX mathematical expressions, revealing two key observations: 1) the presence of explicit structural symbols, and 2) the treatment of symbols, particularly letters, as minimal units with context-dependent semantics, representing variables or constants. Rooted in these properties, we propose that language models have the potential to synchronously and complementarily provide both structural and semantic information, making them suitable for correction of HMER.


384, POE: Process of Elimination for Multiple Choice Reasoning
Chenkai Ma; Xinya Du;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To this end, we present the Process of Elimination (POE), a two-step scoring method.


385, A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems
Songbo Hu; Han Zhou; Moy Yuan; Milan Gritta; Guchun Zhang; Ignacio Iacobacci; Anna Korhonen; Ivan Vulic;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Beyond providing a series of insights into the performance disparities of ToD systems in different languages, our analyses offer practical tips on how to approach ToD data collection and system development for new languages.


386, Program Translation Via Code Distillation
Yufan Huang; Mengnan Qi; Yongqiang Yao; Maoquan Wang; Bin Gu; Colin Clement; Neel Sundaresan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper we propose a novel model called Code Distillation (CoDist) whereby we capture the semantic and structural equivalence of code in a language agnostic intermediate representation.


387, SUT: Active Defects Probing for Transcompiler Models
Mengnan Qi; Yufan Huang; Maoquan Wang; Yongqiang Yao; Zihan Liu; Bin Gu; Colin Clement; Neel Sundaresan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Metrics like BLUE, CodeBLUE and computation accuracy may not expose these issues. In this paper we introduce a new metrics for programming language translation and these metrics address these basic syntax errors.


388, MT2: Towards A Multi-Task Machine Translation Model with Translation-Specific In-Context Learning
Chunyou Li; Mingtong Liu; Hongxiao Zhang; Yufeng Chen; Jinan Xu; Ming Zhou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Most of the previous work uses separate models or methods to solve these tasks, which is not conducive to knowledge transfer of different tasks and increases the complexity of system construction. In this work, we explore the potential of pre-trained language model in machine translation tasks and propose a Multi-Task Machine Translation (MT2) model to integrate these translation tasks.


389, Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers
Chen Tang; Shun Wang; Tomas Goldsack; Chenghua Lin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers, allowing neural networks to generate summaries by leveraging both the paper content and relevant knowledge from citation papers.


390, Compressing Context to Enhance Inference Efficiency of Large Language Models
Yucheng Li; Bo Dong; Frank Guerin; Chenghua Lin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper proposes a method called Selective Context that enhances the inference efficiency of LLMs by identifying and pruning redundancy in the input context to make the input more compact.


391, A Diachronic Analysis of Paradigm Shifts in NLP Research: When, How, and Why?
Aniket Pramanick; Yufang Hou; Saif Mohammad; Iryna Gurevych;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this study, we propose a systematic framework for analyzing the evolution of research topics in a scientific field using causal discovery and inference techniques.


392, Learning from Mistakes Via Cooperative Study Assistant for Large Language Models
Danqing Wang; Lei Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose Study Assistant for Large LAnguage Model (SALAM), a novel framework with an auxiliary agent to assist the main LLM in learning from mistakes through interactive cooperation.


393, ClusterLLM: Large Language Models As A Guide for Text Clustering
Yuwei Zhang; Zihan Wang; Jingbo Shang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce ClusterLLM, a novel text clustering framework that leverages feedback from an instruction-tuned large language model, such as ChatGPT.


394, Can Language Models Laugh at YouTube Short-form Videos?
Dayoon Ko; Sangho Lee; Gunhee Kim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We curate a user-generated dataset of 10K multimodal funny videos from YouTube, called ExFunTube.


395, MRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images
Keighley Overbay; Jaewoo Ahn; Fatemeh Pesaran zadeh; Joonsuk Park; Gunhee Kim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To this end, we present mRedditSum, the first multimodal discussion summarization dataset.


396, PRESTO: A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs
Rahul Goel; Waleed Ammar; Aditya Gupta; Siddharth Vashishtha; Motoki Sano; Faiz Surani; Max Chang; HyunJeong Choe; David Greene; Chuan He; Rattima Nitisaroj; Anna Trukhina; Shachi Paul; Pararth Shah; Rushin Shah; Zhou Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To enable research on some of the more challenging aspects of parsing realistic conversations, we introduce PRESTO, a public dataset of over 550K contextual multilingual conversations between humans and virtual assistants.


397, Condensing Multilingual Knowledge with Lightweight Language-Specific Modules
Haoran Xu; Weiting Tan; Shuyue Li; Yunmo Chen; Benjamin Van Durme; Philipp Koehn; Kenton Murray;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present Language-specific Matrix Synthesis (LMS), a novel method that addresses the issue.


398, PROSE: A Pronoun Omission Solution for Chinese-English Spoken Language Translation
Ke Wang; Xiutian Zhao; Yanghui Li; Wei Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To alleviate the negative impact introduced by pro-drop, we propose Mention-Aware Semantic Augmentation, a novel approach that leverages the semantic embedding of dropped pronouns to augment training pairs.


399, M3Seg: A Maximum-Minimum Mutual Information Paradigm for Unsupervised Topic Segmentation in ASR Transcripts
Ke Wang; Xiutian Zhao; Yanghui Li; Wei Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose M3Seg, a novel Maximum-Minimum Mutual information paradigm for linear topic segmentation without using any parallel data.


400, GEMINI: Controlling The Sentence-Level Summary Style in Abstractive Text Summarization
Guangsheng Bao; Zebin Ou; Yue Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: These techniques are flexible and thus difficult to be imitated by any single method. To address this issue, we propose an adaptive model, GEMINI, that integrates a rewriter and a generator to mimic the sentence rewriting and abstracting techniques, respectively.


401, Optimizing Retrieval-augmented Reader Models Via Token Elimination
Moshe Berchansky; Peter Izsak; Avi Caciularu; Ido Dagan; Moshe Wasserblat;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we analyze the contribution and necessity of all the retrieved passages to the performance of reader models, and propose eliminating some of the retrieved information, at the token level, that might not contribute essential information to the answer generation process.


402, Learning Retrieval Augmentation for Personalized Dialogue Generation
Qiushi Huang; Shuai Fu; Xubo Liu; Wenwu Wang; Tom Ko; Yu Zhang; Lilian Tang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, persona profiles, a prevalent setting in current personalized dialogue datasets, typically composed of merely four to five sentences, may not offer comprehensive descriptions of the persona about the agent, posing a challenge to generate truly personalized dialogues. To handle this problem, we propose Learning Retrieval Augmentation for Personalized DialOgue Generation (LAPDOG), which studies the potential of leveraging external knowledge for persona dialogue generation.


403, Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation
Mateusz Lango; Ondrej Dusek;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we explore a new way to mitigate hallucinations by combining the probabilistic output of a generator language model (LM) with the output of a special ?text critic? classifier, which guides the generation by assessing the match between the input data and the text generated so far.


404, Image Manipulation Via Multi-Hop Instructions - A New Dataset and Weakly-Supervised Neuro-Symbolic Approach
Harman Singh; Poorva Garg; Mohit Gupta; Kevin Shah; Ashish Goswami; Satyam Modi; Arnab Mondal; Dinesh Khandelwal; Dinesh Garg; Parag Singla;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We create a new dataset for the task, and extensive experiments demonstrate that NeuroSIM is highly competitive with or beats SOTA baselines that make use of supervised data for manipulation.


405, Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Emanuele Bugliarello; Aida Nematzadeh; Lisa Hendricks;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we take a step further and explore how we can tap into supervision from small-scale visual relation data.


406, Large Language Models Are Temporal and Causal Reasoners for Video Question Answering
Dohwan Ko; Ji Lee; Woo-Young Kang; Byungseok Roh; Hyunwoo Kim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we develop LLaMA-VQA by applying Flipped-VQA to LLaMA, and it outperforms both LLMs-based and non-LLMs-based models on five challenging VideoQA benchmarks.


407, KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing
Seonmin Koo; Chanjun Park; Jinsung Kim; Jaehyung Seo; Sugyeong Eo; Hyeonseok Moon; Heuiseok Lim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Conventional evaluation metrics for ASR systems produce a singular aggregate score, which is insufficient for understanding specific system vulnerabilities. Therefore, we aim to address the limitations of the previous ASR evaluation methods by introducing the Korean Error Explainable Benchmark Dataset for ASR and Post-processing (KEBAP).


408, Post-hoc Utterance Refining Method By Entity Mining for Faithful Knowledge Grounded Conversations
Yoonna Jang; Suhyune Son; Jeongwoo Lee; Junyoung Son; Yuna Hur; Jungwoo Lim; Hyeonseok Moon; Kisu Yang; Heuiseok Lim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In particular, entity-level hallucination that causes critical misinformation and undesirable conversation is one of the major concerns. To address this issue, we propose a post-hoc refinement method called REM.


409, Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings
Josip Jukic; Jan Snajder;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We present an empirical study of PEFT behavior with AL in low-resource settings for text classification tasks.


410, CHEF in The Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
Jaehyung Seo; Hyeonseok Moon; Jaewook Lee; Sugyeong Eo; Chanjun Park; Heuiseok Lim;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: The complexity of morphological variations allows for diverse sentence forms based on the syntactic-semantic integration of functional morphemes (i. e. , affixes) to lexical morphemes (i. e. , roots). With this in mind, we propose a method - CHEF, replicating the morphological transformations inherent in sentences based on lexical and functional morpheme combinations through generative data augmentation.


411, Comparing Styles Across Languages
Shreya Havaldar; Matthew Pressimone; Eric Wong; Lyle Ungar;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce an explanation framework to extract stylistic differences from multilingual LMs and compare styles across languages.


412, Benchmarking and Improving Text-to-SQL Generation Under Ambiguity
Adithya Bhaskar; Tushar Tomar; Ashutosh Sathe; Sunita Sarawagi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose LogicalBeam, a new decoding algorithm that navigates the SQL logic space using a blend of plan-based template generation and constrained infilling.


413, ReSee: Responding Through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue
Haoqin Tu; Yitong Li; Fei Mi; Zhongliang Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We propose to explicitly split the visual knowledge into finer granularity (?turn-level? and ?entity-level?).


414, Multilingual K-Nearest-Neighbor Machine Translation
David Stap; Christof Monz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, these improvements have been limited to high-resource language pairs, with large datastores, and remain a challenge for low-resource languages. In this paper, we address this issue by combining representations from multiple languages into a single datastore.


415, Make Every Example Count: On The Stability and Utility of Self-Influence for Learning from Noisy NLP Datasets
Irina Bejan; Artem Sokolov; Katja Filippova;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We study the fitness of task-agnostic self-influence scores of training examples for data cleaning, analyze their efficacy in capturing naturally occurring outliers, and investigate to what extent self-influence based data cleaning can improve downstream performance in machine translation, question answering and text classification, building up on recent approaches to self-influence calculation and automated curriculum learning.


416, MarkQA: A Large Scale KBQA Dataset with Numerical Reasoning
Xiang Huang; Sitao Cheng; Yuheng Bao; Shanshan Huang; Yuzhong Qu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we focus on the complex numerical reasoning in KBQA, and propose a new task, NR-KBQA, which necessitates the ability to perform both multi-hop reasoning and numerical reasoning.


417, Learning to Rank Context for Named Entity Recognition Using A Synthetic Dataset
Arthur Amalvy; Vincent Labatut; Richard Dufour;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Instead, we propose to generate a synthetic context retrieval training dataset using Alpaca, an instruction-tuned large language model (LLM).


418, Paraphrase Types for Generation and Detection
Jan Philip Wahle; Bela Gipp; Terry Ruas;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Current approaches in paraphrase generation and detection heavily rely on a single general similarity score, ignoring the intricate linguistic properties of language. This paper introduces two new tasks to address this shortcoming by considering paraphrase types - specific linguistic perturbations at particular text positions.


419, DueT: Image-Text Contrastive Transfer Learning with Dual-adapter Tuning
Taku Hasegawa; Kyosuke Nishida; Koki Maeda; Kuniko Saito;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper presents DueT, a novel transfer learning method for vision and language models built by contrastive learning.


420, Empower Nested Boolean Logic Via Self-Supervised Curriculum Learning
Hongqiu Wu; Linfeng Liu; Hai Zhao; Min Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We find that any pre-trained language models even including large language models only behave like a random selector in the face of multi-nested boolean logic, a task that humans can handle with ease. To empower language models with this fundamental capability, this paper proposes a new self-supervised learning method Curriculum Logical Reasoning (Clr), where we augment the training data with nested boolean logic chain step-by-step, and program the training from simpler logical patterns gradually to harder ones.


421, Speech-enriched Memory for Inference-time Adaptation of ASR Models to Word Dictionaries
Ashish Mittal; Sunita Sarawagi; Preethi Jyothi; George Saon; Gakuto Kurata;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a novel inference algorithm that improves the prediction of state-of-the-art ASR models using nearest-neighbor-based matching on an inference-time word list.


422, Dr ChatGPT Tell Me What I Want to Hear: How Different Prompts Impact Health Answer Correctness
Bevan Koopman; Guido Zuccon;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper investigates the significant impact different prompts have on the behaviour of ChatGPT when used for health information seeking.


423, ToolWriter: Question Specific Tool Synthesis for Tabular Data
Carlos Gemmell; Jeff Dalton;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Unlike humans who use programmatic tools like filters to transform data before processing, language models in TQA process tables directly, resulting in information loss as table size increases. In this paper we propose ToolWriter to generate query specific programs and detect when to apply them to transform tables and align them with the TQA model?s capabilities.


424, Large Language Models Are Complex Table Parsers
Bowen Zhao; Changkai Ji; Yuejie Zhang; Wen He; Yingwen Wang; Qing Wang; Rui Feng; Xiaobo Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose to incorporate GPT-3. 5 to address such challenges, in which complex tables are reconstructed into tuples and specific prompt designs are employed for dialogues.


425, Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPT
Xiaoshuai Song; Keqing He; Pei Wang; Guanting Dong; Yutao Mou; Jingang Wang; Yunsen Xian; Xunliang Cai; Weiran Xu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: More deeply, through a series of analytical experiments, we summarize and discuss the challenges faced by LLMs including clustering, domain-specific understanding, and cross-domain in-context learning scenarios.


426, Chinese Lexical Substitution: Dataset and Method
Jipeng Qiang; Kang Liu; Ying Li; Yun Li; Yi Zhu; Yun-Hao Yuan; Xiaocheng Hu; Xiaoye Ouyang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Existing lexical substitution (LS) benchmarks were collected by asking human annotators to think of substitutes from memory, resulting in benchmarks with limited coverage and relatively small scales. To overcome this problem, we propose a novel annotation method to construct an LS dataset based on human and machine collaboration.


427, Biomedical Named Entity Recognition Via Dictionary-based Synonym Generalization
Zihao Fu; Yixuan Su; Zaiqiao Meng; Nigel Collier;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we propose a novel Synonym Generalization (SynGen) framework that recognizes the biomedical concepts contained in the input text using span-based predictions.


428, Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning Via Compositional Operations
James Huang; Wenlin Yao; Kaiqiang Song; Hongming Zhang; Muhao Chen; Dong Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings that supports compositional sentence operations in the embedding space.


429, More Than Spoken Words: Nonverbal Message Extraction and Generation
Dian Yu; Xiaoyang Wang; Wanshun Chen; Nan Du; Longyue Wang; Haitao Mi; Dong Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper introduces the task of extracting NMs in written text and generating NMs for spoken text.


430, Mitigating Backdoor Poisoning Attacks Through The Lens of Spurious Correlation
Xuanli He; Qiongkai Xu; Jun Wang; Benjamin Rubinstein; Trevor Cohn;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper posits that backdoor poisoning attacks exhibit a spurious correlation between simple text features and classification labels, and accordingly, proposes methods for mitigating spurious correlation as means of defence.


431, BUSTER: A ?BUSiness Transaction Entity Recognition? Dataset
Andrea Zugarini; Andrew Zamai; Marco Ernandes; Leonardo Rigutini;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To support industry-oriented research, we present BUSTER, a BUSiness Transaction Entity Recognition dataset.


432, Query-as-context Pre-training for Dense Passage Retrieval
Xing W; Guangyuan Ma; Wanhui Qian; Zijia Lin; Songlin Hu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Thus, this paper proposes query-as-context pre-training, a simple yet effective pre-training technique to alleviate the issue.


433, CT-GAT: Cross-Task Generative Adversarial Attack Based on Transferability
Minxuan Lv; Chengwei Dai; Kun Li; Wei Zhou; Songlin Hu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose a novel approach that directly constructs adversarial examples by extracting transferable features across various tasks.


434, Rationale-Enhanced Language Models Are Better Continual Relation Learners
Weimin Xiong; Yifan Song; Peiyi Wang; Sujian Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To address the issue, we introduce rationale, i. e. , the explanations of relation classification results generated by Large Language Models (LLM), into CRE task.


435, Hierarchical Pretraining on Multimodal Electronic Health Records
Xiaochen Wang; Junyu Luo; Jiaqi Wang; Ziyi Yin; Suhan Cui; Yuan Zhong; Yaqing Wang; Fenglong Ma;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, in the medical domain, existing pretrained models on electronic health records (EHR) fail to capture the hierarchical nature of EHR data, limiting their generalization capability across diverse downstream tasks using a single pretrained model. To tackle this challenge, this paper introduces a novel, general, and unified pretraining framework called MedHMP, specifically designed for hierarchically multimodal EHR data.


436, Identifying Informational Sources in News Articles
Alexander Spangher; Nanyun Peng; Emilio Ferrara; Jonathan May;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Modeling when, how and why sources get used together in stories can help us better understand the information we consume and even help journalists with the task of producing it. In this work, we take steps toward this goal by constructing the largest and widest-ranging annotated dataset, to date, of informational sources used in news writing.


437, A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation
Giuseppe Attanasio; Flor Plaza del Arco; Debora Nozza; Anne Lauscher;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In MT, this might lead to misgendered translations, resulting, among other harms, in the perpetuation of stereotypes and prejudices. In this work, we address this gap by investigating whether and to what extent such models exhibit gender bias in machine translation and how we can mitigate it.


438, Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Seraphina Goldfarb-Tarrant; Bj?rn Ross; Adam Lopez;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We also find racial biases to be much more prevalent than gender biases. To spur further research on this topic, we release the sentiment models we used for this study, and the intermediate checkpoints throughout training, yielding 1,525 distinct models; we also release our evaluation code.


439, Can You Follow Me? Testing Situational Understanding for ChatGPT
Chenghao Yang; Allyson Ettinger;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Previous works have identified certain SU limitations in non-chatbot Large Language models (LLMs), but the extent and causes of these limitations are not well understood, and capabilities of current chat-based models in this domain have not been explored. In this work we tackle these questions, proposing a novel synthetic environment for SU testing which allows us to do controlled and systematic testing of SU in chat-oriented models, through assessment of models? ability to track and enumerate environment states.


440, SimCSE++: Improving Contrastive Learning for Sentence Embeddings from Two Perspectives
Jiahao Xu; Wei Shao; Lihui Chen; Lemao Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Specifically, for the first perspective, we identify that the dropout noise from negative pairs affects the model?s performance. Therefore, we propose a simple yet effective method to deal with such type of noise.


441, Question Answering As Programming for Solving Time-Sensitive Questions
Xinyu Zhu; Cheng Yang; Bei Chen; Siheng Li; Jian-Guang Lou; Yujiu Yang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This can be attributed to the LLMs? inability to perform rigorous reasoning based on surface-level text semantics. To overcome this limitation, rather than requiring LLMs to directly answer the question, we propose a novel approach where we reframe the Question Answering task as Programming (QAaP).


442, Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective
Tianyu Liu; Afra Amini; Mrinmaya Sachan; Ryan Cotterell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Such tasks, in general, require exhaustive pair-wise comparisons of tokens, thus having a quadratic runtime complexity in the length of the string. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by casting the relation between tokens as a partial order over the string.


443, On The Representational Capacity of Recurrent Neural Language Models
Franz Nowak; Anej Svete; Li Du; Ryan Cotterell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This work investigates the computational expressivity of language models (LMs) based on recurrent neural networks (RNNs).


444, Recurrent Neural Language Models As Probabilistic Finite-state Automata
Anej Svete; Ryan Cotterell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, LMs do not describe unweighted formal languages?rather, they define probability distributions over strings. In this work, we study what classes of such probability distributions RNN LMs can represent, which allows us to make more direct statements about their capabilities.


445, Analysing State-Backed Propaganda Websites: A New Dataset and Linguistic Study
Freddy Heppell; Kalina Bontcheva; Carolina Scarton;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: The main contribution of this paper for the NLP community is in the novel dataset which enables studies of disinformation networks, and the training of NLP tools for disinformation detection.


446, A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from Video
Keito Kudo; Haruki Nagasawa; Jun Suzuki; Nobuyuki Shimizu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper proposes a practical multimodal video summarization task setting and a dataset to train and evaluate the task.


447, Natural Disaster Tweets Classification Using Multimodal Data
Mohammad Basit; Bashir Alam; Zubaida Fatima; Salman Shaikh;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we have explored different models which can lead to the development of a system that deals with multimodal datasets and can perform sequential hierarchical classification.


448, DSI++: Updating Transformer Memory with New Documents
Sanket Mehta; Jai Gupta; Yi Tay; Mostafa Dehghani; Vinh Tran; Jinfeng Rao; Marc Najork; Emma Strubell; Donald Metzler;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we introduce DSI++, a continual learning challenge for DSI with the goal of continuously indexing new documents while being able to answer queries related to both previously and newly indexed documents.


449, Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions
Lucie-Aim?e Kaffee; Arnav Arora; Isabelle Augenstein;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Currently, only a few comments explicitly mention those policies ? 20% of the English ones, but as few as 2% of the German and Turkish comments. To aid in this process of understanding how content is moderated, we construct a novel multilingual dataset of Wikipedia editor discussions along with their reasoning in three languages.


450, From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning
Zheyuan Zhang; Shane Storks; Fengyuan Hu; Sungryull Sohn; Moontae Lee; Honglak Lee; Joyce Chai;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Cognitive psychology theorizes that humans are capable of utilizing fast and intuitive *heuristic* thinking to make decisions based on past experience, then rationalizing the decisions through slower and deliberative *analytic* reasoning. We incorporate these interlinked dual processes in fine-tuning and in-context learning with PLMs, applying them to two language understanding tasks that require coherent physical commonsense reasoning.


451, Explaining Interactions Between Text Spans
Sagnik Choudhury; Pepa Atanasova; Isabelle Augenstein;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Most notably, there is a lack of annotations capturing the human decision-making process with respect to the necessary interactions for informed decision-making in such tasks. To bridge this gap, we introduce SpanEx, a multi-annotator dataset of human span interaction explanations for two NLU tasks: NLI and FC.


452, CoF-CoT: Enhancing Large Language Models with Coarse-to-Fine Chain-of-Thought Prompting for Multi-domain NLU Tasks
Hoang Nguyen; Ye Liu; Chenwei Zhang; Tao Zhang; Philip Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Motivated by multi-step reasoning of LLMs, we propose Coarse-to-Fine Chain-of-Thought (CoF-CoT) approach that breaks down NLU tasks into multiple reasoning steps where LLMs can learn to acquire and leverage essential concepts to solve tasks from different granularities.


453, Revisiting Automated Topic Model Evaluation with Large Language Models
Dominik Stammbach; Vil?m Zouhar; Alexander Hoyle; Mrinmaya Sachan; Elliott Ash;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Automatically evaluating their output and determining the optimal number of topics are both longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models (LLMs) for these tasks.


454, Let GPT Be A Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation
Zhenwen Liang; Wenhao Yu; Tanmay Rajpurohit; Peter Clark; Xiangliang Zhang; Ashwin Kalyan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models.


455, ToViLaG: Your Visual-Language Generative Model Is Also An Evildoer
Xinpeng Wang; Xiaoyuan Yi; Han Jiang; Shanlin Zhou; Zhihua Wei; Xing Xie;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: On such a basis, we benchmarked the toxicity of a diverse spectrum of VLGMs and discovered that some models do more evil than expected while some are more vulnerable to infection, underscoring the necessity of VLGMs detoxification. Therefore, we develop an innovative bottleneck-based detoxification method.


456, Longtriever: A Pre-trained Long Text Encoder for Dense Document Retrieval
Junhan Yang; Zheng Liu; Chaozhuo Li; Guangzhong Sun; Xing Xie;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, a novel retrieval model Longtriever is proposed to embrace three core challenges of long document retrieval: substantial computational cost, incomprehensive document understanding, and scarce annotations.


457, Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge
Te-Lin Wu; Yu Zhou; Nanyun Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: While existing works approach this problem from a pure vision perspective, we investigate to which extent the textual modality (i. e. , task instructions) and their interaction with visual modality can be beneficial. Specifically, we propose to improve phrase grounding models? ability on localizing the active objects by: (1) learning the role of ?objects undergoing change? and extracting them accurately from the instructions, (2) leveraging pre- and post-conditions of the objects during actions, and (3) recognizing the objects more robustly with descriptional knowledge.


458, Harnessing Black-Box Control to Boost Commonsense in LM?s Generation
Yufei Tian; Felix Zhang; Nanyun Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a computation-efficient framework that steers a frozen Pre-Trained Language Model (PTLM) towards more commonsensical generation (i. e. , producing a plausible output that incorporates a list of concepts in a meaningful way).


459, Gender Biases in Automatic Evaluation Metrics for Image Captioning
Haoyi Qiu; Zi-Yi Dou; Tianlu Wang; Asli Celikyilmaz; Nanyun Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we conduct a systematic study of gender biases in model-based automatic evaluation metrics for image captioning tasks.


460, ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Te-Lin Wu; Zi-Yi Dou; Qingyuan Hu; Yu Hou; Nischal Chandra; Marjorie Freedman; Ralph Weischedel; Nanyun Peng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Among them, they only cover reasoning over synthetic environments or specific types of events (e. g. traffic collisions), making them hard to reliably benchmark the model generalization ability in diverse real-world scenarios and reasoning dimensions. To overcome these limitations, we develop a video question answering dataset, ACQUIRED: it consists of 3. 9K annotated videos, encompassing a wide range of event types and incorporating both first and third-person viewpoints, which ensures a focus on real-world diversity.


461, HyperNetwork-based Decoupling to Improve Model Generalization for Few-Shot Relation Extraction
Liang Zhang; Chulun Zhou; Fandong Meng; Jinsong Su; Yidong Chen; Jie Zhou;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: By investigating the class separation of an FSRE model, we find that model upper layers are prone to learn relation-specific knowledge. Therefore, in this paper, we propose a HyperNetwork-based Decoupling approach to improve the generalization of FSRE models.


462, HutCRS: Hierarchical User-Interest Tracking for Conversational Recommender System
Mingjie Qian; Yongsen Zheng; Jinghui Qin; Liang Lin;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Furthermore, these methods assume that users like all attributes of the target item and dislike those unrelated to it, which can introduce bias in attribute-level feedback and impede the system?s ability to accurately identify the target item. To address these issues, we propose a more realistic, user-friendly, and explainable CRS framework called Hierarchical User-Interest Tracking for Conversational Recommender System (HutCRS).


463, Bias Neutralization in Non-Parallel Texts: A Cyclic Approach with Auxiliary Guidance
Karthic Madanagopal; James Caverlee;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Toward expanding the reach of bias neutralization, we propose in this paper a new approach called FairBalance.


464, Causal Reasoning Through Two Cognition Layers for Improving Generalization in Visual Question Answering
Trang Nguyen; Naoaki Okazaki;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Besides, diverse interpretations of the input lead to various modes of answer generation, highlighting the role of causal reasoning between interpreting and answering steps in VQA. Through this lens, we propose Cognitive pathways VQA (CopVQA) improving the multimodal predictions by emphasizing causal reasoning factors.


465, DocumentNet: Bridging The Data Gap in Document Pre-training
Lijun Yu; Jin Miao; Xiaoyu Sun; Jiayi Chen; Alexander Hauptmann; Hanjun Dai; Wei Wei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a method to collect massive-scale and weakly labeled data from the web to benefit the training of VDER models.


466, Dual-Channel Span for Aspect Sentiment Triplet Extraction
Pan Li; Ping Li; Kai Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, most of the existing span-based approaches suffer from enumerating all possible spans, since it can introduce too much noise in sentiment triplet extraction. To ease this burden, we propose a dual-channel span generation method to coherently constrain the search space of span candidates.


467, Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-Memories
Suyu Ge; Chenyan Xiong; Corby Rosset; Arnold Overwijk; Jiawei Han; Paul Bennett;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper we improve the zero-shot generalization ability of language models via Mixture-Of-Memory Augmentation (MoMA), a mechanism that retrieves augmentation documents from multiple information corpora (external memories), with the option to ?plug in? unseen memory at inference time.


468, ?Fifty Shades of Bias?: Normative Ratings of Gender Bias in GPT Generated English Text
Rishav Hada; Agrima Seth; Harshita Diddee; Kalika Bali;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Specifically, we create the first dataset of GPT-generated English text with normative ratings of gender bias.


469, Unsupervised Grammatical Error Correction Rivaling Supervised Methods
Hannan Cao; Liping Yuan; Yuchen Zhang; Hwee Tou Ng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we employ the Break-It-Fix-It (BIFI) method to build an unsupervised GEC system.


470, Merging Generated and Retrieved Knowledge for Open-Domain QA
Yunxiang Zhang; Muhammad Khalifa; Lajanugen Logeswaran; Moontae Lee; Honglak Lee; Lu Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Based on the intuition that answers supported by both sources are more likely to be correct, we propose COMBO, a Compatibility-Oriented knowledge Merging for Better Open-domain QA framework, to effectively leverage the two sources of information.


471, PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation
Gaurav Sahu; Olga Vechtomova; Dzmitry Bahdanau; Issam Laradji;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose a method to generate more helpful augmented data by utilizing the LLM?s abilities to follow instructions and perform few-shot classifications.


472, Transfer-Free Data-Efficient Multilingual Slot Labeling
Evgeniia Razumovskaia; Ivan Vulic; Anna Korhonen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To mitigate the inherent data scarcity issue, current research on multilingual ToD assumes that sufficient English-language annotated data are always available for particular tasks and domains, and thus operates in a standard cross-lingual transfer setup. In this work, we depart from this often unrealistic assumption.


473, 4 and 7-bit Labeling for Projective and Non-Projective Dependency Trees
Carlos G?mez-Rodr?guez; Diego Roca; David Vilares;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce an encoding for parsing as sequence labeling that can represent any projective dependency tree as a sequence of 4-bit labels, one per word.


474, Retrieval-Generation Alignment for End-to-End Task-Oriented Dialogue System
Weizhou Shen; Yingqi Gao; Canbin Huang; Fanqi Wan; Xiaojun Quan; Wei Bi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose the application of maximal marginal likelihood to train a perceptive retriever by utilizing signals from response generation for supervision.


475, Disentangling Transformer Language Models As Superposed Topic Models
Jia Peng Lim; Hady Lauw;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics.


476, MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation
Zexue He; Yu Wang; An Yan; Yao Liu; Eric Chang; Amilcare Gentili; Julian McAuley; Chun-Nan Hsu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present MedEval, a multi-level, multi-task, and multi-domain medical benchmark to facilitate the development of language models for healthcare.


477, Polar Ducks and Where to Find Them: Enhancing Entity Linking with Duck Typing and Polar Box Embeddings
Mattia Atzeni; Mikhail Plekhanov; Frederic Dreyer; Nora Kassner; Simone Merello; Louis Martin; Nicola Cancedda;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Entity linking methods based on dense retrieval are widely adopted in large-scale applications for their efficiency, but they can fall short of generative models, as they are sensitive to the structure of the embedding space. To address this issue, this paper introduces DUCK, an approach to infusing structural information in the space of entity representations, using prior knowledge of entity types.


478, Quantifying The Redundancy Between Prosody and Text
Lukas Wolf; Tiago Pimentel; Evelina Fedorenko; Ryan Cotterell; Alex Warstadt; Ethan Wilcox; Tamar Regev;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We use large language models (LLMs) to estimate how much information is redundant between prosody and the words themselves.


479, A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models
Yi Zhou; Jose Camacho-Collados; Danushka Bollegala;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To study the relationship between model factors and the social biases learned by an MLM, as well as the downstream task performance of the model, we conduct a comprehensive study over 39 pretrained MLMs covering different model sizes, training objectives, tokenization methods, training data domains and languages.


480, Token Prediction As Implicit Classification to Identify LLM-Generated Text
Yutian Chen; Hao Kang; Vivian Zhai; Liangze Li; Rita Singh; Bhiksha Raj;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This paper introduces a novel approach for identifying the possible large language models (LLMs) involved in text generation.


481, Assessing Step-by-Step Reasoning Against Lexical Negation: A Case Study on Syllogism
Mengyu Ye; Tatsuki Kuribayashi; Jun Suzuki; Goro Kobayashi; Hiroaki Funayama;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we inspect the step-by-step reasoning ability of LLMs with a focus on negation, which is a core linguistic phenomenon that is difficult to process.


482, Efficient Transformer Knowledge Distillation: A Performance Review
Nathan Brown; Ashton Williamson; Tahj Anderson; Logan Lawrence;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we provide an evaluation of model compression via knowledge distillation on efficient attention transformers.


483, Text2Topic: Multi-Label Text Classification System for Efficient Topic Detection in User Generated Content with Zero-Shot Capabilities
Fengjun Wang; Moran Beladev; Ofri Kleinfeld; Elina Frayerman; Tal Shachar; Eran Fainman; Karen Lastmann Assaraf; Sarai Mizrachi; Benjamin Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We propose Text to Topic (Text2Topic), which achieves high multi-label classification performance by employing a Bi-Encoder Transformer architecture that utilizes concatenation, subtraction, and multiplication of embeddings on both text and topic.


484, IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions
Zhebin Zhang; Xinyu Zhang; Yuanhang Ren; Saijiang Shi; Meng Han; Yongkang Wu; Ruofei Lai; Zhao Cao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose an Induction-Augmented Generation (IAG) framework that utilizes inductive knowledge along with the retrieved documents for implicit reasoning.


485, Introducing Rhetorical Parallelism Detection: A New Task with Datasets, Metrics, and Baselines
Stephen Bothwell; Justin DeBenedetto; Theresa Crnkovich; Hildegund Muller; David Chiang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Despite the ubiquity of parallelism, the field of natural language processing has seldom investigated it, missing a chance to better understand the nature of the structure, meaning, and intent that humans convey. To address this, we introduce the task of rhetorical parallelism detection.


486, Efficient Algorithms for Recognizing Weighted Tree-Adjoining Languages
Alexandra Butoi; Tim Vieira; Ryan Cotterell; David Chiang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: These four formalisms are equivalent to tree-adjoining grammars (TAG), linear indexed grammars (LIG), pushdown-adjoining automata (PAA), and embedded pushdown automata (EPDA). We define semiring-weighted versions of the above two-level formalisms, and we design new algorithms for computing their stringsums (the weight of all derivations of a string) and allsums (the weight of all derivations).


487, Gatekeeper to Save COGS and Improve Efficiency of Text Prediction
Nidhi Tiwari; Sneha Kola; Milos Milunovic; Si-qing Chen; Marjan Slavkovski;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: So, we propose a Model gatekeeper (GK) to stop the LLM calls that will result in incorrect predictions at client application level itself.


488, Revisiting Sparse Retrieval for Few-shot Entity Linking
Yulin Chen; Zhenran Xu; Baotian Hu; Min Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: For training the extractor, we propose a distant supervision method to automatically generate training data based on overlapping tokens between mention contexts and entity descriptions.


489, QUDeval: The Evaluation of Questions Under Discussion Discourse Parsing
Yating Wu; Ritika Mangla; Greg Durrett; Junyi Jessy Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This work introduces the first framework for the automatic evaluation of QUD parsing, instantiating the theoretical constraints of QUD in a concrete protocol.


490, Elaborative Simplification As Implicit Questions Under Discussion
Yating Wu; William Sheffield; Kyle Mahowald; Junyi Jessy Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This view fails to account for elaborative simplification, where new information is added into the simplified text. This paper proposes to view elaborative simplification through the lens of the Question Under Discussion (QUD) framework, providing a robust way to investigate what writers elaborate upon, how they elaborate, and how elaborations fit into the discourse context by viewing elaborations as explicit answers to implicit questions.


491, Taxonomy Expansion for Named Entity Recognition
Karthikeyan K; Yogarshi Vyas; Jie Ma; Giovanni Paolini; Neha John; Shuai Wang; Yassine Benajiba; Vittorio Castelli; Dan Roth; Miguel Ballesteros;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, this is an extremely laborious task. To remedy this, we propose a novel approach called Partial Label Model (PLM) that uses only partially annotated datasets.


492, Continual Named Entity Recognition Without Catastrophic Forgetting
Duzhen Zhang; Wei Cong; Jiahua Dong; Yahan Yu; Xiuyi Chen; Yonggang Zhang; Zhen Fang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce a pooled feature distillation loss that skillfully navigates the trade-off between retaining knowledge of old entity types and acquiring new ones, thereby more effectively mitigating the problem of catastrophic forgetting.


493, Mirror: A Universal Framework for Various Information Extraction Tasks
Tong Zhu; Junfei Ren; Zijian Yu; Mengsong Wu; Guoliang Zhang; Xiaoye Qu; Wenliang Chen; Zhefeng Wang; Baoxing Huai; Min Zhang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To this end, we reorganize IE problems into unified multi-slot tuples and propose a universal framework for various IE tasks, namely Mirror.


494, Text Rendering Strategies for Pixel Language Models
Jonas Lotz; Elizabeth Salesky; Phillip Rust; Desmond Elliott;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we investigate four approaches to rendering text in the PIXEL model (Rust et al. , 2023), and find that simple character bigram rendering brings improved performance on sentence-level tasks without compromising performance on token-level or multilingual tasks.


495, CQE: A Comprehensive Quantity Extractor
Satya Almasian; Vivian Kazakova; Philipp G?ldner; Michael Gertz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Compared to other information extraction approaches, interestingly only a few works exist that describe methods for a proper extraction and representation of quantities in text. In this paper, we present such a comprehensive quantity extraction framework from text data.


496, TLM: Token-Level Masking for Transformers
Yangjun Wu; Kebin Fang; Dongxiang Zhang; Han Wang; Hao Zhang; Gang Chen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a new regularization scheme based on token-level rather than structure-level to reduce overfitting.


497, Prompting with Pseudo-Code Instructions
Mayank Mishra; Prince Kumar; Riyaz Bhat; Rudra Murthy; Danish Contractor; Srikanth Tamilselvam;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we explore if prompting via pseudo-code instructions helps improve the performance of pre-trained language models.


498, Multilingual Simplification of Medical Texts
Sebastian Joseph; Kathryn Kazanas; Keziah Reina; Vishnesh Ramanathan; Wei Xu; Byron Wallace; Junyi Jessy Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce MultiCochrane, the first sentence-aligned multilingual text simplification dataset for the medical domain in four languages: English, Spanish, French, and Farsi.


499, Towards Noise-Tolerant Speech-Referring Video Object Segmentation: Bridging Speech and Text
Xiang Li; Jinglu Wang; Xiaohao Xu; Muqiao Yang; Fan Yang; Yizhou Zhao; Rita Singh; Bhiksha Raj;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this study, we investigate the prominent HCI task, Referring Video Object Segmentation (R-VOS), which aims to segment and track objects using linguistic references.


500, Reduce Human Labor On Evaluating Conversational Information Retrieval System: A Human-Machine Collaboration Approach
Chen Huang; Peixin Qin; Wenqiang Lei; Jiancheng Lv;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: It is imperative to invest significant effort into researching more labor-effective methods for evaluating CIR systems. To touch upon this challenge, we take the first step to involve active testing in CIR evaluation and propose a novel method, called HomCoE.


501, Towards Effective Automatic Debt Collection with Persona Awareness
Tong Zhang; Junhong Liu; Chen Huang; Jia Liu; Hongru Liang; Zujie Wen; Wenqiang Lei;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we take the first step towards comprehensively investigating the significance of debtor personas and present a successful commercial practice on automatic debt collection agents.


502, VLIS: Unimodal Language Models Guide Multimodal Language Generation
Jiwan Chung; Youngjae Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, existing vision-language models face challenges in tasks that require complex linguistic understanding. To address this issue, we introduce Visual-Language models as Importance Sampling weights (VLIS), a novel framework that combines the visual conditioning capability of vision-language models with the language understanding of unimodal text-only language models without further training.


503, Reading Books Is Great, But Not If You Are Driving! Visually Grounded Reasoning About Defeasible Commonsense Norms
Seungju Han; Junhyeok Kim; Jack Hessel; Liwei Jiang; Jiwan Chung; Yejin Son; Yejin Choi; Youngjae Yu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We construct a new multimodal benchmark for studying commonsense norms: NormLens.


504, ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness
Jan Cegin; Jakub Simko; Peter Brusilovsky;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: For some of these tasks, models like ChatGPT can potentially substitute human workers. In this study, we investigate whether this is the case for the task of paraphrase generation for intent classification.


505, Preserving Privacy Through Dememorization: An Unlearning Technique For Mitigating Memorization Risks In Language Models
Aly Kassem; Omar Mahmoud; Sherif Saad;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, these methods have limitations regarding the number of protected samples, limited privacy types, and potentially lower-quality generative models. To tackle this challenge more effectively, we propose ?DeMem,? a novel unlearning approach that utilizes an efficient reinforcement learning feedback loop via proximal policy optimization.


506, CLEME: Debiasing Multi-reference Evaluation for Grammatical Error Correction
Jingheng Ye; Yinghui Li; Qingyu Zhou; Yangning Li; Shirong Ma; Hai-Tao Zheng; Ying Shen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, mainstream evaluation metrics, i. e. , reference-based metrics, introduce bias into the multi-reference evaluation by extracting edits without considering the presence of multiple references. To overcome this issue, we propose Chunk-LE Multi-reference Evaluation (CLEME), designed to evaluate GEC systems in the multi-reference evaluation setting.


507, MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
Dominik Macko; Robert Moro; Adaku Uchendu; Jason Lucas; Michiharu Yamashita; Mat?? Pikuliak; Ivan Srba; Thai Le; Dongwon Lee; Jakub Simko; Maria Bielikova;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE, a novel benchmarking dataset for multilingual machine-generated text detection comprising of 74,081 authentic and machine-generated texts in 11 languages (ar, ca, cs, de, en, es, nl, pt, ru, uk, and zh) generated by 8 multilingual LLMs.


508, Multi-Source Multi-Type Knowledge Exploration and Exploitation for Dialogue Generation
Xuanfan Ni; Hongliang Dai; Zhaochun Ren; Piji Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To harness the knowledge storage of LLMs, we propose a framework named KnowEE that explores multi-source multi-type knowledge from LLMs by leveraging diverse datasets and then exploits the obtained knowledge for response generation.


509, Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents
Jannis Vamvas; Rico Sennrich;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We formulate recognizing semantic differences (RSD) as a token-level regression task and study three unsupervised approaches that rely on a masked language model.


510, BRAINTEASER: Lateral Thinking Puzzles for Large Language Models
Yifan Jiang; Filip Ilievski; Kaixin Ma; Zhivar Sourati;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: While such vertical thinking tasks have been relatively popular, lateral thinking puzzles have received little attention. To bridge this gap, we devise BrainTeaser: a multiple-choice Question Answering task designed to test the model?s ability to exhibit lateral thinking and defy default commonsense associations.


511, Pre-training Language Models for Comparative Reasoning
Mengxia Yu; Zhihan Zhang; Wenhao Yu; Meng Jiang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we propose a novel framework to pre-train language models for enhancing their abilities of comparative reasoning over texts.


512, Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation
Jian Wang; Yi Cheng; Dongding Lin; Chak Leong; Wenjie Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, by formulating a pair as the conversation target, we explore a novel problem of personalized target-oriented dialogue by considering personalization during the target accomplishment process.


513, VIBE: Topic-Driven Temporal Adaptation for Twitter Classification
Yuji Zhang; Jing Li; Wenjie Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Most prior work focused on continued pretraining or knowledge updating, which may compromise their performance on noisy social media data. To tackle this issue, we reflect feature change via modeling latent topic evolution and propose a novel model, VIBE: Variational Information Bottleneck for Evolutions.


514, Self-Detoxifying Language Models Via Toxification Reversal
Chak Leong; Yi Cheng; Jiashuo Wang; Jian Wang; Wenjie Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we propose a more lightweight approach that enables the PLM itself to achieve ?self-detoxification?.


515, Establishing Trustworthiness: Rethinking Tasks and Model Evaluation
Robert Litschko; Max M?ller-Eberstein; Rob van der Goot; Leon Weber-Genzel; Barbara Plank;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: At the same time, LLMs are being deployed in more real-world scenarios, including previously unforeseen zero-shot setups, increasing the need for trustworthy and reliable systems. Therefore, we argue that it is time to rethink what constitutes tasks and model evaluation in NLP, and pursue a more holistic view on language, placing trustworthiness at the center.


516, ACTOR: Active Learning with Annotator-specific Classification Heads to Embrace Human Label Variation
Xinpeng Wang; Barbara Plank;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We show that in the active learning setting, a multi-head model performs significantly better than a single-head model in terms of uncertainty estimation.


517, LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
Zhiqiang Hu; Lei Wang; Yihuai Lan; Wanyu Xu; Ee-Peng Lim; Lidong Bing; Xing Xu; Soujanya Poria; Roy Lee;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Among the various fine-tuning methods, adapter-based parameter-efficient fine-tuning (PEFT) is undoubtedly one of the most attractive topics, as it only requires fine-tuning a few external parameters instead of the entire LLMs while achieving comparable or even better performance. To enable further research on PEFT methods of LLMs, this paper presents LLM-Adapters, an easy-to-use framework that integrates various adapters into LLMs and can execute these adapter-based PEFT methods of LLMs for different tasks.


518, CoSyn: Detecting Implicit Hate Speech in Online Conversations Using A Context Synergized Hyperbolic Network
Sreyan Ghosh; Manan Suri; Purva Chiniya; Utkarsh Tyagi; Sonal Kumar; Dinesh Manocha;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we present CoSyn, a context synergized neural network that explicitly incorporates user- and conversational-context for detecting implicit hate speech in online conversations.


519, Improving Dialogue Discourse Parsing Via Reply-to Structures of Addressee Recognition
Yaxin Fan; Feng Jiang; Peifeng Li; Fang Kong; Qiaoming Zhu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To alleviate data sparsity, previous studies have adopted multitasking approaches to jointly learn dialogue discourse parsing with related tasks (e. g. , reading comprehension) that require additional human annotation, thus limiting their generality. In this paper, we propose a multitasking framework that integrates dialogue discourse parsing with its neighboring task addressee recognition.


520, DALE: Generative Data Augmentation for Low-Resource Legal NLP
Sreyan Ghosh; Chandra Kiran Reddy Evuru; Sonal Kumar; S Ramaneswaran; S Sakshi; Utkarsh Tyagi; Dinesh Manocha;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We present DALE, a novel and effective generative Data Augmentation framework for low-resource LEgal NLP.


521, APoLLo : Unified Adapter and Prompt Learning for Vision Language Models
Sanjoy Chowdhury; Sayan Nag; Dinesh Manocha;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We present APoLLo, a unified multi-modal approach that combines Adapter and Prompt learning for Vision-Language models.


522, Video-Helpful Multimodal Machine Translation
Yihang Li; Shuichiro Shimizu; Chenhui Chu; Sadao Kurohashi; Wei Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: We introduce EVA (Extensive training set and Video-helpful evaluation set for Ambiguous subtitles translation), an MMT dataset containing 852k Japanese-English parallel subtitle pairs, 520k Chinese-English parallel subtitle pairs, and corresponding video clips collected from movies and TV episodes.


523, From Multilingual Complexity to Emotional Clarity: Leveraging Commonsense to Unveil Emotions in Code-Mixed Dialogues
Shivani Kumar; Ramaneswaran S; Md Akhtar; Tanmoy Chakraborty;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Recognizing that emotional intelligence encompasses a comprehension of worldly knowledge, we propose an innovative approach that integrates commonsense information with dialogue context to facilitate a deeper understanding of emotions.


524, Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization
Pengzhi Gao; Liwen Zhang; Zhongjun He; Hua Wu; Haifeng Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we introduce MuSR: a one-for-all Multilingual Sentence Representation model that supports 223 languages.


525, Understanding Compositional Data Augmentation in Typologically Diverse Morphological Inflection
Farhan Samir; Miikka Silfverberg;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we aim to shed light on the theoretical aspects of the data augmentation strategy StemCorrupt, a method that generates synthetic examples by randomly substituting stem characters in existing gold standard training examples.


526, What Else Do I Need to Know? The Effect of Background Information on Users? Reliance on QA Systems
Navita Goyal; Eleftheria Briakou; Amanda Liu; Connor Baumler; Claire Bonial; Jeffrey Micher; Clare Voss; Marine Carpuat; Hal Daum? III;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we study how users interact with QA systems in the absence of sufficient information to assess their predictions.


527, Prompting Is Not A Substitute for Probability Measurements in Large Language Models
Jennifer Hu; Roger Levy;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this study, we compare metalinguistic prompting and direct probability measurements as ways of measuring models? linguistic knowledge.


528, End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga-Gomez; Zhaocheng Huang; Xing Niu; Rohit Paturi; Sundararajan Srinivasan; Prashant Mathur; Brian Thompson; Marcello Federico;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this paper, we tackle single-channel multi-speaker conversational ST with an end-to-end and multi-task training model, named Speaker-Turn Aware Conversational Speech Translation, that combines automatic speech recognition, speech translation and speaker turn detection using special tokens in a serialized labeling format.


529, Hidding The Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection
Xinlin Peng; Ying Zhou; Ben He; Le Sun; Yingfei Sun;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Although several detectors have been proposed to address these concerns, their effectiveness against adversarial perturbations, specifically in the context of student essay writing, remains largely unexplored. This paper aims to bridge this gap by constructing AIG-ASAP, an AI-generated student essay dataset, employing a range of text perturbation methods that are expected to generate high-quality essays while evading detection.


530, CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation
Philipp Borchert; Jochen De Weerdt; Kristof Coussement; Arno De Caigny; Marie-Francine Moens;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We introduce CORE, a dataset for few-shot relation classification (RC) focused on company relations and business entities.


531, What to Read in A Contract? Party-Specific Summarization of Legal Obligations, Entitlements, and Prohibitions
Abhilasha Sancheti; Aparna Garimella; Balaji Srinivasan; Rachel Rudinger;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we propose a new task of party-specific extractive summarization for legal contracts to facilitate faster reviewing and improved comprehension of rights and duties.


532, Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling
Hai Yu; Chong Deng; Qinglin Zhang; Jiaqing Liu; Qian Chen; Wen Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Therefore, this paper enhances the ability of supervised models to capture coherence from both logical structure and semantic similarity perspectives to further improve the topic segmentation performance, proposing Topic-aware Sentence Structure Prediction (TSSP) and Contrastive Semantic Similarity Learning (CSSL).


533, LLM4Vis: Explainable Visualization Recommendation Using ChatGPT
Lei Wang; Songheng Zhang; Yun Wang; Ee-Peng Lim; Yong Wang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: To address this research gap, we propose LLM4Vis, a novel ChatGPT-based prompting approach to perform visualization recommendation and return human-like explanations using very few demonstration examples.


534, Enhancing Computation Efficiency in Large Language Models Through Weight and Activation Quantization
Janghwan Lee; Minsoo Kim; Seungcheol Baek; Seok Hwang; Wonyong Sung; Jungwook Choi;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We present two innovative techniques: activation-quantization-aware scaling (AQAS) and sequence-length-aware calibration (SLAC) to enhance PTQ by considering the combined effects on weights and activations and aligning calibration sequence lengths to target tasks.


535, Effects of Sub-word Segmentation on Performance of Transformer Language Models
Jue Hou; Anisia Katinskaia; Anh-Duc Vu; Roman Yangarber;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms for morphological segmentation ? Morfessor and StateMorph.


536, Understanding The Effect of Model Compression on Social Bias in Large Language Models
Gustavo Gon?alves; Emma Strubell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We perform a carefully controlled study of the impact of model compression via quantization and knowledge distillation on measures of social bias in LLMs.


537, Unraveling Feature Extraction Mechanisms in Neural Networks
Xiaobing Sun; Jiaxi Li; Wei Lu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose a theoretical approach based on Neural Tangent Kernels (NTKs) to investigate such mechanisms.


538, To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing
Sireesh Gururaja; Amanda Bertsch; Clara Na; David Widder; Emma Strubell;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we seek to understand how to shape our future by better understanding our past.


539, MILDSum: A Novel Benchmark Dataset for Multilingual Summarization of Indian Legal Case Judgments
Debtanu Datta; Shubham Soni; Rajdeep Mukherjee; Saptarshi Ghosh;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: While prior research primarily focuses on summarizing legal case judgments in their source languages, this study presents a pioneering effort toward cross-lingual summarization of English legal documents into Hindi, the most frequently spoken Indian language.


540, Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Zeyu Liu; Tim Dettmers; Xi Lin; Veselin Stoyanov; Xian Li;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we analyzed two major design choices of S-FFN: the memory block (a. k. a. expert) size and the memory block selection method under a general conceptual framework of sparse neural memory.


541, Learning The Visualness of Text Using Large Vision-Language Models
Gaurav Verma; Ryan Rossi; Christopher Tensmeyer; Jiuxiang Gu; Ani Nenkova;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To this end, we curate a dataset of 3,620 English sentences and their visualness scores provided by multiple human annotators. We also propose a fine-tuning strategy that adapts large vision-language models like CLIP by modifying the model?s contrastive learning objective to map text identified as non-visual to a common NULL image while matching visual text to their corresponding images in the document.


542, Solving Hard Analogy Questions with Relation Embedding Chains
Nitesh Kumar; Steven Schockaert;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: A common strategy is to rely on knowledge graphs (KGs) such as ConceptNet, and to model the relation between two concepts as a set of paths. However, KGs are limited to a fixed set of relation types, and they are incomplete and often noisy. Another strategy is to distill relation embeddings from a fine-tuned language model. However, this is less suitable for words that are only indirectly related and it does not readily allow us to incorporate structured domain knowledge. In this paper, we aim to combine the best of both worlds


543, Revisiting Block-based Quantisation: What Is Important for Sub-8-bit LLM Inference?
Cheng Zhang; Jianyi Cheng; Ilia Shumailov; George Constantinides; Yiren Zhao;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we explore the statistical and learning properties of the LLM layer and attribute the bottleneck of LLM quantisation to numerical scaling offsets.


544, M2DF: Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis
Fei Zhao; Chunhui Li; Zhen Wu; Yawen Ouyang; Jianbing Zhang; Xinyu Dai;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: Although some work attempts to filter low-quality noise images by setting thresholds, relying on thresholds will inevitably filter out a lot of useful image information. Therefore, in this work, we focus on whether the negative impact of noisy images can be reduced without modifying the data.


545, AdapterDistillation: Non-Destructive Task Composition with Knowledge Distillation
Junjie Wang; Yicheng Chen; Wangshu Zhang; Sen Hu; Teng Xu; Jing Zheng;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: However, adding an extra fusion layer to implement knowledge composition not only increases the inference time but also is non-scalable for some applications. To avoid these issues, we propose a two-stage knowledge distillation algorithm called AdapterDistillation.


546, Beyond Shared Vocabulary: Increasing Representational Word Similarities Across Languages for Multilingual Machine Translation
Di Wu; Christof Monz;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: However, when words overlap is small, e. g. , using different writing systems, transfer is inhibited. In this paper, we propose a re-parameterized method for building embeddings to alleviate this problem.


547, Bootstrapping Small & High Performance Language Models with Unmasking-Removal Training Policy
Yahan Yang; Elior Sulem; Insup Lee; Dan Roth;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: BabyBERTa, a language model trained on small-scale child-directed speech while none of the words are unmasked during training, has been shown to achieve a level of grammaticality comparable to that of RoBERTa-base, which is trained on 6,000 times more words and 15 times more parameters. Relying on this promising result, we explore in this paper the performance of BabyBERTa-based models in downstream tasks, focusing on Semantic Role Labeling (SRL) and two Extractive Question Answering tasks, with the aim of building more efficient systems that rely on less data and smaller models.


548, TOD-Flow: Modeling The Structure of Task-Oriented Dialogues
Sungryull Sohn; Yiwei Lyu; Anthony Liu; Lajanugen Logeswaran; Dong-Ki Kim; Dongsub Shim; Honglak Lee;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: While recent advances have capitalized on pre-trained language models (PLMs), they exhibit limitations regarding transparency and controllability. To address these challenges, we propose a novel approach focusing on inferring the TOD-flow graph from dialogue data annotated with dialog acts, uncovering the underlying task structure in the form of a graph.


549, Are All Steps Equally Important? Benchmarking Essentiality Detection in Event Processes
Haoyu Wang; Hongming Zhang; Yueguan Wang; Yuqian Deng; Muhao Chen; Dan Roth;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: A critical but overlooked challenge in understanding an event process lies in the fact that the step events are not equally important to the central goal. In this paper, we seek to fill this gap by studying how well current models can understand the essentiality of different step events towards a goal event.


550, Adaptive Policy with Wait-k Model for Simultaneous Translation
Libo Zhao; Kai Fan; Wei Luo; Wu Jing; Shushu Wang; Ziqian Zeng; Zhongqiang Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this study, we propose a more flexible approach by decoupling the adaptive policy model from the translation model.


551, Training Simultaneous Speech Translation with Robust and Random Wait-k-Tokens Strategy
Linlin Zhang; Kai Fan; Jiajun Bu; Zhongqiang Huang;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: Subsequently, to optimize the SimulST task, we propose a robust and random wait-k-tokens strategy.


552, Comparing Biases and The Impact of Multilingual Training Across Multiple Languages
Sharon Levy; Neha John; Ling Liu; Yogarshi Vyas; Jie Ma; Yoshinari Fujinuma; Miguel Ballesteros; Vittorio Castelli; Dan Roth;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: We present a bias analysis across Italian, Chinese, English, Hebrew, and Spanish on the downstream sentiment analysis task to observe whether specific demographics are viewed more positively.


553, CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Low Resource With Contrastive Learning
Xiaoming Liu; Zhaohan Zhang; Yichen Wang; Hang Pu; Yu Lan; Chao Shen;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this paper, we present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario.


554, A Picture Is Worth A Thousand Words: Language Models Plan from Pixels
Anthony Liu; Lajanugen Logeswaran; Sungryull Sohn; Honglak Lee;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: In this work, we explore the use of pre-trained language models (PLMs) to reason about plan sequences from text instructions in embodied visual environments.


555, Graph Meets LLM: A Novel Approach to Collaborative Filtering for Robust Conversational Understanding
Zheng Chen; Ziyan Jiang; Fan Yang; Eunah Cho; Xing Fan; Xiaojiang Huang; Yanbin Lu; Aram Galstyan;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: This paper introduces our Collaborative Query Rewriting approach, which utilizes underlying topological information to assist in rewriting defective queries arising from unseen user interactions.


556, Failures Pave The Way: Enhancing Large Language Models Through Tuning-free Rule Accumulation
Zeyuan Yang; Peng Li; Yang Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   Related Code   View
Highlight: In this work, we propose our Tuning-free Rule Accumulation (TRAN) framework, which guides LLMs in improving their performance by learning from previous mistakes.


557, Learn and Consolidate: Continual Adaptation for Zero-Shot and Multilingual Neural Machine Translation
Kaiyu Huang; Peng Li; Junpeng Liu; Maosong Sun; Yang Liu;
Related Papers   Related Patents   Related Grants   Related Venues   Related Experts   View
Highlight: To this end, we propose a two-stage approach that encourages original models to acquire language-agnostic multilingual representations from new data, and preserves the model architecture without introducing parameters.


558, MeaeQ: Mount Model Extraction Attacks with Efficient Queries
Chengwei Dai; Minxuan Lv; Kun Li; Wei Zhou;
Related Papers   Related Patents</