Paper Digest: Recent Papers on Transformer

July 1, 2020April 19, 2024 admin

Paper Digest Team extracted all recent Transformer (NLP) related papers on our radar, and generated highlight sentences for them. The results are then sorted by relevance & date. In addition to this ‘static’ page, we also provide a real-time version of this article, which has more coverage and is updated in real time to include the most recent updates on this topic.

Based in New York, Paper Digest is dedicated to producing high-quality text analysis results that people can acturally use on a daily basis. Since 2018, we have been serving users across the world with a number of exclusive services to track, search, review and rewrite scientific literature.

You are welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

Paper Digest Team
New York City, New York, 10017
team@paperdigest.org

TABLE 1: Paper Digest: Recent Papers on Transformer

	Paper	Author(s)	Source	Date
1	Large Language Models in Targeted Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate the use of decoder-based generative transformers for extracting sentiment towards the named entities in Russian news articles.	Nicolay Rusnachenko; Anton Golubev; Natalia Loukachevitch;	arxiv-cs.CL	2024-04-18
2	EmrQA-msquad: A Medical Dataset Structured with The SQuAD V2.0 Framework, Enriched with EmrQA Medical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key solution involves integrating specialized medical datasets and creating dedicated datasets. This strategic approach enhances the accuracy of QAS, contributing to advancements in clinical decision-making and medical research.	Jimenez Eladio; Hao Wu;	arxiv-cs.CL	2024-04-18
3	X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer As Meta Multi-Agent Reinforcement Learner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model for cross-city meta multi-agent traffic signal control, named as X-Light: We input the full Markov Decision Process trajectories, and the Lower Transformer aggregates the states, actions, rewards among the target intersection and its neighbors within a city, and the Upper Transformer learns the general decision trajectories across different cities.	HAOYUAN JIANG et. al.	arxiv-cs.AI	2024-04-18
4	MIDGET: Music Conditioned 3D Dance Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm.	Jinwu Wang; Wei Mao; Miaomiao Liu;	arxiv-cs.SD	2024-04-18
5	Augmenting Emotion Features in Irony Detection with Large Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces a novel method for irony detection, applying Large Language Models (LLMs) with prompt-based learning to facilitate emotion-centric text augmentation.	Yucheng Lin; Yuhan Xia; Yunfei Long;	arxiv-cs.CL	2024-04-18
6	Transformer Tricks: Removing Weights for Skipless Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: He and Hofmann (arXiv:2311.01906) detailed a skipless transformer without the V and P (post-attention projection) linear layers, which reduces the total number of weights. …	Nils Graef;	arxiv-cs.LG	2024-04-18
7	Octopus V3: Technical Report for On-device Sub-billion Multimodal AI Agent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a multimodal model that incorporates the concept of functional token specifically designed for AI agent applications.	Wei Chen; Zhiyuan Li;	arxiv-cs.CL	2024-04-17
8	CAUS: A Dataset for Question Generation Based on Human Cognition Leveraging Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the CAUS (Curious About Uncertain Scene) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties.	Minjung Shin; Donghyun Kim; Jeh-Kwang Ryu;	arxiv-cs.AI	2024-04-17
9	Cross-Problem Learning for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants.	ZHUOYI LIN et. al.	arxiv-cs.AI	2024-04-17
10	In-Context Learning State Vector with Inner and Momentum Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this gap by presenting a comprehensive analysis of these compressed vectors, drawing parallels to the parameters trained with gradient descent, and introduce the concept of state vector.	DONGFANG LI et. al.	arxiv-cs.CL	2024-04-17
11	MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show how to build small models that have GPT-4-level performance but for 400x lower cost.	Liyan Tang; Philippe Laban; Greg Durrett;	arxiv-cs.CL	2024-04-16
12	Search Beyond Queries: Training Smaller Language Models for Web Interactions Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This highlights the need for intelligent web navigation agents capable of formulating queries and navigating web pages according to users’ high-level intents. In response to this need, this work introduces a Grounded Language Agent for Intelligent Web Interactions, called GLAINTEL.	Moghis Fereidouni; A. B. Siddique;	arxiv-cs.CL	2024-04-16
13	CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an attribution-oriented Chain-of-Thought reasoning method to enhance the accuracy of attributions.	Moshe Berchansky; Daniel Fleischer; Moshe Wasserblat; Peter Izsak;	arxiv-cs.CL	2024-04-16
14	Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios.	PEIYUAN ZHI et. al.	arxiv-cs.RO	2024-04-15
15	AIGeN: An Adversarial Approach for Instruction Generation in VLN Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose AIGeN, a novel architecture inspired by Generative Adversarial Networks (GANs) that produces meaningful and well-formed synthetic instructions to improve navigation agents’ performance.	Niyati Rawal; Roberto Bigazzi; Lorenzo Baraldi; Rita Cucchiara;	arxiv-cs.CV	2024-04-15
16	Transformers, Contextualism, and Polysemy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, I argue that we an extract from the way the transformer architecture works a picture of the relationship between context and meaning.	Jumbly Grindrod;	arxiv-cs.CL	2024-04-15
17	Demonstration of DB-GPT: Next Generation Data Interaction System Empowered By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interaction tasks to enhance user experience and accessibility.	SIQIAO XUE et. al.	arxiv-cs.AI	2024-04-15
18	Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore GPT-4V’s capabilities in the insurance domain.	Chenwei Lin; Hanjia Lyu; Jiebo Luo; Xian Xu;	arxiv-cs.CV	2024-04-15
19	Few-shot Name Entity Recognition on StackOverflow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: StackOverflow, with its vast question repository and limited labeled examples, raise an annotation challenge for us. We address this gap by proposing RoBERTa+MAML, a few-shot named entity recognition (NER) method leveraging meta-learning.	Xinwei Chen; Kun Li; Tianyou Song; Jiangjian Guo;	arxiv-cs.CL	2024-04-14
20	TransformerFAM: Feedback Attention Is Working Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs. We propose Feedback Attention Memory (FAM), a novel Transformer architecture that leverages a feedback loop to enable the network to attend to its own latent representations.	Dongseong Hwang; Weiran Wang; Zhuoyuan Huo; Khe Chai Sim; Pedro Moreno Mengibar;	arxiv-cs.LG	2024-04-14
21	Foundational GPT Model for MEG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose two classes of deep learning foundational models that can be trained using forecasting of unlabelled MEG.	Richard Csaky; Mats W. J. van Es; Oiwi Parker Jones; Mark Woolrich;	arxiv-cs.LG	2024-04-14
22	Understanding The Role of Temperature in Diverse Question Generation By GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a preliminary study of the effect of GPT’s temperature parameter on the diversity of GPT4-generated questions.	ARAV AGARWAL et. al.	arxiv-cs.CL	2024-04-14
23	Large Language Models for Mobile GUI Text Input Generation: An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We collected 114 UI pages from 62 open-source Android apps and extracted contextual information from the UI pages to construct prompts for LLMs to generate text inputs.	CHENHUI CUI et. al.	arxiv-cs.SE	2024-04-13
24	Pre-training Small Base LMs with Fewer Tokens Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate Inheritune in a slightly different setting where we train small LMs utilizing larger LMs and their full pre-training dataset.	Sunny Sanyal; Sujay Sanghavi; Alexandros G. Dimakis;	arxiv-cs.CL	2024-04-12
25	Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to prove it, we introduce a new task, Logically Equivalent Code Selection, which necessitates the selection of logically equivalent code from a candidate set, given a query code.	MENGNAN QI et. al.	arxiv-cs.PL	2024-04-12
26	Transfer Learning Study of Motion Transformer-based Trajectory Predictions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Since models that can cover all system setups and design domains at once are not yet foreseeable, model adaptation plays a central role. Therefore, a simulation-based study on transfer learning techniques is conducted on basis of a transformer-based model.	Lars Ullrich; Alex McMaster; Knut Graichen;	arxiv-cs.LG	2024-04-12
27	MSSTNet: A Multi-Scale Spatio-Temporal CNN-Transformer Network for Dynamic Facial Expression Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike typical video action recognition, Dynamic Facial Expression Recognition (DFER) does not involve distinct moving targets but relies on localized changes in facial muscles. Addressing this distinctive attribute, we propose a Multi-Scale Spatio-temporal CNN-Transformer network (MSSTNet).	Linhuang Wang; Xin Kang; Fei Ding; Satoshi Nakagawa; Fuji Ren;	arxiv-cs.CV	2024-04-12
28	CreativEval: Evaluating Creativity of LLM-Based Hardware Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this research gap, we present CreativeEval, a framework for evaluating the creativity of LLMs within the context of generating hardware designs.	Matthew DeLorenzo; Vasudev Gohil; Jeyavijayan Rajendran;	arxiv-cs.CL	2024-04-12
29	Constrained C-Test Generation Via Mixed-Integer Programming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap.	Ji-Ung Lee; Marc E. Pfetsch; Iryna Gurevych;	arxiv-cs.CL	2024-04-12
30	Small Models Are (Still) Effective Cross-Domain Argument Extractors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, detailed explorations of these techniques’ ability to actually enable this transfer are lacking. In this work, we provide such a study, exploring zero-shot transfer using both techniques on six major EAE datasets at both the sentence and document levels.	William Gantt; Aaron Steven White;	arxiv-cs.CL	2024-04-12
31	Measuring Geographic Diversity of Foundation Models with A Natural Language–based Geo-guessing Experiment on GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented.	Zilong Liu; Krzysztof Janowicz; Kitty Currier; Meilin Shi;	arxiv-cs.CY	2024-04-11
32	Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore how large language models (LLMs) can be used to generate and evaluate multiple-choice reading comprehension items.	Andreas Säuberli; Simon Clematide;	arxiv-cs.CL	2024-04-11
33	Reflectance Estimation for Proximity Sensing By Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we verify that 1) LLMs such as GPT-3.5 and GPT-4 can estimate an object’s reflectance using only text as input; and 2) VLMs such as CLIP can increase their generalization capabilities in reflectance estimation from images.	Masashi Osada; Gustavo A. Garcia Ricardez; Yosuke Suzuki; Tadahiro Taniguchi;	arxiv-cs.RO	2024-04-11
34	From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do linear and non-linear regression when given in-context examples, without any additional training or gradient updates.	Robert Vacareanu; Vlad-Andrei Negru; Vasile Suciu; Mihai Surdeanu;	arxiv-cs.CL	2024-04-11
35	On Training Data Influence of GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents GPTfluence, a novel approach that leverages a featurized simulation to assess the impact of training examples on the training dynamics of GPT models.	QINGYI LIU et. al.	arxiv-cs.CL	2024-04-11
36	LLM Agents Can Autonomously Exploit One-day Vulnerabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems.	Richard Fang; Rohan Bindu; Akul Gupta; Daniel Kang;	arxiv-cs.CR	2024-04-11
37	Remembering Transformer for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Neural networks encounter the challenge of Catastrophic Forgetting (CF) in continual learning, where new task knowledge interferes with previously learned knowledge. We propose Remembering Transformer, inspired by the brain’s Complementary Learning Systems (CLS), to tackle this issue.	Yuwei Sun; Jun Sakuma; Ryota Kanai;	arxiv-cs.LG	2024-04-11
38	Simpler Becomes Harder: Do LLMs Exhibit A Coherent Behavior on Simplified Corpora? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Text simplification seeks to improve readability while retaining the original content and meaning. Our study investigates whether pre-trained classifiers also maintain such coherence by comparing their predictions on both original and simplified inputs.	Miriam Anschütz; Edoardo Mosca; Georg Groh;	arxiv-cs.CL	2024-04-10
39	From Model-centered to Human-Centered: Revision Distance As A Metric for Text Evaluation in LLMs-based Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Evaluating large language models (LLMs) is fundamental, particularly in the context of practical applications.	YONGQIANG MA et. al.	arxiv-cs.CL	2024-04-10
40	Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data.	YANJIE LI et. al.	arxiv-cs.LG	2024-04-09
41	InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, this research advances the patch division paradigm by introducing a novel extension: dynamic resolution with automatic patch configuration.	XIAOYI DONG et. al.	arxiv-cs.CV	2024-04-09
42	Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a computational framework for characterizing multimodal long-form summarization and investigate the behavior of Claude 2.0/2.1, GPT-4/3.5, and Command.	Tianyu Cao; Natraj Raman; Danial Dervovic; Chenhao Tan;	arxiv-cs.CL	2024-04-09
43	PetKaz at SemEval-2024 Task 8: Can Linguistics Capture The Specifics of LLM-generated Text? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present our submission to the SemEval-2024 Task 8 Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, focusing on the detection of machine-generated texts (MGTs) in English.	Kseniia Petukhova; Roman Kazakov; Ekaterina Kochmar;	arxiv-cs.CL	2024-04-08
44	OPSD: An Offensive Persian Social Media Dataset and Its Baseline Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets.	MEHRAN SAFAYANI et. al.	arxiv-cs.CL	2024-04-08
45	Use of A Structured Knowledge Base Enhances Metadata Curation By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper investigates the potential of large language models (LLMs), specifically GPT-4, to improve adherence to metadata standards.	SOWMYA S. SUNDARAM et. al.	arxiv-cs.AI	2024-04-08
46	Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these models demonstrate remarkable performance on general datasets, they can struggle in specialized domains such as medicine, where unique domain-specific terminologies, domain-specific abbreviations, and varying document structures are common. This paper explores strategies for adapting these models to domain-specific requirements, primarily through continuous pre-training on domain-specific data.	AHMAD IDRISSI-YAGHIR et. al.	arxiv-cs.CL	2024-04-08
47	Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to compare the performance of GPT with traditional deep learning models (Long Short-Term Memory (LSTM) and Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT)) in extracting acupoint-related location relations and assess the impact of pretraining and fine-tuning on GPT’s performance.	YIMING LI et. al.	arxiv-cs.CL	2024-04-08
48	PagPassGPT: Pattern Guided Password Guessing Via Generative Pretrained Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT).	XINGYU SU et. al.	arxiv-cs.CR	2024-04-07
49	Clinical Trials Protocol Authoring Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report embarks on a mission to revolutionize clinical trial protocol development through the integration of advanced AI technologies.	Morteza Maleki;	arxiv-cs.CE	2024-04-07
50	Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a novel approach, Joint Visual and Text Prompting (VTPrompt), that employs fine-grained visual information to enhance the capability of MLLMs in VQA, especially for object-oriented perception.	SONGTAO JIANG et. al.	arxiv-cs.CL	2024-04-06
51	Scope Ambiguities in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite this, there has been little research into how modern large language models treat them. In this paper, we investigate how different versions of certain autoregressive language models — GPT-2, GPT-3/3.5, Llama 2 and GPT-4 — treat scope ambiguous sentences, and compare this with human judgments.	Gaurav Kamath; Sebastian Schuster; Sowmya Vajjala; Siva Reddy;	arxiv-cs.CL	2024-04-05
52	Outlier-Efficient Hopfield Layers for Large Transformer-Based Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathtt{OutEffHop}$) and use it to address the outlier-induced challenge of quantizing gigantic transformer-based models.	JERRY YAO-CHIEH HU et. al.	arxiv-cs.LG	2024-04-04
53	SDPose: Tokenized Pose Estimation Via Circulation-Guide Self-Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Those transformer-based models that require fewer resources are prone to under-fitting due to their smaller scale and thus perform notably worse than their larger counterparts. Given this conundrum, we introduce SDPose, a new self-distillation method for improving the performance of small transformer-based models.	SICHEN CHEN et. al.	arxiv-cs.CV	2024-04-04
54	Evaluating LLMs at Detecting Errors in LLM Responses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs.	RYO KAMOI et. al.	arxiv-cs.CL	2024-04-04
55	Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs. Besides, some methods are not limited to the …	SHUO CHEN et. al.	arxiv-cs.LG	2024-04-04
56	NLP at UC Santa Cruz at SemEval-2024 Task 5: Legal Answer Validation Using Few-Shot Multi-Choice QA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present two approaches to solving the task of legal answer validation, given an introduction to the case, a question and an answer candidate.	Anish Pahilajani; Samyak Rajesh Jain; Devasha Trivedi;	arxiv-cs.CL	2024-04-03
57	Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional approaches approximate the value functions as piecewise linear convex functions by incrementally accumulating subgradient cutting planes from the primal and dual solutions of stagewise subproblems. Recognizing these limitations, we introduce TranSDDP, a novel Transformer-based stagewise decomposition algorithm.	Chanyeong Kim; Jongwoong Park; Hyunglip Bae; Woo Chang Kim;	arxiv-cs.LG	2024-04-03
58	BCAmirs at SemEval-2024 Task 4: Beyond Words: A Multimodal and Multilingual Exploration of Persuasion in Memes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by this, our team engaged in SemEval-2024 Task 4, a hierarchical multi-label classification task designed to identify rhetorical and psychological persuasion techniques embedded within memes. To tackle this problem, we introduced a caption generation step to assess the modality gap and the impact of additional semantic information from images, which improved our result.	Amirhossein Abaskohi; Amirhossein Dabiriaghdam; Lele Wang; Giuseppe Carenini;	arxiv-cs.CL	2024-04-03
59	GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo.	Ali Pesaranghader; Nikhil Verma; Manasa Bharadwaj;	arxiv-cs.CL	2024-04-03
60	Task Agnostic Architecture for Algorithm Induction Via Implicit Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Taking this trend of generalization to the extreme suggests the possibility of a single deep network architecture capable of solving all tasks. This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed.	Sahil J. Sindhi; Ignas Budvytis;	arxiv-cs.LG	2024-04-03
61	UTeBC-NLP at SemEval-2024 Task 9: Can LLMs Be Lateral Thinkers? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore prompt engineering methods: chain of thoughts (CoT) and direct prompting, enhancing with informative descriptions, and employing contextualizing prompts using a retrieval augmented generation (RAG) pipeline.	Pouya Sadeghi; Amirhossein Abaskohi; Yadollah Yaghoobzadeh;	arxiv-cs.CL	2024-04-03
62	ASTRA: An Action Spotting TRAnsformer for Soccer Videos Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches.	Artur Xarles; Sergio Escalera; Thomas B. Moeslund; Albert Clapés;	arxiv-cs.CV	2024-04-02
63	Toward Informal Language Processing: Knowledge of Slang in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using movie subtitles, we construct a dataset that supports evaluation on a diverse set of tasks pertaining to automatic processing of slang.	Zhewei Sun; Qian Hu; Rahul Gupta; Richard Zemel; Yang Xu;	arxiv-cs.CL	2024-04-02
64	WcDT: World-centric Diffusion Transformer for Traffic Scene Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel approach for autonomous driving trajectory generation by harnessing the complementary strengths of diffusion probabilistic models (a.k.a., diffusion models) and transformers.	Chen Yang; Aaron Xuxiang Tian; Dong Chen; Tianyu Shi; Arsalan Heydarian;	arxiv-cs.CV	2024-04-02
65	Generative AI-Based Text Generation Methods Using Pre-Trained GPT-2 Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through analysis of greedy search, beam search, top-k sampling, top-p sampling, contrastive searching, and locally typical searching, this work has provided valuable insights into the strengths, weaknesses, and potential applications of each method.	ROHIT PANDEY et. al.	arxiv-cs.CL	2024-04-02
66	SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose SGSH–a simple and effective framework to Stimulate GPT-3.5 with Skeleton Heuristics to enhance KBQG.	SHASHA GUO et. al.	arxiv-cs.CL	2024-04-02
67	Release of Pre-Trained Models for The Japanese Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI democratization aims to create a world in which the average person can utilize AI techniques.	KEI SAWADA et. al.	arxiv-cs.CL	2024-04-02
68	Collapse of Self-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In various fields of knowledge creation, including science, new ideas often build on pre-existing information. In this work, we explore this concept within the context of language models.	David Herel; Tomas Mikolov;	arxiv-cs.CL	2024-04-02
69	Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks.	AMIRHOSSEIN ABASKOHI et. al.	arxiv-cs.CL	2024-04-02
70	RAT: Retrieval-Augmented Transformer for Click-Through Rate Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Predicting click-through rates (CTR) is a fundamental task for Web applications, where a key issue is to devise effective models for feature interactions.	YUSHEN LI et. al.	arxiv-cs.IR	2024-04-02
71	METAL: Towards Multilingual Meta-Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a framework for an end-to-end assessment of LLMs as evaluators in multilingual scenarios.	Rishav Hada; Varun Gumma; Mohamed Ahmed; Kalika Bali; Sunayana Sitaram;	arxiv-cs.CL	2024-04-02
72	BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research.	Jiarong Xian; Jibao Yuan; Peiwei Zheng; Dexian Chen;	arxiv-cs.CL	2024-04-01
73	Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we explore the potential of zero-shot Large Multimodal Models (LMMs) in the domain of drone perception.	Christian Limberg; Artur Gonçalves; Bastien Rigault; Helmut Prendinger;	arxiv-cs.CV	2024-04-01
74	TWIN-GPT: Digital Twins for Clinical Trials Via Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a large language model-based digital twin creation approach, called TWIN-GPT.	YUE WANG et. al.	arxiv-cs.LG	2024-04-01
75	Unveiling Divergent Inductive Biases of LLMs on Temporal Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the adeptness of Large Language Models (LLMs) in discerning patterns and relationships from data, their inherent comprehension of temporal dynamics remains a formidable challenge. This research meticulously explores these intrinsic challenges within LLMs, with a specific emphasis on evaluating the performance of GPT-3.5 and GPT-4 models in the analysis of temporal data.	Sindhu Kishore; Hangfeng He;	arxiv-cs.CL	2024-04-01
76	What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the capabilities of the transformer architecture with varying depth.	Xingwu Chen; Difan Zou;	arxiv-cs.LG	2024-04-01
77	Large Language Model Evaluation Via Multi AI Agents: Preliminary Results Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite extensive efforts to examine LLMs from various perspectives, there is a noticeable lack of multi-agent AI models specifically designed to evaluate the performance of different LLMs. To address this gap, we introduce a novel multi-agent AI model that aims to assess and compare the performance of various LLMs.	Zeeshan Rasheed; Muhammad Waseem; Kari Systä; Pekka Abrahamsson;	arxiv-cs.SE	2024-04-01
78	LLM-RadJudge: Achieving Radiologist-Level Evaluation for X-Ray Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel evaluation framework using large language models (LLMs) to compare radiology reports for assessment.	Zilong Wang; Xufang Luo; Xinyang Jiang; Dongsheng Li; Lili Qiu;	arxiv-cs.CL	2024-04-01
79	Advancing AI with Integrity: Ethical Challenges and Solutions in Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify and address ethical issues through empirical studies.	Richard Kimera; Yun-Seon Kim; Heeyoul Choi;	arxiv-cs.CL	2024-04-01
80	Syntactic Robustness for LLM-based Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on prompts that ask for code that generates solutions to variables in an equation, when given coefficients of the equation as input.	Laboni Sarker; Mara Downing; Achintya Desai; Tevfik Bultan;	arxiv-cs.SE	2024-04-01
81	Condition-Aware Neural Network for Controlled Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Condition-Aware Neural Network (CAN), a new method for adding control to image generative models.	HAN CAI et. al.	arxiv-cs.CV	2024-04-01
82	CHOPS: CHat with CustOmer Profile Systems for Customer Service with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a practical dataset, the CPHOS-dataset, which includes a database, guiding files, and QA pairs collected from CPHOS, an online platform that facilitates the organization of simulated Physics Olympiads for high school teachers and students.	JINGZHE SHI et. al.	arxiv-cs.CL	2024-03-31
83	EvoCodeBench: An Evolving Code Generation Benchmark Aligned with Real-World Code Repositories Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a new benchmark – EvoCodeBench to address the preceding problems, which has three primary advances.	Jia Li; Ge Li; Xuanming Zhang; Yihong Dong; Zhi Jin;	arxiv-cs.CL	2024-03-31
84	Jetsons at FinNLP 2024: Towards Understanding The ESG Impact of A News Article Using Transformer-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task.	PARAG PRAVIN DAKLE et. al.	arxiv-cs.CL	2024-03-30
85	TRABSA: Interpretable Sentiment Analysis of Tweets Using Attention-based BiLSTM and Twitter-RoBERTa Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose TRABSA, a hybrid framework integrating transformer-based architectures, attention mechanisms, and BiLSTM networks to address this.	Md Abrar Jahin; Md Sakib Hossain Shovon; M. F. Mridha;	arxiv-cs.CL	2024-03-30
86	Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune Your Model Unless You Have Access to GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a PEFT method to improve the consistency of LLMs by merging adapters that were fine-tuned separately using triplet and language modelling objectives.	Aryo Pradipta Gema; Giwon Hong; Pasquale Minervini; Luke Daines; Beatrice Alex;	arxiv-cs.CL	2024-03-30
87	Cross-lingual Named Entity Corpus for Slavic Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a corpus manually annotated with named entities for six Slavic languages – Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian.	Jakub Piskorski; Michał Marcińczuk; Roman Yangarber;	arxiv-cs.CL	2024-03-30
88	A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In pursuit of suitable data augmentation methods, this study explores both established legacy approaches and contemporary practices such as Large Language Models (LLM), including GPT in Hate Speech detection.	Md Saroar Jahan; Mourad Oussalah; Djamila Romaissa Beddia; Jhuma kabir Mim; Nabil Arhab;	arxiv-cs.CL	2024-03-30
89	Spread Your Wings: A Radial Strip Transformer for Image Deblurring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Radial Strip Transformer (RST), which is a transformer-based architecture that restores the blur images in a polar coordinate system instead of a Cartesian one.	DUOSHENG CHEN et. al.	arxiv-cs.CV	2024-03-30
90	Transformer Based Pluralistic Image Completion with Reduced Information Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer. To mitigate these issues, we propose a new transformer based framework called PUT.	QIANKUN LIU et. al.	arxiv-cs.CV	2024-03-30
91	Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive.	Ahmad Diab; Rr. Nefriana; Yu-Ru Lin;	arxiv-cs.CL	2024-03-29
92	Shallow Cross-Encoders for Low-Latency Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, keeping search latencies low is important for user satisfaction and energy usage. In this paper, we show that weaker shallow transformer models (i.e., transformers with a limited number of layers) actually perform better than full-scale models when constrained to these practical low-latency settings since they can estimate the relevance of more documents in the same time budget.	Aleksandr V. Petrov; Sean MacAvaney; Craig Macdonald;	arxiv-cs.IR	2024-03-29
93	ReALM: Reference Resolution As Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates how LLMs can be used to create an extremely effective system to resolve references of various types, by showing how reference resolution can be converted into a language modeling problem, despite involving forms of entities like those on screen that are not traditionally conducive to being reduced to a text-only modality.	JOEL RUBEN ANTONY MONIZ et. al.	arxiv-cs.CL	2024-03-29
94	ChatGPT V.s. Media Bias: A Comparative Study of GPT-3.5 and Fine-tuned Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can ChatGPT detect media bias? This study seeks to answer this question by leveraging the Media Bias Identification Benchmark (MBIB) to assess ChatGPT’s competency in distinguishing six categories of media bias, juxtaposed against fine-tuned models such as BART, ConvBERT, and GPT-2.	Zehao Wen; Rabih Younes;	arxiv-cs.CL	2024-03-29
95	Decision Mamba: Reinforcement Learning Via Sequence Modeling with Selective State Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of Mamba as a valuable tool for improving the efficacy of Transformer-based models in reinforcement learning scenarios.	Toshihiro Ota;	arxiv-cs.LG	2024-03-28
96	Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator’s behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Turbo) can ingest and generate, via a framework we call Keypoint Action Tokens (KAT).	Norman Di Palo; Edward Johns;	arxiv-cs.RO	2024-03-28
97	Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we deeply explore the mechanisms employed by Transformer-based language models in factual recall tasks.	ANG LV et. al.	arxiv-cs.CL	2024-03-28
98	Generate Then Retrieve: Conversational Response Retrieval Using LLMs As Answer and Query Generators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose three different methods for generating multiple queries to enhance the retrieval.	Zahra Abbasiantaeb; Mohammad Aliannejadi;	arxiv-cs.IR	2024-03-28
99	BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Can smaller, more targeted models compete? To address this question, we build and release BioMedLM, a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles.	ELLIOT BOLTON et. al.	arxiv-cs.CL	2024-03-27
100	Reshaping Free-Text Radiology Notes Into Structured Reports With Generative Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a pipeline to extract information from free-text radiology reports, that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma.	LAURA BERGOMI et. al.	arxiv-cs.CL	2024-03-27
101	PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students’ physics labs.	Ehsan Latif; Ramviyas Parasuraman; Xiaoming Zhai;	arxiv-cs.RO	2024-03-27
102	SemRoDe: Macro Adversarial Training to Learn Representations That Are Robust to Word-Level Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach called Semantic Robust Defence (SemRoDe), a Macro Adversarial Training strategy to enhance the robustness of LMs.	Brian Formento; Wenjie Feng; Chuan Sheng Foo; Luu Anh Tuan; See-Kiong Ng;	arxiv-cs.CL	2024-03-27
103	RankMamba: Benchmarking Mamba’s Document Ranking Performance in The Era of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine \mamba’s efficacy through the lens of a classical IR task — document ranking.	Zhichao Xu;	arxiv-cs.IR	2024-03-27
104	3P-LLM: Probabilistic Path Planning Using Large Language Model for Autonomous Robot Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research assesses the feasibility of using LLM (GPT-3.5-turbo chatbot by OpenAI) for robotic path planning.	Ehsan Latif;	arxiv-cs.RO	2024-03-27
105	MALBERT: Is A Compact Multilingual BERT Model Still Worth It? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Within the current trend of Pretained Language Models (PLM), emerge more and more criticisms about the ethical andecological impact of such models. In this article, considering these critical remarks, we propose to focus on smallermodels, such as compact models like ALBERT, which are more ecologically virtuous than these PLM.	Christophe Servan; Sahar Ghannay; Sophie Rosset;	arxiv-cs.AI	2024-03-27
106	AcTED: Automatic Acquisition of Typical Event Duration for Semi-supervised Temporal Commonsense QA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a voting-driven semi-supervised approach to automatically acquire the typical duration of an event and use it as pseudo-labeled data.	FELIX VIRGO et. al.	arxiv-cs.CL	2024-03-27
107	Evaluating Large Language Models for Health-Related Text Classification Tasks with Public Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We developed three approaches for leveraging LLMs for text classification: employing LLMs as zero-shot classifiers, us-ing LLMs as annotators to annotate training data for supervised classifiers, and utilizing LLMs with few-shot examples for augmentation of manually annotated data.	Yuting Guo; Anthony Ovadje; Mohammed Ali Al-Garadi; Abeed Sarker;	arxiv-cs.CL	2024-03-27
108	A Survey on Large Language Models from Concept to Implementation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent advancements in Large Language Models (LLMs), particularly those built on Transformer architectures, have significantly broadened the scope of natural language processing (NLP) applications, transcending their initial use in chatbot technology. This paper investigates the multifaceted applications of these models, with an emphasis on the GPT series.	Chen Wang; Jin Zhao; Jiaqi Gong;	arxiv-cs.CL	2024-03-27
109	The Topos of Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide a theoretical analysis of the expressivity of the transformer architecture through the lens of topos theory.	Mattia Jacopo Villani; Peter McBurney;	arxiv-cs.LG	2024-03-27
110	Hierarchical Light Transformer Ensembles for Multimodal Trajectory Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach named Hierarchical Light Transformer Ensembles (HLT-Ens), aimed at efficiently training an ensemble of Transformer architectures using a novel hierarchical loss function.	Adrien Lafage; Mathieu Barbier; Gianni Franchi; David Filliat;	arxiv-cs.CV	2024-03-26
111	Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking.	HAI-LONG NGUYEN et. al.	arxiv-cs.CL	2024-03-26
112	LLMs in HCI Data Work: Bridging The Gap Between Information Retrieval and Responsible Research Practices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Efficient and accurate information extraction from scientific papers is significant in the rapidly developing human-computer interaction research in the literature review process.	Neda Taghizadeh Serajeh; Iman Mohammadi; Vittorio Fuccella; Mattia De Rosa;	arxiv-cs.HC	2024-03-26
113	MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the empirical findings, we propose a novel LLM-based Multi-Agent framework for GitHub Issue reSolution, MAGIS, consisting of four kinds of agents customized for the software evolution: Manager, Repository Custodian, Developer, and Quality Assurance Engineer agents.	Wei Tao; Yucheng Zhou; Wenqiang Zhang; Yu Cheng;	arxiv-cs.SE	2024-03-26
114	Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a task for testing lexical-syntactic flexibility — the degree to which models can generalize over words in a construction with a non-prototypical part of speech.	David R. Mortensen; Valentina Izrailevitch; Yunze Xiao; Hinrich Schütze; Leonie Weissweiler;	arxiv-cs.CL	2024-03-26
115	GPT-4 Understands Discourse at Least As Well As Humans Do Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief …	Thomas Shultz; Jamie Wise; Ardavan Salehi Nobandegani;	arxiv-cs.CL	2024-03-25
116	Decoding Probing: Revealing Internal Linguistic Structures in Neural Language Models Using Minimal Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by cognitive neuroscience studies, we introduce a novel `decoding probing’ method that uses minimal pairs benchmark (BLiMP) to probe internal linguistic characteristics in neural language models layer by layer.	Linyang He; Peili Chen; Ercong Nie; Yuanning Li; Jonathan R. Brennan;	arxiv-cs.CL	2024-03-25
117	CYGENT: A Cybersecurity Conversational Agent with Log Summarization Powered By GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to the escalating cyber-attacks in the modern IT and IoT landscape, we developed CYGENT, a conversational agent framework powered by GPT-3.5 turbo model, designed to aid system administrators in ensuring optimal performance and uninterrupted resource availability.	Prasasthy Balasubramanian; Justin Seby; Panos Kostakos;	arxiv-cs.CR	2024-03-25
118	Grammatical Vs Spelling Error Correction: An Investigation Into The Responsiveness of Transformer-based Language Models Using BART and MarianMT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims at analyzing different kinds of error that occurs in text documents.	Rohit Raju; Peeta Basa Pati; SA Gandheesh; Gayatri Sanjana Sannala; Suriya KS;	arxiv-cs.CL	2024-03-25
119	A Transformer Approach for Electricity Price Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel approach to electricity price forecasting (EPF) using a pure Transformer model.	Oscar Llorente; Jose Portela;	arxiv-cs.LG	2024-03-24
120	Can Language Models Pretend Solvers? Logic Code Simulation with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we introduce a pioneering LLM-based code simulation technique, Dual Chains of Logic (DCoL).	MINYU CHEN et. al.	arxiv-cs.AI	2024-03-24
121	Using Large Language Models for OntoClean-based Ontology Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology.	Yihang Zhao; Neil Vetter; Kaveh Aryan;	arxiv-cs.AI	2024-03-23
122	VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the purpose of future research, CafeBERT is made publicly available for research purposes.	Phong Nguyen-Thuan Do; Son Quoc Tran; Phu Gia Hoang; Kiet Van Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2024-03-23
123	LlamBERT: Large-scale Low-cost Data Annotation in NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present LlamBERT, a hybrid approach that leverages LLMs to annotate a small subset of large, unlabeled databases and uses the results for fine-tuning transformer encoders like BERT and RoBERTa.	Bálint Csanády; Lajos Muzsai; Péter Vedres; Zoltán Nádasdy; András Lukács;	arxiv-cs.CL	2024-03-23
124	MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 – Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection.	SADIYA SAYARA CHOWDHURY PUSPO et. al.	arxiv-cs.CL	2024-03-22
125	ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work presents ParFormer as an enhanced transformer architecture that allows the incorporation of different token mixers into a single stage, hence improving feature extraction capabilities.	NOVENDRA SETYAWAN et. al.	arxiv-cs.CV	2024-03-22
126	Comprehensive Evaluation and Insights Into The Use of Large Language Models in The Automation of Behavior-Driven Development Acceptance Test Formulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this manuscript, we propose a novel approach to enhance BDD practices using large language models (LLMs) to automate acceptance test generation.	SHANTHI KARPURAPU et. al.	arxiv-cs.SE	2024-03-22
127	Can Large Language Models Explore In-context? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.	Akshay Krishnamurthy; Keegan Harris; Dylan J. Foster; Cyril Zhang; Aleksandrs Slivkins;	arxiv-cs.LG	2024-03-22
128	GPT-Connect: Interaction Between Text-Driven Human Motion Generator and 3D Scenes in A Training-free Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, intuitively training a separate scene-aware motion generator in a supervised way can require a large amount of motion samples to be troublesomely collected and annotated in a large scale of different 3D scenes. To handle this task rather in a relatively convenient manner, in this paper, we propose a novel GPT-connect framework.	Haoxuan Qu; Ziyan Guo; Jun Liu;	arxiv-cs.CV	2024-03-22
129	On Zero-Shot Counterspeech Generation By LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech – counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind.	Punyajoy Saha; Aalok Agrawal; Abhik Jana; Chris Biemann; Animesh Mukherjee;	arxiv-cs.CL	2024-03-22
130	Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish.	Lanxin Xu; Shuo Wang;	arxiv-cs.CV	2024-03-22
131	K-Act2Emo: Korean Commonsense Knowledge Graph for Indirect Emotional Expression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In many literary texts, emotions are indirectly conveyed through descriptions of actions, facial expressions, and appearances, necessitating emotion inference for narrative understanding. In this paper, we introduce K-Act2Emo, a Korean commonsense knowledge graph (CSKG) comprising 1,900 indirect emotional expressions and the emotions inferable from them.	Kyuhee Kim; Surin Lee; Sangah Lee;	arxiv-cs.CL	2024-03-21
132	LLM-based Extraction of Contradictions from Patents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI’s GPT-4.	Stefan Trapp; Joachim Warschat;	arxiv-cs.CL	2024-03-21
133	Extracting Emotion Phrases from Tweets Using BART Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we applied an approach to sentiment analysis based on a question-answering framework.	Mahdi Rezapour;	arxiv-cs.CL	2024-03-20
134	Ax-to-Grind Urdu: Benchmark Dataset for Urdu Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we curate and contribute the first largest publicly available dataset for Urdu FND, Ax-to-Grind Urdu, to bridge the identified gaps and limitations of existing Urdu datasets in the literature.	Sheetal Harris; Jinshuo Liu; Hassan Jalil Hadi; Yue Cao;	arxiv-cs.CL	2024-03-20
135	Open Access NAO (OAN): A ROS2-based Software Framework for HRI Applications with The NAO Robot Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new software framework for HRI experimentation with the sixth version of the common NAO robot produced by the United Robotics Group.	Antonio Bono; Kenji Brameld; Luigi D’Alfonso; Giuseppe Fedele;	arxiv-cs.RO	2024-03-20
136	AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional methods for integrating such multimodal information often stumble, leading to less-than-ideal outcomes in the task of facial action unit detection. To overcome these shortcomings, we propose a novel approach utilizing audio-visual multimodal data.	JUN YU et. al.	arxiv-cs.CV	2024-03-20
137	Retina Vision Transformer (RetinaViT): Introducing Scaled Patches Into Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Humans see low and high spatial frequency components at the same time, and combine the information from both to form a visual scene. Drawing on this neuroscientific inspiration, we propose an altered Vision Transformer architecture where patches from scaled down versions of the input image are added to the input of the first Transformer Encoder layer.	Yuyang Shu; Michael E. Bain;	arxiv-cs.CV	2024-03-20
138	Generating Automatic Feedback on UI Mockups with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the potential of using large language models for automatic feedback.	Peitong Duan; Jeremy Warner; Yang Li; Bjoern Hartmann;	arxiv-cs.HC	2024-03-19
139	Automated Data Curation for Robust Language Model Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an automated data curation pipeline CLEAR (Confidence-based LLM Evaluation And Rectification) for instruction tuning datasets, that can be used with any LLM and fine-tuning procedure.	Jiuhai Chen; Jonas Mueller;	arxiv-cs.CL	2024-03-19
140	Encode Once and Decode in Parallel: Efficient Transformer Decoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and question-answering tasks where multiple outputs are required of a single input.	BO-RU LU et. al.	arxiv-cs.CL	2024-03-19
141	TT-BLIP: Enhancing Fake News Detection Using BLIP and Tri-Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an end-to-end model called TT-BLIP that applies the bootstrapping language-image pretraining for unified vision-language understanding and generation (BLIP) for three types of information: BERT and BLIP\textsubscript{Txt} for text, ResNet and BLIP\textsubscript{Img} for images, and bidirectional BLIP encoders for multimodal information.	Eunjee Choi; Jong-Kook Kim;	arxiv-cs.LG	2024-03-19
142	Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive study on the application of GPT-4, a large language model, for automatic information extraction from UK Employment Tribunal (UKET) cases.	Joana Ribeiro de Faria; Huiyuan Xie; Felix Steffek;	arxiv-cs.CL	2024-03-19
143	Navigating Compiler Errors with AI Assistance — A Study of GPT Hints in An Introductory Programming Course Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We examined the efficacy of AI-assisted learning in an introductory programming course at the university level by using a GPT-4 model to generate personalized hints for compiler errors within a platform for automated assessment of programming assignments.	Maciej Pankiewicz; Ryan S. Baker;	arxiv-cs.SE	2024-03-19
144	LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is not aligned with the prompt compression objective. To address these issues, we propose a data distillation procedure to derive knowledge from an LLM to compress prompts without losing crucial information, and meantime, introduce an extractive text compression dataset.	ZHUOSHI PAN et. al.	arxiv-cs.CL	2024-03-19
145	CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We describe the dataset and benchmark naive, traditional, and Transformer models.	Korbinian Randl; John Pavlopoulos; Aron Henriksson; Tony Lindgren;	arxiv-cs.CL	2024-03-18
146	Shifting The Lens: Detecting Malware in Npm Ecosystem with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this study is to assist security analysts in identifying malicious packages through the empirical study of large language models (LLMs) to detect potential malware in the npm ecosystem.	Nusrat Zahan; Philipp Burckhardt; Mikola Lysenko; Feross Aboukhadijeh; Laurie Williams;	arxiv-cs.CR	2024-03-18
147	Fusion Transformer with Object Mask Guidance for Image Forgery Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce OMG-Fuser, a fusion transformer-based network designed to extract information from various forensic signals to enable robust image forgery detection and localization.	Dimitrios Karageorgiou; Giorgos Kordopatis-Zilos; Symeon Papadopoulos;	arxiv-cs.CV	2024-03-18
148	Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its impressive capabilities, the financial cost associated with GPT-4V’s inference presents a substantial barrier for its wide use. To address this challenge, our work introduces Collage Prompting, a budget-friendly prompting approach that concatenates multiple images into a single visual input.	Siyu Xu; Yunke Wang; Daochang Liu; Chang Xu;	arxiv-cs.CV	2024-03-18
149	Emotion Detection with Transformers: A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the application of transformer-based models for emotion classification on text data.	Mahdi Rezapour;	arxiv-cs.CL	2024-03-18
150	An Empirical Study on JIT Defect Prediction Based on BERT-style Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on BERT-style pre-trained model for JIT defect prediction.	Yuxiang Guo; Xiaopeng Gao; Bo Jiang;	arxiv-cs.SE	2024-03-17
151	Embracing The Generative AI Revolution: Advancing Tertiary Education in Cybersecurity with GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigated the impact of GPTs, specifically ChatGPT, on tertiary education in cybersecurity, and provided recommendations for universities to adapt their curricula to meet the evolving needs of the industry.	Raza Nowrozy; David Jam;	arxiv-cs.CY	2024-03-17
152	Reasoning in Transformers – Mitigating Spurious Correlations and Reasoning Shortcuts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data.	Daniel Enström; Viktor Kjellberg; Moa Johansson;	arxiv-cs.LG	2024-03-17
153	From Words to Routes: Applying Large Language Models to Vehicle Routing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The success of LLMs in these tasks leads us to wonder: What is the ability of LLMs to solve vehicle routing problems (VRPs) with natural language task descriptions? In this work, we study this question in three steps.	Zhehui Huang; Guangyao Shi; Gaurav S. Sukhatme;	arxiv-cs.CL	2024-03-15
154	ATOM: Asynchronous Training of Massive Models for Deep Learning in A Decentralized Environment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce \atom, a resilient distributed training framework designed for asynchronous training of vast models in a decentralized setting using cost-effective hardware, including consumer-grade GPUs and Ethernet.	Xiaofeng Wu; Jia Rao; Wei Chen;	arxiv-cs.DC	2024-03-15
155	GiT: Towards Generalist Vision Transformer Through Universal Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a simple, yet effective framework, called GiT, simultaneously applicable for various vision tasks only with a vanilla ViT.	HAIYANG WANG et. al.	arxiv-cs.CV	2024-03-14
156	Evaluating LLMs for Gender Disparities in Notable Persons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect hallucinated responses or to altogether decline to even answer prompt at all.	Lauren Rhue; Sofie Goethals; Arun Sundararajan;	arxiv-cs.CL	2024-03-14
157	Reality Bites: Assessing The Realism of Driving Scenarios with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) are demonstrating outstanding potential for tasks such as text generation, summarization, and classification.	Jiahui Wu; Chengjie Lu; Aitor Arrieta; Tao Yue; Shaukat Ali;	arxiv-cs.SE	2024-03-14
158	AI on AI: Exploring The Utility of GPT As An Expert Annotator of AI Publications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our results indicate that with effective prompt engineering, chatbots can be used as reliable data annotators even where subject-area expertise is required. To evaluate the utility of chatbot-annotated datasets on downstream classification tasks, we train a new classifier on GPT-labeled data and compare its performance to the arXiv-trained model.	Autumn Toney-Wails; Christian Schoeberl; James Dunham;	arxiv-cs.CL	2024-03-14
159	Fisher Mask Nodes for Language Model Merging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce a novel model merging method for Transformers, combining insights from previous work in Fisher-weighted averaging and the use of Fisher information in model pruning.	Thennal D K; Ganesh Nathan; Suchithra M S;	arxiv-cs.CL	2024-03-14
160	Sabiá-2: A New Generation of Portuguese Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Sabi\’a-2, a family of large language models trained on Portuguese texts.	Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira; Ramon Pires;	arxiv-cs.CL	2024-03-14
161	Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Targeting at VL PEFT tasks, we propose a family of operations, called routing functions, to enhance VL alignment in the low-rank bottlenecks.	Tingyu Qu; Tinne Tuytelaars; Marie-Francine Moens;	arxiv-cs.CV	2024-03-14
162	FBPT: A Fully Binary Point Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices.	Zhixing Hou; Yuzhang Shang; Yan Yan;	arxiv-cs.CV	2024-03-14
163	ViTCN: Vision Transformer Contrastive Network For Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Zhang et al proposed a dataset called RAVEN which can be used to test Machine Learning model abstract reasoning ability. In this paper, we purposed Vision Transformer Contrastive Network which build on previous work with the Contrastive Perceptual Inference network (CoPiNet), which set a new benchmark for permutationinvariant models Raven Progressive Matrices by incorporating contrast effects from psychology, cognition, and education, and extends this foundation by leveraging the cutting-edge Vision Transformer architecture.	Bo Song; Yuanhao Xu; Yichao Wu;	arxiv-cs.CV	2024-03-14
164	A Continued Pretrained LLM Approach for Automatic Medical Note Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HEAL, the first continuously trained 13B LLaMA2-based LLM that is purpose-built for medical conversations and measured on automated scribing.	DONG YUAN et. al.	arxiv-cs.CL	2024-03-13
165	GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored By Compliance, Context and Attribute Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security.	Raza Nowrozy; Khandakar Ahmed; Hua Wang;	arxiv-cs.CY	2024-03-13
166	Do Language Models Care About Text Quality? Evaluating Web-Crawled Corpora Across 11 Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare four of the currently most relevant large, web-crawled corpora (CC100, MaCoCu, mC4 and OSCAR) across eleven lower-resourced European languages.	RIK VAN NOORD et. al.	arxiv-cs.CL	2024-03-13
167	Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore training open-source small multimodal models (SMMs) to bridge biomedical competency gaps for unmet clinical needs.	JUAN MANUEL ZAMBRANO CHAVES et. al.	arxiv-cs.CL	2024-03-12
168	The Future of Document Indexing: GPT and Donut Revolutionize Table of Content Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model.	Degaga Wolde Feyisa; Haylemicheal Berihun; Amanuel Zewdu; Mahsa Najimoghadam; Marzieh Zare;	arxiv-cs.IR	2024-03-12
169	ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing studies are devoted to designing vision-specific transformers to solve the above problems, which introduce additional pre-training costs. Therefore, we present a plain, pre-training-free, and feature-enhanced ViT backbone with Convolutional Multi-scale feature interaction, named ViT-CoMer, which facilitates bidirectional interaction between CNN and transformer.	Chunlong Xia; Xinliang Wang; Feng Lv; Xin Hao; Yifeng Shi;	arxiv-cs.CV	2024-03-12
170	Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we introduce an RWQ-Elo rating system, engaging 24 LLMs such as GPT-4, GPT-3.5, Google-Gemini-Pro and LLaMA-1/-2, in a two-player competitive format, with GPT-4 serving as the judge.	Fangyun Wei; Xi Chen; Lin Luo;	arxiv-cs.CL	2024-03-12
171	SIFiD: Reassess Summary Factual Inconsistency Detection with LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4.	JIUDING YANG et. al.	arxiv-cs.CL	2024-03-12
172	In-context Learning Enables Multimodal Large Language Models to Classify Cancer Pathology Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates.	DYKE FERBER et. al.	arxiv-cs.CV	2024-03-12
173	Rethinking ASTE: A Minimalist Tagging Scheme Alongside Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches to ASTE often complicate the task with additional structures or external data. In this research, we propose a novel tagging scheme and employ a contrastive learning approach to mitigate these challenges.	Qiao Sun; Liujia Yang; Minghao Ma; Nanyang Ye; Qinying Gu;	arxiv-cs.CL	2024-03-12
174	GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present GPT Reddit Dataset (GRiD), a novel Generative Pretrained Transformer (GPT)-generated text detection dataset designed to assess the performance of detection models in identifying generated responses from ChatGPT.	Zubair Qazi; William Shiao; Evangelos E. Papalexakis;	arxiv-cs.CL	2024-03-12
175	Development of A Reliable and Accessible Caregiving Language Model (CaLM) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we focused on caregivers of individuals with Alzheimer’s Disease Related Dementias.	BAMBANG PARMANTO et. al.	arxiv-cs.CL	2024-03-11
176	QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, significant challenges arise during post-training linear quantization, leading to noticeable reductions in inference accuracy. Our study focuses on uncovering the underlying causes of these accuracy drops and proposing a quantization-friendly fine-tuning method, \textbf{QuantTune}.	JIUN-MAN CHEN et. al.	arxiv-cs.CV	2024-03-11
177	Stealing Part of A Production Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2.	NICHOLAS CARLINI et. al.	arxiv-cs.CR	2024-03-11
178	Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Which we use in another set of transformer encoder layers to learn the inter-chunk representations. We analyze the adaptability of Large Language Models (LLMs) with multi-billion parameters (GPT-Neo, and GPT-J) with the hierarchical framework of MESc and compare them with their standalone performance on legal texts.	Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki;	arxiv-cs.CL	2024-03-11
179	LLMs Still Can’t Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard’s Capacity to Handle Object-Oriented Programming Assignments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we experimented with three prominent LLMs – GPT-3.5, GPT-4, and Bard – to solve real-world OOP exercises used in educational settings, subsequently validating their solutions using an Automatic Assessment Tool (AAT).	Bruno Pereira Cipriano; Pedro Alves;	arxiv-cs.SE	2024-03-10
180	MoST: Motion Style Transformer Between Diverse Action Contents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge, we propose a novel motion style transformer that effectively disentangles style from content and generates a plausible motion with transferred style from a source motion.	Boeun Kim; Jungho Kim; Hyung Jin Chang; Jin Young Choi;	arxiv-cs.CV	2024-03-10
181	GPT As Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos.	HAO LU et. al.	arxiv-cs.CV	2024-03-09
182	TsGT: Stochastic Time Series Modeling With Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we go in a different direction by introducing tsGT, a stochastic time series model built on a general-purpose transformer architecture.	ŁUKASZ KUCIŃSKI et. al.	arxiv-cs.LG	2024-03-08
183	Decomposing Vision-based LLM Predictions for Auto-Evaluation with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel evaluation framework to judge the capabilities of vision-language LLMs in generating accurate summaries of CT-based abnormalities.	QINGQING ZHU et. al.	arxiv-cs.AI	2024-03-08
184	To Err Is Human, But Llamas Can Learn It Too Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs).	Agnes Luhtaru; Taido Purason; Martin Vainikko; Maksym Del; Mark Fishel;	arxiv-cs.CL	2024-03-08
185	Will GPT-4 Run DOOM? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that GPT-4’s reasoning and planning capabilities extend to the 1993 first-person shooter Doom.	Adrian de Wynter;	arxiv-cs.CL	2024-03-08
186	RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore how iterative revising a chain of thoughts with the help of information retrieval significantly improves large language models’ reasoning and generation ability in long-horizon generation tasks, while hugely mitigating hallucination.	ZIHAO WANG et. al.	arxiv-cs.CL	2024-03-08
187	The Impact of Quantization on The Robustness of Transformer-based Text Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the effect of quantization on the robustness of Transformer-based models.	Seyed Parsa Neshaei; Yasaman Boreshban; Gholamreza Ghassem-Sani; Seyed Abolghasem Mirroshandel;	arxiv-cs.CL	2024-03-08
188	An In-depth Evaluation of GPT-4 in Sentence Simplification with Error-based Human Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design an error-based human annotation framework to assess the GPT-4’s simplification capabilities.	Xuanxin Wu; Yuki Arase;	arxiv-cs.CL	2024-03-07
189	Feedback-Generation for Programming Exercises With GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ever since Large Language Models (LLMs) and related applications have become broadly available, several studies investigated their potential for assisting educators and supporting students in higher education.	Imen Azaiz; Natalie Kiesler; Sven Strickroth;	arxiv-cs.AI	2024-03-07
190	Federated Recommendation Via Hybrid Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we propose GPT-FedRec, a federated recommendation framework leveraging ChatGPT and a novel hybrid Retrieval Augmented Generation (RAG) mechanism.	Huimin Zeng; Zhenrui Yue; Qian Jiang; Dong Wang;	arxiv-cs.IR	2024-03-07
191	Assessing The Aesthetic Evaluation Capabilities of GPT-4 with Vision: Insights from Group and Individual Assessments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, it has been recognized that large language models demonstrate high performance on various intellectual tasks.	Yoshia Abe; Tatsuya Daikoku; Yasuo Kuniyoshi;	arxiv-cs.AI	2024-03-06
192	Whodunit: Classifying Code As Human Authored or GPT-4 Generated — A Case Study on CodeChef Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study shows that code stylometry is a promising approach for distinguishing between GPT-4 generated code and human-authored code.	Oseremen Joy Idialu; Noble Saji Mathews; Rungroj Maipradit; Joanne M. Atlee; Mei Nagappan;	arxiv-cs.SE	2024-03-06
193	Probabilistic Topic Modelling with Transformer Representations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the Transformer-Representation Neural Topic Model (TNTM), which combines the benefits of topic representations in transformer-based embedding spaces and probabilistic modelling.	Arik Reuter; Anton Thielmann; Christoph Weisser; Benjamin Säfken; Thomas Kneib;	arxiv-cs.LG	2024-03-06
194	Designing Informative Metrics for Few-Shot Example Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a complexity-based prompt selection approach for sequence tagging tasks.	Rishabh Adiga; Lakshminarayanan Subramanian; Varun Chandrasekaran;	arxiv-cs.CL	2024-03-06
195	Can Large Language Models Do Analytical Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports.	YEBOWEN HU et. al.	arxiv-cs.CL	2024-03-06
196	Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead of following the popular practice of directly translating existing English resources into Japanese (e.g., Japanese-Alpaca), we propose an efficient self-instruct method based on GPT-4.	YIKUN SUN et. al.	arxiv-cs.CL	2024-03-06
197	Japanese-English Sentence Translation Exercises Dataset for Automatic Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the task of automatic assessment of Sentence Translation Exercises (STEs), that have been used in the early stage of L2 language learning.	NAOKI MIURA et. al.	arxiv-cs.CL	2024-03-05
198	AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys.	Anjalee De Silva; Janaka L. Wijekoon; Rashini Liyanarachchi; Rrubaa Panchendrarajan; Weranga Rajapaksha;	arxiv-cs.AI	2024-03-05
199	InjectTST: A Transformer Method of Injecting Global Information Into Independent Channels for Long Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the problem, an injection method for global information into channel-independent Transformer, InjectTST, is proposed in this paper.	CE CHI et. al.	arxiv-cs.LG	2024-03-05
200	Solving The Bongard-logo Problem By Modeling A Probabilistic Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces PMoC, a tailored probability model for the Bongard-Logo problem, achieving high reasoning accuracy by constructing independent probability models.	Ruizhuo Song; Beiming Yuan;	arxiv-cs.CV	2024-03-05
201	PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: These setups constrain evaluation methods (e.g., predefined action space), architectural choices (e.g., only generative models), and overlook the linguistic nuances essential for realistic analysis. To tackle this, we present PARADISE, an abductive reasoning task using Q\&A format on practical procedural text sourced from wikiHow.	Arda Uzunoglu; Abdalfatah Rashid Safa; Gözde Gül Şahin;	arxiv-cs.CL	2024-03-05
202	Design2Code: How Far Are We From Automating Front-End Engineering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This can enable a new paradigm of front-end development, in which multimodal LLMs might directly convert visual designs into code implementations. In this work, we formalize this as a Design2Code task and conduct comprehensive benchmarking.	Chenglei Si; Yanzhe Zhang; Zhengyuan Yang; Ruibo Liu; Diyi Yang;	arxiv-cs.CL	2024-03-05
203	Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled By GPT-4 for Enhanced Interpretability and Public Engagement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: And the public requires complex techniques to inquiry and understand socio-cultural and institutional factors, often hinders the public’s understanding of flood risks. To overcome these challenges, our study introduces an innovative solution: a customized AI Assistant powered by the GPT-4 Large Language Model.	Rafaela Martelo; Ruo-Qian Wang;	arxiv-cs.AI	2024-03-05
204	Evolution Transformer: In-Context Evolutionary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: An alternative promising approach is to leverage data and directly discover powerful optimization principles via meta-optimization. In this work, we follow such a paradigm and introduce Evolution Transformer, a causal Transformer architecture, which can flexibly characterize a family of Evolution Strategies.	Robert Tjarko Lange; Yingtao Tian; Yujin Tang;	arxiv-cs.AI	2024-03-05
205	JMI at SemEval 2024 Task 3: Two-step Approach for Multimodal ECAC Using In-context Learning with GPT and Instruction-tuned Llama Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents our system development for SemEval-2024 Task 3: The Competition of Multimodal Emotion Cause Analysis in Conversations.	Mohammed Abbas Ansari; Chandni Saxena; Tanvir Ahmad;	arxiv-cs.CL	2024-03-05
206	PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In today’s landscape where role-play is a common strategy when using LLMs, our research highlights the need for caution, as models that adopt specific personas with personalities potentially also alter their reasoning abilities in an unexpected manner.	FIONA ANTING TAN et. al.	arxiv-cs.CL	2024-03-04
207	Using LLMs for The Extraction and Normalization of Product Attribute Values Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For our experiments, we introduce the WDC Product Attribute-Value Extraction (WDC PAVE) dataset.	Nick Baumann; Alexander Brinkmann; Christian Bizer;	arxiv-cs.CL	2024-03-04
208	Transformer for Times Series: An Application to The S&P500 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The transformer models have been extensively used with good results in a wide area of machine learning applications including Large Language Models and image generation.	Pierre Brugiere; Gabriel Turinici;	arxiv-cs.AI	2024-03-04
209	Improving The Validity of Automatically Generated Feedback Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we address both problems of automatically generating and evaluating feedback while considering both correctness and alignment.	Alexander Scarlatos; Digory Smith; Simon Woodhead; Andrew Lan;	arxiv-cs.CL	2024-03-02
210	LM4OPT: Unveiling The Potential of Large Language Models in Formulating Mathematical Optimization Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the rapidly evolving field of natural language processing, the translation of linguistic descriptions into mathematical formulation of optimization problems presents a formidable challenge, demanding intricate understanding and processing capabilities from Large Language Models (LLMs). This study compares prominent LLMs, including GPT-3.5, GPT-4, and Llama-2-7b, in zero-shot and one-shot settings for this task.	Tasnim Ahmed; Salimur Choudhury;	arxiv-cs.CL	2024-03-02
211	LAB: Large-Scale Alignment for ChatBots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training.	SHIVCHANDER SUDALAIRAJ et. al.	arxiv-cs.CL	2024-03-01
212	Surveying The Dead Minds: Historical-Psychological Text Analysis with Contextualized Construct Representation (CCR) for Classical Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a pipeline for historical-psychological text analysis in classical Chinese.	Yuqi Chen; Sixuan Li; Ying Li; Mohammad Atari;	arxiv-cs.CL	2024-03-01
213	A Systematic Evaluation of Large Language Models for Generating Programming Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most LeetCode and GeeksforGeeks coding contests evaluated in this study, GPT-4 employing the optimal prompt strategy outperforms 85 percent of human participants.	Wenpin Hou; Zhicheng Ji;	arxiv-cs.SE	2024-03-01
214	VisionLLaMA: A Unified LLaMA Interface for Vision Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Can the same transformer be used to process 2D images? In this paper, we answer this question by unveiling a LLaMA-like vision transformer in plain and pyramid forms, termed VisionLLaMA, which is tailored for this purpose.	Xiangxiang Chu; Jianlin Su; Bo Zhang; Chunhua Shen;	arxiv-cs.CV	2024-03-01
215	NewsBench: Systematic Evaluation of LLMs for Writing Proficiency and Safety Adherence in Chinese Journalistic Editorial Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents NewsBench, a novel benchmark framework developed to evaluate the capability of Large Language Models (LLMs) in Chinese Journalistic Writing Proficiency (JWP) and their Safety Adherence (SA), addressing the gap between journalistic ethics and the risks associated with AI utilization.	MIAO LI et. al.	arxiv-cs.CL	2024-02-29
216	Here’s A Free Lunch: Sanitizing Backdoored Models with Model Merge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our experiments, we explore various models (BERT-Base, RoBERTa-Large, Llama2-7B, and Mistral-7B) and datasets (SST-2, OLID, AG News, and QNLI).	ANSH ARORA et. al.	arxiv-cs.CL	2024-02-29
217	RL-GPT: Integrating Reinforcement Learning and Code-as-policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To seamlessly integrate both modalities, we introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.	SHAOTENG LIU et. al.	arxiv-cs.AI	2024-02-29
218	PeLLE: Encoder-based Language Models for Brazilian Portuguese Based on Open Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we present PeLLE, a family of large language models based on the RoBERTa architecture, for Brazilian Portuguese, trained on curated, open data from the Carolina corpus.	GUILHERME LAMARTINE DE MELLO et. al.	arxiv-cs.CL	2024-02-29
219	PROC2PDDL: Open-Domain Planning Representations from Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent approaches have used language models to predict a planning domain definition (e.g., PDDL) but have only been evaluated in closed-domain simulated environments. To address this, we present Proc2PDDL , the first dataset containing open-domain procedural texts paired with expert-annotated PDDL representations.	TIANYI ZHANG et. al.	arxiv-cs.CL	2024-02-29
220	Query-OPT: Optimizing Inference of Large Language Models Via Multi-Query Instructions in Meeting Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, repeated calls to the LLM inference endpoints would significantly increase the costs of using them in production, making LLMs impractical for many real-world use cases. To address this problem, in this paper, we investigate whether combining the queries for the same input context in a single prompt to minimize repeated calls can be successfully used in meeting summarization.	Md Tahmid Rahman Laskar; Elena Khasanova; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN;	arxiv-cs.CL	2024-02-29
221	Can GPT Improve The State of Prior Authorization Via Guideline Based Automated Question Answering? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate whether GPT can validate numerous key factors, in turn helping health plans reach a decision drastically faster.	Shubham Vatsal; Ayush Singh; Shabnam Tafreshi;	arxiv-cs.CL	2024-02-28
222	A Language Model Based Framework for New Concept Placement in Ontologies Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In all steps, we propose to leverage neural methods, where we apply embedding-based methods and contrastive learning with Pre-trained Language Models (PLMs) such as BERT for edge search, and adapt a BERT fine-tuning-based multi-label Edge-Cross-encoder, and Large Language Models (LLMs) such as GPT series, FLAN-T5, and Llama 2, for edge selection.	Hang Dong; Jiaoyan Chen; Yuan He; Yongsheng Gao; Ian Horrocks;	arxiv-cs.CL	2024-02-27
223	Conformer: Embedding Continuous Attention in Vision Transformer for Weather Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although Transformers-based models have shown remarkable potential in weather forecasting, Transformers are discrete models which limit their ability to learn the continuous spatio-temporal features of the dynamical weather system. We address this issue with Conformer, a spatio-temporal Continuous Vision Transformer for weather forecasting.	Hira Saleem; Flora Salim; Cormac Purcell;	arxiv-cs.LG	2024-02-27
224	Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters.	Flavio Petruzzellis; Alberto Testolin; Alessandro Sperduti;	arxiv-cs.CL	2024-02-27
225	Fraud Detection with Binding Global and Local Relational Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel framework called Relation-Aware GNN with transFormer (RAGFormer) which simultaneously embeds local and global features into a target node.	HAOLIN LI et. al.	arxiv-cs.LG	2024-02-27
226	Variational Learning Is Effective for Large Deep Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show several new use cases of IVON where we improve fine-tuning and model merging in Large Language Models, accurately predict generalization error, and faithfully estimate sensitivity to data.	YUESONG SHEN et. al.	arxiv-cs.LG	2024-02-27
227	Can GPT-4 Identify Propaganda? Annotation and Detection of Propaganda Spans in News Articles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The majority of the recent initiatives targeting medium to low-resource languages produced relatively small annotated datasets, with a skewed distribution, posing challenges for the development of sophisticated propaganda detection models. To address this challenge, we carefully develop the largest propaganda dataset to date, ArPro, comprised of 8K paragraphs from newspaper articles, labeled at the text span level following a taxonomy of 23 propagandistic techniques.	Maram Hasanain; Fatema Ahmed; Firoj Alam;	arxiv-cs.CL	2024-02-27
228	Latent Attention for Linear Time Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The time complexity of the standard attention mechanism in a transformer scales quadratically with the length of the sequence. We introduce a method to reduce this to linear scaling with time, based on defining attention via latent vectors.	Rares Dolga; Marius Cobzarenco; David Barber;	arxiv-cs.CL	2024-02-27
229	CAPT: Category-level Articulation Estimation from A Single Point Cloud Using Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose CAPT: category-level articulation estimation from a point cloud using Transformer.	Lian Fu; Ryoichi Ishikawa; Yoshihiro Sato; Takeshi Oishi;	arxiv-cs.CV	2024-02-27
230	Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose temporal logic Specification-conditioned Decision Transformer (SDT), a novel framework that harnesses the expressive power of signal temporal logic (STL) to specify complex temporal rules that an agent should follow and the sequential modeling capability of Decision Transformer (DT).	Zijian Guo; Weichao Zhou; Wenchao Li;	arxiv-cs.LG	2024-02-27
231	GeoLLM: Extracting Geospatial Knowledge from Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged for geospatial prediction tasks.	Anonymous Authors;	iclr	2024-02-26
232	Asymmetry in Low-Rank Adapters of Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices.	JIACHENG ZHU et. al.	arxiv-cs.LG	2024-02-26
233	Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative pre-trained models have demonstrated remarkable effectiveness in language and vision domains by learning useful representations. In this paper, we extend the scope of this effectiveness by showing that visual robot manipulation can significantly benefit from large-scale video generative pre-training.	HONGTAO WU et. al.	iclr	2024-02-26
234	Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the effect of code on enhancing LLMs’ reasoning capability by introducing different constraints on the Code Usage Frequency of GPT-4 Code Interpreter.	AOJUN ZHOU et. al.	iclr	2024-02-26
235	The Reversal Curse: LLMs Trained on “A Is B” Fail to Learn “B Is A” Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, models exhibit a basic failure of logical deduction and do not generalize a prevalent pattern in their training set (i.e. if A is B occurs, B is A is more likely to occur). We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as Uriah Hawthorne is the composer of Abyssal Melodies and showing that they fail to correctly answer Who composed *Abyssal Melodies?	LUKAS BERGLUND et. al.	iclr	2024-02-26
236	Transformer Fusion with Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a systematic approach for fusing two or more transformer-based networks exploiting Optimal Transport to (soft-)align the various architectural components.	MORITZ IMFELD et. al.	iclr	2024-02-26
237	Xformer: Hybrid X-Shaped Transformer for Image Denoising Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a hybrid X-shaped vision Transformer, named Xformer, which performs notably on image denoising tasks.	Anonymous Authors;	iclr	2024-02-26
238	PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces PixArt-$\alpha$, a Transformer-based T2I diffusion model whose image generation quality is competitive with state-of-the-art image generators (e.g., Imagen, SDXL, and even Midjourney), reaching near-commercial application standards.	JUNSONG CHEN et. al.	iclr	2024-02-26
239	Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We used an ablation study to show that joint training on neuronal responses and behavior boosted performance, highlighting the model’s ability to associate behavioral and neural representations in an unsupervised manner.	Antonis Antoniades; Yiyi Yu; Joe S Canzano; William Yang Wang; Spencer Smith;	iclr	2024-02-26
240	The Devil Is in The Neurons: Interpreting and Mitigating Social Biases in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we try to unveil the mystery of social bias inside language models by introducing the concept of {\sc Social Bias Neurons}.	YAN LIU et. al.	iclr	2024-02-26
241	Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SUFO, a systematic framework that enhances interpretability of fine-tuned transformer feature spaces.	Anonymous Authors;	iclr	2024-02-26
242	Transformer-VQ: Linear-Time Transformers Via Vector Quantization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time.	Anonymous Authors;	iclr	2024-02-26
243	Graph Transformers on EHRs: Better Representation Improves Downstream Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose GT-BEHRT, a new approach that leverages temporal visit embeddings extracted from a graph transformer and uses a BERT-based model to obtain more robust patient representations, especially on longer EHR sequences.	Raphael Poulain; Rahmatollah Beheshti;	iclr	2024-02-26
244	The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity.	Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Re;	iclr	2024-02-26
245	Looped Transformers Are Better at Learning Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures.	Liu Yang; Kangwook Lee; Robert D Nowak; Dimitris Papailiopoulos;	iclr	2024-02-26
246	DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Conventional training-based methods have limitations in flexibility, particularly when adapting to new domains, and they often lack explanatory power. To address this gap, we propose a novel training-free detection strategy called Divergent N-Gram Analysis (DNA-GPT).	XIANJUN YANG et. al.	iclr	2024-02-26
247	Masked Distillation Advances Self-Supervised Transformer Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper proposes a masked image modelling (MIM) based self-supervised neural architecture search method specifically designed for vision transformers, termed as MaskTAS, which completely avoids the expensive costs of data labeling inherited from supervised learning.	CAIXIA YAN et. al.	iclr	2024-02-26
248	A Multi-Level Framework for Accelerating Training Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by a set of key observations of inter- and intra-layer similarities among feature maps and attentions that can be identified from typical training processes, we propose a multi-level framework for training acceleration.	Longwei Zou; Han Zhang; Yangdong Deng;	iclr	2024-02-26
249	Dual-Space Optimization: Improved Molecule Sequence Design By Latent Prompt Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Designing molecules with desirable properties, such as drug-likeliness and high binding affinities towards protein targets, is a challenging problem. In this paper, we propose the Dual-Space Optimization (DSO) method that integrates latent space sampling and data space selection to solve this problem.	DEQIAN KONG et. al.	arxiv-cs.LG	2024-02-26
250	Learning Unsupervised World Models for Autonomous Driving Via Discrete Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we propose a novel world modeling approach that first tokenizes sensor observations with VQVAE, then predicts the future via discrete diffusion.	Anonymous Authors;	iclr	2024-02-26
251	Is Self-Repair A Silver Bullet for Code Generation? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze Code Llama, GPT-3.5 and GPT-4’s ability to perform self-repair on problems taken from HumanEval or APPS, finding that when the cost of carrying out repair is taken into account gains are often modest, vary a lot between subsets of the data, and are sometimes not present at all.	Theo X. Olausson; Jeevana Priya Inala; Chenglong Wang; Jianfeng Gao; Armando Solar-Lezama;	iclr	2024-02-26
252	Simplifying Transformer Blocks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we ask to what extent the standard transformer block can be simplified?	Bobby He; Thomas Hofmann;	iclr	2024-02-26
253	Linear Attention Is (maybe) All You Need (to Understand Transformer Optimization) IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, the results obtained in this paper suggest that a simple linearized Transformer model could actually be a valuable, realistic abstraction for understanding Transformer optimization.	KWANGJUN AHN et. al.	iclr	2024-02-26
254	Prometheus: Inducing Evaluation Capability in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose PROMETHEUS a fully open-source LLM that is on par with GPT-4’s evaluation capabilities when the appropriate reference materials (reference answer, score rubric) are accompanied.For this purpose, we construct a new dataset – FEEDBACK COLLECTION – that consists of 1K fine-grained score rubrics, 20K instructions, and 100K natural language feedback generated by GPT-4.	SEUNGONE KIM et. al.	iclr	2024-02-26
255	MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The recently released GPT-4 Code Interpreter has demonstrated remarkable proficiency in solving challenging math problems, primarily attributed to its ability to seamlessly reason with natural language, generate code, execute code, and continue reasoning based on the execution output. In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities.	KE WANG et. al.	iclr	2024-02-26
256	MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We believe that the enhanced multi-modal generation capabilities of GPT-4 stem from the utilization of sophisticated large language models (LLM). To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen advanced LLM, Vicuna, using one projection layer.	Deyao Zhu; Jun Chen; Xiaoqian Shen; Xiang Li; Mohamed Elhoseiny;	iclr	2024-02-26
257	Test-Time Training on Nearest Neighbors for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We build a large-scale distributed index based on text embeddings of the Pile dataset.	Anonymous Authors;	iclr	2024-02-26
258	Massive Editing for Large Language Model Via Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameter using the normal equation.	Chenmien Tan; Ge Zhang; Jie Fu;	iclr	2024-02-26
259	An LLM Can Fool Itself: A Prompt-Based Adversarial Attack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes an efficient tool to audit the LLM’s adversarial robustness via a prompt-based adversarial attack (PromptAttack).	XILIE XU et. al.	iclr	2024-02-26
260	Linearity of Relation Decoding in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations: relations between words and their synonyms, entities and their attributes, etc. We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation.	Anonymous Authors;	iclr	2024-02-26
261	AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that vector graphics are typically encoded using low-level graphics primitives, generating them directly is difficult. To address this, we propose the use of TikZ, a well-known abstract graphics language that can be compiled to vector graphics, as an intermediate representation of scientific figures.	Anonymous Authors;	iclr	2024-02-26
262	Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring The Design of Next-generation Neuromorphic Chips Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a general Transformer-based SNN architecture, termed as “Meta-SpikeFormer, whose goals are: (1) Lower-power, supports the spike-driven paradigm that there is only sparse addition in the network; (2) Versatility, handles various vision tasks; (3) High-performance, shows overwhelming performance advantages over CNN-based SNNs; (4) Meta-architecture, provides inspiration for future next-generation Transformer-based neuromorphic chip designs.	Anonymous Authors;	iclr	2024-02-26
263	Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing (e.g., mathematical, causal) reasoning tasks.	Anonymous Authors;	iclr	2024-02-26
264	ITransformer: Inverted Transformers Are Effective for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we reflect on the competent duties of Transformer components and repurpose the Transformer architecture without any adaptation on the basic components.	YONG LIU et. al.	iclr	2024-02-26
265	CARD: Channel Aligned Robust Blend Transformer for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.	XUE WANG et. al.	iclr	2024-02-26
266	Tailoring Self-Rationalizers with Multi-Reward Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we enable small-scale LMs (∼200x smaller than GPT-3) to generate rationales that not only improve downstream task performance, but are also more plausible, consistent, and diverse, assessed both by automatic and human evaluation.	SAHANA RAMNATH et. al.	iclr	2024-02-26
267	From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatility Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This groundbreaking study explores the expanse of Large Language Models (LLMs), such as Generative Pre-Trained Transformer (GPT) and Bidirectional Encoder Representations from Transformers (BERT) across varied domains ranging from technology, finance, healthcare to education.	PRAVNEET KAUR et. al.	arxiv-cs.CL	2024-02-25
268	Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Models such as BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. These pre-trained models serve as robust bases for various tasks including semantic understanding, intelligent writing, and reasoning, paving the way for a more generalized form of artificial intelligence.	Shuning Huo; Yafei Xiang; Hanyi Yu; Mengran Zhu; Yulu Gong;	arxiv-cs.CL	2024-02-25
269	HPE Transformer: Learning to Optimize Multi-Group Multicast Beamforming Under Nonconvex QoS Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate real-time implementations, this paper proposes a deep learning-based approach, which consists of a beamforming structure assisted problem transformation and a customized neural network architecture named hierarchical permutation equivariance (HPE) transformer.	Yang Li; Ya-Feng Liu;	arxiv-cs.IT	2024-02-25
270	Increasing SAM Zero-Shot Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without Human Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study develops and evaluates a novel multimodal medical image zero-shot segmentation algorithm named Text-Visual-Prompt SAM (TV-SAM) without any manual annotations.	ZEKUN JIANG et. al.	arxiv-cs.CV	2024-02-24
271	SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection.	Ayan Datta; Aryan Chandramania; Radhika Mamidi;	arxiv-cs.CL	2024-02-24
272	Towards Efficient Active Learning in NLP Via Pretrained Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications.	Artem Vysogorets; Achintya Gopal;	arxiv-cs.LG	2024-02-23
273	ArabianGPT: Native Arabic GPT-based Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, there is a theoretical and practical imperative for developing LLMs predominantly focused on Arabic linguistic elements. To address this gap, this paper proposes ArabianGPT, a series of transformer-based models within the ArabianLLM suite designed explicitly for Arabic.	Anis Koubaa; Adel Ammar; Lahouari Ghouti; Omar Najar; Serry Sibaee;	arxiv-cs.CL	2024-02-23
274	Advancing Parameter Efficiency in Fine-tuning Via Representation Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the promising performance of current PEFT methods, they present challenges in hyperparameter selection, such as determining the rank of LoRA or Adapter, or specifying the length of soft prompts. In addressing these challenges, we propose a novel approach to fine-tuning neural models, termed Representation EDiting (RED), which scales and biases the representation produced at each layer.	MULING WU et. al.	arxiv-cs.LG	2024-02-23
275	Multimodal Transformer With A Low-Computational-Cost Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, multimodal Transformers significantly suffer from a quadratic complexity of the multi-head attention with the input sequence length, especially as the number of modalities increases. To address this, we introduce Low-Cost Multimodal Transformer (LoCoMT), a novel multimodal attention mechanism that aims to reduce computational cost during training and inference with minimal performance loss.	Sungjin Park; Edward Choi;	arxiv-cs.LG	2024-02-23
276	A First Look at GPT Apps: Landscape and Vulnerability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, in this work, we conduct a pioneering exploration of GPT stores, aiming to study vulnerabilities and plagiarism within GPT applications.	ZEJUN ZHANG et. al.	arxiv-cs.CR	2024-02-23
277	Self-Supervised Pre-Training for Table Structure Recognition Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we resolve the issue by proposing a self-supervised pre-training (SSP) method for TSR transformers.	ShengYun Peng; Seongmin Lee; Xiaojing Wang; Rajarajeswari Balasubramaniyan; Duen Horng Chau;	arxiv-cs.CV	2024-02-23
278	A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the potential of large language models to generate patient summaries based on doctors’ notes and study the effect of training data on the faithfulness and quality of the generated summaries.	STEFAN HEGSELMANN et. al.	arxiv-cs.CL	2024-02-23
279	RoboScript: Code Generation for Free-Form Manipulation Tasks Across Real and Simulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous studies put much effort into general common sense reasoning and task planning capabilities of large-scale language or multi-modal models, relatively little effort on ensuring the deployability of generated code on real robots, and other fundamental components of autonomous robot systems including robot perception, motion planning, and control. To bridge this “ideal-to-real” gap, this paper presents \textbf{RobotScript}, a platform for 1) a deployable robot manipulation pipeline powered by code generation; and 2) a code generation benchmark for robot manipulation tasks in free-form natural language.	JUNTING CHEN et. al.	arxiv-cs.RO	2024-02-22
280	OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code.	TIANYU ZHENG et. al.	arxiv-cs.SE	2024-02-22
281	Prompting A Pretrained Transformer Can Be A Universal Approximator Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of transformer models, our theoretical understanding of these fine-tuning methods remains limited. A …	Aleksandar Petrov; Philip H. S. Torr; Adel Bibi;	arxiv-cs.LG	2024-02-22
282	Whose LLM Is It Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through a comprehensive linguistic analysis, we compare the vocabulary, Part-Of-Speech (POS) distribution, dependency distribution, and sentiment of texts generated by three of the most popular LLMS today (GPT-3.5, GPT-4, and Bard) to diverse inputs.	Ariel Rosenfeld; Teddy Lazebnik;	arxiv-cs.CL	2024-02-22
283	Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper highlights the best practices of the PGI, Persona, Grouping, and Intelligence, method, a strategic framework that achieved a remarkable error rate of only 3,15 percent across 4,000 responses generated by GPT in response to a real business challenge.	Aline Ioste;	arxiv-cs.CL	2024-02-21
284	Do Efficient Transformers Really Save Computation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to understand the capabilities and limitations of efficient Transformers, specifically the Sparse Transformer and the Linear Transformer.	KAI YANG et. al.	arxiv-cs.LG	2024-02-21
285	TransGOP: Transformer-Based Gaze Object Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper introduces Transformer into the fields of gaze object prediction and proposes an end-to-end Transformer-based gaze object prediction method named TransGOP.	Binglu Wang; Chenxi Guo; Yang Jin; Haisheng Xia; Nian Liu;	arxiv-cs.CV	2024-02-21
286	On The Expressive Power of A Variant of The Looped Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide theoretical evidence of the expressive power of the AlgoFormer in solving some challenging problems, mirroring human-designed algorithms.	YIHANG GAO et. al.	arxiv-cs.LG	2024-02-21
287	Towards Understanding Counseling Conversations: Domain Knowledge and Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a systematic approach to examine the efficacy of domain knowledge and large language models (LLMs) in better representing conversations between a crisis counselor and a help seeker.	Younghun Lee; Dan Goldwasser; Laura Schwab Reese;	arxiv-cs.CL	2024-02-21
288	An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present an optimized, fine-tuned transformer-based DistilBERT model designed for the detection of phishing emails.	Mohammad Amaz Uddin; Iqbal H. Sarker;	arxiv-cs.LG	2024-02-21
289	Knowledge Graph Enhanced Large Language Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of postedit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME.	MENGQI ZHANG et. al.	arxiv-cs.CL	2024-02-21
290	SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy.	PRAKAMYA MISHRA et. al.	arxiv-cs.CL	2024-02-21
291	Beyond Hate Speech: NLP’s Challenges and Opportunities in Uncovering Dehumanizing Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper evaluates the performance of cutting-edge NLP models, including GPT-4, GPT-3.5, and LLAMA-2, in identifying dehumanizing language.	Hezhao Zhang; Lasana Harris; Nafise Sadat Moosavi;	arxiv-cs.CL	2024-02-21
292	Are ELECTRA’s Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We notice a significant drop in performance when using the ELECTRA discriminator’s last layer in comparison to earlier layers. We explore this drop and devise a way to repair ELECTRA’s embeddings, proposing a novel truncated model fine-tuning (TMFT) method.	Ivan Rep; David Dukić; Jan Šnajder;	arxiv-cs.CL	2024-02-20
293	Transformer Tricks: Precomputing The First Layer Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This micro-paper describes a trick to speed up inference of transformers with RoPE (such as LLaMA, Mistral, PaLM, and Gemma). For these models, a large portion of the first …	Nils Graef;	arxiv-cs.LG	2024-02-20
294	How Easy Is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The remarkable advancements in Multimodal Large Language Models (MLLMs) have not rendered them immune to challenges, particularly in the context of handling deceptive information in prompts, thus producing hallucinated responses under such conditions. To quantitatively assess this vulnerability, we present MAD-Bench, a carefully curated benchmark that contains 850 test samples divided into 6 categories, such as non-existent objects, count of objects, spatial relationship, and visual confusion.	Yusu Qian; Haotian Zhang; Yinfei Yang; Zhe Gan;	arxiv-cs.CV	2024-02-20
295	Can Large Language Models Be Used to Provide Psychological Counselling? An Analysis of GPT-4-Generated Responses Using Role-play Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For this study, we collected counseling dialogue data via role-playing scenarios involving expert counselors, and the utterances were annotated with the intentions of the counselors.	Michimasa Inaba; Mariko Ukiyo; Keiko Takamizo;	arxiv-cs.CL	2024-02-20
296	Can GNN Be Good Adapter for LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs.	XUANWEN HUANG et. al.	arxiv-cs.CL	2024-02-20
297	The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Compared to work on monolingual (English) in-context learning, multilingual in-context learning is under-explored, and we lack an in-depth understanding of the role of demonstrations in this context. To address this gap, we conduct a multidimensional analysis of multilingual in-context learning, experimenting with 5 models from different model families, 9 datasets covering classification and generation tasks, and 56 typologically diverse languages.	MIAORAN ZHANG et. al.	arxiv-cs.CL	2024-02-20
298	RhythmFormer: Extracting RPPG Signals Based on Hierarchical Temporal Periodic Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose RhythmFormer, a fully end-to-end transformer-based method for extracting rPPG signals by explicitly leveraging the quasi-periodic nature of rPPG.	Bochao Zou; Zizheng Guo; Jiansheng Chen; Huimin Ma;	arxiv-cs.CV	2024-02-20
299	Advancing GenAI Assisted Programming–A Comparative Study on Prompt Efficiency and Code Quality Between GPT-4 and GLM-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study aims to explore the best practices for utilizing GenAI as a programming tool, through a comparative analysis between GPT-4 and GLM-4.	Angus Yang; Zehan Li; Jie Li;	arxiv-cs.SE	2024-02-20
300	Meta Ranking: Less Capable Language Models Are Capable for Single Response Judgement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To enable less capable LLMs to effectively judge the reliability of individual responses, we propose a novel method named $\textit{Meta}$ $\textit{Ranking}$ (MR).	ZIJUN LIU et. al.	arxiv-cs.CL	2024-02-19
301	Your Large Language Model Is Secretly A Fairness Proponent and You Should Prompt It Like One Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response to this, we validate that prompting LLMs with specific roles can allow LLMs to express diverse viewpoints. Building on this insight and observation, we develop FairThinking, a pipeline designed to automatically generate roles that enable LLMs to articulate diverse perspectives for fair expressions.	TIANLIN LI et. al.	arxiv-cs.CL	2024-02-19
302	End-to-end Multilingual Fact-checking at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we describe how you can perform end-to-end fact-checking in over 100 languages using Factiverse AI models.	Vinay Setty;	arxiv-cs.CL	2024-02-19
303	Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a circuit discovery framework alternative to activation patching.	ZHENGFU HE et. al.	arxiv-cs.LG	2024-02-19
304	Query-Based Adversarial Prompt Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks.	Jonathan Hayase; Ema Borevkovic; Nicholas Carlini; Florian Tramèr; Milad Nasr;	arxiv-cs.CL	2024-02-19
305	DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the effectiveness of LLMs for solving code-repair task.	BERKAY BERABI et. al.	arxiv-cs.CR	2024-02-19
306	Reflect-RL: Two-Player Online RL Fine-Tuning for LMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, only a few works attempted to directly train the LMs within interactive decision-making environments. We aim to create an effective mechanism to fine-tune LMs with online reinforcement learning (RL) in these environments.	Runlong Zhou; Simon S. Du; Beibin Li;	arxiv-cs.LG	2024-02-19
307	A Critical Evaluation of AI Feedback for Aligning Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that the improvements of the RL step are virtually entirely due to the widespread practice of using a weaker teacher model (e.g. GPT-3.5) for SFT data collection than the critic (e.g., GPT-4) used for AI feedback generation.	ARCHIT SHARMA et. al.	arxiv-cs.LG	2024-02-19
308	Creating A Fine Grained Entity Type Taxonomy Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the potential of GPT-4 and its advanced iteration, GPT-4 Turbo, in autonomously developing a detailed entity type taxonomy.	Michael Gunn; Dohyun Park; Nidhish Kamath;	arxiv-cs.CL	2024-02-19
309	Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conclusion: In this paper, we show that while GPT-4 is superior to open-source models in zero-shot report labeling, the implementation of few-shot prompting can bring open-source models on par with GPT-4.	FELIX J. DORFNER et. al.	arxiv-cs.CL	2024-02-19
310	Enhancing Large Language Models for Text-to-Testcase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: In this paper, we introduce a text-to-testcase generation approach based on a large language model (GPT-3.5) that is fine-tuned on our curated dataset with an effective prompt design.	Saranya Alagarsamy; Chakkrit Tantithamthavorn; Chetan Arora; Aldeida Aleti;	arxiv-cs.SE	2024-02-19
311	Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers.	Anna Martin-Boyle; Aahan Tyagi; Marti A. Hearst; Dongyeop Kang;	arxiv-cs.CL	2024-02-19
312	Acquiring Clean Language Models from Backdoor Poisoned Datasets By Downscaling Frequency Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the learning mechanisms of backdoor LMs in the frequency space by Fourier analysis.	Zongru Wu; Zhuosheng Zhang; Pengzhou Cheng; Gongshen Liu;	arxiv-cs.CL	2024-02-19
313	Evaluation of ChatGPT’s Smart Contract Auditing Capabilities Based on Chain of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of enhancing smart contract security audits using the GPT-4 model.	Yuying Du; Xueyan Tang;	arxiv-cs.CR	2024-02-19
314	Decoding News Narratives: A Critical Analysis of Large Language Models in Framing Bias Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work contributes to the expanding research on the applicability of LLMs in social sciences by examining the performance of GPT-3.5 Turbo, GPT-4, and Flan-T5 models in detecting framing bias in news headlines through zero-shot, few-shot, and explainable prompting methods.	Valeria Pastorino; Jasivan A. Sivakumar; Nafise Sadat Moosavi;	arxiv-cs.CL	2024-02-18
315	LongAgent: Scaling Language Models to 128k Context Through Multi-Agent Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose \textsc{LongAgent}, a method based on multi-agent collaboration, which scales LLMs (e.g., LLaMA) to a context of 128K and demonstrates potential superiority in long-text processing compared to GPT-4.	JUN ZHAO et. al.	arxiv-cs.CL	2024-02-18
316	Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, we propose a two-stage instruction tuning framework, in which VLMs are firstly finetuned on Vision-Flan and further tuned on GPT-4 synthesized data. We find this two-stage tuning framework significantly outperforms the traditional single-stage visual instruction tuning framework and achieves the state-of-the-art performance across a wide range of multi-modal evaluation benchmarks.	ZHIYANG XU et. al.	arxiv-cs.CL	2024-02-18
317	A Curious Case of Searching for The Correlation Between Training Data and Adversarial Robustness of Transformer Textual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we want to prove that there is also a strong correlation between training data and model robustness.	Cuong Dang; Dung D. Le; Thai Le;	arxiv-cs.LG	2024-02-18
318	Exploring ChatGPT for Next-generation Information Retrieval: Opportunities and Challenges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike the traditional supervised learning approach in IR tasks, ChatGPT challenges existing paradigms, bringing forth new challenges and opportunities regarding text quality assurance, model bias, and efficiency. This paper seeks to examine the impact of ChatGPT on IR tasks and offer insights into its potential future developments.	Yizheng Huang; Jimmy Huang;	arxiv-cs.IR	2024-02-17
319	Reasoning Before Comparison: LLM-Enhanced Semantic Similarity Metrics for Domain Specialized Text Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage LLM to enhance the semantic analysis and develop similarity metrics for texts, addressing the limitations of traditional unsupervised NLP metrics like ROUGE and BLEU.	SHAOCHEN XU et. al.	arxiv-cs.CL	2024-02-17
320	Can Large Language Models Perform Relation-based Argument Mining? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that general-purpose Large Language Models (LLMs), appropriately primed and prompted, can significantly outperform the best performing (RoBERTa-based) baseline.	Deniz Gorur; Antonio Rago; Francesca Toni;	arxiv-cs.CL	2024-02-17
321	Can Separators Improve Chain-of-Thought Prompting? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by human cognition, we introduce CoT-Sep, a novel method that strategically employs separators at the end of each exemplar in CoT prompting.	Yoonjeong Park; Hyunjin Kim; Chanyeol Choi; Junseong Kim; Jy-yong Sohn;	arxiv-cs.CL	2024-02-16
322	Linear Transformers with Learnable Kernel Functions Are Better In-Context Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Mirroring the Transformer’s in-context adeptness, it became a strong contender in the field. In our work, we present a singular, elegant alteration to the Based kernel that amplifies its In-Context Learning abilities evaluated with the Multi-Query Associative Recall task and overall language modeling process, as demonstrated on the Pile dataset.	YAROSLAV AKSENOV et. al.	arxiv-cs.LG	2024-02-16
323	In Search of Needles in A 11M Haystack: Recurrent Memory Finds What LLMs Miss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To evaluate different approaches, we introduce BABILong, a new benchmark designed to assess model capabilities in extracting and processing distributed facts within extensive texts.	YURI KURATOV et. al.	arxiv-cs.CL	2024-02-16
324	Enhancing ESG Impact Type Identification Through Early Fusion and Multilingual Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the evolving landscape of Environmental, Social, and Corporate Governance (ESG) impact assessment, the ML-ESG-2 shared task proposes identifying ESG impact types. To address this challenge, we present a comprehensive system leveraging ensemble learning techniques, capitalizing on early and late fusion approaches.	Hariram Veeramani; Surendrabikram Thapa; Usman Naseem;	arxiv-cs.CL	2024-02-16
325	Inference to The Best Explanation in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes IBE-Eval, a framework inspired by philosophical accounts on Inference to the Best Explanation (IBE) to advance the interpretation and evaluation of LLMs’ explanations.	Dhairya Dalal; Marco Valentino; André Freitas; Paul Buitelaar;	arxiv-cs.CL	2024-02-16
326	Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios. To address this gap, we introduce a new benchmark, Conan, designed for extracting and analysing intricate character relation graphs from detective narratives.	RUNCONG ZHAO et. al.	arxiv-cs.CL	2024-02-16
327	Fine-tuning Large Language Model (LLM) Artificial Intelligence Chatbots in Ophthalmology and LLM-based Evaluation Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Notably, qualitative analysis and the glaucoma sub-analysis revealed clinical inaccuracies in the LLM-generated responses, which were appropriately identified by the GPT-4 evaluation.	TING FANG TAN et. al.	arxiv-cs.AI	2024-02-15
328	Leveraging Large Language Models for Enhanced NLP Task Performance Through Knowledge Distillation and Optimized Training Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our approach presents a scalable methodology that reduces manual annotation costs and increases efficiency, making it especially pertinent in resource-limited and closed-network environments.	Yining Huang; Keke Tang; Meilian Chen;	arxiv-cs.CL	2024-02-14
329	API Pack: A Massive Multilingual Dataset for API Call Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce API Pack, a multilingual dataset featuring over one million instruction-API call pairs aimed at advancing large language models’ API call generation capabilities.	Zhen Guo; Adriana Meza Soria; Wei Sun; Yikang Shen; Rameswar Panda;	arxiv-cs.CL	2024-02-14
330	An Analysis of Language Frequency and Error Correction for Esperanto Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current Grammar Error Correction (GEC) initiatives tend to focus on major languages, with less attention given to low-resource languages like Esperanto. In this article, we begin to bridge this gap by first conducting a comprehensive frequency analysis using the Eo-GP dataset, created explicitly for this purpose.	Junhong Liang;	arxiv-cs.CL	2024-02-14
331	Research and Application of Transformer Based Anomaly Detection Model: A Literature Review Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To inspire research on Transformer-based anomaly detection, this review offers a fresh perspective on the concept of anomaly detection.	Mingrui Ma; Lansheng Han; Chunjie Zhou;	arxiv-cs.LG	2024-02-14
332	Changes By Butterflies: Farsighted Forecasting with Group Reservoir Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Group Reservoir Transformer to predict long-term events more accurately and robustly by overcoming two challenges in Chaos: (1) the extensive historical sequences and (2) the sensitivity to initial conditions.	Md Kowsher; Jia Xu;	arxiv-cs.LG	2024-02-14
333	Pyramid Attention Network for Medical Image Registration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The advent of deep-learning-based registration networks has addressed the time-consuming challenge in traditional iterative methods.However, the potential of current registration networks for comprehensively capturing spatial relationships has not been fully explored, leading to inadequate performance in large-deformation image registration.The pure convolutional neural networks (CNNs) neglect feature enhancement, while current Transformer-based networks are susceptible to information redundancy.To alleviate these issues, we propose a pyramid attention network (PAN) for deformable medical image registration.Specifically, the proposed PAN incorporates a dual-stream pyramid encoder with channel-wise attention to boost the feature representation.Moreover, a multi-head local attention Transformer is introduced as decoder to analyze motion patterns and generate deformation fields.Extensive experiments on two public brain magnetic resonance imaging (MRI) datasets and one abdominal MRI dataset demonstrate that our method achieves favorable registration performance, while outperforming several CNN-based and Transformer-based registration networks.Our code is publicly available at https://github.com/JuliusWang-7/PAN.	Zhuoyuan Wang; Haiqiao Wang; Yi Wang;	arxiv-cs.CV	2024-02-14
334	GPT-4’s Assessment of Its Performance in A USMLE-based Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates GPT-4’s assessment of its performance in healthcare applications.	UTTAM DHAKAL et. al.	arxiv-cs.AI	2024-02-14
335	L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a language agent with chain-of-3D-thoughts (L3GO), an inference-time approach that can reason about part-based 3D mesh generation of unconventional objects that current data-driven diffusion models struggle with.	YUTARO YAMADA et. al.	arxiv-cs.AI	2024-02-14
336	Combining Insights From Multiple Large Language Models Improves Diagnostic Accuracy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Background: Large language models (LLMs) such as OpenAI’s GPT-4 or Google’s PaLM 2 are proposed as viable diagnostic support tools or even spoken of as replacements for curbside consults.	Gioele Barabucci; Victor Shia; Eugene Chu; Benjamin Harack; Nathan Fu;	arxiv-cs.AI	2024-02-13
337	Eliciting Personality Traits in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we use a novel elicitation approach using prompts derived from common interview questions, as well as prompts designed to elicit particular Big Five personality traits to examine whether the models were susceptible to trait-activation like humans are, to measure their personality based on the language used in their outputs.	Airlie Hilliard; Cristian Munoz; Zekun Wu; Adriano Soares Koshiyama;	arxiv-cs.CL	2024-02-13
338	Grounding LLMs For Robot Task Planning Using Closed-loop State Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an innovative planning algorithm that integrates LLMs into the robotics context, enhancing task-focused execution and success rates.	Vineet Bhat; Ali Umut Kaypak; Prashanth Krishnamurthy; Ramesh Karri; Farshad Khorrami;	arxiv-cs.RO	2024-02-13
339	The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in A Prospective Cardiac Rehabilitation Setting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigated the viability of using Large Language Models (LLMs) for triggering and personalizing content for Just-in-Time Adaptive Interventions (JITAIs) in digital health.	DAVID HAAG et. al.	arxiv-cs.HC	2024-02-13
340	IMUOptimize: A Data-Driven Approach to Optimal IMU Placement for Human Pose Estimation with Transformer Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel approach for predicting human poses using IMU data, diverging from previous studies such as DIP-IMU, IMUPoser, and TransPose, which use up to 6 IMUs in conjunction with bidirectional RNNs.	Varun Ramani; Hossein Khayami; Yang Bai; Nakul Garg; Nirupam Roy;	arxiv-cs.LG	2024-02-13
341	Measuring and Controlling Instruction (In)Stability in Language Model Dialogs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To combat attention decay and instruction drift, we propose a lightweight method called split-softmax, which compares favorably against two strong baselines.	KENNETH LI et. al.	arxiv-cs.CL	2024-02-13
342	Auditing Counterfire: Evaluating Advanced Counterargument Generation with Evidence and Style Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We audited counter-arguments generated by large language models (LLMs), focusing on their ability to generate evidence-based and stylistic counter-arguments to posts from the …	Preetika Verma; Kokil Jaidka; Svetlana Churina;	arxiv-cs.CL	2024-02-13
343	Addressing Cognitive Bias in Medical Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we developed BiasMedQA, a benchmark for evaluating cognitive biases in LLMs applied to medical tasks.	SAMUEL SCHMIDGALL et. al.	arxiv-cs.CL	2024-02-12
344	Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We next present the M2-BERT retrieval encoder, an 80M parameter state-space encoder model built from the Monarch Mixer architecture, capable of scaling to documents up to 32K tokens long. We describe a pretraining data mixture which allows this encoder to process both short and long context sequences, and a finetuning approach that adapts this base model to retrieval with only single-sample batches.	Jon Saad-Falcon; Daniel Y. Fu; Simran Arora; Neel Guha; Christopher Ré;	arxiv-cs.IR	2024-02-12
345	Investigating The Impact of Data Contamination of Large Language Models in Text-to-SQL Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the impact of Data Contamination on the performance of GPT-3.5 in the Text-to-SQL code-generating tasks.	FEDERICO RANALDI et. al.	arxiv-cs.CL	2024-02-12
346	Lissard: Long and Simple Sequential Reasoning Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Lissard, a benchmark comprising seven tasks whose goal is to assess the ability of models to process and generate wide-range sequence lengths, requiring repetitive procedural execution.	Mirelle Bueno; Roberto Lotufo; Rodrigo Nogueira;	arxiv-cs.CL	2024-02-12
347	Transformer Mechanisms Mimic Frontostriatal Gating Operations When Trained on Human Working Memory Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze the mechanisms that emerge within a vanilla attention-only Transformer trained on a simple sequence modeling task inspired by a task explicitly designed to study working memory gating in computational cognitive neuroscience.	Aaron Traylor; Jack Merullo; Michael J. Frank; Ellie Pavlick;	arxiv-cs.AI	2024-02-12
348	Enhancing Programming Error Messages in Real Time with Generative AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We extend this work by implementing feedback from ChatGPT for all programs submitted to our automated assessment tool, Athene, providing help for compiler, run-time, and logic errors.	BAILEY KIMMEL et. al.	arxiv-cs.HC	2024-02-12
349	Enhancing Multi-Criteria Decision Analysis with AI: Integrating Analytic Hierarchy Process and GPT-4 for Automated Decision Support Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our study presents a new framework that incorporates the Analytic Hierarchy Process (AHP) and Generative Pre-trained Transformer 4 (GPT-4) large language model (LLM), bringing novel approaches to cybersecurity Multiple-criteria Decision Making (MCDA).	Igor Svoboda; Dmytro Lande;	arxiv-cs.AI	2024-02-11
350	Leveraging AI to Advance Science and Computing Education Across Africa: Progress, Challenges, and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this book chapter, we describe our works developing and deploying AI in Education tools in Africa: (1) SuaCode, an AI-powered app that enables Africans to learn to code using their smartphones, (2) AutoGrad, an automated grading, and feedback tool for graphical and interactive coding assignments, (3) a tool for code plagiarism detection that shows visual evidence of plagiarism, (4) Kwame, a bilingual AI teaching assistant for coding courses, (5) Kwame for Science, a web-based AI teaching assistant that provides instant answers to students’ science questions and (6) Brilla AI, an AI contestant for the National Science and Maths Quiz competition.	George Boateng;	arxiv-cs.CY	2024-02-11
351	Gemini Goes to Med School: Exploring The Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our analysis revealed that Gemini is highly susceptible to hallucinations, overconfidence, and knowledge gaps, which indicate risks if deployed uncritically.	Ankit Pal; Malaikannan Sankarasubbu;	arxiv-cs.CL	2024-02-10
352	Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection Via Retrieval-Augmented GPT-4 and LLaMA Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study details our approach for the CASE 2024 Shared Task on Climate Activism Stance and Hate Event Detection, focusing on Hate Speech Detection, Hate Speech Target Identification, and Stance Detection as classification challenges.	MAREK ŠUPPA et. al.	arxiv-cs.CL	2024-02-09
353	UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction.	Yansong Ning; Hao Liu;	arxiv-cs.AI	2024-02-09
354	Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, time series data are uniquely challenging due to significant distribution shifts and intrinsic noise levels. To address these two challenges,we introduce the Sparse Vector Quantized FFN-Free Transformer (Sparse-VQ).	YANJUN ZHAO et. al.	arxiv-cs.LG	2024-02-08
355	FACT-GPT: Fact-Checking Augmentation Via Claim Matching with LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking.	Eun Cheol Choi; Emilio Ferrara;	arxiv-cs.CL	2024-02-08
356	Efficient Models for The Detection of Hate, Abuse and Profanity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is unacceptable in civil discourse.The detection of Hate, Abuse and Profanity in text is a vital component of creating civil and unbiased LLMs, which is needed not only for English, but for all languages. In this article, we briefly describe the creation of HAP detectors and various ways of using them to make models civil and acceptable in the output they generate.	Christoph Tillmann; Aashka Trivedi; Bishwaranjan Bhattacharjee;	arxiv-cs.CL	2024-02-08
357	Model Editing with Canonical Examples Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce model editing with canonical examples, a setting in which (1) a single learning example is provided per desired behavior, (2) evaluation is performed exclusively out-of-distribution, and (3) deviation from an initial model is strictly limited.	JOHN HEWITT et. al.	arxiv-cs.CL	2024-02-08
358	Limits of Transformer Language Models on Learning Algorithmic Compositions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce two new tasks demanding the composition of several discrete sub-tasks.	JONATHAN THOMM et. al.	arxiv-cs.LG	2024-02-08
359	Traditional Machine Learning Models and Bidirectional Encoder Representations From Transformer (BERT)-Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: Our goal was to identify efficient machine learning models for categorizing tweets related to eating disorders.	José Alberto Benítez-Andrades; José-Manuel Alija-Pérez; Maria-Esther Vidal; Rafael Pastor-Vargas; María Teresa García-Ordás;	arxiv-cs.CL	2024-02-08
360	Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduced three visual related tasks, i.e. caption classification, pairwise captioning, and culture tag selection, to systematically delve into fine-grained visual cultural evaluation.	YONG CAO et. al.	arxiv-cs.CL	2024-02-08
361	Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an approach for building a Named Entity Recognition (NER) model built upon a Bidirectional Encoder Representations from Transformers (BERT) architecture, specifically utilizing the SlovakBERT model.	Bibiána Lajčinová; Patrik Valábek; Michal Spišiak;	arxiv-cs.CL	2024-02-08
362	AttnLRP: Attention-Aware Layer-wise Relevance Propagation for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively.	REDUAN ACHTIBAT et. al.	arxiv-cs.CL	2024-02-08
363	Long Is More for Alignment: A Simple But Tough-to-Beat Baseline for Instruction Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the OpenLLM benchmarks that test factual knowledge.	Hao Zhao; Maksym Andriushchenko; Francesco Croce; Nicolas Flammarion;	arxiv-cs.CL	2024-02-07
364	Opening The AI Black Box: Program Synthesis Via Mechanistic Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code.	ERIC J. MICHAUD et. al.	arxiv-cs.LG	2024-02-07
365	In-Context Principle Learning from Mistakes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, all ICL-based approaches only learn from correct input-output pairs. In this paper, we revisit this paradigm, by learning more from the few given input-output examples.	TIANJUN ZHANG et. al.	arxiv-cs.CL	2024-02-07
366	Improving Cross-Domain Low-Resource Text Generation Through LLM Post-Editing: A Programmer-Interpreter Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, the editing strategies in these methods are not optimally designed for text-generation tasks. To address these limitations, we propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output.	Zhuang Li; Levon Haroutunian; Raj Tumuluri; Philip Cohen; Gholamreza Haffari;	arxiv-cs.CL	2024-02-07
367	Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To the best of our knowledge, we are the first to introduce SNNs into the realm of rPPG, proposing a hybrid neural network (HNN) model, the Spiking-PhysFormer, aimed at reducing power consumption.	MINGXUAN LIU et. al.	arxiv-cs.CV	2024-02-07
368	Grandmaster-Level Chess Without Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike traditional chess engines that rely on complex heuristics, explicit search, or a combination of both, we train a 270M parameter transformer model with supervised learning on a dataset of 10 million chess games.	ANIAN RUOSS et. al.	arxiv-cs.LG	2024-02-06
369	The Use of A Large Language Model for Cyberbullying Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Several machine learning (ML) algorithms have been proposed for this purpose.	Bayode Ogunleye; Babitha Dharmaraj;	arxiv-cs.CL	2024-02-06
370	The Hedgehog & The Porcupine: Expressive Linear Attentions with Softmax Mimicry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We thus propose Hedgehog, a learnable linear attention that retains the spiky and monotonic properties of softmax attention while maintaining linear complexity.	Michael Zhang; Kush Bhatia; Hermann Kumbong; Christopher Ré;	arxiv-cs.LG	2024-02-06
371	CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on synthetic data generation and demonstrate the capability of training a GPT model using a particular patient representation derived from CEHR-BERT, enabling us to generate patient sequences that can be seamlessly converted to the Observational Medical Outcomes Partnership (OMOP) data format.	CHAO PANG et. al.	arxiv-cs.LG	2024-02-06
372	Behind The Screen: Investigating ChatGPT’s Dark Personality Traits and Conspiracy Beliefs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: ChatGPT is notorious for its intransparent behavior. This paper tries to shed light on this, providing an in-depth analysis of the dark personality traits and conspiracy beliefs of GPT-3.5 and GPT-4.	Erik Weber; Jérôme Rutinowski; Markus Pauly;	arxiv-cs.CL	2024-02-06
373	Self-Discover: Large Language Models Self-Compose Reasoning Structures Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods.	PEI ZHOU et. al.	arxiv-cs.AI	2024-02-05
374	MobilityGPT: Enhanced Human Mobility Modeling with A GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we reformat human mobility modeling as an autoregressive generation task, leveraging Generative Pre-trained Transformer (GPT). To ensure its controllable generation to alleviate the above challenges, we propose a geospatially-aware generative model, MobilityGPT.	Ammar Haydari; Dongjie Chen; Zhengfeng Lai; Chen-Nee Chuah;	arxiv-cs.LG	2024-02-05
375	Conversation Reconstruction Attack Against GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We will responsibly disclose our findings to the suppliers of related large language models.	Junjie Chu; Zeyang Sha; Michael Backes; Yang Zhang;	arxiv-cs.CR	2024-02-05
376	Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods.	Lingxiao Zhao; Xueying Ding; Leman Akoglu;	arxiv-cs.LG	2024-02-05
377	Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To investigate the effect language on the formation of abstractions, we implement a novel multimodal serial reproduction framework by asking people who receive a visual stimulus to reproduce it in a linguistic format, and vice versa.	SREEJAN KUMAR et. al.	arxiv-cs.AI	2024-02-05
378	UniMem: Towards A Unified View of Long-Context Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We reformulate 16 existing methods based on UniMem and analyze four representative methods: Transformer-XL, Memorizing Transformer, RMT, and Longformer into equivalent UniMem forms to reveal their design principles and strengths. Based on these analyses, we propose UniMix, an innovative approach that integrates the strengths of these algorithms.	JUNJIE FANG et. al.	arxiv-cs.CL	2024-02-05
379	Harnessing PubMed User Query Logs for Post Hoc Explanations of Recommended Similar Articles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our major contribution is building PubCLogs by repurposing 5.6 million pairs of coclicked articles from PubMed’s user query logs.	Ashley Shin; Qiao Jin; James Anibal; Zhiyong Lu;	arxiv-cs.IR	2024-02-05
380	Make Every Move Count: LLM-based High-Quality RTL Code Generation Using MCTS Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we present an automated transformer decoding algorithm that integrates Monte Carlo tree-search for lookahead, guiding the transformer to produce compilable, functionally correct, and PPA-optimized code.	MATTHEW DELORENZO et. al.	arxiv-cs.LG	2024-02-05
381	Identifying Reasons for Contraceptive Switching from Real-World Data Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate that GPT-4 can accurately extract reasons for contraceptive switching, outperforming baseline BERT-based models with microF1 scores of 0.849 and 0.881 for contraceptive start and stop extraction, respectively.	BRENDA Y. MIAO et. al.	arxiv-cs.CL	2024-02-05
382	Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper describes a novel approach for automated ICD coding, combining several ideas from previous related work.	Gonçalo Gomes; Isabel Coutinho; Bruno Martins;	arxiv-cs.CL	2024-02-05
383	SWAG: Storytelling With Action Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Storytelling With Action Guidance (SWAG), a novel approach to storytelling with LLMs.	Zeeshan Patel; Karim El-Refai; Jonathan Pei; Tianle Li;	arxiv-cs.CL	2024-02-05
384	Evaluating Large Language Models in Analysing Classroom Dialogue Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the application of Large Language Models (LLMs), specifically GPT-4, in the analysis of classroom dialogue, a crucial research task for both teaching diagnosis and quality improvement.	Yun Long; Haifeng Luo; Yu Zhang;	arxiv-cs.CL	2024-02-04
385	Key-Graph Transformer for Image Restoration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, the self-attention mechanism in transformers is prone to considering unnecessary global cues from unrelated objects or regions, introducing computational inefficiencies. In response to these challenges, we introduce the Key-Graph Transformer (KGT) in this paper.	BIN REN et. al.	arxiv-cs.CV	2024-02-04
386	DenseFormer: Enhancing Information Flow in Transformers Via Depth Weighted Averaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The transformer architecture by Vaswani et al. (2017) is now ubiquitous across application domains, from natural language processing to speech processing and image understanding. We propose DenseFormer, a simple modification to the standard architecture that improves the perplexity of the model without increasing its size — adding a few thousand parameters for large-scale models in the 100B parameters range.	Matteo Pagliardini; Amirkeivan Mohtashami; Francois Fleuret; Martin Jaggi;	arxiv-cs.CL	2024-02-04
387	A Graph Is Worth $K$ Words: Euclideanizing Graph Using Pure Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GraphsGPT, featuring a Graph2Seq encoder that transforms non-Euclidean graphs into learnable graph words in a Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from graph words to ensure information equivalence.	ZHANGYANG GAO et. al.	arxiv-cs.LG	2024-02-04
388	Timer: Transformers for Time Series Analysis at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To change the current practices of training small models on specific datasets from scratch, this paper aims at an early development of large time series models (LTSM).	YONG LIU et. al.	arxiv-cs.LG	2024-02-04
389	GPT-4V As Traffic Assistant: An In-depth Look at Vision Language Model on Complex Traffic Events Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The advent of large vision-language models (VLMs) such as GPT-4V, has introduced innovative approaches to addressing this issue. In this paper, we explore the ability of GPT-4V with a set of representative traffic incident videos and delve into the model’s capacity of understanding these complex traffic situations.	Xingcheng Zhou; Alois C. Knoll;	arxiv-cs.CV	2024-02-03
390	Spin: An Efficient Secure Computation Framework with GPU Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose optimized protocols for non-linear functions that are critical for machine learning, as well as several novel optimizations specific to attention that is the fundamental unit of Transformer models, allowing Spin to perform non-trivial CNNs training and Transformer inference without sacrificing security.	WUXUAN JIANG et. al.	arxiv-cs.CR	2024-02-03
391	EffiBench: Benchmarking The Efficiency of Automatically Generated Code Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents EffiBench, a benchmark with 1,000 efficiency-critical coding problems for assessing the efficiency of code generated by code generation models.	Dong Huang; Jie M. Zhang; Yuhao Qing; Heming Cui;	arxiv-cs.SE	2024-02-03
392	User Intent Recognition and Satisfaction with Large Language Models: A User Study with ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on a fine-grained intent taxonomy and intent-based prompt reformulations, we analyze (1) the quality of intent recognition and (2) user satisfaction with answers from intent-based prompt reformulations for two recent ChatGPT models, GPT-3.5 Turbo and GPT-4 Turbo.	Anna Bodonhelyi; Efe Bozkir; Shuo Yang; Enkelejda Kasneci; Gjergji Kasneci;	arxiv-cs.HC	2024-02-03
393	Data Quality Matters: Suicide Intention Detection on Social Media Posts Using A RoBERTa-CNN Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on identifying suicidal intentions in SuicideWatch Reddit posts and present a novel approach to suicide detection using the cutting-edge RoBERTa-CNN model, a variant of RoBERTa (Robustly optimized BERT approach).	Emily Lin; Jian Sun; Hsingyu Chen; Mohammad H. Mahoor;	arxiv-cs.CL	2024-02-03
394	ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annotations. To address this issue, this paper proposes a new CNN-Transformer hybrid solution for scribble-supervised medical image segmentation called ScribFormer.	ZIHAN LI et. al.	arxiv-cs.CV	2024-02-02
395	LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: ChatGPT and other general large language models (LLMs) have achieved remarkable success, but they have also raised concerns about the misuse of AI-generated texts.	RONGSHENG WANG et. al.	arxiv-cs.CL	2024-02-02
396	Can LLMs Perform Structured Graph Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Keeping the trade-off between LLM compatibility and structure complexity in mind, we design various graph reasoning tasks as a proxy to semi-structured tasks in this paper, in order to test the ability to navigate through representations beyond plain text in various LLMs.	Palaash Agrawal; Shavak Vasania; Cheston Tan;	arxiv-cs.CL	2024-02-02
397	MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the MasonPerplexity submission for the Shared Task on Multimodal Hate Speech Event Detection at CASE 2024 at EACL 2024.	AMRITA GANGULY et. al.	arxiv-cs.CL	2024-02-02
398	COMET: Generating Commit Messages Using Delta Graph Context Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and leverages a transformer-based model to generate high-quality commit messages.	Abhinav Reddy Mandli; Saurabhsingh Rajput; Tushar Sharma;	arxiv-cs.SE	2024-02-02
399	Faster Inference of Integer SWIN Transformer By Removing The GELU Activation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we improve upon the inference latency of the state-of-the-art methods by removing the floating-point operations, which are associated with the GELU activation in Swin Transformer.	Mohammadreza Tayaranian; Seyyed Hasan Mozafari; James J. Clark; Brett Meyer; Warren Gross;	arxiv-cs.CV	2024-02-02
400	Ultra Fast Transformers on FPGAs for Particle Physics Experiments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we have implemented critical components of a transformer model, such as multi-head attention and softmax layers.	ZHIXING JIANG et. al.	arxiv-cs.LG	2024-02-01
401	Generation, Distillation and Evaluation of Motivational Interviewing-Style Reflections with A Foundational Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a method for distilling the generation of reflections from a Foundational Language Model (GPT-4) into smaller models.	ANDREW BROWN et. al.	arxiv-cs.CL	2024-02-01
402	Understanding The Expressive Power and Mechanisms of Transformer for Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory.	Mingze Wang; Weinan E;	arxiv-cs.LG	2024-02-01
403	Comparative Study of Large Language Model Architectures on Frontier Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Employing the same materials science text corpus and a comprehensive end-to-end pipeline, we conduct a comparative analysis of their training and downstream performance.	Junqi Yin; Avishek Bose; Guojing Cong; Isaac Lyngaas; Quentin Anthony;	arxiv-cs.DC	2024-02-01
404	Benefits of Transformer: In-Context Learning in Linear Regression Tasks with Unstructured Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In practice, it is observed that transformer-based models can learn concepts in context in the inference stage.	Yue Xing; Xiaofeng Lin; Namjoon Suh; Qifan Song; Guang Cheng;	arxiv-cs.LG	2024-02-01
405	Self-Supervised Contrastive Pre-Training for Multivariate Point Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new paradigm for self-supervised learning for multivariate point processes using a transformer encoder.	Xiao Shou; Dharmashankar Subramanian; Debarun Bhattacharjya; Tian Gao; Kristin P. Bennet;	arxiv-cs.LG	2024-02-01
406	COA-GPT: Generative Pre-trained Transformers for Accelerated Course of Action Development in Military Operations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The development of Courses of Action (COAs) in military operations is traditionally a time-consuming and intricate process. Addressing this challenge, this study introduces COA-GPT, a novel algorithm employing Large Language Models (LLMs) for rapid and efficient generation of valid COAs.	Vinicius G. Goecks; Nicholas Waytowich;	arxiv-cs.AI	2024-02-01
407	Supporting Anticipatory Governance Using LLMs: Evaluating and Aligning Large Language Models with The News Media to Anticipate The Negative Impacts of AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we leverage news media, a diverse data source that is rich with normative assessments of emerging technologies, to formulate a taxonomy of impacts to act as a baseline for comparing against.	Mowafak Allaham; Nicholas Diakopoulos;	arxiv-cs.CL	2024-01-31
408	Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Gyan AI Paramanu (atom), a family of novel language models for Indian languages.	Mitodru Niyogi; Arnab Bhattacharya;	arxiv-cs.CL	2024-01-31
409	Global-Liar: Factuality of LLMs Over Time and Geographic Regions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ‘Global-Liar,’ a dataset uniquely balanced in terms of geographic and temporal representation, facilitating a more nuanced evaluation of LLM biases.	Shujaat Mirza; Bruno Coelho; Yuyuan Cui; Christina Pöpper; Damon McCoy;	arxiv-cs.CL	2024-01-31
410	Towards Scalable Robotic Intervention of Children with Autism Spectrum Disorder Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a social robot capable of verbally interacting with children with Autism Spectrum Disorder (ASD).	Ruchik Mishra; Karla Conn Welch;	arxiv-cs.RO	2024-01-31
411	Towards AI-Assisted Synthesis of Verified Dafny Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we demonstrate how to improve two pretrained models’ proficiency in the Dafny verified programming language.	Md Rakib Hossain Misu; Cristina V. Lopes; Iris Ma; James Noble;	arxiv-cs.SE	2024-01-31
412	Evaluating The Effectiveness of GPT-4 Turbo in Creating Defeaters for Assurance Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Identifying defeaters arguments that refute these ACs is essential for improving the robustness and confidence in ACs. To automate this task, we introduce a novel method that leverages the capabilities of GPT-4 Turbo, an advanced Large Language Model (LLM) developed by OpenAI, to identify defeaters within ACs formalized using the Eliminative Argumentation (EA) notation.	KIMYA KHAKZAD SHAHANDASHTI et. al.	arxiv-cs.SE	2024-01-31
413	ConSmax: Hardware-Friendly Alternative Softmax with Learnable Parameters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Apart from the non-linearity, the low arithmetic intensity greatly reduces the processing parallelism, which becomes the bottleneck especially when dealing with a longer context. To address this challenge, we propose Constant Softmax (ConSmax), a software-hardware co-design as an efficient Softmax alternative.	SHIWEI LIU et. al.	arxiv-cs.AR	2024-01-31
414	Mitigating The Problem of Strong Priors in LMs with Context Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply it to eleven models including GPT-2, GPT-3, Llama 2, and Mistral on four tasks, and find improvements in 41/44.	Raymond Douglas; Andis Draguns; Tomáš Gavenčiak;	arxiv-cs.CL	2024-01-31
415	Towards Efficient and Reliable LLM Serving: A Real-World Workload Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we find that the absence of reliable workload data for evaluating LLM serving systems impacts the quality of service (QoS) and reliability in industrial deployments.	YUXIN WANG et. al.	arxiv-cs.DC	2024-01-31
416	Scavenging Hyena: Distilling Transformers Into Long Convolution Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a pioneering approach to address the efficiency concerns associated with LLM pre-training, proposing the use of knowledge distillation for cross-architecture transfer.	Tokiniaina Raharison Ralambomihanta; Shahrad Mohammadzadeh; Mohammad Sami Nur Islam; Wassim Jabbour; Laurence Liang;	arxiv-cs.CL	2024-01-30
417	Arabic Tweet Act: A Weighted Ensemble Pre-Trained Transformer Model for Classifying Arabic Speech Acts on Twitter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a Twitter dialectal Arabic speech act classification approach based on a transformer deep learning neural network.	Khadejaa Alshehri; Areej Alhothali; Nahed Alowidi;	arxiv-cs.CL	2024-01-30
418	Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide a Transformer-based model and a baseline benchmark for the Turkish Language.	Savas Yildirim;	arxiv-cs.CL	2024-01-30
419	SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we proposed a subarray-level processing-in-memory architecture named SAL-PIM, HBM-based PIM architecture for the end-to-end acceleration of transformer-based text generation.	Wontak Han; Hyunjun Cho; Donghyuk Kim; Joo-Young Kim;	arxiv-cs.AR	2024-01-30
420	CAFCT: Contextual and Attentional Feature Fusions of Convolutional Neural Networks and Transformer for Liver Tumor Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Contextual and Attentional feature Fusions enhanced Convolutional Neural Network (CNN) and Transformer hybrid network (CAFCT) model for liver tumor segmentation.	Ming Kang; Chee-Ming Ting; Fung Fung Ting; Raphaël Phan;	arxiv-cs.CV	2024-01-30
421	Conditional and Modal Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we probe the extent to which a dozen LLMs are able to distinguish logically correct inferences from logically fallacious ones.	Wesley H. Holliday; Matthew Mandelkern;	arxiv-cs.CL	2024-01-30
422	Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we construct dialogue modules based on a CBT scenario focused on conventional Socratic questioning using two kinds of LLMs: a Transformer-based dialogue model further trained with a social media empathetic counseling dataset, provided by Osaka Prefecture (OsakaED), and GPT-4, a state-of-the art LLM created by OpenAI.	KENTA IZUMI et. al.	arxiv-cs.CL	2024-01-29
423	Breaking Free Transformer Models: Task-specific Context Attribution Promises Improved Generalizability Without Fine-tuning Pre-trained LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a framework that allows for maintaining generalizability, and enhances the performance on the downstream task by utilizing task-specific context attribution.	Stepan Tytarenko; Mohammad Ruhul Amin;	arxiv-cs.CL	2024-01-29
424	PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the lack of specialized, high-quality benchmark impeded their development and precise evaluation. To address this, we introduce PathMMU, the largest and highest-quality expert-validated pathology benchmark for Large Multimodal Models (LMMs).	YUXUAN SUN et. al.	arxiv-cs.CV	2024-01-29
425	You Tell Me: A Dataset of GPT-4-Based Behaviour Change Support Conversations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and …	Selina Meyer; David Elsweiler;	arxiv-cs.HC	2024-01-29
426	TQCompressor: Improving Tensor Decomposition Methods in Neural Networks Via Permutations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce TQCompressor, a novel method for neural network model compression with improved tensor decompositions.	V. ABRONIN et. al.	arxiv-cs.LG	2024-01-29
427	Leveraging Professional Radiologists’ Expertise to Enhance LLMs’ Evaluation for Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Experimental results show that our Detailed GPT-4 (5-shot) model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our Regressed GPT-4 model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin.	QINGQING ZHU et. al.	arxiv-cs.CL	2024-01-29
428	UnMASKed: Quantifying Gender Biases in Masked Language Models Through Linguistically Informed Job Market Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluated six prominent models: BERT, RoBERTa, DistilBERT, BERT-multilingual, XLM-RoBERTa, and DistilBERT-multilingual.	Iñigo Parra;	arxiv-cs.CL	2024-01-28
429	A New Method for Vehicle Logo Recognition Based on Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we implement real-time VLR using Swin Transformer and fine-tune it for optimal performance.	Yang Li; Doudou Zhang; Jianli Xiao;	arxiv-cs.CV	2024-01-27
430	Semantics of Multiword Expressions in Transformer-Based Models: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Addressing this gap, we provide the first in-depth survey of MWE processing with transformer models. We overall find that they capture MWE semantics inconsistently, as shown by reliance on surface patterns and memorized information.	Filip Miletić; Sabine Schulte im Walde;	arxiv-cs.CL	2024-01-27
431	Large Language Model for Vulnerability Detection: Emerging Results and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the effectiveness of LLMs in detecting software vulnerabilities is largely unexplored. This paper aims to bridge this gap by exploring how LLMs perform with various prompts, particularly focusing on two state-of-the-art LLMs: GPT-3.5 and GPT-4.	Xin Zhou; Ting Zhang; David Lo;	arxiv-cs.SE	2024-01-27
432	Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Importantly, we find that coding fidelity improves considerably when the LLM is prompted to give rationale justifying its coding decisions (chain-of-thought reasoning). We present these and other findings along with a set of best practices for adapting traditional codebooks for LLMs.	Zackary Okun Dunivin;	arxiv-cs.CL	2024-01-26
433	Dynamic Transformer Architecture for Continual Learning of Multimodal Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a transformer-based CL framework focusing on learning tasks that involve both vision and language, known as Vision-and-Language (VaL) tasks.	Yuliang Cai; Mohammad Rostami;	arxiv-cs.CV	2024-01-26
434	From GPT-4 to Gemini and Beyond: Assessing The Landscape of MLLMs on Generalizability, Trustworthiness and Causality Through Four Modalities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents.	CHAOCHAO LU et. al.	arxiv-cs.CV	2024-01-26
435	Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution.	CHENG QIAN et. al.	arxiv-cs.CL	2024-01-25
436	Relative Value Biases in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Studies of reinforcement learning in humans and animals have demonstrated a preference for options that yielded relatively better outcomes in the past, even when those options are associated with lower absolute reward.	William M. Hayes; Nicolas Yax; Stefano Palminteri;	arxiv-cs.CL	2024-01-25
437	MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We train a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings.	PATRICK LEE et. al.	arxiv-cs.CL	2024-01-25
438	Evaluating GPT-3.5’s Awareness and Summarization Abilities for European Constitutional Texts with Shared Topics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, using the renowned GPT-3.5, we leverage generative large language models to understand constitutional passages that transcend national boundaries.	Candida M. Greco; A. Tagarelli;	arxiv-cs.CL	2024-01-25
439	(Chat)GPT V BERT: Dawn of Justice for Semantic Change Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we specifically focus on the temporal problem of semantic change, and evaluate their ability to solve two diachronic extensions of the Word-in-Context (WiC) task: TempoWiC and HistoWiC.	Francesco Periti; Haim Dubossarsky; Nina Tahmasebi;	arxiv-cs.CL	2024-01-25
440	Enhanced Labeling Technique for Reddit Text and Fine-Tuned Longformer Models for Classifying Depression Severity in English and Luganda Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It employs a proposed labeling approach to categorize the text and subsequently fine-tunes the Longformer model.	Richard Kimera; Daniela N. Rim; Joseph Kirabira; Ubong Godwin Udomah; Heeyoul Choi;	arxiv-cs.CL	2024-01-25
441	Dynamic Long-Term Time-Series Forecasting Via Meta Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes Meta-Transformer Networks (MANTRA) to deal with the dynamic long-term time-series forecasting tasks.	MUHAMMAD ANWAR MA’SUM et. al.	arxiv-cs.LG	2024-01-25
442	Automated Root Causing of Cloud Incidents Using In-Context Learning with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the high cost of fine-tuning LLM, we propose an in-context learning approach for automated root causing, which eliminates the need for fine-tuning.	XUCHAO ZHANG et. al.	arxiv-cs.CL	2024-01-24
443	A Comparative Study of Zero-shot Inference with Large Language Models and Supervised Modeling in Breast Cancer Pathology Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explored whether recent LLMs can reduce the need for large-scale data annotations.	MADHUMITA SUSHIL et. al.	arxiv-cs.CL	2024-01-24
444	Unmasking and Quantifying Racial Bias of Large Language Models in Medical Report Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite few attempts in the past, the precise impact and extent of these biases remain uncertain. Through both qualitative and quantitative analyses, we find that these models tend to project higher costs and longer hospitalizations for White populations and exhibit optimistic views in challenging medical scenarios with much higher survival rates.	Yifan Yang; Xiaoyu Liu; Qiao Jin; Furong Huang; Zhiyong Lu;	arxiv-cs.CL	2024-01-24
445	Discovering Mathematical Formulas from Data Via GPT-guided Monte Carlo Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To optimize the trade-off between efficiency and versatility, we introduce SR-GPT, a novel algorithm for symbolic regression that integrates Monte Carlo Tree Search (MCTS) with a Generative Pre-Trained Transformer (GPT).	YANJIE LI et. al.	arxiv-cs.LG	2024-01-24
446	Can GPT-3.5 Generate and Code Discharge Summaries? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We report micro- and macro-F1 scores on the full codeset, generation codes, and their families.	MATÚŠ FALIS et. al.	arxiv-cs.CL	2024-01-24
447	ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces ConTextual, a novel benchmark comprising instructions designed explicitly to evaluate LMMs’ ability to perform context-sensitive text-rich visual reasoning.	Rohan Wadhawan; Hritik Bansal; Kai-Wei Chang; Nanyun Peng;	arxiv-cs.CV	2024-01-24
448	Evaluation of General Large Language Models in Contextually Assessing Semantic Concepts Extracted from Adult Critical Care Electronic Health Record Notes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective: We sought to evaluate the performance of LLMs in the complex clinical context of adult critical care medicine using systematic and comprehensible analytic methods, including clinician annotation and adjudication.	DARREN LIU et. al.	arxiv-cs.CL	2024-01-24
449	Convolutional Initialization for Data-Efficient Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast, convolutional neural networks (CNNs) can achieve state-of-the-art performance by leveraging their architectural inductive bias. In this paper, we investigate whether this inductive bias can be reinterpreted as an initialization bias within a vision transformer network.	Jianqiao Zheng; Xueqian Li; Simon Lucey;	arxiv-cs.CV	2024-01-23
450	TAT-LLM: A Specialized Language Model for Discrete Reasoning Over Tabular and Textual Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we address question answering (QA) over a hybrid of tabular and textual data that are very common content on the Web (e.g. SEC filings), where discrete reasoning capabilities are often required.	FENGBIN ZHU et. al.	arxiv-cs.CL	2024-01-23
451	Contrastive Learning in Distilled Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models yet to perform well on Semantic Textual Similarity, and may be too large to be deployed as lightweight edge applications. We seek to apply a suitable contrastive learning method based on the SimCSE paper, to a model architecture adapted from a knowledge distillation based model, DistilBERT, to address these two issues.	Valerie Lim; Kai Wen Ng; Kenneth Lim;	arxiv-cs.CL	2024-01-22
452	Enhancing In-context Learning Via Linear Probe Calibration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input.	MOMIN ABBAS et. al.	arxiv-cs.CL	2024-01-22
453	Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on Subtask A & B. Each subtask is supported by three datasets for training, development, and testing.	FENG XIONG et. al.	arxiv-cs.CL	2024-01-22
454	Freely Long-Thinking Transformer (FraiLT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Freely Long-Thinking Transformer (FraiLT) is an improved transformer model designed to enhance processing capabilities without scaling up size. It utilizes a recursive approach, iterating over a subset of layers multiple times, and introduces iteration encodings to maintain awareness across these cycles.	Akbay Tabak;	arxiv-cs.LG	2024-01-21
455	Revolutionizing Finance with LLMs: An Overview of Applications and Insights Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we provide a comprehensive overview of the emerging integration of LLMs into various financial tasks.	HUAQIN ZHAO et. al.	arxiv-cs.CL	2024-01-21
456	Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the Hourglass Diffusion Transformer (HDiT), an image generative model that exhibits linear scaling with pixel count, supporting training at high-resolution (e.g. $1024 \times 1024$) directly in pixel-space.	KATHERINE CROWSON et. al.	arxiv-cs.CV	2024-01-21
457	CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Free-text radiology reports present a rich data source for various medical tasks, but effectively labeling these texts remains challenging.	JAWOOK GU et. al.	arxiv-cs.CL	2024-01-21
458	P2DT: Mitigating Forgetting in Task-incremental Learning with Progressive Prompt Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our work, we propose a novel solution – the Progressive Prompt Decision Transformer (P2DT).	Zhiyuan Wang; Xiaoyang Qu; Jing Xiao; Bokui Chen; Jianzong Wang;	arxiv-cs.LG	2024-01-21
459	Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a fault protection mechanism that incurs zero space cost.	BINGBING LI et. al.	arxiv-cs.LG	2024-01-21
460	Drop Your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The usage of additional Transformer-based decoders also incurs significant computational costs. In this study, we aim to shed light on this issue by revealing that masked auto-encoder (MAE) pre-training with enhanced decoding significantly improves the term coverage of input tokens in dense representations, compared to vanilla BERT checkpoints.	Guangyuan Ma; Xing Wu; Zijia Lin; Songlin Hu;	arxiv-cs.IR	2024-01-20
461	Enhancing Large Language Models for Clinical Decision Support By Incorporating Clinical Practice Guidelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Methods We develop three distinct methods for incorporating CPGs into LLMs: Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought-Few-Shot Prompting (CoT-FSP).	DAVID ONIANI et. al.	arxiv-cs.CL	2024-01-20
462	Visualization Generation with Large Language Models: An Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the capability of a large language model to generate visualization specifications on the task of natural language to visualization (NL2VIS).	GUOZHENG LI et. al.	arxiv-cs.HC	2024-01-20
463	Unfair TOS: An Automated Approach Using Customized BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research paper, we present SOTA(State of The Art) results on unfair clause detection from ToS documents based on unprecedented custom BERT Fine-tuning in conjunction with SVC(Support Vector Classifier).	Bathini Sai Akash; Akshara Kupireddy; Lalita Bhanu Murthy;	arxiv-cs.CL	2024-01-20
464	Cross-lingual Editing in Multilingual Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For more comprehensive information, the dataset used in this research and the associated code are publicly available at the following URL\url{https://github.com/lingo-iitgn/XME}.	Himanshu Beniwal; Kowsik Nandagopan D; Mayank Singh;	arxiv-cs.CL	2024-01-19
465	Mining Experimental Data from Materials Science Literature with Large Language Models: An Evaluation Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel methodology for the comparative analysis of intricate material expressions, emphasising the standardisation of chemical formulas to tackle the complexities inherent in materials science information assessment.	Luca Foppiano; Guillaume Lambard; Toshiyuki Amagasa; Masashi Ishii;	arxiv-cs.CL	2024-01-19
466	Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a comparative analysis of state-of-the-art Pre-trained Language Models (PTMs) for fine-grained emotion classification on two benchmark datasets from GitHub and Stack Overflow.	Mia Mohammad Imran;	arxiv-cs.SE	2024-01-19
467	Speech Swin-Transformer: Exploring A Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In speech signals, emotional information is distributed across different scales of speech features, e.\,g., word, phrase, and utterance. Drawing above inspiration, this paper presents a hierarchical speech Transformer with shifted windows to aggregate multi-scale emotion features for speech emotion recognition (SER), called Speech Swin-Transformer.	YONG WANG et. al.	arxiv-cs.CL	2024-01-19
468	Custom Developer GPT for Ethical AI Solutions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main goal of this project is to create a new software artefact: a custom Generative Pre-trained Transformer (GPT) for developers to discuss and solve ethical issues through AI engineering.	Lauren Olson;	arxiv-cs.SE	2024-01-19
469	ChatQA: Building GPT-4 Level Conversational QA Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce ChatQA, a family of conversational question answering (QA) models that obtain GPT-4 level accuracies.	ZIHAN LIU et. al.	arxiv-cs.CL	2024-01-18
470	Gender Bias in Machine Translation and The Era of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This chapter examines the role of Machine Translation in perpetuating gender bias, highlighting the challenges posed by cross-linguistic settings and statistical dependencies.	Eva Vanmassenhove;	arxiv-cs.CL	2024-01-18
471	Improving The Accuracy of Analog-Based In-Memory Computing Accelerators Post-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose two Post-Training (PT) optimization methods to improve accuracy after training is performed.	COREY LAMMIE et. al.	arxiv-cs.ET	2024-01-18
472	Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article provides a step-by-step generalizable guideline to identify and classify different forms of racist discourse in large corpora. In our approach, we start by conceptualizing racism and its different manifestations.	Diana Davila Gordillo; Joan Timoneda; Sebastian Vallejo Vera;	arxiv-cs.CL	2024-01-17
473	Efficient Slot Labelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a lightweight method which performs on par or better than the state-of-the-art PLM-based methods, while having almost 10x less trainable parameters.	Vladimir Vlasov;	arxiv-cs.CL	2024-01-17
474	GPT in Sheep’s Clothing: The Risk of Customized GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We aim to raise awareness of the fact that GPTs can be used maliciously, posing privacy and security risks to their users.	SAGIV ANTEBI et. al.	arxiv-cs.CR	2024-01-17
475	Improving Classification Performance With Human Feedback: Label A Few, We Label The Rest Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By employing Large Language Models (LLMs) such as GPT-3.5, BERT, and SetFit, we aim to analyze the efficacy of using a limited number of labeled examples to substantially improve model accuracy.	Natan Vidra; Thomas Clifford; Katherine Jijo; Eden Chung; Liang Zhang;	arxiv-cs.LG	2024-01-17
476	Land Cover Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We compare convolutional neural networks (CNN) against transformer-based methods, showcasing their applications and advantages in LC studies.	Antonio Rangel; Juan Terven; Diana M. Cordova-Esparza; E. A. Chavez-Urbiola;	arxiv-cs.CV	2024-01-17
477	Human Vs. LMMs: Exploring The Discrepancy in Emoji Interpretation and Usage in Digital Communication Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging Large Multimodal Models (LMMs) to simulate human behaviors when processing multimodal information, especially in the context of social media, has garnered immense interest due to its broad potential and far-reaching implications.	Hanjia Lyu; Weihong Qi; Zhongyu Wei; Jiebo Luo;	arxiv-cs.CV	2024-01-16
478	Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This communication bottleneck exacerbates the already complex computational landscape, hindering the efficient utilization of high-performance computing resources. In this paper, we propose a lightweight optimization technique called ExFlow, to largely accelerate the inference of these MoE models.	Jinghan Yao; Quentin Anthony; Aamir Shafi; Hari Subramoni; Dhabaleswar K.;	arxiv-cs.LG	2024-01-16
479	Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (27.3%), most prominent in image comprehension (21.6%). Regardless of GPT-4V’s high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such models into clinical workflows.	QIAO JIN et. al.	arxiv-cs.CV	2024-01-16
480	MADA: Meta-Adaptive Optimizers Through Hyper-gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Meta-Adaptive Optimizers (MADA), a unified optimizer framework that can generalize several known optimizers and dynamically learn the most suitable one during training.	KAAN OZKARA et. al.	arxiv-cs.LG	2024-01-16
481	RAG Vs Fine-tuning: Pipelines, Tradeoffs, and A Case Study on Agriculture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a pipeline for fine-tuning and RAG, and present the tradeoffs of both for multiple popular LLMs, including Llama2-13B, GPT-3.5, and GPT-4.	ANGELS BALAGUER et. al.	arxiv-cs.CL	2024-01-16
482	Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models (LMs).	Saptarshi Sengupta; Connor Heaton; Prasenjit Mitra; Soumalya Sarkar;	arxiv-cs.CL	2024-01-15
483	Cascaded Cross-Modal Transformer for Audio-Textual Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To attain superior classification performance, we propose to harness the inherent value of multimodal representations by transcribing speech using automatic speech recognition (ASR) models and translating the transcripts into different languages via pretrained translation models.	Nicolae-Catalin Ristea; Andrei Anghel; Radu Tudor Ionescu;	arxiv-cs.CL	2024-01-15
484	Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we tackle the challenge of classifying the object category in point clouds, which previous works like PointCLIP struggle to address due to the inherent limitations of the CLIP architecture.	Qi Sun; Xiao Cui; Wengang Zhou; Houqiang Li;	arxiv-cs.CV	2024-01-15
485	Transformer-based Approach for Ethereum Price Prediction Using Crosscurrency Correlation and Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The model employs a transformer architecture for several setups from single-feature scenarios to complex configurations incorporating volume, sentiment, and correlated cryptocurrency prices.	Shubham Singh; Mayur Bhat;	arxiv-cs.LG	2024-01-15
486	Selene: Pioneering Automated Proof in Software Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Selene in this paper, which is the first project-level automated proof benchmark constructed based on the real-world industrial-level project of the seL4 operating system microkernel.	Lichen Zhang; Shuai Lu; Nan Duan;	arxiv-cs.SE	2024-01-15
487	SemEval-2017 Task 4: Sentiment Analysis in Twitter Using BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper uses the BERT model, which is a transformer-based architecture, to solve task 4A, English Language, Sentiment Analysis in Twitter of SemEval2017.	Rupak Kumar Das; Dr. Ted Pedersen;	arxiv-cs.CL	2024-01-15
488	Enhancing Robustness of LLM-Synthetic Text Detectors for Academic Writing: A Comprehensive Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a comprehensive analysis of the impact of prompts on the text generated by LLMs and highlight the potential lack of robustness in one of the current state-of-the-art GPT detectors.	Zhicheng Dou; Yuchen Guo; Ching-Chun Chang; Huy H. Nguyen; Isao Echizen;	arxiv-cs.CL	2024-01-15
489	The Chronicles of RAG: The Retriever, The Chunk and The Generator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given all these challenges, every day a new technique to improve RAG appears, making it unfeasible to experiment with all combinations for your problem. In this context, this paper presents good practices to implement, optimize, and evaluate RAG for the Brazilian Portuguese language, focusing on the establishment of a simple pipeline for inference and experiments.	PAULO FINARDI et. al.	arxiv-cs.LG	2024-01-15
490	Harnessing Large Language Models Over Transformer Models for Detecting Bengali Depressive Social Media Text: A Comprehensive Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work provides full architecture details for each model and a methodical way to assess their performance in Bengali depressive text categorization using zero-shot and few-shot learning techniques.	AHMADUL KARIM CHOWDHURY et. al.	arxiv-cs.CL	2024-01-14
491	Killer Apps: Low-Speed, Large-Scale AI Weapons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the concept of AI weapons, their deployment, detection, and potential countermeasures.	Philip Feldman; Aaron Dant; James R. Foulds;	arxiv-cs.CY	2024-01-14
492	MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel map-guided GPT-based agent, dubbed MapGPT, which introduces an online linguistic-formed map to encourage the global exploration.	JIAQI CHEN et. al.	arxiv-cs.AI	2024-01-14
493	Leveraging The Power of Transformers for Guilt Detection in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the applicability of three transformer-based language models for detecting guilt in text and compare their performance for general emotion detection and guilt detection.	Abdul Gafar Manuel Meque; Jason Angel; Grigori Sidorov; Alexander Gelbukh;	arxiv-cs.CL	2024-01-14
494	Active Learning for NLP with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work investigates the accuracy and cost of using LLMs (GPT-3.5 and GPT-4) to label samples on 3 different datasets.	Xuesong Wang;	arxiv-cs.CL	2024-01-14
495	A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a multi-stage prompting approach (MSP) for the generation of multiple choice questions (MCQs), harnessing the capabilities of GPT models such as text-davinci-003 and GPT-4, renowned for their excellence across various NLP tasks.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2024-01-13
496	How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety By Humanizing LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As large language models (LLMs) become increasingly common and competent, non-expert users can also impose risks during daily interactions. This paper introduces a new perspective to jailbreak LLMs as human-like communicators, to explore this overlooked intersection between everyday language interaction and AI safety.	YI ZENG et. al.	arxiv-cs.CL	2024-01-12
497	An Investigation of Structures Responsible for Gender Bias in BERT and DistilBERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A crucial issue is the fairness of the predictions made by both PLMs and their distilled counterparts. In this paper, we propose an empirical exploration of this problem by formalizing two questions: (1) Can we identify the neural mechanism(s) responsible for gender bias in BERT (and by extension DistilBERT)?	Thibaud Leteno; Antoine Gourru; Charlotte Laclau; Christophe Gravier;	arxiv-cs.CL	2024-01-12
498	Mission: Impossible Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we develop a set of synthetic impossible languages of differing complexity, each designed by systematically altering English data with unnatural word orders and grammar rules.	Julie Kallini; Isabel Papadimitriou; Richard Futrell; Kyle Mahowald; Christopher Potts;	arxiv-cs.CL	2024-01-12
499	From Automation to Augmentation: Large Language Models Elevating Essay Scoring Landscape Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study investigates the effectiveness of Large Language Models (LLMs), specifically GPT-4 and fine-tuned GPT-3.5, as tools for AES.	CHANGRONG XIAO et. al.	arxiv-cs.CL	2024-01-12
500	Swin Transformer-Based CSI Feedback for Massive MIMO Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Swin Transformer-based autoencoder network called SwinCFNet for the CSI feedback task.	JIAMING CHENG et. al.	arxiv-cs.IT	2024-01-12
501	Intention Analysis Makes LLMs A Good Jailbreak Defender Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a simple yet highly effective defense strategy, i.e., Intention Analysis ($\mathbb{IA}$).	Yuqi Zhang; Liang Ding; Lefei Zhang; Dacheng Tao;	arxiv-cs.CL	2024-01-12
502	Transformer for Object Re-Identification: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering the trending unsupervised Re-ID, we propose a new Transformer baseline, UntransReID, achieving state-of-the-art performance on both single-/cross modal tasks.	MANG YE et. al.	arxiv-cs.CV	2024-01-12
503	Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This research focuses on representing documents across languages by using Transformer Leveraged Document Representations (TLDRs) that are mapped to a cross-lingual domain.	Tsegaye Misikir Tashu; Eduard-Raul Kontos; Matthia Sabatelli; Matias Valdenegro-Toro;	arxiv-cs.CL	2024-01-12
504	DevEval: Evaluating Code Generation in Practical Software Projects Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new benchmark named DevEval, aligned with Developers’ experiences in practical projects.	JIA LI et. al.	arxiv-cs.SE	2024-01-12
505	Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we show that Zephyr-7b presents a consistently viable alternative, overcoming key limitations of commonly used approaches like Llama-2 and GPT-3.5.	Tyler Vergho; Jean-Francois Godbout; Reihaneh Rabbany; Kellin Pelrine;	arxiv-cs.CL	2024-01-12
506	YOLO-Former: YOLO Shakes Hand With ViT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The results demonstrate the effectiveness of the proposed approach, with a mean average precision (mAP) of 85.76\% on the Pascal VOC dataset, while maintaining high prediction speed with a frame rate of 10.85 frames per second. The contribution of this work lies in the demonstration of how the innovative combination of these two state-of-the-art techniques can lead to further improvements in the field of object detection.	Javad Khoramdel; Ahmad Moori; Yasamin Borhani; Armin Ghanbarzadeh; Esmaeil Najafi;	arxiv-cs.CV	2024-01-11
507	Prompt-based Mental Health Screening from Social Media Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text.	Wesley Ramos dos Santos; Ivandre Paraboni;	arxiv-cs.CL	2024-01-11
508	The Benefits of A Concise Chain of Thought on Problem-Solving in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce Concise Chain-of-Thought (CCoT) prompting.	Matthew Renze; Erhan Guven;	arxiv-cs.CL	2024-01-10
509	Monte Carlo Tree Search for Recipe Generation Using GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose RecipeMC, a text generation method using GPT-2 that relies on Monte Carlo Tree Search (MCTS).	Karan Taneja; Richard Segal; Richard Goodwin;	arxiv-cs.CL	2024-01-10
510	EmMixformer: Mix Transformer for Eye Movement Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although deep neural networks, such as convolutional neural network (CNN), have recently achieved promising performance, current solutions fail to capture local and global temporal dependencies within eye movement data. To overcome this problem, we propose in this paper a mixed transformer termed EmMixformer to extract time and frequency domain information for eye movement recognition.	HUAFENG QIN et. al.	arxiv-cs.CV	2024-01-10
511	OTAS: An Elastic Transformer Serving System Via Token Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce OTAS, the first elastic serving system specially tailored for transformer models by exploring lightweight token management.	JINYU CHEN et. al.	arxiv-cs.DC	2024-01-10
512	DebugBench: Evaluating Debugging Capability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Previous evaluations of LLMs’ debugging ability are significantly limited by the risk of data leakage, the scale of the dataset, and the variety of tested bugs. To overcome these deficiencies, we introduce `DebugBench’, an LLM debugging benchmark consisting of 4,253 instances.	RUNCHU TIAN et. al.	arxiv-cs.SE	2024-01-09
513	DepressionEmo: A Novel Dataset for Multilabel Classification of Depression Emotions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel dataset named DepressionEmo designed to detect 8 emotions associated with depression by 6037 examples of long Reddit user posts.	Abu Bakar Siddiqur Rahman; Hoang-Thang Ta; Lotfollah Najjar; Azad Azadmanesh; Ali Saffet Gönül;	arxiv-cs.CL	2024-01-09
514	An Assessment on Comprehending Mental Health Through Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, across various applications, an outstanding question involves the capacity of large language models to comprehend expressions of human mental health conditions in natural language. This study presents an initial evaluation of large language models in addressing this gap.	Mihael Arcan; David-Paul Niland; Fionn Delahunty;	arxiv-cs.CL	2024-01-09
515	MARG: Multi-Agent Review Generation for Scientific Papers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion.	Mike D’Arcy; Tom Hope; Larry Birnbaum; Doug Downey;	arxiv-cs.CL	2024-01-08
516	Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Audio and video are two most common modalities in the mainstream media platforms, e.g., YouTube. To learn from multimodal videos effectively, in this work, we propose a novel …	Wentao Zhu;	ArXiv	2024-01-08
517	Distortions in Judged Spatial Relations in Large Language Models: The Dawn of Natural Language Geographic Data? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a benchmark for assessing the capability of Large Language Models (LLMs) to discern intercardinal directions between geographic locations and apply it to three prominent LLMs: GPT-3.5, GPT-4, and Llama-2.	Nir Fulman; Abdulkadir Memduhoğlu; Alexander Zipf;	arxiv-cs.CL	2024-01-08
518	MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: At the same time, Mixture of Experts (MoE) has significantly improved Transformer-based LLMs, including recent state-of-the-art open-source models. We propose that to unlock the potential of SSMs for scaling, they should be combined with MoE.	Maciej Pióro; Kamil Ciebiera; Krystian Król; Jan Ludziejewski; Sebastian Jaszczur;	arxiv-cs.LG	2024-01-08
519	Mixtral of Experts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model.	ALBERT Q. JIANG et. al.	arxiv-cs.LG	2024-01-08
520	GPT-4V(ision) Is A Human-Aligned Evaluator for Text-to-3D Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents an automatic, versatile, and human-aligned evaluation metric for text-to-3D generative models.	TONG WU et. al.	arxiv-cs.CV	2024-01-08
521	GloTSFormer: Global Video Text Spotting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Global Video Text Spotting Transformer GloTSFormer to model the tracking problem as global associations and utilize the Gaussian Wasserstein distance to guide the morphological correlation between frames.	Han Wang; Yanjie Wang; Yang Li; Can Huang;	arxiv-cs.CV	2024-01-08
522	InFoBench: Evaluating Instruction Following Ability in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces the Decomposed Requirements Following Ratio (DRFR), a new metric for evaluating Large Language Models’ (LLMs) ability to follow instructions.	YIWEI QIN et. al.	arxiv-cs.CL	2024-01-07
523	CharPoet: A Chinese Classical Poetry Generation System Based on Token-free LLM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) improve content control by allowing unrestricted user instructions, but the token-by-token generation process frequently makes format errors. Motivated by this, we propose CharPoet, a Chinese classical poetry generation system based on token-free LLM, which provides effective control over both format and content.	Chengyue Yu; Lei Zang; Jiaotuan Wang; Chenyi Zhuang; Jinjie Gu;	arxiv-cs.CL	2024-01-07
524	RoBERTurk: Adjusting RoBERTa for Turkish Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We pretrain RoBERTa on a Turkish corpora using BPE tokenizer. Our model outperforms BERTurk family models on the BOUN dataset for the POS task while resulting in underperformance …	Nuri Tas;	arxiv-cs.CL	2024-01-07
525	Using Large Language Models to Assess Tutors’ Performance in Reacting to Students Making Math Errors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the capacity of generative AI to evaluate real-life tutors’ performance in responding to students making math errors.	Sanjit Kakarla; Danielle Thomas; Jionghao Lin; Shivang Gupta; Kenneth R. Koedinger;	arxiv-cs.HC	2024-01-06
526	PIXAR: Auto-Regressive Language Modeling in Pixel Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce PIXAR, the first pixel-based autoregressive LLM that performs text generation.	Yintao Tai; Xiyang Liao; Alessandro Suglia; Antonio Vergari;	arxiv-cs.CL	2024-01-06
527	PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets.	GANG LIU et. al.	arxiv-cs.CL	2024-01-05
528	Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we carry out the preliminary and comprehensive case study of utilizing GPT-4V for marine analysis.	ZIQIANG ZHENG et. al.	arxiv-cs.CL	2024-01-04
529	Re-evaluating The Memory-balanced Pipeline Parallelism: BPipe Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it suffers from imbalanced memory consumption, leading to insufficient memory utilization. The BPipe technique was proposed to address this issue and has proven effective in the GPT-3 model.	MINCONG HUANG et. al.	arxiv-cs.LG	2024-01-04
530	Shayona@SMM4H23: COVID-19 Self Diagnosis Classification Using BERT and LightGBM Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes approaches and results for shared Task 1 and 4 of SMMH4-23 by Team Shayona.	Rushi Chavda; Darshan Makwana; Vraj Patel; Anupam Shukla;	arxiv-cs.CL	2024-01-04
531	Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel text augmentation method that leverages the Fill-Mask feature of the transformer-based BERT model.	Himmet Toprak Kesgin; Mehmet Fatih Amasyali;	arxiv-cs.CL	2024-01-03
532	MULTI-CASE: A Transformer-based Ethics-aware Multimodal Investigative Intelligence Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the challenge of multimodal analytics, we present MULTI-CASE, a holistic visual analytics framework tailored towards ethics-aware and multimodal intelligence exploration, designed in collaboration with domain experts.	Maximilian T. Fischer; Yannick Metz; Lucas Joos; Matthias Miller; Daniel A. Keim;	arxiv-cs.HC	2024-01-03
533	Close to Human-Level Agreement: Tracing Journeys of Violent Speech in Incel Posts with GPT-4-Enhanced Annotations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the prevalence of violent language on incels.is.	Daniel Matter; Miriam Schirmer; Nir Grinberg; Jürgen Pfeffer;	arxiv-cs.SI	2024-01-03
534	GPT-4V(ision) Is A Generalist Web Agent, If Grounded IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore the potential of LMMs like GPT-4V as a generalist web agent that can follow natural language instructions to complete tasks on any given website.	Boyuan Zheng; Boyu Gou; Jihyung Kil; Huan Sun; Yu Su;	arxiv-cs.IR	2024-01-03
535	Revisiting Zero-Shot Abstractive Summarization in The Era of Large Language Models from The Perspective of Position Bias Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We characterize and study zero-shot abstractive summarization in Large Language Models (LLMs) by measuring position bias, which we propose as a general formulation of the more restrictive lead bias phenomenon studied previously in the literature.	Anshuman Chhabra; Hadi Askari; Prasant Mohapatra;	arxiv-cs.CL	2024-01-03
536	MLPs Compass: What Is Learned When MLPs Are Combined with PLMs? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by recent efforts that prove Multilayer-Perceptrons (MLPs) modules achieving robust structural capture capabilities, even outperforming Graph Neural Networks (GNNs), this paper aims to quantify whether simple MLPs can further enhance the already potent ability of PLMs to capture linguistic information.	LI ZHOU et. al.	arxiv-cs.CL	2024-01-03
537	SecFormer: Towards Fast and Accurate Privacy-Preserving Inference for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is largely due to the multitude of nonlinear operations in the Transformer architecture, which are not well-suited to SMPC and difficult to circumvent or optimize effectively. To address this concern, we introduce an advanced optimization framework called SecFormer, to achieve fast and accurate PPI for Transformer models.	JINGLONG LUO et. al.	arxiv-cs.LG	2024-01-01
538	U²-Former: Nested U-Shaped Transformer for Image Restoration Via Multi-View Contrastive Learning Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: While Transformer has achieved remarkable performance in various high-level vision tasks, it is still challenging to exploit the full potential of Transformer in image …	XIN FENG et. al.	IEEE Transactions on Circuits and Systems for Video …	2024-01-01
539	A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce LogicAsker, an automatic approach that comprehensively evaluates and improves the logical reasoning abilities of LLMs under a set of atomic reasoning skills based on propositional and predicate logic.	YUXUAN WAN et. al.	arxiv-cs.SE	2024-01-01
540	Detection of Post-COVID-19-related Pulmonary Diseases in X-ray Images Using Vision Transformer-based Neural Network Related Papers Related Patents Related Grants Related Venues Related Experts View	Anzhelika Mezina; Radim Burget;	Biomed. Signal Process. Control.	2024-01-01
541	Modified Distance Protection for Transmission Line with Hexagonal Phase-shifting Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	F. Aboshady;	International Journal of Electrical Power & Energy Systems
542	Opening A Pandora’s Box: Things You Should Know in The Era of Custom GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct a comprehensive analysis of the security and privacy issues arising from the custom GPT platform.	GUANHONG TAO et. al.	arxiv-cs.CR	2023-12-31
543	GraphGPT: Graph Learning with Generative Pre-trained Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce \textit{GraphGPT}, a novel model for Graph learning by self-supervised Generative Pre-training Transformers.	Qifang Zhao; Weidong Ren; Tianyu Li; Xiaoxiao Xu; Hong Liu;	arxiv-cs.LG	2023-12-31
544	A Two-stream Hybrid CNN-Transformer Network for Skeleton-based Human Interaction Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Two-stream Hybrid CNN-Transformer Network (THCT-Net), which exploits the local specificity of CNN and models global dependencies through the Transformer.	Ruoqi Yin; Jianqin Yin;	arxiv-cs.CV	2023-12-31
545	Advancing TTP Analysis: Harnessing The Power of Encoder-Only and Decoder-Only Language Models with Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The state-of-the-art LLMs have shown to be prone to hallucination by providing inaccurate information, which is problematic in critical domains like cybersecurity. Therefore, we propose the use of Retrieval Augmented Generation (RAG) techniques to extract relevant contexts for each cyberattack procedure for decoder-only LLMs (without fine-tuning).	Reza Fayyazi; Rozhina Taghdimi; Shanchieh Jay Yang;	arxiv-cs.CR	2023-12-30
546	Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a framework to procedurally generate numerical questions and puzzles, and compare the results with and without the application of several red teaming techniques.	ALEKSANDER BUSZYDLIK et. al.	arxiv-cs.CL	2023-12-30
547	MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce MosaicBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining.	JACOB PORTES et. al.	arxiv-cs.CL	2023-12-29
548	FlashVideo: A Framework for Swift Inference in Text-to-Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces FlashVideo, a novel framework tailored for swift Text-to-Video generation.	Bin Lei; le Chen; Caiwen Ding;	arxiv-cs.CV	2023-12-29
549	ClST: A Convolutional Transformer Framework for Automatic Modulation Recognition By Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, insufficient training signal data in complicated channel environments and large-scale DL models are critical factors that make DL methods difficult to deploy in practice. Aiming to these problems, we propose a novel neural network named convolution-linked signal transformer (ClST) and a novel knowledge distillation method named signal knowledge distillation (SKD).	Dongbin Hou; Lixin Li; Wensheng Lin; Junli Liang; Zhu Han;	arxiv-cs.LG	2023-12-28
550	DB-GPT: Empowering Database Interactions with Private Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user experience and accessibility.	SIQIAO XUE et. al.	arxiv-cs.DB	2023-12-28
551	BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose BEAt tracking Streaming Transformer (BEAST), an online joint beat and downbeat tracking system based on the streaming Transformer.	Chih-Cheng Chang; Li Su;	arxiv-cs.SD	2023-12-28
552	SentinelLMs: Encrypted Input Adaptation and Fine-tuning of Language Models for Private and Secure Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, this introduces two fundamental risks: (a) the transmission of user inputs to the server via the network gives rise to interception vulnerabilities, and (b) privacy concerns emerge as organizations that deploy such models store user data with restricted context. To address this, we propose a novel method to adapt and fine-tune transformer-based language models on passkey-encrypted user-specific text.	Abhijit Mishra; Mingda Li; Soham Deo;	arxiv-cs.CR	2023-12-28
553	On The Rate of Convergence of An Over-parametrized Transformer Classifier Learned By Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here it is not only important what kind of models these network can approximate, or how they can generalize their knowledge learned by choosing the best possible approximation to a concrete data set, but also how well optimization of such transformer network based on concrete data set works. In this article we consider all these three different aspects simultaneously and show a theoretical upper bound on the missclassification probability of a transformer network fitted to the observed data.	Michael Kohler; Adam Krzyzak;	arxiv-cs.LG	2023-12-28
554	Evaluating The Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in undergraduate admissions exams proposed by the National Polytechnic Institute in Mexico.	Sabino Miranda; Obdulia Pichardo-Lagunas; Bella Martínez-Seis; Pierre Baldi;	arxiv-cs.CL	2023-12-28
555	SCTNet: Single-Branch CNN with Transformer Semantic Information for Real-Time Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the additional branch incurs undesirable computational overhead and slows inference speed. To eliminate this dilemma, we propose SCTNet, a single branch CNN with transformer semantic information for real-time segmentation.	ZHENGZE XU et. al.	arxiv-cs.CV	2023-12-28
556	Gemini Pro Defeated By GPT-4V: Evidence from Education Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study compared the classification performance of Gemini Pro and GPT-4V in educational settings. Employing visual question answering (VQA) techniques, the study examined both …	Gyeong-Geon Lee; Ehsan Latif; Lehong Shi; Xiaoming Zhai;	ArXiv	2023-12-27
557	Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces 26 guiding principles designed to streamline the process of querying and prompting large language models.	Sondos Mahmoud Bsharat; Aidar Myrzakhan; Zhiqiang Shen;	arxiv-cs.CL	2023-12-26
558	SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SecQA, a novel dataset tailored for evaluating the performance of Large Language Models (LLMs) in the domain of computer security.	Zefang Liu;	arxiv-cs.CL	2023-12-25
559	Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the effectiveness of integrating an increasing number of heterogeneous methods.	Jia Cheng Hu; Roberto Cavicchioli; Giulia Berardinelli; Alessandro Capotondi;	arxiv-cs.CL	2023-12-25
560	Fairness-Aware Structured Pruning in Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The increasing size of large language models (LLMs) has introduced challenges in their training and inference. Removing model components is perceived as a solution to tackle the …	Abdelrahman Zayed; Goncalo Mordido; Samira Shabanian; Ioana Baldini; Sarath Chandar;	arxiv-cs.CL	2023-12-23
561	Building Real-World Meeting Summarization Systems Using Large Language Models: A Practical Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper studies how to effectively build meeting summarization systems for real-world usage using large language models (LLMs).	Md Tahmid Rahman Laskar; Xue-Yong Fu; Cheng Chen; Shashi Bhushan TN;	emnlp	2023-12-22
562	TheoremQA: A Theorem-driven Question Answering Dataset IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce TheoremQA, the first theorem-driven question-answering dataset designed to evaluate AI models� capabilities to apply theorems to solve challenging science problems.	WENHU CHEN et. al.	emnlp	2023-12-22
563	Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Thus, it is still an open question: shall we pretrain large autoregressive LMs with retrieval? To answer it, we perform a comprehensive study on a scalable pre-trained retrieval-augmented LM (i. e. , RETRO) compared with standard GPT and retrieval-augmented GPT incorporated at fine-tuning or inference stages.	BOXIN WANG et. al.	emnlp	2023-12-22
564	Bootstrapping Small & High Performance Language Models with Unmasking-Removal Training Policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: BabyBERTa, a language model trained on small-scale child-directed speech while none of the words are unmasked during training, has been shown to achieve a level of grammaticality comparable to that of RoBERTa-base, which is trained on 6,000 times more words and 15 times more parameters. Relying on this promising result, we explore in this paper the performance of BabyBERTa-based models in downstream tasks, focusing on Semantic Role Labeling (SRL) and two Extractive Question Answering tasks, with the aim of building more efficient systems that rely on less data and smaller models.	Yahan Yang; Elior Sulem; Insup Lee; Dan Roth;	emnlp	2023-12-22
565	API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, three pivotal questions remain unanswered: (1) How effective are current LLMs in utilizing tools? (2) How can we enhance LLMs? ability to utilize tools? (3) What obstacles need to be overcome to leverage tools? To address these questions, we introduce API-Bank, a groundbreaking benchmark, specifically designed for tool-augmented LLMs.	MINGHAO LI et. al.	emnlp	2023-12-22
566	Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This ubiquitous layer of language models is often overlooked. We find that similarities between these input embeddings are highly interpretable and that the geometry of these embeddings differs between model families.	Andrea Wen-Yi; David Mimno;	emnlp	2023-12-22
567	Do Transformers Parse While Predicting The Masked Word? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Some doubts have been raised whether the models are doing parsing or only some computation weakly correlated with it. Concretely: (a) Is it possible to explicitly describe transformers with realistic embedding dimensions, number of heads, etc. that are capable of doing parsing ? or even approximate parsing? (b) Why do pre-trained models capture parsing structure? This paper takes a step toward answering these questions in the context of generative modeling with PCFGs. We show that masked language models like BERT or RoBERTa of moderate sizes can approximately execute the Inside-Outside algorithm for the English PCFG (Marcus et al. , 1993).	Haoyu Zhao; Abhishek Panigrahi; Rong Ge; Sanjeev Arora;	emnlp	2023-12-22
568	LINC: A Neurosymbolic Approach for Logical Reasoning By Combining Language Models with First-Order Logic Provers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing in subtle and unpredictable ways. In this work, we investigate the validity of instead reformulating such tasks as modular neurosymbolic programming, which we call LINC: Logical Inference via Neurosymbolic Computation.	THEO OLAUSSON et. al.	emnlp	2023-12-22
569	JASMINE: Arabic GPT Models for Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using our novel benchmark, we evaluate JASMINE extensively showing powerful performance intrinsically as well as in few-shot learning on a wide range of NLP tasks. We aim to responsibly release our models and evaluation benchmark with interested researchers, along with code for experimenting with them.	El Moatez Billah Nagoudi; Muhammad Abdul-Mageed; AbdelRahim Elmadany; Alcides Inciarte; Md Tawkat Islam Khondaker;	emnlp	2023-12-22
570	MRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present mRedditSum, the first multimodal discussion summarization dataset.	Keighley Overbay; Jaewoo Ahn; Fatemeh Pesaran zadeh; Joonsuk Park; Gunhee Kim;	emnlp	2023-12-22
571	Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Dynosaur, a dynamic growth paradigm for the automatic curation of instruction-tuning data.	DA YIN et. al.	emnlp	2023-12-22
572	Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the advancements of T2I models, a common issue encountered by users is the need for repetitive editing of input prompts in order to receive a satisfactory image, which is time-consuming and labor-intensive. Given the demonstrated text generation power of large-scale language models, such as GPT-k, we investigate the potential of utilizing such models to improve the prompt editing process for T2I generation.	WANRONG ZHU et. al.	emnlp	2023-12-22
573	Deep Natural Language Feature Learning for Interpretable Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task.	Felipe Urrutia; Cristian Calderon; Valentin Barriere;	emnlp	2023-12-22
574	GPT-RE: In-context Learning for Relation Extraction Using Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose GPT-RE to successfully address the aforementioned issues by (1) incorporating task-aware representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic.	ZHEN WAN et. al.	emnlp	2023-12-22
575	Zero-Shot Multi-Label Topic Inference with Sentence Encoders and LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conducted a comprehensive study with the latest Sentence Encoders and Large Language Models (LLMs) on the challenging task of �definition-wild zero-shot topic inference�, where users define or provide the topics of interest in real-time.	Souvika Sarkar; Dongji Feng; Shubhra Kanti Karmaker Santu;	emnlp	2023-12-22
576	NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this servingtime requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer�s FLOPs per document and can be served using commodity CPUs.	Livio Soares; Daniel Gillick; Jeremy Cole; Tom Kwiatkowski;	emnlp	2023-12-22
577	LLM-powered Data Augmentation for Enhanced Cross-lingual Performance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores the potential of leveraging Large Language Models (LLMs) for data augmentation in multilingual commonsense reasoning datasets where the available training data is extremely limited.	Chenxi Whitehouse; Monojit Choudhury; Alham Aji;	emnlp	2023-12-22
578	Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: i. e, generating large-scale harmful and misleading content). To combat this emerging risk of LLMs, we propose a novel �Fighting Fire with Fire*� (F3) strategy that harnesses modern LLMs� generative and emergent reasoning capabilities to counter human-written and LLM-generated disinformation.	JASON LUCAS et. al.	emnlp	2023-12-22
579	The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper provides a comprehensive analysis of the divergence between academic research in NLP and the needs of real-world NLP applications via a large-scale collection of user-GPT conversations.	SIRU OUYANG et. al.	emnlp	2023-12-22
580	Understanding The Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a comprehensive analytical model for estimating the performance of a spatial LLM accelerator, taking into account the on-chip compute and memory resources available on an FPGA.	HONGZHENG CHEN et. al.	arxiv-cs.LG	2023-12-22
581	Harnessing Black-Box Control to Boost Commonsense in LM�s Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a computation-efficient framework that steers a frozen Pre-Trained Language Model (PTLM) towards more commonsensical generation (i. e. , producing a plausible output that incorporates a list of concepts in a meaningful way).	Yufei Tian; Felix Zhang; Nanyun Peng;	emnlp	2023-12-22
582	Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the table-to-text capabilities of different LLMs using four datasets within two real-world information seeking scenarios.	YILUN ZHAO et. al.	emnlp	2023-12-22
583	Automatic Transcription of Handwritten Old Occitan Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an innovative HTR approach that leverages the Transformer architecture for recognizing handwritten Old Occitan language.	Esteban Arias; Vallari Pai; Matthias Sch�ffel; Christian Heumann; Matthias Aenmacher;	emnlp	2023-12-22
584	Disentangling Transformer Language Models As Superposed Topic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to its hypothesised superposition properties, the final logits originating from the residual path are considered uninterpretable. Therefore, we posit that we can interpret TLM as superposed NTM by proposing a novel weight-based, model-agnostic and corpus-agnostic approach to search and disentangle decoder-only TLM, potentially mapping individual neurons to multiple coherent topics.	Jia Peng Lim; Hady Lauw;	emnlp	2023-12-22
585	Let GPT Be A Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models.	ZHENWEN LIANG et. al.	emnlp	2023-12-22
586	Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present novel experimental insights into the resilience of LLMs, particularly GPT-4, when subjected to extensive character-level permutations.	Qi Cao; Takeshi Kojima; Yutaka Matsuo; Yusuke Iwasawa;	emnlp	2023-12-22
587	Towards Detecting Cascades of Biased Medical Claims on Twitter Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a machine learning framework that uses two models in tandem: RoBERTa to detect medical claims and DistilBERT to classify bias.	Libby Tiderman; Juan Sanchez Mercedes; Fiona Romanoschi; Fabricio Murai;	arxiv-cs.SI	2023-12-22
588	Refining GPT-3 Embeddings with A Siamese Structure for Technical Post Duplicate Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we attempt to employ and refine the GPT-3 embeddings for the duplicate detection task.	Xingfang Wu; Heng Li; Nobukazu Yoshioka; Hironori Washizaki; Foutse Khomh;	arxiv-cs.SE	2023-12-22
589	Effects of Sub-word Segmentation on Performance of Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms for morphological segmentation � Morfessor and StateMorph.	Jue Hou; Anisia Katinskaia; Anh-Duc Vu; Roman Yangarber;	emnlp	2023-12-22
590	Investigating Efficiently Extending Transformers for Long Input Summarization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on our findings, we introduce PEGASUS-X, an extension of the PEGASUS model with additional long input pretraining to handle inputs of up to 16K tokens, which achieves strong performance on long input summarization tasks comparable with much larger models.	Jason Phang; Yao Zhao; Peter Liu;	emnlp	2023-12-22
591	Document-Level Machine Translation with Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The study focuses on three aspects: 1) Effects of Context-Aware Prompts, where we investigate the impact of different prompts on document-level translation quality and discourse phenomena; 2) Comparison of Translation Models, where we compare the translation performance of ChatGPT with commercial MT systems and advanced document-level MT methods; 3) Analysis of Discourse Modelling Abilities, where we further probe discourse knowledge encoded in LLMs and shed light on impacts of training techniques on discourse modeling.	LONGYUE WANG et. al.	emnlp	2023-12-22
592	Evaluation Metrics in The Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We aim to improve the understanding of current models� performance by providing a preliminary and hybrid evaluation on a range of open and closed-source generative LLMs on three NLP benchmarks: text summarisation, text simplification and grammatical error correction (GEC), using both automatic and human evaluation.	Andrea Sottana; Bin Liang; Kai Zou; Zheng Yuan;	emnlp	2023-12-22
593	Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose focusing on generalization, uncertainty, and how to leverage recent large language models, in order to create more practical tools to evaluate information veracity in contexts where perfect classification is impossible.	KELLIN PELRINE et. al.	emnlp	2023-12-22
594	Explicit Planning Helps Language Models in Logical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose LEAP, a novel system that uses language models to perform multi-step logical reasoning and incorporates explicit planning into the inference procedure.	Hongyu Zhao; Kangrui Wang; Mo Yu; Hongyuan Mei;	emnlp	2023-12-22
595	Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs Without Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose Inference-time Policy Adapters (IPA), which efficiently tailors a language model such as GPT-3 without fine-tuning it.	XIMING LU et. al.	emnlp	2023-12-22
596	Large Language Models Are Biased to Overestimate Profoundness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We found a significant statement-to-statement correlation between the LLMs and humans, irrespective of the type of statements and the prompting technique used.	Eugenio Herrera-Berg; Tom�s Browne; Pablo Le�n-Villagr�; Marc-Llu�s Vives; Cristian Calderon;	emnlp	2023-12-22
597	Knowledge Rumination for Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, despite the promising outcome, we empirically observe that PLMs may have already encoded rich knowledge in their pre-trained parameters but fails to fully utilize them when applying to knowledge-intensive tasks. In this paper, we propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize that related latent knowledge without retrieving them from the external corpus.	YUNZHI YAO et. al.	emnlp	2023-12-22
598	EELBERT: Tiny Models Through Dynamic Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce EELBERT, an approach for compression of transformer-based models (e. g. , BERT), with minimal impact on the accuracy of downstream tasks.	Gabrielle Cohn; Rishika Agarwal; Deepanshu Gupta; Siddharth Patwardhan;	emnlp	2023-12-22
599	Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we carry out a data archaeology to infer books that are known to ChatGPT and GPT-4 using a name cloze membership inference query.	Kent Chang; Mackenzie Cramer; Sandeep Soni; David Bamman;	emnlp	2023-12-22
600	Gemini Vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The rapidly evolving sector of Multi-modal Large Language Models (MLLMs) is at the forefront of integrating linguistic and visual processing in artificial intelligence. This paper …	ZHANGYANG QI et. al.	ArXiv	2023-12-22
601	Editing Common Sense in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether commonsense judgments are causally associated with localized, editable parameters in Transformers, and we provide an affirmative answer.	ANSHITA GUPTA et. al.	emnlp	2023-12-22
602	IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an Induction-Augmented Generation (IAG) framework that utilizes inductive knowledge along with the retrieved documents for implicit reasoning.	ZHEBIN ZHANG et. al.	emnlp	2023-12-22
603	Exploring The Boundaries of GPT-4 in Radiology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models.	QIANCHU LIU et. al.	emnlp	2023-12-22
604	Unsupervised Auditory and Semantic Entrainment Models with Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an unsupervised deep learning framework that derives meaningful representation from textual features for developing semantic entrainment.	Jay Kejriwal; Stefan Benus; Lina M. Rojas-Barahona;	arxiv-cs.CL	2023-12-22
605	Sparse Universal Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is mainly because scaling UT parameters is more compute and memory intensive than scaling up a VT. This paper proposes the Sparse Universal Transformer (SUT), which leverages Sparse Mixture of Experts (SMoE) to reduce UT�s computation complexity while retaining its parameter efficiency and generalization ability.	Shawn Tan; Yikang Shen; Zhenfang Chen; Aaron Courville; Chuang Gan;	emnlp	2023-12-22
606	Beware of Model Collapse! Fast and Stable Test-time Adaptation for Robust Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we delve into why TTA causes model collapse and find that the imbalanced label distribution inherent in QA is the reason for it.	Yi Su; Yixin Ji; Juntao Li; Hai Ye; Min Zhang;	emnlp	2023-12-22
607	Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the ArgTersely benchmark for sentence-level counter-argument generation, drawing from a manually annotated dataset from the ChangeMyView debate forum.	JIAYU LIN et. al.	emnlp	2023-12-22
608	Outlier Suppression+: Accurate Quantization of Large Language Models By Equivalent and Effective Shifting and Scaling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that these outliers are concentrated in specific channels and are asymmetric across channels. To address this issue, we propose the Outlier Suppression+ (OS+) framework, which contains the channel-wise shifting for asymmetry and channel-wise scaling for concentration.	XIUYING WEI et. al.	emnlp	2023-12-22
609	Cabbage Sweeter Than Cake? Analysing The Potential of Large Language Models for Learning Conceptual Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These quality dimensions are usually learned from human judgements, which means that applications of conceptual spaces tend to be limited to narrow domains (e. g. modelling colour or taste). Encouraged by recent findings about the ability of Large Language Models (LLMs) to learn perceptually grounded representations, we explore the potential of such models for learning conceptual spaces.	Usashi Chatterjee; Amit Gajbhiye; Steven Schockaert;	emnlp	2023-12-22
610	SAMRank: Unsupervised Keyphrase Extraction Using Self-Attention Map in BERT and GPT-2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel unsupervised keyphrase extraction approach, called SAMRank, which uses only a self-attention map in a pre-trained language model (PLM) to determine the importance of phrases.	Byungha Kang; Youhyun Shin;	emnlp	2023-12-22
611	Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite its importance to journalists and human fact-checkers, it remains a severely understudied problem, and the scarce research on this topic so far has only focused on English. Here we aim to bridge this gap by creating a novel dataset, X-CLAIM, consisting of 7K real-world claims collected from numerous social media platforms in five Indian languages and English.	Shubham Mittal; Megha Sundriyal; Preslav Nakov;	emnlp	2023-12-22
612	Axiomatic Preference Modeling for Longform Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The contributions of this work include: training a standalone preference model that can score human- and LLM-generated answers on the same scale; developing an axiomatic framework for generating training data pairs tailored to certain principles; and showing that a small amount of axiomatic signals can help small models outperform GPT-4 in preference scoring.	Corby Rosset; Guoqing Zheng; Victor Dibia; Ahmed Awadallah; Paul Bennett;	emnlp	2023-12-22
613	Retrofitting Light-weight Language Models for Emotions Using Supervised Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel retrofitting method to induce emotion aspects into pre-trained language models (PLMs) such as BERT and RoBERTa.	Sapan Shah; Sreedhar Reddy; Pushpak Bhattacharyya;	emnlp	2023-12-22
614	Large Language Models Are Complex Table Parsers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to incorporate GPT-3. 5 to address such challenges, in which complex tables are reconstructed into tuples and specific prompt designs are employed for dialogues.	BOWEN ZHAO et. al.	emnlp	2023-12-22
615	Conceptor-Aided Debiasing of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two methods of applying conceptors (1) bias subspace projection by post-processing by the conceptor NOT operation; and (2) a new architecture, conceptor-intervened BERT (CI-BERT), which explicitly incorporates the conceptor projection into all layers during training.	Li Yifei; Lyle Ungar; Jo�o Sedoc;	emnlp	2023-12-22
616	INSTRUCTSCORE: Towards Explainable Text Generation Evaluation with Automatic Feedback IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although recent learned metrics show high correlation with human judgement, these metrics do not provide explicit explanation of their verdict, nor associate the scores with defects in the generated text. To address this limitation, we present INSTRUCTSCORE, a fine-grained explainable evaluation metric for text generation.	WENDA XU et. al.	emnlp	2023-12-22
617	Exploiting Novel GPT-4 APIs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that fine-tuning a model on as few as 15 harmful examples or 100 benign examples can remove core safeguards from GPT-4, enabling a range of harmful outputs.	Kellin Pelrine; Mohammad Taufeeque; Michał Zając; Euan McLean; Adam Gleave;	arxiv-cs.CR	2023-12-21
618	Efficacy of Machine-Generated Instructions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large instruction-tuned language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks.	Samaksh Gulati; Anshit Verma; Manoj Parmar; Palash Chaudhary;	arxiv-cs.CL	2023-12-21
619	Generative Pretraining at Scale: Transformer-Based Encoding of Transactional Behavior for Fraud Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce an innovative autoregressive model leveraging Generative Pretrained Transformer (GPT) architectures, tailored for fraud detection in payment systems.	Ze Yu Zhao; Zheng Zhu; Guilin Li; Wenhan Wang; Bo Wang;	arxiv-cs.LG	2023-12-21
620	ChatGPT As A Commenter to The News: Can LLMs Generate Human-like Opinions? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research we investigate to what extent GPT-3.5 can generate human-like comments on Dutch news articles.	Rayden Tseng; Suzan Verberne; Peter van der Putten;	arxiv-cs.CL	2023-12-21
621	Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI’s LLM with Open Source SLMs in Production Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a systematic evaluation methodology and a characterization of modern open-source SLMs and their trade-offs when replacing proprietary LLMs for a real-world product feature.	CHANDRA IRUGALBANDARA et. al.	arxiv-cs.SE	2023-12-20
622	Automated DevOps Pipeline Generation for Code Repositories Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a detailed investigation into the use of Large Language Models (LLMs) specifically, GPT 3.5 and GPT 4 to generate and evaluate GitHub Action workflows for DevOps tasks.	Deep Mehta; Kartik Rawool; Subodh Gujar; Bowen Xu;	arxiv-cs.SE	2023-12-20
623	Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: A Focused Study on Chemical Entities of Biological Interest Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the era of foundational language models, this study compares and analyzes three NLP paradigms for curation tasks: in-context learning (ICL), fine-tuning (FT), and supervised learning (ML).	EMILY GROVES et. al.	arxiv-cs.LG	2023-12-20
624	AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their advancements, challenges in balancing code snippet generation with effective test case generation and execution persist. To address these issues, this paper introduces Multi-Agent Assistant Code Generation (AgentCoder), a novel solution comprising a multi-agent framework with specialized agents: the programmer agent, the test designer agent, and the test executor agent.	Dong Huang; Qingwen Bu; Jie M. Zhang; Michael Luck; Heming Cui;	arxiv-cs.CL	2023-12-20
625	Advancing SQL Injection Detection for High-Speed Data Centers: A Novel Approach Using Cascaded NLP Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel cascade SQLi detection method, blending classical and transformer-based NLP models, achieving a 99.86% detection accuracy with significantly lower computational demands-20 times faster than using transformer-based models alone.	KASIM TASDEMIR et. al.	arxiv-cs.CR	2023-12-20
626	HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As the number of hardware attacks on Internet of Things (IoT) devices continues to rapidly increase, we present the Hardware Vulnerability to Weakness Mapping (HW-V2W-Map) Framework, which is a Machine Learning (ML) framework focusing on hardware vulnerabilities and IoT security.	YU-ZHENG LIN et. al.	arxiv-cs.CR	2023-12-20
627	Can Transformers Learn Sequential Function Classes In Context? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel sliding window sequential function class and employ toy-sized transformers with a GPT-2 architecture to conduct our experiments.	Ryan Campbell; Emma Guo; Evan Hu; Reya Vir; Ethan Hsiao;	arxiv-cs.LG	2023-12-19
628	Founder-GPT: Self-play to Evaluate The Founder-Idea Fit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research introduces an innovative evaluation method for the founder-idea fit in early-stage startups, utilizing advanced large language model techniques to assess founders’ profiles against their startup ideas to enhance decision-making.	Sichao Xiong; Yigit Ihlamur;	arxiv-cs.CL	2023-12-19
629	A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a preliminary exploration of Gemini Pro’s visual understanding proficiency, which comprehensively covers four domains: fundamental perception, advanced cognition, challenging vision tasks, and various expert capacities.	CHAOYOU FU et. al.	arxiv-cs.CV	2023-12-19
630	Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we evaluate the conversational reasoning capabilities of the current state-of-the-art LLM (GPT-4) on knowledge graphs (KGs).	Yuxuan Huang; Lida Shi; Anqi Liu; Hao Xu;	arxiv-cs.CL	2023-12-18
631	APIDocBooster: An Extract-Then-Abstract Framework Leveraging Large Language Models for Augmenting API Documentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce APIDocBooster, an extract-then-abstract framework that seamlessly fuses the advantages of both extractive (i.e., enabling faithful summaries without length limitation) and abstractive summarization (i.e., producing coherent and concise summaries).	CHENGRAN YANG et. al.	arxiv-cs.SE	2023-12-18
632	Time-Transformer: Integrating Local and Global Features for Better Time Series Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Most existing generative models have failed to effectively learn both the local and global properties of time series data. To address this open problem, we propose a novel time series generative model named ‘Time-Transformer AAE’, which consists of an adversarial autoencoder (AAE) and a newly designed architecture named ‘Time-Transformer’ within the decoder.	YUANSAN LIU et. al.	arxiv-cs.LG	2023-12-18
633	MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, most existing methods neglect the crucial significance of LLMs utilizing external tools and model collaboration. To address these challenges, we introduce MAC-SQL, a novel LLM-based multi-agent collaborative framework.	BING WANG et. al.	arxiv-cs.CL	2023-12-18
634	Stronger Graph Transformer with Regularized Attention Scores Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel version of edge regularization technique that alleviates the need for Positional Encoding and ultimately alleviate GT’s out of memory issue.	Eugene Ku;	arxiv-cs.LG	2023-12-18
635	A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage chiplet-based heterogeneous integration (HI) to design a high-performance and energy-efficient multi-chiplet platform to accelerate transformer workloads.	Harsh Sharma; Pratyush Dhingra; Janardhan Rao Doppa; Umit Ogras; Partha Pratim Pande;	arxiv-cs.AR	2023-12-18
636	An In-depth Look at Gemini’s Language Abilities Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we do an in-depth exploration of Gemini’s language abilities, making two contributions. First, we provide a third-party, objective comparison of the abilities of the OpenAI GPT and Google Gemini models with reproducible code and fully transparent results.	SYEDA NAHIDA AKTER et. al.	arxiv-cs.CL	2023-12-18
637	An Evaluation of GPT-4V and Gemini in Online VQA Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct fine-grained analysis by generating seven types of metadata for nearly 2,000 visual questions, such as image type and the required image processing capabilities.	Mengchen Liu; Chongyan Chen; Danna Gurari;	arxiv-cs.CV	2023-12-17
638	AI Gender Bias, Disparities, and Fairness: Does Training Data Matter? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The study employs three distinct techniques for bias analysis: Scoring accuracy difference to evaluate bias, mean score gaps by gender (MSG) to evaluate disparity, and Equalized Odds (EO) to evaluate fairness.	Ehsan Latif; Xiaoming Zhai; Lei Liu;	arxiv-cs.CY	2023-12-17
639	Can Persistent Homology Whiten Transformer-based Black-box Models? A Case Study on BERT Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we propose Optimus BERT Compression and Explainability (OBCE), a methodology to bring explainability to BERT models using persistent homology, aiming to measure the importance of each neuron by studying the topological characteristics of their outputs.	Luis Balderas; Miguel Lastra; José M. Benítez;	arxiv-cs.LG	2023-12-17
640	Cross-Domain Robustness of Transformer-based Keyphrase Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore the effectiveness of abstractive text summarization models for keyphrase selection.	Anna Glazkova; Dmitry Morozov;	arxiv-cs.CL	2023-12-17
641	Multi-Label Classification of COVID-Tweets Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tried three different models-(a) Supervised BERT-large-uncased, (b) Supervised HateXplain model, and (c) Zero-Shot GPT-3.5 Turbo model.	Aniket Deroy; Subhankar Maity;	arxiv-cs.CL	2023-12-17
642	Decoding Concerns: Multi-label Classification of Vaccine Sentiments in Social Media Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However the situation involves a mix of perspectives, with skepticism towards vaccines prevailing for various reasons such as political dynamics, apprehensions about side effects, and more. The paper addresses the challenge of comprehensively understanding and categorizing these diverse concerns expressed in the context of vaccination.	Somsubhra De; Shaurya Vats;	arxiv-cs.SI	2023-12-17
643	T2M-HiFiGPT: Generating High Quality Human Motion from Textual Descriptions with Residual Discrete Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce T2M-HiFiGPT, a novel conditional generative framework for synthesizing human motion from textual descriptions.	Congyi Wang;	arxiv-cs.CV	2023-12-17
644	HyperPIE: Hyperparameter Information Extraction from Scientific Publications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we formalize and tackle hyperparameter information extraction (HyperPIE) as an entity recognition and relation extraction task.	Tarek Saier; Mayumi Ohta; Takuto Asakura; Michael Färber;	arxiv-cs.CL	2023-12-17
645	A Comparative Analysis of Large Language Models for Code Documentation Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive comparative analysis of Large Language Models (LLMs) for generation of code documentation.	Shubhang Shekhar Dvivedi; Vyshnav Vijay; Sai Leela Rahul Pujari; Shoumik Lodh; Dhruv Kumar;	arxiv-cs.SE	2023-12-16
646	Cross-Linguistic Offensive Language Detection: BERT-Based Analysis of Bengali, Assamese, & Bodo Conversational Hateful Content from Social Media Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we used BERT models, including XML-Roberta, L3-cube, IndicBERT, BenglaBERT, and BanglaHateBERT.	Jhuma Kabir Mim; Mourad Oussalah; Akash Singhal;	arxiv-cs.CL	2023-12-16
647	SPT: Fine-Tuning Transformer-based Language Models Efficiently with Sparsification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the SPT system to fine-tune Transformer-based models efficiently by introducing sparsity.	Yuntao Gui; Xiao Yan; Peiqi Yin; Han Yang; James Cheng;	arxiv-cs.DC	2023-12-16
648	DeepArt: A Benchmark to Advance Fidelity Research in AI-Generated Content Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper explores the image synthesis capabilities of GPT-4, a leading multi-modal large language model.	Wentao Wang; Xuanyao Huang; Tianyang Wang; Swalpa Kumar Roy;	arxiv-cs.CV	2023-12-16
649	Algorithms for Automatic Intents Extraction and Utterances Classification for Goal-oriented Dialogue Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A method for preprocessing dialog data sets in JSON format is described.	Leonid Legashev; Alexander Shukhman; Arthur Zhigalov;	arxiv-cs.AI	2023-12-15
650	Distilling Large Language Models for Matching Patients to Clinical Trials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are significant challenges associated with using closed-source proprietary LLMs like GPT-3.5 in practical healthcare applications, such as cost, privacy and reproducibility concerns. To address these issues, this study presents the first systematic examination of the efficacy of both proprietary (GPT-3.5, and GPT-4) and open-source LLMs (LLAMA 7B,13B, and 70B) for the task of patient-trial matching.	Mauro Nievas; Aditya Basu; Yanshan Wang; Hrituraj Singh;	arxiv-cs.AI	2023-12-15
651	LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the possibility of Language Adaptation for LLaMA models, explicitly focusing on addressing the challenge of Italian Language coverage.	PIERPAOLO BASILE et. al.	arxiv-cs.CL	2023-12-15
652	Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present BinSum, a comprehensive benchmark and dataset of over 557K binary functions and introduce a novel method for prompt synthesis and optimization.	Xin Jin; Jonathan Larson; Weiwei Yang; Zhiqiang Lin;	arxiv-cs.CR	2023-12-15
653	GPT-doctor: Customizing Large Language Models for Medical Consultation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel framework to customize LLMs for general business contexts that aims to achieve three fundamental objectives simultaneously: (1) aligning conversational patterns, (2) integrating in-depth domain knowledge, and (3) embodying the soft skills and core principles.	Wen Wang; Zhenyue Zhao; Tianshu Sun;	arxiv-cs.CY	2023-12-15
654	3DAxiesPrompts: Unleashing The 3D Spatial Task Capabilities of GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a new visual prompting method called 3DAxiesPrompts (3DAP) to unleash the capabilities of GPT-4V in performing 3D spatial tasks.	DINGNING LIU et. al.	arxiv-cs.AI	2023-12-15
655	Exploring Automatic Text Simplification of German Narrative Documents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we apply transformer-based Natural Language Generation (NLG) techniques to the problem of text simplification.	Thorben Schomacker; Tillmann Dönicke; Marina Tropmann-Frick;	arxiv-cs.CL	2023-12-15
656	Red AI? Inconsistent Responses from GPT3.5 Models on Political Issues in The US and China Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We posed the same question about high-profile political issues in the United States and China to GPT in both English and simplified Chinese, and our analysis of the bilingual responses revealed that GPT’s bilingual models’ political knowledge (content) and the political attitude (sentiment) are significantly more inconsistent on political issues in China.	Di Zhou; Yinxian Zhang;	arxiv-cs.CL	2023-12-15
657	Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, future superhuman models will behave in complex ways too difficult for humans to reliably evaluate; humans will only be able to weakly supervise superhuman models. We study an analogy to this problem: can weak model supervision elicit the full capabilities of a much stronger model?	COLLIN BURNS et. al.	arxiv-cs.CL	2023-12-14
658	TinyGSM: Achieving >80% on GSM8k with Small Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce \texttt{TinyGSM}, a synthetic dataset of 12.3M grade school math problems paired with Python solutions, generated fully by GPT-3.5.	BINGBIN LIU et. al.	arxiv-cs.LG	2023-12-14
659	VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Vision-Language Generative Pre-trained Transformer (VL-GPT), a transformer model proficient at concurrently perceiving and generating visual and linguistic data.	JINGUO ZHU et. al.	arxiv-cs.CV	2023-12-14
660	Heterogeneous Graph Neural Architecture Search with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a new GPT-4 based HGNAS model to improve the search efficiency and search accuracy of HGNAS.	Haoyuan Dong; Yang Gao; Haishuai Wang; Hong Yang; Peng Zhang;	arxiv-cs.AI	2023-12-14
661	No-Skim: Towards Efficiency Robustness Evaluation on Skimming-based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose No-Skim, a general framework to help the owners of skimming-based LLM to understand and measure the robustness of their acceleration scheme.	Shengyao Zhang; Mi Zhang; Xudong Pan; Min Yang;	arxiv-cs.CR	2023-12-14
662	Weaving Pathways for Justice with GPT: LLM-driven Automated Drafting of Interactive Legal Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we describe 3 approaches to automating the completion of court forms: a generative AI approach that uses GPT-3 to iteratively prompt the user to answer questions, a constrained template-driven approach that uses GPT-4-turbo to generate a draft of questions that are subject to human review, and a hybrid method.	Quinten Steenhuis; David Colarusso; Bryce Willey;	arxiv-cs.AI	2023-12-14
663	Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address increasing compute demand from recent multi-model workloads with heavy models like large language models, we propose to deploy heterogeneous chiplet-based multi-chip module (MCM)-based accelerators.	Mohanad Odema; Hyoukjun Kwon; Mohammad Abdullah Al Faruque;	arxiv-cs.AR	2023-12-14
664	Dietary Assessment with Multimodal ChatGPT: A Systematic Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the application of multimodal ChatGPT within the realm of dietary assessment.	FRANK P. -W. LO et. al.	arxiv-cs.CV	2023-12-13
665	Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Further, the concept drift phenomenon of API calls is prominent. To tackle these issues, we introduce a prompt engineering-assisted malware dynamic analysis using GPT-4.	Pei Yan; Shunquan Tan; Miaohui Wang; Jiwu Huang;	arxiv-cs.CR	2023-12-13
666	Native Language Identification with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first experiments on Native Language Identification (NLI) using LLMs such as GPT-4.	Wei Zhang; Alexandre Salle;	arxiv-cs.CL	2023-12-12
667	Exploring Large Language Models to Facilitate Variable Autonomy for Human-Robot Teaming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recognizing the significance of this development, this paper explores the integration of Large Language Models (LLMs) like Generative pre-trained transformer (GPT) into human-robot teaming environments to facilitate variable autonomy through the means of verbal human-robot communication. In this paper, we introduce a novel framework for such a GPT-powered multi-robot testbed environment, based on a Unity Virtual Reality (VR) setting.	Younes Lakhnati; Max Pascher; Jens Gerken;	arxiv-cs.HC	2023-12-12
668	How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our findings delineate GPT-4V’s capability boundaries in distribution shifts, shedding light on its strengths and limitations across various scenarios. Importantly, this investigation contributes to our understanding of how AI foundation models generalize to distribution shifts, offering pivotal insights into their adaptability and robustness.	ZHONGYI HAN et. al.	arxiv-cs.LG	2023-12-12
669	Abusive Span Detection for Vietnamese Narrative Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, there are limited studies on applying natural language processing (NLP) in this field in Vietnam. Therefore, we aim to contribute by building a human-annotated Vietnamese dataset for detecting abusive content in Vietnamese narrative texts.	Nhu-Thanh Nguyen; Khoa Thi-Kim Phan; Duc-Vu Nguyen; Ngan Luu-Thuy Nguyen;	arxiv-cs.CL	2023-12-12
670	Scaling Culture in Blockchain Gaming: Generative AI and Pseudonymous Engagement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Managing rapidly growing decentralized gaming communities brings unique challenges at the nexus of cultural economics and technology. This paper introduces a streamlined analytical framework that utilizes Large Language Models (LLMs), in this instance open-access generative pre-trained transformer (GPT) models, offering an efficient solution with deeper insights into community dynamics.	Henrik Axelsen; Sebastian Axelsen; Valdemar Licht; Jason Potts;	arxiv-cs.HC	2023-12-12
671	Towards Equipping Transformer with The Ability of Systematic Compositionality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We tentatively provide a successful implementation of a multi-layer CAT on the basis of the especially popular BERT.	Chen Huang; Peixin Qin; Wenqiang Lei; Jiancheng Lv;	arxiv-cs.CL	2023-12-12
672	Transformer Attractors for Robust and Efficient End-to-End Neural Diarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to replace EDA with a transformer-based attractor calculation (TA) module.	Lahiru Samarakoon; Samuel J. Broughton; Marc Harkönen; Ivan Fung;	arxiv-cs.SD	2023-12-11
673	Revisiting The Role of Label Smoothing in Enhanced Text Sentiment Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill in the gap, this article performs a set of in-depth analyses on eight datasets for text sentiment classification and three deep learning architectures: TextCNN, BERT, and RoBERTa, under two learning schemes: training from scratch and fine-tuning.	Yijie Gao; Shijing Si; Hua Luo; Haixia Sun; Yugui Zhang;	arxiv-cs.CL	2023-12-11
674	Survey on Foundation Models for Prognostics and Health Management in Industrial Cyber-Physical Systems Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Industrial Cyber-Physical Systems (ICPS) integrate the disciplines of computer science, communication technology, and engineering, and have emerged as integral components of …	Ruonan Liu; Quanhu Zhang; Te Han;	arxiv-cs.AI	2023-12-11
675	SM70: A Large Language Model for Medical Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We are introducing SM70, a 70 billion-parameter Large Language Model that is specifically designed for SpassMed’s medical devices under the brand name ‘JEE1’ (pronounced as G1 and means ‘Life’).	Anubhav Bhatti; Surajsinh Parmar; San Lee;	arxiv-cs.CL	2023-12-11
676	U-MixFormer: UNet-like Transformer with Mix-Attention for Efficient Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we propose a novel transformer decoder, U-MixFormer, built upon the U-Net structure, designed for efficient semantic segmentation.	Seul-Ki Yeom; Julian von Klitzing;	arxiv-cs.CV	2023-12-11
677	From Text to Motion: Grounding GPT-4 in A Humanoid Robot Alter3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We report the development of Alter3, a humanoid robot capable of generating spontaneous motion using a Large Language Model (LLM), specifically GPT-4.	Takahide Yoshida; Atsushi Masumori; Takashi Ikegami;	arxiv-cs.RO	2023-12-11
678	AI Control: Improving Safety Despite Intentional Subversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we develop and evaluate pipelines of safety techniques (protocols) that are robust to intentional subversion.	Ryan Greenblatt; Buck Shlegeris; Kshitij Sachan; Fabien Roger;	arxiv-cs.LG	2023-12-11
679	Gated Linear Attention Transformers with Hardware-Efficient Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work describes a hardware-efficient algorithm for linear attention that trades off memory movement against parallelizability.	Songlin Yang; Bailin Wang; Yikang Shen; Rameswar Panda; Yoon Kim;	arxiv-cs.LG	2023-12-11
680	Generative Large Language Models Are All-purpose Text Analytics Engines: Text-to-text Learning Is All Your Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Objective To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. Methods We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 billion parameters.	CHENG PENG et. al.	arxiv-cs.CL	2023-12-10
681	Image and Data Mining in Reticular Chemistry Using GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study demonstrates the remarkable ability of GPT-4V to navigate and obtain complex data for metal-organic frameworks, especially from graphical sources.	ZHILING ZHENG et. al.	arxiv-cs.AI	2023-12-09
682	A Review of Hybrid and Ensemble in Deep Learning for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This review presents a comprehensive exploration of hybrid and ensemble deep learning models within Natural Language Processing (NLP), shedding light on their transformative potential across diverse tasks such as Sentiment Analysis, Named Entity Recognition, Machine Translation, Question Answering, Text Classification, Generation, Speech Recognition, Summarization, and Language Modeling.	Jianguo Jia; Wen Liang; Youzhi Liang;	arxiv-cs.AI	2023-12-09
683	FP8-BERT: Post-Training Quantization for Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we empirically validate the effectiveness of FP8 as a way to do Post-Training Quantization without significant loss of accuracy, with a simple calibration and format conversion process.	Jianwei Li; Tianchi Zhang; Ian En-Hsu Yen; Dongkuan Xu;	arxiv-cs.AI	2023-12-09
684	Sim-GPT: Text Similarity Via GPT Annotated Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Due to the lack of a large collection of high-quality labeled sentence pairs with textual similarity scores, existing approaches for Semantic Textual Similarity (STS) mostly rely on unsupervised techniques or training signals that are only partially correlated with textual similarity, e.g., NLI-based datasets. To tackle this issue, in this paper, we propose the strategy of measuring text similarity via GPT annotated data (Sim-GPT for short).	SHUHE WANG et. al.	arxiv-cs.CL	2023-12-09
685	Hate Speech and Offensive Content Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We, team Z-AGI Labs, conduct a comprehensive comparative analysis of hate speech classification across five distinct languages: Bengali, Assamese, Bodo, Sinhala, and Gujarati.	Nikhil Narayan; Mrutyunjay Biswal; Pramod Goyal; Abhranta Panigrahi;	arxiv-cs.CL	2023-12-09
686	GPT-4 and Safety Case Generation: An Exploratory Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While their potential is undeniable across various domains, this paper sets out on a captivating expedition to investigate their uncharted territory, the exploration of generating safety cases. In this paper, our primary objective is to delve into the existing knowledge base of GPT-4, focusing specifically on its understanding of the Goal Structuring Notation (GSN), a well-established notation allowing to visually represent safety cases.	Mithila Sivakumar; Alvine Boaye Belle; Jinjun Shan; Kimya Khakzad Shahandashti;	arxiv-cs.SE	2023-12-09
687	Illicit Darkweb Classification Via Natural-language Processing: Classifying Illicit Content of Webpages Based on Textual Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work aims at expanding previous works done in the context of illegal activities classification, performing three different steps.	Giuseppe Cascavilla; Gemma Catolino; Mirella Sangiovanni;	arxiv-cs.IR	2023-12-08
688	Leveraging Transformer-based Language Models to Automate Requirements Satisfaction Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we leverage recent advances in natural language processing to deliver significantly more accurate results.	Amrit Poudel; Jinfeng Lin; Jane Cleland-Huang;	arxiv-cs.SE	2023-12-07
689	GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recently, GPT-4 with Vision (GPT-4V) has demonstrated remarkable visual capabilities across various tasks, but its performance in emotion recognition has not been fully evaluated. To bridge this gap, we present the quantitative evaluation results of GPT-4V on 21 benchmark datasets covering 6 tasks: visual sentiment analysis, tweet sentiment analysis, micro-expression recognition, facial emotion recognition, dynamic facial emotion recognition, and multimodal emotion recognition.	ZHENG LIAN et. al.	arxiv-cs.CV	2023-12-07
690	Enhancing Medical Task Performance in GPT-4V: A Comprehensive Study on Prompt Engineering Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From our comprehensive evaluations, we distilled 10 effective prompt engineering techniques, each fortifying GPT-4V’s medical acumen.	PENGCHENG CHEN et. al.	arxiv-cs.CL	2023-12-07
691	User-Aware Prefix-Tuning Is A Good Learner for Personalized Image Captioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, they need to update the entire caption model parameters when meeting new samples, which is time-consuming and calculation-intensive. To address this challenge, we propose a novel personalized image captioning framework that leverages user context to consider personality factors.	Xuan Wang; Guanhong Wang; Wenhao Chai; Jiayu Zhou; Gaoang Wang;	arxiv-cs.CV	2023-12-07
692	On Sarcasm Detection with OpenAI GPT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the applications of the Generative Pretrained Transformer (GPT) models, including GPT-3, InstructGPT, GPT-3.5, and GPT-4, in detecting sarcasm in natural language.	Montgomery Gole; Williams-Paul Nwadiugwu; Andriy Miranskyy;	arxiv-cs.CL	2023-12-07
693	JAMMIN-GPT: Text-based Improvisation Using LLMs in Ableton Live Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a system that allows users of Ableton Live to create MIDI-clips by naming them with musical descriptions.	Sven Hollowell; Tashi Namgyal; Paul Marshall;	arxiv-cs.SD	2023-12-06
694	Transformer-Powered Surrogates Close The ICF Simulation-Experiment Gap with Extremely Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel transformer-powered approach for enhancing prediction accuracy in multi-modal output scenarios, where sparse experimental data is supplemented with simulation data.	MATTHEW L. OLSON et. al.	arxiv-cs.LG	2023-12-06
695	Not All Large Language Models (LLMs) Succumb to The Reversal Curse: A Comparative Study of Deductive Logical Reasoning in BERT and GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study, we examined a bidirectional LLM, BERT, and found that it is immune to the reversal curse.	Jingye Yang; Da Wu; Kai Wang;	arxiv-cs.CL	2023-12-06
696	A Text-to-Text Model for Multilingual Offensive Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the majority of these models are limited in their capabilities due to their encoder-only architecture, which restricts the number and types of labels in downstream tasks. Addressing these limitations, this study presents the first pre-trained model with encoder-decoder architecture for offensive language identification with text-to-text transformers (T5) trained on two large offensive language identification datasets; SOLID and CCTK.	Tharindu Ranasinghe; Marcos Zampieri;	arxiv-cs.CL	2023-12-06
697	DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose DocBinFormer (Document Binarization Transformer), a novel two-level vision transformer (TL-ViT) architecture based on vision transformers for effective document image binarization.	Risab Biswas; Swalpa Kumar Roy; Ning Wang; Umapada Pal; Guang-Bin Huang;	arxiv-cs.CV	2023-12-06
698	Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, it raises the concern that the current research findings only hold for GPT models but not LLM in general. In this work, we lift this pre-condition and build for the first time effective listwise rerankers without any form of dependency on GPT.	Xinyu Zhang; Sebastian Hofstätter; Patrick Lewis; Raphael Tang; Jimmy Lin;	arxiv-cs.CL	2023-12-05
699	Empathy and Distress Detection Using Ensembles of Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents our approach for the WASSA 2023 Empathy, Emotion and Personality Shared Task.	Tanmay Chavan; Kshitij Deshpande; Sheetal Sonawane;	arxiv-cs.CL	2023-12-05
700	RankZephyr: Effective and Robust Zero-Shot Listwise Reranking Is A Breeze! Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the gap between open-source and closed models persists, with reliance on proprietary, non-transparent models constraining reproducibility. Addressing this gap, we introduce RankZephyr, a state-of-the-art, open-source LLM for listwise zero-shot reranking.	Ronak Pradeep; Sahel Sharifymoghaddam; Jimmy Lin;	arxiv-cs.IR	2023-12-05
701	A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyzed the capability of GPT-4 to produce multiple-choice questions (MCQs) aligned with specific learning objectives (LOs) from Python programming classes in higher education.	JACOB DOUGHTY et. al.	arxiv-cs.CY	2023-12-05
702	Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By adopting a divide-and-conquer strategy, we propose a new framework named Fine-Grained Human Motion Diffusion Model (FG-MDM) for human motion generation.	Xu Shi; Chuanchen Luo; Junran Peng; Hongwen Zhang; Yunlian Sun;	arxiv-cs.CV	2023-12-05
703	Jellyfish: A Large Language Model for Data Preprocessing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Whereas the use of LLMs has sparked interest in devising universal solutions to DP, recent initiatives in this domain typically rely on GPT APIs, raising inevitable data breach concerns. Unlike these approaches, we consider instruction-tuning local LLMs (7 – 13B models) as universal DP ask solver.	Haochen Zhang; Yuyang Dong; Chuan Xiao; Masafumi Oyamada;	arxiv-cs.AI	2023-12-04
704	MKA: A Scalable Medical Knowledge Assisted Mechanism for Generative Models on Medical Conversation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address the limitation, a scalable Medical Knowledge Assisted mechanism, MKA, is proposed in this paper.	Ke Liang; Sifan Wu; Jiayi Gu;	arxiv-cs.CL	2023-12-04
705	Explore, Select, Derive, and Recall: Augmenting LLM with Human-like Memory for Mobile Task Automation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to the inherent unreliability and high operational cost of LLMs, their practical applicability is quite limited. To address these issues, this paper introduces MobileGPT, an innovative LLM-based mobile task automator equipped with a human-like app memory.	SUNJAE LEE et. al.	arxiv-cs.HC	2023-12-04
706	Automatic Report Generation for Histopathology Images Using Pre-trained Vision Transformers and BERT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we show that using an existing pre-trained Vision Transformer (ViT) to encode 4096×4096 sized patches of the Whole Slide Image (WSI) and a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model for language modeling-based decoder for report generation, we can build a performant and portable report generation mechanism that takes into account the whole high resolution image.	Saurav Sengupta; Donald E. Brown;	arxiv-cs.CV	2023-12-03
707	On Significance of Subword Tokenization for Low Resource and Efficient Named Entity Recognition: A Case Study in Marathi Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on NER for low-resource language and present our case study in the context of the Indian language Marathi.	HARSH CHAUDHARI et. al.	arxiv-cs.CL	2023-12-03
708	Harnessing The Power of Prompt-based Techniques for Generating School-Level Questions Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions.	Subhankar Maity; Aniket Deroy; Sudeshna Sarkar;	arxiv-cs.CL	2023-12-02
709	Swarm-GPT: Combining Large Language Models with Safe Motion Planning for Robot Choreography Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning – offering an automated and novel approach to deployable drone swarm choreography.	AORAN JIAO et. al.	arxiv-cs.RO	2023-12-02
710	From Voices to Validity: Leveraging Large Language Models (LLMs) for Textual Analysis of Policy Stakeholder Interviews Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the integration of Large Language Models (LLMs)–like GPT-4–with human expertise to enhance text analysis of stakeholder interviews regarding K-12 education policy within one U.S. state.	Alex Liu; Min Sun;	arxiv-cs.HC	2023-12-02
711	A Ripple in Time: A Discontinuity in American History Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this note we use the State of the Union Address (SOTU) dataset from Kaggle to make some surprising (and some not so surprising) observations pertaining to the general timeline of American history, and the character and nature of the addresses themselves.	Alexander Kolpakov; Igor Rivin;	arxiv-cs.CL	2023-12-02
712	MCRformer: Morphological Constraint Reticular Transformer for 3D Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	JUN YU LI et. al.	Expert Syst. Appl.	2023-12-01
713	QTN: Quaternion Transformer Network for Hyperspectral Image Classification Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Numerous state-of-the-art transformer-based techniques with self-attention mechanisms have recently been demonstrated to be quite effective in the classification of hyperspectral …	Xiaofei Yang; Weijia Cao; Yao Lu; Yicong Zhou;	IEEE Transactions on Circuits and Systems for Video …	2023-12-01
714	TNN-IDS: Transformer Neural Network-based Intrusion Detection System for MQTT-enabled IoT Networks Related Papers Related Patents Related Grants Related Venues Related Experts View	SAFI ULLAH et. al.	Comput. Networks	2023-12-01
715	Quick Back-Translation for Unsupervised Machine Translation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a two-for-one improvement to Transformer back-translation: Quick Back-Translation (QBT).	Benjamin Brimacombe; Jiawei Zhou;	arxiv-cs.CL	2023-12-01
716	Applying Large Language Models and Chain-of-Thought for Automatic Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigates the application of large language models (LLMs), specifically GPT-3.5 and GPT-4, with Chain-of-Though (CoT) in the automatic scoring of student-written responses to science assessments.	Gyeong-Geon Lee; Ehsan Latif; Xuansheng Wu; Ninghao Liu; Xiaoming Zhai;	arxiv-cs.CL	2023-11-30
717	CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new critique generation model called CritiqueLLM, which includes a dialogue-based prompting method for high-quality referenced / reference-free evaluation data.	PEI KE et. al.	arxiv-cs.CL	2023-11-30
718	OmniMotionGPT: Animal Motion Generation with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our paper aims to generate diverse and realistic animal motion sequences from textual descriptions, without a large-scale animal text-motion dataset.	ZHANGSIHAO YANG et. al.	arxiv-cs.CV	2023-11-30
719	MultiResFormer: Transformer with Adaptive Multi-Resolution Modeling for General Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MultiResFormer, which dynamically models temporal variations by adaptively choosing optimal patch lengths.	LINFENG DU et. al.	arxiv-cs.LG	2023-11-30
720	Mavericks at ArAIEval Shared Task: Towards A Safer Digital Space — Transformer Ensemble Models Tackling Deception and Persuasion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we highlight our approach for the Arabic AI Tasks Evaluation (ArAiEval) Shared Task 2023.	Sudeep Mangalvedhekar; Kshitij Deshpande; Yash Patwardhan; Vedant Deshpande; Ravindra Murumkar;	arxiv-cs.CL	2023-11-30
721	Autonomous Agents in Software Development: A Vision Paper Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have shown in our initial experimental analysis for simple software (e.g., Snake Game, Tic-Tac-Toe, Notepad) that multiple GPT agents can produce high-quality code and document it carefully.	ZEESHAN RASHEED et. al.	arxiv-cs.SE	2023-11-30
722	TransOpt: Transformer-based Representation Learning for Optimization Problem Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a representation of optimization problem instances using a transformer-based neural network architecture trained for the task of problem classification of the 24 problem classes from the Black-box Optimization Benchmarking (BBOB) benchmark.	Gjorgjina Cenikj; Gašper Petelin; Tome Eftimov;	arxiv-cs.LG	2023-11-29
723	Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This ubiquitous layer of language models is often overlooked. We find that similarities between these input embeddings are highly interpretable and that the geometry of these embeddings differs between model families.	Andrea W Wen-Yi; David Mimno;	arxiv-cs.CL	2023-11-29
724	TurkishBERTweet: Fast and Reliable Large Language Model for Social Media Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Wide us of this language on social media platforms such as Twitter, Instagram, or Tiktok and strategic position of the country in the world politics makes it appealing for the social network researchers and industry. To address this need, we introduce TurkishBERTweet, the first large scale pre-trained language model for Turkish social media built using almost 900 million tweets.	Ali Najafi; Onur Varol;	arxiv-cs.CL	2023-11-29
725	Extrapolatable Transformer Pre-training for Ultra Long Time-Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present Timely Generative Pre-trained Transformer (TimelyGPT).	Ziyang Song; Qincheng Lu; Hao Xu; David L. Buckeridge; Yue Li;	arxiv-cs.LG	2023-11-29
726	Improving The Robustness of Transformer-based Large Language Models with Dynamic Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method called dynamic attention, tailored for the transformer architecture, to enhance the inherent robustness of the model itself against various adversarial attacks.	LUJIA SHEN et. al.	arxiv-cs.CL	2023-11-29
727	LLVMs4Protest: Harnessing The Power of Large Language and Vision Models for Deciphering Protests in The News Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: UCLA-protest project contains labeled imagery data with information such as protest, violence, and sign.	Yongjun Zhang;	arxiv-cs.CV	2023-11-29
728	Biomedical Knowledge Graph-enhanced Prompt Generation for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce a task-agnostic Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging the massive biomedical KG SPOKE with LLMs such as Llama-2-13b, GPT-3.5-Turbo and GPT-4, to generate meaningful biomedical text rooted in established knowledge.	KARTHIK SOMAN et. al.	arxiv-cs.CL	2023-11-28
729	General-Purpose Vs. Domain-Adapted Large Language Models for Extraction of Structured Data from Chest Radiology Reports Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, variability in style limits usage. Study compares system using domain-adapted language model (RadLing) and general-purpose LLM (GPT-4) in extracting relevant features from chest radiology reports and standardizing them to common data elements (CDEs).	ALI H. DHANALIWALA et. al.	arxiv-cs.CL	2023-11-28
730	Scaling Political Texts with ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop and validate a new approach by positioning British party manifestos on the economic, social, and immigration policy dimensions and tweets by members of the US Congress on the left-right ideological spectrum.	Gaël Le Mens; Aina Gallego;	arxiv-cs.CL	2023-11-28
731	RoKEPG: RoBERTa and Knowledge Enhancement for Prescription Generation of Traditional Chinese Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a RoBERTa and Knowledge Enhancement model for Prescription Generation of Traditional Chinese Medicine (RoKEPG).	Hua Pu; Jiacong Mi; Shan Lu; Jieyue He;	arxiv-cs.CL	2023-11-28
732	GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper does not present a novel method.	WENHAO WU et. al.	arxiv-cs.CV	2023-11-27
733	BERT Goes Off-Topic: Investigating The Domain Transfer Challenge Using Genre Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For example, a genre classifier trained on \textit{political} topics often fails when tested on documents about \textit{sport} or \textit{medicine}. In this work, we quantify this phenomenon empirically with a large corpus and a large set of topics.	Dmitri Roussinov; Serge Sharoff;	arxiv-cs.CL	2023-11-27
734	MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we improve access to large-scale medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain.	ZEMING CHEN et. al.	arxiv-cs.CL	2023-11-27
735	OccWorld: Learning A 3D Occupancy World Model for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D Occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.	WENZHAO ZHENG et. al.	arxiv-cs.CV	2023-11-27
736	Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Medprompt, based on a composition of several prompting strategies.	HARSHA NORI et. al.	arxiv-cs.CL	2023-11-27
737	Comparative Analysis of ChatGPT, GPT-4, and Microsoft Bing Chatbots for GRE Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This research paper presents an analysis of how well three artificial intelligence chatbots: Bing, ChatGPT, and GPT-4, perform when answering questions from standardized tests.	Mohammad Abu-Haifa; Bara’a Etawi; Huthaifa Alkhatatbeh; Ayman Ababneh;	arxiv-cs.CL	2023-11-26
738	Machine-Generated Text Detection Using Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research focuses on the crucial challenge of discerning text produced by Large Language Models (LLMs) from human-generated text, which holds significance for various applications. With ongoing discussions about attaining a model with such functionality, we present supporting evidence regarding the feasibility of such models.	Raghav Gaggar; Ashish Bhagchandani; Harsh Oza;	arxiv-cs.CL	2023-11-26
739	GPT-4V Takes The Wheel: Promises and Challenges for Pedestrian Behavior Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate GPT-4V(ision) on publicly available pedestrian datasets: JAAD and WiDEVIEW.	Jia Huang; Peng Jiang; Alvika Gautam; Srikanth Saripalli;	arxiv-cs.CV	2023-11-24
740	LLamol: A Dynamic Multi-Conditional Generative Transformer for De Novo Molecular Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Generative models have demonstrated substantial promise in Natural Language Processing (NLP) and have found application in designing molecules, as seen in General Pretrained Transformer (GPT) models. In our efforts to develop such a tool for exploring the organic chemical space in search of potentially electro-active compounds, we present LLamol, a single novel generative transformer model based on the LLama 2 architecture, which was trained on a 13M superset of organic compounds drawn from diverse public sources.	Niklas Dobberstein; Astrid Maass; Jan Hamaekers;	arxiv-cs.LG	2023-11-24
741	GPT Struct Me: Probing GPT Models on Narrative Entity Extraction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such effectiveness raises a pertinent question: Can these models be leveraged for the extraction of structured information? In this work, we address this question by evaluating the capabilities of two state-of-the-art language models — GPT-3 and GPT-3.5, commonly known as ChatGPT — in the extraction of narrative entities, namely events, participants, and temporal expressions.	Hugo Sousa; Nuno Guimarães; Alípio Jorge; Ricardo Campos;	arxiv-cs.CL	2023-11-24
742	GeoViT: A Versatile Vision Transformer Architecture for Geospatial Image Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GeoViT, a compact vision transformer model adept in processing satellite imagery for multimodal segmentation, classification, and regression tasks targeting CO2 and NO2 emissions.	Madhav Khirwar; Ankur Narang;	arxiv-cs.CV	2023-11-24
743	Benchmarking Large Language Models for Log Analysis, Security, and Interpretation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The best-performing fine-tuned sequence classification model (DistilRoBERTa) outperforms the current state-of-the-art; with an average F1-Score of 0.998 across six datasets from both web application and system log sources. To achieve this, we propose and implement a new experimentation pipeline (LLM4Sec) which leverages LLMs for log analysis experimentation, evaluation, and analysis.	Egil Karlsen; Xiao Luo; Nur Zincir-Heywood; Malcolm Heywood;	arxiv-cs.NI	2023-11-24
744	CMed-GPT: Prompt Tuning for Entity-Aware Chinese Medical Dialogue Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing medical dialogue models are mostly based on BERT and pre-trained on English corpora, but there is a lack of high-performing models on the task of Chinese medical dialogue generation. To solve the above problem, this paper proposes CMed-GPT, which is the GPT pre-training language model based on Chinese medical domain text.	Zhijie Qu; Juan Li; Zerui Ma; Jianqiang Li;	arxiv-cs.CL	2023-11-24
745	AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively evaluate large vision-language models in open-ended video question answering.	Xiuyuan Chen; Yuan Lin; Yuchen Zhang; Weiran Huang;	arxiv-cs.CV	2023-11-24
746	A Cross Attention Approach to Diagnostic Explainability Using Clinical Practice Guidelines for Depression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an application-specific language model called ProcesS knowledge-infused cross ATtention (PSAT), which incorporates CPGs when computing attention.	SUMIT DALAL et. al.	arxiv-cs.AI	2023-11-23
747	Evaluating GPT-4’s Vision Capabilities on Brazilian University Admission Exams Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, existing studies often overlook questions that require the integration of visual comprehension, thus compromising the full spectrum and complexity inherent in real-world scenarios. To address this gap, we present a comprehensive framework to evaluate language models on entrance exams, which incorporates both textual and visual elements.	Ramon Pires; Thales Sales Almeida; Hugo Abonizio; Rodrigo Nogueira;	arxiv-cs.CL	2023-11-23
748	Towards Explainable Strategy Templates Using NLP Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By leveraging traditional Natural Language Processing (NLP) techniques and Large Language Models (LLMs) equipped with Transformers, we outline how parts of DRL strategies composed of parts within strategy templates can be transformed into user-friendly, human-like English narratives. To achieve this, we present a top-level algorithm that involves parsing mathematical expressions of strategy templates, semantically interpreting variables and structures, generating rule-based primary explanations, and utilizing a Generative Pre-trained Transformer (GPT) model to refine and contextualize these explanations.	Pallavi Bagga; Kostas Stathis;	arxiv-cs.AI	2023-11-23
749	MLLM-Bench, Evaluating Multi-modal LLMs Using GPT-4V Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing automatic evaluation methodologies on multi-modal large language models rely on objective queries that have standard answers, inadequately addressing the nuances of creative and associative multi-modal tasks. To address this, we introduce MLLM-Bench, an innovative benchmark inspired by Vicuna, spanning a diverse array of scenarios, including Perception, Understanding, Applying, Analyzing, Evaluating, and Creation along with the ethical consideration.	WENTAO GE et. al.	arxiv-cs.CL	2023-11-23
750	Surpassing GPT-4 Medical Coding with A Two-Stage Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the GPT-4 LLM predicts an excessive number of ICD codes for medical coding tasks, leading to high recall but low precision. To tackle this challenge, we introduce LLM-codex, a two-stage approach to predict ICD codes that first generates evidence proposals using an LLM and then employs an LSTM-based verification stage.	Zhichao Yang; Sanjit Singh Batra; Joel Stremmel; Eran Halperin;	arxiv-cs.CL	2023-11-22
751	Current Topological and Machine Learning Applications for Bias Detection in Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, few previous studies investigate large language model embeddings and geometric models of biased text data to understand geometry’s impact on bias modeling accuracy. To overcome this issue, this study utilizes the RedditBias database to analyze textual biases.	COLLEEN FARRELLY et. al.	arxiv-cs.CY	2023-11-22
752	Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper employs different transformer models, and train for Named Entity Recognition (NER) in the context of Australian construction SCRM.	Milad Baghalzadeh Shishehgarkhaneh; Robert C. Moehler; Yihai Fang; Amer A. Hijazi; Hamed Aboutorab;	arxiv-cs.CL	2023-11-22
753	Comparison of Pipeline, Sequence-to-sequence, and GPT Models for End-to-end Relation Extraction: Experiments with The Rare Disease Use-case Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to compare three prevailing paradigms for E2ERE using a complex dataset focused on rare diseases involving discontinuous and nested entities.	Shashank Gupta; Xuguang Ai; Ramakanth Kavuluru;	arxiv-cs.CL	2023-11-22
754	White-Box Transformers Via Sparse Rate Reduction: Compression Is All There Is? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we contend that a natural objective of representation learning is to compress and transform the distribution of the data, say sets of tokens, towards a low-dimensional Gaussian mixture supported on incoherent subspaces.	YAODONG YU et. al.	arxiv-cs.LG	2023-11-21
755	Visual Analytics for Generative Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel visual analytical framework to support the analysis of transformer-based generative networks.	RAYMOND LI et. al.	arxiv-cs.CL	2023-11-21
756	Detecting Out-of-distribution Text Using Topological Features of Transformer-based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate our proposed TDA-based approach for out-of-distribution detection on BERT, a transformer-based language model, and compare the to a more traditional OOD approach based on BERT CLS embeddings. We found that our TDA approach outperforms the CLS embedding approach at distinguishing in-distribution data (politics and entertainment news articles from HuffPost) from far out-of-domain samples (IMDB reviews), but its effectiveness deteriorates with near out-of-domain (CNN/Dailymail) or same-domain (business news articles from HuffPost) datasets.	Andres Pollano; Anupam Chaudhuri; Anj Simmons;	arxiv-cs.CL	2023-11-21
757	InterPrompt: Interpretable Prompting for Interrelated Interpersonal Risk Factors in Reddit Posts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce an Interpretable Prompting (InterPrompt)} method to boost the attention mechanism by fine-tuning the GPT-3 model.	MSVPJ Sathvik; Surjodeep Sarkar; Chandni Saxena; Sunghwan Sohn; Muskan Garg;	arxiv-cs.CL	2023-11-21
758	NERIF: GPT-4V for Automatic Scoring of Drawn Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We randomly selected a set of balanced data (N = 900) that includes student-drawn models for six modeling assessment tasks.	Gyeong-Geon Lee; Xiaoming Zhai;	arxiv-cs.AI	2023-11-21
759	GPT4Motion: Scripting Physical Motions in Text-to-Video Generation Via Blender-Oriented GPT Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they usually encounter high computational costs and often struggle to produce videos with coherent physical motions. To tackle these issues, we propose GPT4Motion, a training-free framework that leverages the planning capability of large language models such as GPT, the physical simulation strength of Blender, and the excellent image generation ability of text-to-image diffusion models to enhance the quality of video synthesis.	JIAXI LV et. al.	arxiv-cs.CV	2023-11-21
760	Looped Transformers Are Better at Learning Learning Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the absence of an inherent iterative structure in the transformer architecture presents a challenge in emulating the iterative algorithms, which are commonly employed in traditional machine learning methods. To address this, we propose the utilization of looped transformer architecture and its associated training methodology, with the aim of incorporating iterative characteristics into the transformer architectures.	Liu Yang; Kangwook Lee; Robert Nowak; Dimitris Papailiopoulos;	arxiv-cs.LG	2023-11-21
761	Extracting Definienda in Mathematical Scholarly Articles with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the development of transformer-based natural language processing applications, we pose the problem as (a) a token-level classification task using fine-tuned pre-trained transformers; and (b) a question-answering task using a generalist large language model (GPT).	Shufan Jiang; Pierre Senellart;	arxiv-cs.AI	2023-11-21
762	PhayaThaiBERT: Enhancing A Pretrained Thai Language Model with Unassimilated Loanwords Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While WangchanBERTa has become the de facto standard in transformer-based Thai language modeling, it still has shortcomings in regard to the understanding of foreign words, most notably English words, which are often borrowed without orthographic assimilation into Thai in many contexts. We identify the lack of foreign vocabulary in WangchanBERTa’s tokenizer as the main source of these shortcomings.	Panyut Sriwirote; Jalinee Thapiang; Vasan Timtong; Attapol T. Rutherford;	arxiv-cs.CL	2023-11-21
763	Interpretation of The Transformer and Improvement of The Extractor Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: It has been over six years since the Transformer architecture was put forward. Surprisingly, the vanilla Transformer architecture is still widely used today. One reason is that …	Zhe Chen;	arxiv-cs.LG	2023-11-21
764	From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This gap is addressed with the release of multimodal vision language models, such as GPT-4V, enabling AI to impact many more types of tasks. In light of these advancements, this paper presents a comprehensive evaluation of GPT-4V, a vision language model, across a wide spectrum of engineering design tasks, categorized into four main areas: Conceptual Design, System-Level and Detailed Design, Manufacturing and Inspection, and Engineering Education Tasks.	CYRIL PICARD et. al.	arxiv-cs.AI	2023-11-21
765	Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer-based Large Language Models (LLMs) have been applied in diverse areas such as knowledge bases, human interfaces, and dynamic agents, and marking a stride towards achieving Artificial General Intelligence (AGI).	YUNPENG HUANG et. al.	arxiv-cs.CL	2023-11-20
766	A Novel Transformer-based Approach for Soil Temperature Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a novel approach using transformer models for the purpose of forecasting soil temperature prediction.	Muhammet Mucahit Enes Yurtsever; Ayhan Kucukmanisa; Zeynep Hilal Kilimci;	arxiv-cs.LG	2023-11-20
767	GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a pipeline that enhances a general-purpose Vision Language Model, GPT-4V(ision), by integrating observations of human actions to facilitate robotic manipulation.	Naoki Wake; Atsushi Kanehira; Kazuhiro Sasabuchi; Jun Takamatsu; Katsushi Ikeuchi;	arxiv-cs.RO	2023-11-20
768	Which AI Technique Is Better to Classify Requirements? An Experiment with SVM, LSTM, and ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, Large Language Models like ChatGPT have demonstrated remarkable proficiency in various Natural Language Processing tasks.	Abdelkarim El-Hajjami; Nicolas Fafin; Camille Salinesi;	arxiv-cs.AI	2023-11-20
769	Inspecting Explainability of Transformer Models with Additional Statistical Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Transformer becomes more popular in the vision domain in recent years so there is a need for finding an effective way to interpret the Transformer model by visualizing it.	Hoang C. Nguyen; Haeil Lee; Junmo Kim;	arxiv-cs.CV	2023-11-19
770	GPT in Data Science: A Practical Exploration of Model Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to elucidate and express the factors and assumptions guiding GPT-4’s model selection recommendations.	Nathalia Nascimento; Cristina Tavares; Paulo Alencar; Donald Cowan;	arxiv-cs.AI	2023-11-19
771	Assessing Prompt Injection Risks in 200+ Custom GPTs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Through prompt injection, an adversary can not only extract the customized system prompts but also access the uploaded files. This paper provides a first-hand analysis of the prompt injection, alongside the evaluation of the possible mitigation of such attacks.	Jiahao Yu; Yuhang Wu; Dong Shu; Mingyu Jin; Xinyu Xing;	arxiv-cs.CR	2023-11-19
772	Partially Randomizing Transformer Weights for Dialogue Response Diversity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose the \underline{Pa}rtially \underline{Ra}ndomized trans\underline{Former} (PaRaFormer), a simple extension of the transformer which involves freezing the weights of selected layers after random initialization.	Jing Yang Lee; Kong Aik Lee; Woon-Seng Gan;	arxiv-cs.CL	2023-11-17
773	Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we focus on attention heads, a major component of the Transformer architecture, and propose a bias analysis framework to explore and identify a small set of biased heads that are found to contribute to a PLM’s stereotypical bias.	Yi Yang; Hanyu Duan; Ahmed Abbasi; John P. Lalor; Kar Yan Tam;	arxiv-cs.CL	2023-11-17
774	Towards Autonomous Hypothesis Verification Via Language Models with Minimal Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Research automation efforts usually employ AI as a tool to automate specific tasks within the research process.	Shiro Takagi; Ryutaro Yamauchi; Wataru Kumagai;	arxiv-cs.AI	2023-11-16
775	GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As part of the full dataset, we introduce a new, manually curated subset StatCodeSearch that focuses on R, a popular but so far underrepresented programming language that is often used by researchers outside the field of computer science.	ANDOR DIERA et. al.	arxiv-cs.CL	2023-11-16
776	On Retrieval Augmentation and The Limitations of Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out one previously posited possibility — the softmax bottleneck.	TING-RUI CHIANG et. al.	arxiv-cs.CL	2023-11-16
777	Self-Contradictory Reasoning Evaluation and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate self-contradictory (Self-Contra) reasoning, where the model reasoning does not support predictions.	Ziyi Liu; Isabelle Lee; Yongkang Du; Soumya Sanyal; Jieyu Zhao;	arxiv-cs.CL	2023-11-16
778	Advancements in Generative AI: A Comprehensive Review of GANs, GPT, Autoencoders, Diffusion Model, and Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This advancement in Generative AI presents a wealth of exciting opportunities and, simultaneously, unprecedented challenges. Throughout this paper, we have explored these state-of-the-art models, the diverse array of tasks they can accomplish, the challenges they pose, and the promising future of Generative Artificial Intelligence.	STAPHORD BENGESI et. al.	arxiv-cs.LG	2023-11-16
779	Diagnosing and Debiasing Corpus-Based Political Bias and Insults in GPT2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We aim to contribute to the ongoing effort of investigating the ethical and social implications of human-AI interaction.	Ambri Ma; Arnav Kumar; Brett Zeligson;	arxiv-cs.CL	2023-11-16
780	Multi-View Spectrogram Transformer for Respiratory Sound Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a Multi-View Spectrogram Transformer (MVST) is proposed to embed different views of time-frequency characteristics into the vision transformer.	Wentao He; Yuchen Yan; Jianfeng Ren; Ruibin Bai; Xudong Jiang;	arxiv-cs.SD	2023-11-16
781	Generative AI for Hate Speech Detection: Evaluation and Findings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this chapter, we provide a review of relevant methods, experimental setups and evaluation of this approach.	Sagi Pendzel; Tomer Wullach; Amir Adler; Einat Minkov;	arxiv-cs.CL	2023-11-16
782	LOKE: Linked Open Knowledge Extraction for Automated Knowledge Graph Construction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through this analysis and a qualitative analysis of sentence extractions via all methods, we found that LOKE-GPT extractions are of high utility for the KGC task and suitable for use in semi-automated extraction settings.	Jamie McCusker;	arxiv-cs.CL	2023-11-15
783	We Demand Justice!: Towards Social Context Grounding of Political Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose two challenging datasets that require an understanding of the real-world context of the text.	Rajkumar Pujari; Chengfei Wu; Dan Goldwasser;	arxiv-cs.CL	2023-11-15
784	Jailbreaking GPT-4V Via Self-Adversarial Attacks with System Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through carefully designed dialogue, we successfully extract the internal system prompts of GPT-4V. This finding indicates potential exploitable security risks in MLLMs; 2) Based on the acquired system prompts, we propose a novel MLLM jailbreaking attack method termed SASP (Self-Adversarial Attack via System Prompt).	Yuanwei Wu; Xiang Li; Yixin Liu; Pan Zhou; Lichao Sun;	arxiv-cs.CR	2023-11-15
785	Llamas Know What GPTs Don’t Show: Surrogate Models for Confidence Estimation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: To maintain user trust, large language models (LLMs) should signal low confidence on examples where they are incorrect, instead of misleading the user. The standard approach of …	Vaishnavi Shrivastava; Percy Liang; Ananya Kumar;	arxiv-cs.CL	2023-11-15
786	Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial Domains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Importantly, only proprietary models such as GPT-3.5 and GPT-4 can recognize nonsensical guidelines, which we hypothesize is due to more sophisticated alignment methods.	Marcio Fonseca; Shay B. Cohen;	arxiv-cs.CL	2023-11-15
787	Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a holistic end-to-end solution for annotating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels concerning the verifiability and factual inconsistencies found in LLM outputs.	YUXIA WANG et. al.	arxiv-cs.CL	2023-11-15
788	MELA: Multilingual Evaluation of Linguistic Acceptability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent benchmarks for Large Language Models (LLMs) have mostly focused on application-driven tasks such as complex reasoning and code generation, and this has led to a scarcity in purely linguistic evaluation of LLMs. Against this background, we introduce Multilingual Evaluation of Linguistic Acceptability — MELA, the first multilingual benchmark on linguistic acceptability with 48K samples covering 10 languages from a diverse set of language families.	ZIYIN ZHANG et. al.	arxiv-cs.CL	2023-11-15
789	Automated Title and Abstract Screening for Scoping Reviews Using The GPT-4 Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This manuscript introduces GPTscreenR, a package for the R statistical programming language that uses the GPT-4 Large Language Model (LLM) to automatically screen sources.	David Wilkins;	arxiv-cs.CL	2023-11-14
790	How Good Are Large Language Models on African Languages? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an analysis of three popular large language models (mT0, LLaMa 2, and GPT-4) on five tasks (news topic classification, sentiment classification, machine translation, question answering, and named entity recognition) across 30 African languages, spanning different language families and geographical regions.	Jessica Ojo; Kelechi Ogueji; Pontus Stenetorp; David I. Adelani;	arxiv-cs.CL	2023-11-14
791	Exploring Semi-supervised Hierarchical Stacked Encoder for Legal Judgement Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The lengthy and non-uniform document structure poses an even greater challenge in extracting information for decision prediction. In this work, we explore and propose a two-level classification mechanism; both supervised and unsupervised; by using domain-specific pre-trained BERT to extract information from long documents in terms of sentence embeddings further processing with transformer encoder layer and use unsupervised clustering to extract hidden labels from these embeddings to better predict a judgment of a legal case.	Nishchal Prasad; Mohand Boughanem; Taoufiq Dkaki;	arxiv-cs.CL	2023-11-14
792	Memory-efficient Stochastic Methods for Memory-based Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel two-phase training mechanism and a novel regularization technique to improve the training efficiency of memory-based transformers, which are often used for long-range context problems.	Vishwajit Kumar Vishnu; C. Chandra Sekhar;	arxiv-cs.LG	2023-11-14
793	Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore and evaluate updating LLM used for candidate recommendation during the learning of the text based game as well to mitigate the reliance on the human annotated gameplays, which are costly to acquire.	Arjun Vaithilingam Sudhakar; Prasanna Parthasarathi; Janarthanan Rajendran; Sarath Chandar;	arxiv-cs.CL	2023-11-13
794	Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction Using Cogtale Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While some existing work focus on evaluating large language models performance on retrieving and answering questions from documents, assessing the LLMs performance on QA types that require exact answer selection from predefined options and numerical extraction is yet to be fully assessed. In this paper, we specifically focus on this underexplored context and conduct empirical analysis of LLMs (GPT-4 and GPT-3.5) on question types, including single-choice, yes-no, multiple-choice, and number extraction questions from documents in zero-shot setting.	ZAFARYAB RASOOL et. al.	arxiv-cs.IR	2023-11-13
795	A Comprehensive Evaluation of GPT-4V on Knowledge-Intensive Visual Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, the true challenge lies in the domain of knowledge-intensive VQA tasks, which necessitate not just recognition of visual elements, but also a deep comprehension of the visual information in conjunction with a vast repository of learned knowledge. To uncover such capabilities of MLMs, particularly the newly introduced GPT-4V, we provide an in-depth evaluation from three perspectives: 1) Commonsense Knowledge, which assesses how well models can understand visual cues and connect to general knowledge; 2) Fine-grained World Knowledge, which tests the model’s skill in reasoning out specific knowledge from images, showcasing their proficiency across various specialized fields; 3) Comprehensive Knowledge with Decision-making Rationales, which examines model’s capability to provide logical explanations for its inference, facilitating a deeper analysis from the interpretability perspective.	YUNXIN LI et. al.	arxiv-cs.CL	2023-11-13
796	Interaction Is All You Need? A Study of Robots Ability to Understand and Execute Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to address a critical challenge in robotics, which is enabling them to operate seamlessly in human environments through natural language interactions.	Kushal Koshti; Nidhir Bhavsar;	arxiv-cs.RO	2023-11-13
797	The Impact of Large Language Models on Scientific Discovery: A Preliminary Study Using GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this report, we delve into the performance of LLMs within the context of scientific discovery, focusing on GPT-4, the state-of-the-art language model.	Microsoft Research AI4Science; Microsoft Azure Quantum;	arxiv-cs.CL	2023-11-13
798	Speech-based Slot Filling Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the potential application of LLMs to slot filling with noisy ASR transcriptions, via both in-context learning and task-specific fine-tuning.	GUANGZHI SUN et. al.	arxiv-cs.CL	2023-11-13
799	MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Several new LLMs have been introduced recently, necessitating their evaluation on non-English languages. This study aims to perform a thorough evaluation of the non-English capabilities of SoTA LLMs (GPT-3.5-Turbo, GPT-4, PaLM2, Gemini-Pro, Mistral, Llama2, and Gemma) by comparing them on the same set of multilingual datasets.	SANCHIT AHUJA et. al.	arxiv-cs.CL	2023-11-13
800	GPT-4V(ision) As A Social Media Analysis Engine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore GPT-4V(ision)’s capabilities for social multimedia analysis.	HANJIA LYU et. al.	arxiv-cs.CV	2023-11-13
801	GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present MM-Navigator, a GPT-4V-based agent for the smartphone graphical user interface (GUI) navigation task.	AN YAN et. al.	arxiv-cs.CV	2023-11-13
802	Towards Understanding The Geospatial Skills of ChatGPT: Taking A Geographic Information Systems (GIS) Exam Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This paper examines the performance of ChatGPT, a large language model (LLM), in a geographic information systems (GIS) exam. As LLMs like ChatGPT become increasingly prevalent in …	Peter Mooney; Wencong Cui; Boyuan Guan; L. Juhász;	Proceedings of the 6th ACM SIGSPATIAL International …	2023-11-13
803	LT-ViT: A Vision Transformer for Multi-label Chest X-ray Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we have developed LT-ViT, a transformer that utilizes combined attention between image tokens and randomly initialized auxiliary tokens that represent labels.	Umar Marikkar; Sara Atito; Muhammad Awais; Adam Mahdi;	arxiv-cs.CV	2023-11-13
804	Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4, using the ConceptARC benchmark [10], which is designed to evaluate robust understanding and reasoning with core-knowledge concepts.	Melanie Mitchell; Alessandro B. Palmarini; Arseny Moskvichev;	arxiv-cs.AI	2023-11-13
805	SpectralGPT: Spectral Remote Sensing Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While most foundation models are tailored to effectively process RGB images for various visual tasks, there is a noticeable gap in research focused on spectral data, which offers valuable information for scene understanding, especially in remote sensing (RS) applications. To fill this gap, we created for the first time a universal RS foundation model, named SpectralGPT, which is purpose-built to handle spectral RS images using a novel 3D generative pretrained transformer (GPT).	DANFENG HONG et. al.	arxiv-cs.CV	2023-11-13
806	Retrieval and Generative Approaches for A Pregnancy Chatbot in Nepali with Stemmed and Non-Stemmed Data : A Comparative Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To provide pregnancy-related information a health domain chatbot has been proposed and this work explores two different NLP-based approaches for developing the chatbot.	Sujan Poudel; Nabin Ghimire; Bipesh Subedi; Saugat Singh;	arxiv-cs.CL	2023-11-12
807	Evaluation of GPT-4 for Chest X-ray Impression Generation: A Reader Study on Performance and Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our study we explored and analyzed the generative abilities of GPT-4 for Chest X-ray impression generation.	SEBASTIAN ZIEGELMAYER et. al.	arxiv-cs.CL	2023-11-12
808	Attention for Causal Relationship Discovery from Biological Neural Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: For transformer models trained to forecast neuronal population dynamics, we show that the cross attention module effectively captures the causal relationship among neurons, with an accuracy equal or superior to that for the most popular Granger causality analysis method.	ZIYU LU et. al.	arxiv-cs.LG	2023-11-12
809	Traffic Sign Recognition Using Local Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a new novel model that blends the advantages of both convolutional and transformer-based networks for traffic sign recognition.	Ali Farzipour; Omid Nejati Manzari; Shahriar B. Shokouhi;	arxiv-cs.CV	2023-11-11
810	NewsGPT: ChatGPT Integration for Robot-Reporter Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper a novel system is proposed that integrates AI’s generative pretrained transformer (GPT) model with the Pepper robot, with the aim of improving the robot’s natural language understanding and response generation capabilities for enhanced social interactions.	Abdelhadi Hireche; Abdelkader Nasreddine Belkacem; Sadia Jamil; Chao Chen;	arxiv-cs.RO	2023-11-11
811	Controllable Topic-Focused Abstractive Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new Transformer-based architecture capable of producing topic-focused summaries.	Seyed Ali Bahrainian; Martin Jaggi; Carsten Eickhoff;	arxiv-cs.CL	2023-11-11
812	Heaps’ Law in GPT-Neo Large Language Model Emulated Corpora Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While this law has been validated in diverse human-authored text corpora, its applicability to large language model generated text remains unexplored. This study addresses this gap, focusing on the emulation of corpora using the suite of GPT-Neo large language models.	Uyen Lai; Gurjit S. Randhawa; Paul Sheridan;	arxiv-cs.CL	2023-11-10
813	Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to leverage a Transformer-based architecture with attention layers to automatically capture feature interactions.	HUAN GUI et. al.	arxiv-cs.IR	2023-11-10
814	ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose ChiMed-GPT, a new benchmark LLM designed explicitly for Chinese medical domain, with enlarged context length to 4,096 tokens and undergoes a comprehensive training regime with pre-training, SFT, and RLHF.	Yuanhe Tian; Ruyi Gan; Yan Song; Jiaxing Zhang; Yongdong Zhang;	arxiv-cs.CL	2023-11-10
815	Argumentation Element Annotation Modeling Using XLNet Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study demonstrates the effectiveness of XLNet, a transformer-based language model, for annotating argumentative elements in persuasive essays.	Christopher Ormerod; Amy Burkhardt; Mackenzie Young; Sue Lottridge;	arxiv-cs.CL	2023-11-10
816	Holistic Evaluation of GPT-4V for Biomedical Imaging Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we present a large-scale evaluation probing GPT-4V’s capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial …	ZHENG LIU et. al.	ArXiv	2023-11-10
817	Establishing Performance Baselines in Fine-Tuning, Retrieval-Augmented Generation and Soft-Prompting for Non-Specialist LLM Users Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we tested an unmodified version of GPT 3.5, a fine-tuned version, and the same unmodified model when given access to a vectorised RAG database, both in isolation and in combination with a basic, non-algorithmic soft prompt.	JENNIFER DODGSON et. al.	arxiv-cs.IR	2023-11-10
818	The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present an investigation into the anisotropy dynamics and intrinsic dimension of embeddings in transformer architectures, focusing on the dichotomy between encoders and decoders.	ANTON RAZZHIGAEV et. al.	arxiv-cs.CL	2023-11-10
819	Accuracy of A Vision-Language Model on Challenging Medical Cases Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Background: General-purpose large language models that utilize both text and images have not been evaluated on a diverse array of challenging medical cases.	Thomas Buckley; James A. Diao; Adam Rodman; Arjun K. Manrai;	arxiv-cs.CV	2023-11-09
820	LogShield: A Transformer-based APT Detection System Leveraging Self-Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, existing state-of-the-art techniques that use system provenance graphs, lack a data processing framework generalized across datasets for optimal performance. For mitigating this limitation as well as exploring the effectiveness of transformer-based language models, this paper proposes LogShield, a framework designed to detect APT attack patterns leveraging the power of self-attention in transformers.	Sihat Afnan; Mushtari Sadia; Shahrear Iqbal; Anindya Iqbal;	arxiv-cs.CR	2023-11-09
821	On The Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore the model’s abilities to understand and reason about driving scenes, make decisions, and ultimately act in the capacity of a driver.	LICHENG WEN et. al.	arxiv-cs.CV	2023-11-09
822	Large Language Models and Prompt Engineering for Biomedical Query Focused Multi-Document Summarisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper reports on the use of prompt engineering and GPT-3.5 for biomedical query-focused multi-document summarisation.	Diego Mollá;	arxiv-cs.CL	2023-11-09
823	Deep Natural Language Feature Learning for Interpretable Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task.	Felipe Urrutia; Cristian Buc; Valentin Barriere;	arxiv-cs.CL	2023-11-09
824	Future Lens: Anticipating Subsequent Tokens from A Single Hidden State Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: More concretely, in this paper we ask: Given a hidden (internal) representation of a single token at position $t$ in an input, can we reliably anticipate the tokens that will appear at positions $\geq t + 2$?	Koyena Pal; Jiuding Sun; Andrew Yuan; Byron C. Wallace; David Bau;	arxiv-cs.CL	2023-11-08
825	Massive Editing for Large Language Models Via Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameters using the normal equation.	Chenmien Tan; Ge Zhang; Jie Fu;	arxiv-cs.CL	2023-11-08
826	GeoFormer: Predicting Human Mobility Using Generative Pre-trained Transformer (GPT) Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present the GeoFormer, a decoder-only transformer model adapted from the GPT architecture to forecast human mobility.	Aivin V. Solatorio;	arxiv-cs.LG	2023-11-08
827	Large GPT-like Models Are Bad Babies: A Closer Look at The Relationship Between Linguistic Competence and Psycholinguistic Measures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find a positive correlation between LM size and performance on all three challenge tasks, with different preferences for model width and depth in each of the tasks.	Julius Steuer; Marius Mosbach; Dietrich Klakow;	arxiv-cs.CL	2023-11-08
828	Evaluating Multiple Large Language Models in Pediatric Ophthalmology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: DESIGN, SETTING, AND PARTICIPANTS This survey study assessed three LLMs, namely ChatGPT (GPT-3.5), GPT-4, and PaLM2, were assessed alongside three human cohorts: medical students, postgraduate students, and attending physicians, in their ability to answer questions related to pediatric ophthalmology.	JASON HOLMES et. al.	arxiv-cs.CL	2023-11-07
829	Exploring Recommendation Capabilities of GPT-4V(ision): A Preliminary Case Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our objective is to explore the potential of extending LMMs from vision and language tasks to recommendation tasks.	PEILIN ZHOU et. al.	arxiv-cs.IR	2023-11-07
830	Neuro-GPT: Towards A Foundation Model for EEG Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model.	WENHUI CUI et. al.	arxiv-cs.LG	2023-11-07
831	Modelling Sentiment Analysis: LLMs and Data Augmentation Techniques Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper provides different approaches for a binary sentiment classification on a small training dataset.	Guillem Senabre Prades;	arxiv-cs.CL	2023-11-07
832	Nexus at ArAIEval Shared Task: Fine-Tuning Arabic Language Models for Propaganda and Disinformation Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The ArAIEval shared task aims to further research on these particular issues within the context of the Arabic language. In this paper, we discuss our participation in these shared tasks.	Yunze Xiao; Firoj Alam;	arxiv-cs.CL	2023-11-06
833	Towards A Transformer-Based Reverse Dictionary Model for Quality Estimation of Definitions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we compare different transformer-based models for solving the reverse dictionary task and explore their use in the context of a serious game called The Dictionary Game.	Julien Guité-Vinet; Alexandre Blondin Massé; Fatiha Sadat;	arxiv-cs.CL	2023-11-06
834	Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate how bias transfers through an AI writing support pipeline.	THIEMO WAMBSGANSS et. al.	arxiv-cs.CL	2023-11-06
835	Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While GPT-4V(ision) impressively models both visual and textual information simultaneously, it’s hallucination behavior has not been systematically assessed. To bridge this gap, we introduce a new benchmark, namely, the Bias and Interference Challenges in Visual Language Models (Bingo).	CHENHANG CUI et. al.	arxiv-cs.LG	2023-11-06
836	A Simple Yet Efficient Ensemble Approach for AI-generated Text Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, it is essential to build automated approaches capable of distinguishing between artificially generated text and human-authored text. In this paper, we propose a simple yet efficient solution to this problem by ensembling predictions from multiple constituent LLMs.	HARIKA ABBURI et. al.	arxiv-cs.CL	2023-11-06
837	Evaluating The Potential of Leading Large Language Models in Reasoning Biology Questions Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advances in Large Language Models (LLMs) have presented new opportunities for integrating Artificial General Intelligence (AGI) into biological research and education. This …	XINYU GONG et. al.	ArXiv	2023-11-05
838	Extraction of Atypical Aspects from Customer Reviews: Datasets and Experiments with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Correspondingly, in this paper we introduce the task of detecting atypical aspects in customer reviews.	Smita Nannaware; Erfan Al-Hossami; Razvan Bunescu;	arxiv-cs.CL	2023-11-05
839	GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the potential of VQA-oriented GPT-4V in the recently popular visual Anomaly Detection (AD) and is the first to conduct qualitative and quantitative evaluations on the popular MVTec AD and VisA datasets. Considering that this task requires both image-/pixel-level evaluations, the proposed GPT-4V-AD framework contains three components: \textbf{\textit{1)}} Granular Region Division, \textbf{\textit{2)}} Prompt Designing, \textbf{\textit{3)}} Text2Segmentation for easy quantitative evaluation, and have made some different attempts for comparative analysis.	JIANGNING ZHANG et. al.	arxiv-cs.CV	2023-11-05
840	Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes The Lead Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study explores the use of GPT-4V(ision), a powerful visual-linguistic model, to address anomaly detection tasks in a generic manner.	Yunkang Cao; Xiaohao Xu; Chen Sun; Xiaonan Huang; Weiming Shen;	arxiv-cs.CV	2023-11-05
841	TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TokenMotion (TMNet), which employs a transformer-based model to enhance VCOD by extracting motion-guided features using a learnable token selection.	ZIFAN YU et. al.	arxiv-cs.CV	2023-11-04
842	Ultra-Long Sequence Distributed Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel and efficient distributed training method, the Long Short-Sequence Transformer (LSS Transformer), for training transformer with long sequences.	XIAO WANG et. al.	arxiv-cs.DC	2023-11-04
843	Rotation Invariant Transformer for Recognizing Object in UAVs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In particular, our solution wins the first place in the UAV-based person re-recognition track in the Multi-Modal Video Reasoning and Analyzing Competition held in ICCV 2021.	Shuoyi Chen; Mang Ye; Bo Du;	arxiv-cs.CV	2023-11-04
844	TCM-GPT: Efficient Pre-training of Large Language Models for Domain Adaptation in Traditional Chinese Medicine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, their effectiveness in specialized domains, such as Traditional Chinese Medicine, requires comprehensive evaluation. To address the above issues, we propose a novel domain specific TCMDA (TCM Domain Adaptation) approach, efficient pre-training with domain-specific corpus.	Guoxing Yang; Jianyu Shi; Zan Wang; Xiaohong Liu; Guangyu Wang;	arxiv-cs.CL	2023-11-03
845	An Empirical Study of Benchmarking Chinese Aspect Sentiment Quad Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To expand capacity, we construct two large Chinese ASQP datasets crawled from multiple online platforms.	Junxian Zhou; Haiqin Yang; Ye Junpeng; Yuxuan He; Hao Mou;	arxiv-cs.CL	2023-11-03
846	Grounded Intuition of GPT-Vision’s Abilities with Scientific Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We use our technique to examine alt text generation for scientific figures, finding that GPT-Vision is particularly sensitive to prompting, counterfactual text in images, and relative spatial relationships.	Alyssa Hwang; Andrew Head; Chris Callison-Burch;	arxiv-cs.CL	2023-11-03
847	Not All Layers Are Equally As Important: Every Layer Counts BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel modification of the transformer architecture, tailored for the data-efficient pretraining of language models.	Lucas Georges Gabriel Charpentier; David Samuel;	arxiv-cs.CL	2023-11-03
848	Multi-scale Time-stepping of Partial Differential Equations with Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we utilize the transformer architecture, the backbone of numerous state-of-the-art AI models, to learn the dynamics of physical systems as the mixing of spatial patterns learned by a convolutional autoencoder.	AmirPouya Hemmasian; Amir Barati Farimani;	arxiv-cs.LG	2023-11-03
849	Inclusiveness Matters: A Large-Scale Analysis of User Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage user feedback from three popular online sources, Reddit, Google Play Store, and Twitter, for 50 of the most popular apps in the world to reveal the inclusiveness-related concerns from end users.	Nowshin Nawar Arony; Ze Shi Li; Bowen Xu; Daniela Damian;	arxiv-cs.SE	2023-11-02
850	GPT-4V(ision) As A Generalist Evaluator for Vision-Language Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We employ two evaluation methods, single-answer grading and pairwise comparison, using GPT-4V.	XINLU ZHANG et. al.	arxiv-cs.CV	2023-11-02
851	Copilot4D: Learning Unsupervised World Models for Autonomous Driving Via Discrete Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we propose Copilot4D, a novel world modeling approach that first tokenizes sensor observations with VQVAE, then predicts the future via discrete diffusion.	LUNJUN ZHANG et. al.	arxiv-cs.CV	2023-11-02
852	Efficient Vision Transformer for Accurate Traffic Sign Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this task is complicated by suboptimal traffic images affected by factors such as camera movement, adverse weather conditions, and inadequate lighting. This study specifically focuses on traffic sign detection methods and introduces the application of the Transformer model, particularly the Vision Transformer variants, to tackle this task.	Javad Mirzapour Kaleybar; Hooman Khaloo; Avaz Naghipour;	arxiv-cs.CV	2023-11-02
853	Enhancing Social Network Hate Detection Using Back Translation and GPT-3 Augmentations During Training and Test-time Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View	SEFFI COHEN et. al.	Inf. Fusion	2023-11-01
854	FTransCNN: Fusing Transformer and A CNN Based on Fuzzy Logic for Uncertain Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	WEIPING DING et. al.	Inf. Fusion	2023-11-01
855	ADCT-Net: Adaptive Traffic Forecasting Neural Network Via Dual-graphic Cross-fused Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	JIANLEI KONG et. al.	Inf. Fusion	2023-11-01
856	Causal-ViT: Robust Vision Transformer By Causal Intervention Related Papers Related Patents Related Grants Related Venues Related Experts View	Wei Li; Zhixin Li; Xiwei Yang; Huifang Ma;	Eng. Appl. Artif. Intell.	2023-11-01
857	Multi-scale Feature Fusion and Transformer Network for Urban Green Space Segmentation from High-resolution Remote Sensing Images Related Papers Related Patents Related Grants Related Venues Related Experts View	YONG CHENG et. al.	Int. J. Appl. Earth Obs. Geoinformation	2023-11-01
858	The Development of LLMs for Embodied Navigation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: A comprehensive list of studies in this survey is available at https://github.com/Rongtao-Xu/Awesome-LLM-EN	JINZHOU LIN et. al.	arxiv-cs.AI	2023-11-01
859	KD-Former: Kinematic and Dynamic Coupled Transformer Network for 3D Human Motion Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	JU DAI et. al.	Pattern Recognit.	2023-11-01
860	Are Large Language Models Reliable Judges? A Study on The Factuality Evaluation Capabilities of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we delve into the potential of LLMs as reliable assessors of factual consistency in summaries generated by text-generation models.	Xue-Yong Fu; Md Tahmid Rahman Laskar; Cheng Chen; Shashi Bhushan TN;	arxiv-cs.CL	2023-11-01
861	Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To alleviate the issue, we propose two attention alignment strategies via temperature scaling.	Ta-Chung Chi; Ting-Han Fan; Alexander I. Rudnicky;	arxiv-cs.CL	2023-11-01
862	Increasing The Performance of Cognitively Inspired Data-Efficient Language Models Via Implicit Structure Building Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we describe our submission to the BabyLM Challenge 2023 shared task on data-efficient language model (LM) pretraining (Warstadt et al., 2023).	Omar Momen; David Arps; Laura Kallmeyer;	arxiv-cs.CL	2023-10-31
863	Do Large Language Models Solve Verbal Analogies Like Children Do? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates whether large language models (LLMs) solve verbal analogies in A:B::C:?	Claire E. Stevenson; Mathilde ter Veen; Rochelle Choenni; Han L. J. van der Maas; Ekaterina Shutova;	arxiv-cs.CL	2023-10-31
864	Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: But such methods are often tested for high-resource languages such as English. In this work, we investigate whether these methods can compensate for data sparseness in low-resource languages, hypothesizing that they ought to be more effective for low-resource languages.	Luke Gessler; Nathan Schneider;	arxiv-cs.CL	2023-10-31
865	A Systematic Evaluation of GPT-4V’s Multimodal Capability for Medical Image Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work conducts an evaluation of GPT-4V’s multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding.	YINGSHU LI et. al.	arxiv-cs.CV	2023-10-31
866	Is GPT Powerful Enough to Analyze The Emotions of Memes? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project aims to explore the capabilities of GPT-3.5, a leading example of LLMs, in processing the sentiment analysis of Internet memes.	Jingjing Wang; Joshua Luo; Grace Yang; Allen Hong; Feng Luo;	arxiv-cs.CL	2023-10-31
867	Does GPT-4 Pass The Turing Test? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: AI models with the ability to masquerade as humans could have widespread societal consequences, and we analyse the effectiveness of different strategies and criteria for judging humanlikeness.	Cameron Jones; Benjamin Bergen;	arxiv-cs.AI	2023-10-31
868	A Systematic Review for Transformer-based Long-term Series Forecasting Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of deep learning has yielded noteworthy advancements in time series forecasting (TSF). Transformer architectures, in particular, have witnessed broad utilization and …	LIYILEI SU et. al.	arxiv-cs.LG	2023-10-31
869	Breaking The Token Barrier: Chunking and Convolution for Efficient Long Text Classification with BERT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a relatively simple extension to vanilla BERT architecture called ChunkBERT that allows finetuning of any pretrained models to perform inference on arbitrarily long text.	Aman Jaiswal; Evangelos Milios;	arxiv-cs.CL	2023-10-31
870	Efficient Classification of Student Help Requests in Programming Courses Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The accurate classification of student help requests with respect to the type of help being sought can enable the tailoring of effective responses. Automatically classifying such requests is non-trivial, but large language models (LLMs) appear to offer an accessible, cost-effective solution.	Jaromir Savelka; Paul Denny; Mark Liffiton; Brad Sheese;	arxiv-cs.CY	2023-10-30
871	Partial Tensorized Transformers for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT.	Subhadra Vadlamannati; Ryan Solgi;	arxiv-cs.CL	2023-10-30
872	MM-VID: Advancing Video Understanding with GPT-4V(ision) Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V, combined with specialized tools in vision, audio, and speech, to facilitate advanced video …	KEVIN LIN et. al.	ArXiv	2023-10-30
873	Extracting User Needs with Chat-GPT for Dialogue Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In most conventional interactive recommendation systems, the language model is used only as a dialogue model, and there is a separate recommendation system.	Yugen Sato; Taisei Nakajima; Tatsuki Kawamoto; Tomohiro Takagi;	arxiv-cs.CY	2023-10-30
874	Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a new pipeline using ChatGPT instead of human experts to generate high-quality feedback data for improving factual consistency in the clinical note summarization task.	PRAKAMYA MISHRA et. al.	arxiv-cs.CL	2023-10-30
875	MIST: Medical Image Segmentation Transformer with Convolutional Attention Mixing (CAM) Decoder Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite being successful in medical image segmentation, transformers face limitations in capturing local contexts of pixels in multimodal dimensions. We propose a Medical Image Segmentation Transformer (MIST) incorporating a novel Convolutional Attention Mixing (CAM) decoder to address this issue.	Md Motiur Rahman; Shiva Shokouhmand; Smriti Bhatt; Miad Faezipour;	arxiv-cs.CV	2023-10-30
876	Multimodal ChatGPT for Medical Applications: An Experimental Study of GPT-4V IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we critically evaluate the capabilities of the state-of-the-art multimodal large language model, i.e., GPT-4 with Vision (GPT-4V), on Visual Question Answering (VQA) task.	ZHILING YAN et. al.	arxiv-cs.CV	2023-10-29
877	Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To do so, we propose a novel graph pre-training auto-encoder to obtain sentence embeddings by explicitly modelling intra-sentential distinctive features and inter-sentential cohesive features through sentence-word bipartite graphs.	QIANREN MAO et. al.	arxiv-cs.CL	2023-10-29
878	From Chatbots to PhishBots? — Preventing Phishing Scams Created Using ChatGPT, Google Bard and Claude Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the potential of using four popular commercially available LLMs, i.e., ChatGPT (GPT 3.5 Turbo), GPT 4, Claude, and Bard, to generate functional phishing attacks using a series of malicious prompts.	Sayak Saha Roy; Poojitha Thota; Krishna Vamsi Naragam; Shirin Nilizadeh;	arxiv-cs.CR	2023-10-29
879	Data Ambiguity Strikes Back: How Documentation Improves GPT’s Text-to-SQL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We have identified prevalent data ambiguities of value consistency, data coverage, and data granularity that affect tasks.	Zezhou Huang; Pavan Kalyan Damalapati; Eugene Wu;	arxiv-cs.DB	2023-10-28
880	Prompt-Engineering and Transformer-based Question Generation and Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we finetuned a pretrained distilBERT model on the SQuAD question answering dataset to generate questions.	Rubaba Amyeen;	arxiv-cs.CL	2023-10-28
881	OffMix-3L: A Novel Code-Mixed Dataset in Bangla-English-Hindi for Offensive Language Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce OffMix-3L, a novel offensive language identification dataset containing code-mixed data from three different languages.	Dhiman Goswami; Md Nishat Raihan; Antara Mahmud; Antonios Anastasopoulos; Marcos Zampieri;	arxiv-cs.CL	2023-10-27
882	OpinSummEval: Revisiting Automated Evaluation for Opinion Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present OpinSummEval, a dataset comprising human judgments and outputs from 14 opinion summarization models.	Yuchen Shen; Xiaojun Wan;	arxiv-cs.CL	2023-10-27
883	GPT-4 Vision on Medical Image Classification — A Case Study on COVID-19 Dataset Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This technical report delves into the application of GPT-4 Vision (GPT-4V) in the nuanced realm of COVID-19 image classification, leveraging the transformative potential of …	RUIBO CHEN et. al.	arxiv-cs.CV	2023-10-27
884	FP8-LM: Training FP8 Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).	HOUWEN PENG et. al.	arxiv-cs.LG	2023-10-27
885	MultiScale Spectral-Spatial Convolutional Transformer for Hyperspectral Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a multiscale spectral-spatial convolutional Transformer (MultiscaleFormer) for hyperspectral image classification.	Zhiqiang Gong; Xian Zhou; Wen Yao;	arxiv-cs.CV	2023-10-27
886	Large Language Models for Aspect-based Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) offer unprecedented text completion capabilities.	Paul F. Simmering; Paavo Huoviala;	arxiv-cs.CL	2023-10-27
887	SentMix-3L: A Bangla-English-Hindi Code-Mixed Dataset for Sentiment Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce SentMix-3L, a novel dataset for sentiment analysis containing code-mixed data between three languages Bangla, English, and Hindi.	Md Nishat Raihan; Dhiman Goswami; Antara Mahmud; Antonios Anastasopoulos; Marcos Zampieri;	arxiv-cs.CL	2023-10-27
888	Harnessing GPT-3.5-turbo for Rhetorical Role Prediction in Legal Cases Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a comprehensive study of one-stage elicitation techniques for querying a large pre-trained generative transformer (GPT-3.5-turbo) in the rhetorical role prediction task of legal cases.	Anas Belfathi; Nicolas Hernandez; Laura Monceaux;	arxiv-cs.CL	2023-10-26
889	MO-YOLO: End-to-End Multiple-Object Tracking Method with YOLO and Decoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Drawing inspiration from successful models like GPT, we present MO-YOLO, an efficient and computationally frugal end-to-end MOT model.	Liao Pan; Yang Feng; Wu Di; Liu Bo; Zhang Xingle;	arxiv-cs.CV	2023-10-26
890	ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing solutions, such as ZeroQuant, offer dynamic quantization for models like BERT and GPT but overlook crucial memory-bounded operators and the complexities of per-token quantization. Addressing these gaps, we present a novel, fully hardware-enhanced robust optimized post-training W8A8 quantization framework, ZeroQuant-HERO.	ZHEWEI YAO et. al.	arxiv-cs.LG	2023-10-26
891	LightLM: A Lightweight Deep and Narrow Language Model for Generative Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents LightLM, a lightweight Transformer-based language model for generative recommendation.	Kai Mei; Yongfeng Zhang;	arxiv-cs.IR	2023-10-26
892	An Ensemble Method Based on The Combination of Transformers with Convolutional Neural Networks to Detect Artificially Generated Text Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Thanks to the state-of-the-art Large Language Models (LLMs), language generation has reached outstanding levels. These models are capable of generating high quality content, thus …	Vijini Liyanage; Davide Buscaldi;	arxiv-cs.CL	2023-10-26
893	FedPEAT: Convergence of Federated Learning, Parameter-Efficient Fine Tuning, and Emulator Assisted Tuning for Artificial Intelligence Foundation Models with Mobile Edge Computing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Emulator-Assisted Tuning (EAT) combined with Parameter-Efficient Fine-Tuning (PEFT) to form Parameter-Efficient Emulator-Assisted Tuning (PEAT).	Terence Jie Chua; Wenhan Yu; Jun Zhao; Kwok-Yan Lam;	arxiv-cs.LG	2023-10-26
894	Can Large Language Models Replace Humans in The Systematic Review Process? Evaluating GPT-4’s Efficacy in Screening and Extracting Data from Peer-reviewed and Grey Literature in Multiple Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Systematic reviews are vital for guiding practice, research, and policy, yet they are often slow and labour-intensive. Large language models (LLMs) could offer a way to speed up and automate systematic reviews, but their performance in such tasks has not been comprehensively evaluated against humans, and no study has tested GPT-4, the biggest LLM so far.	Qusai Khraisha; Sophie Put; Johanna Kappenberg; Azza Warraitch; Kristin Hadfield;	arxiv-cs.CL	2023-10-26
895	BOOST: Harnessing Black-Box Control to Boost Commonsense in LMs’ Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a computation-efficient framework that steers a frozen Pre-Trained Language Model (PTLM) towards more commonsensical generation (i.e., producing a plausible output that incorporates a list of concepts in a meaningful way).	Yufei Tian; Felix Zhang; Nanyun Peng;	arxiv-cs.CL	2023-10-25
896	Can GPT Models Follow Human Summarization Guidelines? Evaluating ChatGPT and GPT-4 for Dialogue Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study explores the capabilities of prompt-driven Large Language Models (LLMs) like ChatGPT and GPT-4 in adhering to human guidelines for dialogue summarization.	Yongxin Zhou; Fabien Ringeval; François Portet;	arxiv-cs.CL	2023-10-25
897	Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a comprehensive evaluation of the Optical Character Recognition (OCR) capabilities of the recently released GPT-4V(ision), a Large Multimodal Model (LMM).	YONGXIN SHI et. al.	arxiv-cs.CV	2023-10-25
898	How Well Can Machine-generated Texts Be Identified and Can Language Models Be Trained to Avoid Identification? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We refined five separate language models to generate synthetic tweets, uncovering that shallow learning classification algorithms, like Naive Bayes, achieve detection accuracy between 0.6 and 0.8.	Sinclair Schneider; Florian Steuber; Joao A. G. Schneider; Gabi Dreo Rodosek;	arxiv-cs.CL	2023-10-25
899	CLEX: Continuous Length Extrapolation for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Length extrapolation methods, although theoretically capable of extending the context window beyond the training sequence length, often underperform in practical long-context applications. To address these challenges, we propose Continuous Length EXtrapolation (CLEX) for LLMs.	Guanzheng Chen; Xin Li; Zaiqiao Meng; Shangsong Liang; Lidong Bing;	arxiv-cs.CL	2023-10-25
900	Decoding Stumpers: Large Language Models Vs. Human Problem-Solvers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the problem-solving capabilities of Large Language Models (LLMs) by evaluating their performance on stumpers, unique single-step intuition problems that pose challenges for human solvers but are easily verifiable.	Alon Goldstein; Miriam Havin; Roi Reichart; Ariel Goldstein;	arxiv-cs.CL	2023-10-25
901	Divide Et Impera: Multi-Transformer Architectures for Complex NLP-Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an approach in which complex tasks are divided into simpler subtasks.	Solveig Helland; Elena Gavagnin; Alexandre de Spindler;	arxiv-cs.CL	2023-10-25
902	Data Augmentation for Emotion Detection in Small Imbalanced Text Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Certain existing datasets are small, follow different emotion taxonomies and display imbalance in their emotion distribution. In this work, we studied the impact of data augmentation techniques precisely when applied to small imbalanced datasets, for which current state-of-the-art models (such as RoBERTa) under-perform.	Anna Koufakou; Diego Grisales; Ragy Costa de jesus; Oscar Fox;	arxiv-cs.CL	2023-10-25
903	An Early Evaluation of GPT-4V(ision) IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we evaluate different abilities of GPT-4V including visual understanding, language understanding, visual puzzle solving, and understanding of other modalities such as depth, thermal, video, and audio.	YANG WU et. al.	arxiv-cs.CL	2023-10-25
904	Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a deep learning model tailored for document information analysis, emphasizing document classification, entity relation extraction, and document visual question answering.	Tofik Ali; Partha Pratim Roy;	arxiv-cs.CV	2023-10-25
905	Stochastic Latent Transformer: Efficient Modelling of Stochastically Forced Zonal Jets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce the ‘Stochastic Latent Transformer’, a probabilistic deep learning approach for efficient reduced-order modelling of stochastic partial differential equations (SPDEs).	Ira J. S. Shokar; Rich R. Kerswell; Peter H. Haynes;	arxiv-cs.LG	2023-10-25
906	On Efficient Training Algorithms For Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we revisit three algorithms: layer stacking, layer dropping, and selective backpropagation.	Jean Kaddour; Oscar Key; Piotr Nawrot; Pasquale Minervini; Matt Kusner;	nips	2023-10-24
907	CAPP-130 : A Dataset of Chinese Application Privacy Policy Summarization and Interpretations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, research on Chinese application privacy policy summarization is currently almost nonexistent, and there is a lack of a high-quality corpus suitable for addressing readability issues. To tackle these challenges, we introduce a fine-grained CAPP-130 corpus and a TCSI-pp framework.	JINFEI LIU et. al.	nips	2023-10-24
908	A Language Model with Limited Memory Capacity Captures Interference in Human Sentence Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the present work, we develop a recurrent neural language model with a single self-attention head, which more closely parallels the memory system assumed by cognitive theories.	William Timkey; Tal Linzen;	arxiv-cs.CL	2023-10-24
909	Mathematical Capabilities of ChatGPT IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology.	SIMON FRIEDER et. al.	nips	2023-10-24
910	From Parameter-Efficient to Memory-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first investigate what is a key factor for the success of existing PEFT methods, and realize that it’s essential to preserve the PLM’s starting point when initializing a PEFT method. With this finding, we propose memory-efficient fine-tuning (MEFT) that inserts adapters into a PLM, preserving the PLM’s starting point and making it reversible without additional pre-training.	Baohao Liao; Shaomu Tan; Christof Monz;	nips	2023-10-24
911	ZipLM: Inference-Aware Structured Pruning of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The breakthrough performance of large language models (LLMs) comes with major computational footprints and high deployment costs. In this paper, we progress towards resolving this problem by proposing a novel structured compression approach for LLMs, called ZipLM.	Eldar Kurtić; Elias Frantar; Dan Alistarh;	nips	2023-10-24
912	PointGPT: Auto-regressively Generative Pre-training from Point Clouds Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Inspired by the advancements of the GPT, we present PointGPT, a novel approach that extends the concept of GPT to point clouds, addressing the challenges associated with disorder properties, low information density, and task gaps.	GUANGYAN CHEN et. al.	nips	2023-10-24
913	DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives – including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness.	BOXIN WANG et. al.	nips	2023-10-24
914	TART: A Plug-and-play Transformer Module for Task-agnostic Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This raises an intriguing question: Are LLMs actually capable of learning how to reason in a task-agnostic manner? We answer this in the affirmative and, as a proof of concept, propose TART which generically improves an LLM’s reasoning abilities using a synthetically trained reasoning module.	Kush Bhatia; Avanika Narayan; Christopher De Sa; Christopher Ré;	nips	2023-10-24
915	Using GPT-4 to Augment Unbalanced Data for Automatic Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Machine learning-based automatic scoring can be challenging if students’ responses are unbalanced across scoring categories, as it introduces uncertainty in the machine training process. To meet this challenge, we introduce a novel text data augmentation framework using GPT-4, a generative large language model, specifically tailored for unbalanced datasets in automatic scoring.	Luyang Fang; Gyeong-Geon Lee; Xiaoming Zhai;	arxiv-cs.CL	2023-10-24
916	Event Stream GPT: A Data Pre-processing and Modeling Library for Generative, Pre-trained Transformers Over Continuous-time Sequences of Complex Events Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite their potential, the adoption of foundation models in these domains has been hampered by the lack of suitable tools for model construction and evaluation. To bridge this gap, we introduce Event Stream GPT (ESGPT), an open-source library designed to streamline the end-to-end process for building GPTs for continuous-time event sequences.	Matthew McDermott; Bret Nestor; Peniel Argaw; Isaac S Kohane;	nips	2023-10-24
917	Self-Refine: Iterative Refinement with Self-Feedback IF:6 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement.	AMAN MADAAN et. al.	nips	2023-10-24
918	Visual Instruction Tuning IF:7 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and an LLM for general-purpose visual and language understanding.	Haotian Liu; Chunyuan Li; Qingyang Wu; Yong Jae Lee;	nips	2023-10-24
919	SwiFT: Swin 4D FMRI Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The modeling of spatiotemporal brain dynamics from high-dimensional data, such as 4D functional MRI, is a formidable task in neuroscience. To address this challenge, we present SwiFT (Swin 4D fMRI Transformer), a Swin Transformer architecture that can learn brain dynamics directly from 4D functional brain MRI data in a memory and computation-efficient manner.	PETER KIM et. al.	nips	2023-10-24
920	Unlimiformer: Long-Range Transformers with Unlimited Length Input IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Unlimiformer: a general approach that wraps any existing pretrained encoder-decoder transformer, and offloads the cross-attention computation to a single $k$-nearest-neighbor ($k$NN) index, while the returned $k$NN distances are the attention dot-product scores.	Amanda Bertsch; Uri Alon; Graham Neubig; Matthew Gormley;	nips	2023-10-24
921	Evaluating Cognitive Maps in Large Language Models: No Emergent Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we make two major contributions. First, we propose CogEval, a Cognitive Science-Inspired protocol for Measurement and Evaluation for Large Language Models. Second, we use CogEval to systematically evaluate hypothesized latent abilities, cognitive maps and planning, across a number of LLMs (OpenAI GPT-4, GPT-3.5, and davinci-003, Anthropic Claude-1, Alpaca-7B, LLaMA-7B, and Bard) using tasks with established construct validity and absent from LLM training sets.	IDA MOMENNEJAD et. al.	nips	2023-10-24
922	RapidBERT: Pretraining BERT from Scratch for $20 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce RapidBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining.	JACOB PORTES et. al.	nips	2023-10-24
923	De Novo Drug Design Using Reinforcement Learning with Multiple GPT Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Although technologies such as transformer model and reinforcement learning have been applied in drug design, their potential has not been fully realized. Therefore, we propose MolRL-MGPT, a reinforcement learning algorithm with multiple GPT agents for drug molecular generation.	Xiuyuan Hu; Hao Zhang; Yang Zhao;	nips	2023-10-24
924	Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we explore Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and model dimension.	DAN FU et. al.	nips	2023-10-24
925	NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, graph neural network (GNN) based approaches still dominate the field of learning representation for the entire network. In this paper, we revisit Transformer and compare it with GNN to analyse the different architecture characteristics of them.	Yun Yi; Haokui Zhang; Rong Xiao; Nannan Wang; Xiaoyu Wang;	nips	2023-10-24
926	The Cambridge Law Corpus: A Corpus for Legal AI Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Cambridge Law Corpus (CLC), a corpus for legal AI research.	ANDREAS ÖSTLING et. al.	nips	2023-10-24
927	Scaling Laws for Language Encoding Models in FMRI IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we tested whether larger open-source models such as those from the OPT and LLaMA families are better at predicting brain responses recorded using fMRI.	Richard Antonello; Aditya Vaidya; Alexander Huth;	nips	2023-10-24
928	Is ChatGPT A Good Multi-Party Conversation Solver? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we delve into the potential of generative LLMs such as ChatGPT and GPT-4 within the context of MPCs.	Chao-Hong Tan; Jia-Chen Gu; Zhen-Hua Ling;	arxiv-cs.CL	2023-10-24
929	Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose GRACE, a Lifelong Model Editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs.	Thomas Hartvigsen; Swami Sankaranarayanan; Hamid Palangi; Yoon Kim; Marzyeh Ghassemi;	nips	2023-10-24
930	What Indeed Can GPT Models Do in Chemistry? A Comprehensive Benchmark on Eight Tasks IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain.	TAICHENG GUO et. al.	nips	2023-10-24
931	Elastic Decision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Elastic Decision Transformer (EDT), a significant advancement over the existing Decision Transformer (DT) and its variants.	Yueh-Hua Wu; Xiaolong Wang; Masashi Hamaya;	nips	2023-10-24
932	Spike-driven Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we incorporate the spike-driven paradigm into Transformer by the proposed Spike-driven Transformer with four unique properties: (1) Event-driven, no calculation is triggered when the input of Transformer is zero; (2) Binary spike communication, all matrix multiplications associated with the spike matrix can be transformed into sparse additions; (3) Self-attention with linear complexity at both token and channel dimensions; (4) The operations between spike-form Query, Key, and Value are mask and addition.	MAN YAO et. al.	nips	2023-10-24
933	Block-State Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a hybrid layer named Block-State Transformer (BST), that internally combines an SSM sublayer for long-range contextualization, and a Block Transformer sublayer for short-term representation of sequences.	JONATHAN PILAULT et. al.	nips	2023-10-24
934	Making Scalable Meta Learning Practical Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems.	SANG CHOE et. al.	nips	2023-10-24
935	Towards Efficient Pre-Trained Language Model Via Feature Correlation Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, our analysis indicates that the different relations within self-attention, as adopted in other works, involves more computation complexities and can easily be constrained by the number of heads, potentially leading to suboptimal solutions. To address these issues, we propose a novel approach that builds relationships directly from output features.	Kun Huang; Xin Guo; Meng Wang;	nips	2023-10-24
936	Understanding Code Semantics: An Evaluation of Transformer Models in Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Ultimately, our research aims to offer valuable insights into the inner workings of transformer-based LMs, enhancing their ability to understand code and contributing to more efficient software development practices and maintenance workflows.	Debanjan Mondal; Abhilasha Lodha; Ankita Sahoo; Beena Kumari;	arxiv-cs.LG	2023-10-24
937	Transformer-based Planning for Symbolic Regression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models primarily rely on supervised pretraining goals borrowed from text generation and overlook equation-specific objectives like accuracy and complexity. To address this, we propose TPSR, a Transformer-based Planning strategy for Symbolic Regression that incorporates Monte Carlo Tree Search into the transformer decoding process.	Parshin Shojaee; Kazem Meidani; Amir Barati Farimani; Chandan Reddy;	nips	2023-10-24
938	Likelihood-Based Diffusion Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we take the first steps towards closing the perplexity gap between autoregressive and diffusion-based language models, with the goal of building and releasing a diffusion model which outperforms the smallest widely-adopted autoregressive model (GPT-2 124M).	Ishaan Gulrajani; Tatsunori Hashimoto;	nips	2023-10-24
939	H3T: Efficient Integration of Memory Optimization and Parallelism for Large-scale Transformer Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a framework to automatically find an efficient integration of memory optimization and parallelism for High-Throughput Transformer Training (named H3T), which is rarely considered by existing efforts for training big Transformer-based models.	YUZHONG WANG et. al.	nips	2023-10-24
940	Transformers As Statisticians: Provable In-Context Learning with In-Context Algorithm Selection IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We establish this in theory by explicit constructions, and also observe this phenomenon experimentally. In theory, we construct two general mechanisms for algorithm selection with concrete examples: (1) Pre-ICL testing, where the transformer determines the right task for the given sequence (such as choosing between regression and classification) by examining certain summary statistics of the input sequence; (2) Post-ICL validation, where the transformer selects—among multiple base ICL algorithms (such as ridge regression with multiple regularization strengths)—a near-optimal one for the given sequence using a train-validation split.	Yu Bai; Fan Chen; Huan Wang; Caiming Xiong; Song Mei;	nips	2023-10-24
941	RealTime QA: What’s The Answer Right Now? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce RealTime QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version).	JUNGO KASAI et. al.	nips	2023-10-24
942	Neural Data Transformer 2: Multi-context Pretraining for Neural Spiking Activity Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We thus develop Neural Data Transformer 2 (NDT2), a spatiotemporal Transformer for neural spiking activity, and demonstrate pretraining can leverage motor BCI datasets that span sessions, subjects, and experimental tasks.	Joel Ye; Jennifer Collinger; Leila Wehbe; Robert Gaunt;	nips	2023-10-24
943	Dissecting In-Context Learning of Translations in GPTs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we try to better understand the role of demonstration attributes for the in-context learning of translations through perturbations of high-quality, in-domain demonstrations.	Vikas Raunak; Hany Hassan Awadalla; Arul Menezes;	arxiv-cs.CL	2023-10-24
944	Global-correlated 3D-decoupling Transformer for Clothed Avatar Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Current methods exhibit limitations in performance, largely attributable to their dependence on insufficient 2D image features and inconsistent query methods. Owing to this, we present the Global-correlated 3D-decoupling Transformer for clothed Avatar reconstruction (GTA), a novel transformer-based architecture that reconstructs clothed human avatars from monocular images.	Zechuan Zhang; Li Sun; Zongxin Yang; Ling Chen; Yi Yang;	nips	2023-10-24
945	LoTR: Logic-Guided Transformer Reasoner for Human-Object Interaction Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present LoTR: Logic-Guided Transformer Reasoner, a novel approach for HOI detection that leverages Transformer as the reasoner to infer feasible interactions between entities.	Liulei Li; Jianan Wei; Wenguan Wang; Yi Yang;	nips	2023-10-24
946	Fast Attention Requires Bounded Entries IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether faster algorithms are possible by \emph{implicitly} making use of the matrix $A$.	Josh Alman; Zhao Song;	nips	2023-10-24
947	Geometric Transformer with Interatomic Positional Encoding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, by designing Interatomic Positional Encoding (IPE) thatparameterizes atomic environments as Transformer’s positional encodings,we propose Geoformer, a novel geometric Transformer to effectively model molecular structures for various molecular property prediction.	YUSONG WANG et. al.	nips	2023-10-24
948	Evaluating The Knowledge Base Completion Potential of GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we perform a careful evaluation of GPT’s potential to complete the largest public KB: Wikidata.	Blerta Veseli; Simon Razniewski; Jan-Christoph Kalo; Gerhard Weikum;	arxiv-cs.CL	2023-10-23
949	Design of A Modified Transformer Architecture Based on Relative Position Coding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	WENFENG ZHENG et. al.	International Journal of Computational Intelligence Systems	2023-10-23
950	LINC: A Neurosymbolic Approach for Logical Reasoning By Combining Language Models with First-Order Logic Provers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing in subtle and unpredictable ways. In this work, we investigate the validity of instead reformulating such tasks as modular neurosymbolic programming, which we call LINC: Logical Inference via Neurosymbolic Computation.	THEO X. OLAUSSON et. al.	arxiv-cs.CL	2023-10-23
951	3M-TRANSFORMER: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on this research, we propose a new multimodal transformer-based architecture for predicting turn-taking in embodied, synchronized multi-perspective data.	Mehdi Fatan; Emanuele Mincato; Dimitra Pintzou; Mariella Dimiccoli;	arxiv-cs.CV	2023-10-23
952	Branch-Solve-Merge Improves Large Language Model Evaluation and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance can fall short, due to the model’s lack of coherence and inability to plan and decompose the problem. We propose Branch-Solve-Merge (BSM), a Large Language Model program (Schlag et al., 2023) for tackling such challenging natural language tasks.	SWARNADEEP SAHA et. al.	arxiv-cs.CL	2023-10-23
953	Generative Pre-trained Transformer for Vietnamese Community-based COVID-19 Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel approach by conducting a comparative analysis of different Transformers vs SOTA models in the community-based COVID-19 question answering dataset.	Tam Minh Vo; Khiem Vinh Tran;	arxiv-cs.CL	2023-10-23
954	TRAMS: Training-free Memory Selection for Long-range Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present a plug-and-play strategy, known as TRAining-free Memory Selection (TRAMS), that selects tokens participating in attention calculation based on one simple metric.	Haofei Yu; Cunxiang Wang; Yue Zhang; Wei Bi;	arxiv-cs.CL	2023-10-23
955	GPT-4 As An Effective Zero-Shot Evaluator for Scientific Figure Captions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates using large language models (LLMs) as a cost-effective, reference-free method for evaluating figure captions.	TING-YAO HSU et. al.	arxiv-cs.CL	2023-10-23
956	Exploring The Boundaries of GPT-4 in Radiology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models.	QIANCHU LIU et. al.	arxiv-cs.CL	2023-10-23
957	Is ChatGPT A Game Changer for Geocoding — A Benchmark for Geocoding Address Parsing Techniques Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To ensure that the evaluation more accurately mirrors performance in real-world scenarios with diverse user input qualities and resolve the pressing need for a ‘gold standard’ evaluation dataset for geocoding systems, we introduce a benchmark dataset of low-quality address descriptions synthesized based on human input patterns mining from actual input logs of a geocoding system in production.	Zhengcong Yin; Diya Li; Daniel W. Goldberg;	arxiv-cs.CL	2023-10-22
958	InstructExcel: A Benchmark for Natural Language Instruction in Excel Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To do so we introduce a new large-scale benchmark, InstructExcel, created by leveraging the ‘Automate’ feature in Excel to automatically generate OfficeScripts from users’ actions.	JUSTIN PAYAN et. al.	arxiv-cs.CL	2023-10-22
959	A Pytorch Reproduction of Masked Generative Image Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this technical report, we present a reproduction of MaskGIT: Masked Generative Image Transformer, using PyTorch.	Victor Besnier; Mickael Chen;	arxiv-cs.CV	2023-10-22
960	CXR-LLAVA: A Multimodal Large Language Model for Interpreting Chest X-ray Images Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Purpose: This study aimed to develop an open-source multimodal large language model (CXR-LLAVA) for interpreting chest X-ray images (CXRs), leveraging recent advances in large language models (LLMs) to potentially replicate the image interpretation skills of human radiologists Materials and Methods: For training, we collected 592,580 publicly available CXRs, of which 374,881 had labels for certain radiographic abnormalities (Dataset 1) and 217,699 provided free-text radiology reports (Dataset 2).	Seowoo Lee; Jiwon Youn; Hyungjin Kim; Mansu Kim; Soon Ho Yoon;	arxiv-cs.CL	2023-10-22
961	Large Language Models Are Biased to Overestimate Profoundness Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We found a significant statement-to-statement correlation between the LLMs and humans, irrespective of the type of statements and the prompting technique used.	Eugenio Herrera-Berg; Tomás Vergara Browne; Pablo León-Villagrá; Marc-Lluís Vives; Cristian Buc Calderon;	arxiv-cs.CL	2023-10-22
962	Evaluating Spatial Understanding of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In extensive error analysis, we find that LLMs’ mistakes reflect both spatial and non-spatial factors.	Yutaro Yamada; Yihan Bao; Andrew K. Lampinen; Jungo Kasai; Ilker Yildirim;	arxiv-cs.CL	2023-10-22
963	Attention-Enhancing Backdoor Attacks Against BERT-based Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we directly target the interior structure of neural networks and the backdoor mechanism.	Weimin Lyu; Songzhu Zheng; Lu Pang; Haibin Ling; Chao Chen;	arxiv-cs.LG	2023-10-22
964	Is ChatGPT A Game Changer for Geocoding – A Benchmark for Geocoding Address Parsing Techniques Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The remarkable success of GPT models across various tasks, including toponymy recognition motivates us to assess the performance of the GPT-3 model in the geocoding address …	Zhengcong Yin; Diya Li; Daniel W. Goldberg;	Proceedings of the 2nd ACM SIGSPATIAL International …	2023-10-22
965	Towards Harmful Erotic Content Detection Through Coreference-Driven Contextual Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a hybrid neural and rule-based context-aware system that leverages coreference resolution to identify harmful contextual cues in erotic content.	Inez Okulska; Emilia Wiśnios;	arxiv-cs.CL	2023-10-22
966	Prompt-based Grouping Transformer for Nucleus Detection and Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel end-to-end nuclei detection and classification framework based on a grouping transformer-based classifier.	Junjia Huang; Haofeng Li; Weijun Sun; Xiang Wan; Guanbin Li;	arxiv-cs.CV	2023-10-22
967	HateRephrase: Zero- and Few-Shot Reduction of Hate Intensity in Online Posts Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose various evaluation metrics to measure the efficacy of the generated text and ensure the generated text has reduced hate intensity without drastically changing the semantic meaning of the original text.	Vibhor Agarwal; Yu Chen; Nishanth Sastry;	arxiv-cs.CL	2023-10-21
968	Exploring Driving Behavior for Autonomous Vehicles Based on Gramian Angular Field Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the Gramian Angular Field Vision Transformer (GAF-ViT) model, designed to analyze AV driving behavior.	JUNWEI YOU et. al.	arxiv-cs.CV	2023-10-21
969	MMTF-DES: A Fusion of Multimodal Transformer Models for Desire, Emotion, and Sentiment Analysis of Social Media Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we have proposed a unified multimodal transformer-based framework with image-text pair settings to identify human desire, sentiment, and emotion.	Abdul Aziz; Nihad Karim Chowdhury; Muhammad Ashad Kabir; Abu Nowshed Chy; Md. Jawad Siddique;	arxiv-cs.CV	2023-10-21
970	GEMBA-MQM: Detecting Translation Quality Error Spans with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces GEMBA-MQM, a GPT-based evaluation metric designed to detect translation quality errors, specifically for the quality estimation setting without the need for human reference translations.	Tom Kocmi; Christian Federmann;	arxiv-cs.CL	2023-10-21
971	The Perils & Promises of Fact-checking with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Understanding the capacities and limitations of LLMs in fact-checking tasks is therefore essential for ensuring the health of our information ecosystem. Here, we evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions.	Dorian Quelle; Alexandre Bovet;	arxiv-cs.CL	2023-10-20
972	Foundation Model’s Embedded Representations May Detect Distribution Shift Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a case study for TL on the Sentiment140 dataset and show that many pre-trained foundation models encode different representations of Sentiment140’s manually curated test set $M$ from the automatically labeled training set $P$, confirming that a distribution shift has occurred.	Max Vargas; Adam Tsou; Andrew Engel; Tony Chiang;	arxiv-cs.LG	2023-10-20
973	Hunayn: Elevating Translation Beyond The Literal Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This project introduces an advanced English-to-Arabic translator surpassing conventional tools.	Nasser Almousa; Nasser Alzamil; Abdullah Alshehri; Ahmad Sait;	arxiv-cs.CL	2023-10-20
974	AllTogether: Investigating The Efficacy of Spliced Prompt for Web Navigation Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Overall, we believe our work provides valuable insights for future research in LLM-driven web agents.	Jiarun Liu; Wentao Hu; Chunhong Zhang;	arxiv-cs.CL	2023-10-20
975	She Had Cobalt Blue Eyes: Prompt Testing to Create Aligned and Sustainable Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a test suite of unique prompts to foster the development of aligned LLMs that are fair, safe, and robust.	Veronica Chatrath; Oluwanifemi Bamgbose; Shaina Raza;	arxiv-cs.CL	2023-10-20
976	BotChat: Evaluating LLMs’ Capabilities of Having Multi-Turn Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With different evaluation protocols, we come to substantially identical conclusions.	HAODONG DUAN et. al.	arxiv-cs.CL	2023-10-20
977	Ask Language Model to Clean Your Noisy Translation Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging the capabilities of large language models (LLMs), we observe their impressive abilities in noise removal.	Quinten Bolding; Baohao Liao; Brandon James Denis; Jun Luo; Christof Monz;	arxiv-cs.CL	2023-10-20
978	Plausibility Processing in Transformer Language Models: Focusing on The Role of Attention Heads in GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The goal of this paper is to explore how Transformer language models process semantic knowledge, especially regarding the plausibility of noun-verb relations.	Soo Hyun Ryu;	arxiv-cs.CL	2023-10-20
979	FMRT: Learning Accurate Feature Matching with Reconciliatory Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose Feature Matching with Reconciliatory Transformer (FMRT), a novel Transformer-based detector-free method that reconciles different features with multiple receptive fields adaptively and utilizes parallel networks to realize reliable positional encoding.	XINYU ZHANG et. al.	arxiv-cs.CV	2023-10-20
980	Product Attribute Value Extraction Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose different prompt templates for instructing LLMs about the target schema of the extraction, covering both zero-shot and few-shot scenarios.	Alexander Brinkmann; Roee Shraga; Christian Bizer;	arxiv-cs.CL	2023-10-19
981	Lost in Translation: When GPT-4V(ision) Can’t See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we draw inspiration from recent investigations into multilingualism and conduct a comprehensive analysis of model’s cross-modal interactions. We introduce a systematic framework that quantifies the capability disparities between different modalities in the multi-modal setting and provide a set of datasets designed for these evaluations.	Xiang Zhang; Senyu Li; Zijun Wu; Ning Shi;	arxiv-cs.CL	2023-10-19
982	Transformer-based Entity Legal Form Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose the application of Transformer-based language models for classifying entity legal forms from raw legal entity names.	ALEXANDER ARIMOND et. al.	arxiv-cs.CL	2023-10-19
983	Experimental Narratives: A Comparison of Human Crowdsourced Storytelling and AI Storytelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper proposes a framework that combines behavioral and computational experiments employing fictional prompts as a novel tool for investigating cultural artifacts and social biases in storytelling both by humans and generative AI.	Nina Begus;	arxiv-cs.CL	2023-10-19
984	AgentTuning: Enabling Generalized Agent Abilities for LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present AgentTuning, a simple and general method to enhance the agent abilities of LLMs while maintaining their general LLM capabilities.	AOHAN ZENG et. al.	arxiv-cs.CL	2023-10-19
985	3D-GPT: Procedural 3D Modeling with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To reduce workload, we introduce 3D-GPT, a framework utilizing large language models~(LLMs) for instruction-driven 3D modeling.	CHUNYI SUN et. al.	arxiv-cs.CV	2023-10-19
986	The Locality and Symmetry of Positional Encodings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conduct a systematic study of positional encodings in \textbf{Bidirectional Masked Language Models} (BERT-style) , which complements existing work in three aspects: (1) We uncover the core function of PEs by identifying two common properties, Locality and Symmetry; (2) We show that the two properties are closely correlated with the performances of downstream tasks; (3) We quantify the weakness of current PEs by introducing two new probing tasks, on which current PEs perform poorly.	Lihu Chen; Gaël Varoquaux; Fabian M. Suchanek;	arxiv-cs.CL	2023-10-19
987	MultiCoNER V2: A Large Multilingual Dataset for Fine-grained and Noisy Named Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present MULTICONER V2, a dataset for fine-grained Named Entity Recognition covering 33 entity classes across 12 languages, in both monolingual and multilingual settings.	Besnik Fetahu; Zhiyu Chen; Sudipta Kar; Oleg Rokhlenko; Shervin Malmasi;	arxiv-cs.CL	2023-10-19
988	Anomaly Detection of Command Shell Sessions Based on DistilBERT: Unsupervised and Supervised Approaches Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we implement a comprehensive approach to detect anomalies in Unix shell sessions using a pretrained DistilBERT model, leveraging both unsupervised and supervised learning techniques to identify anomalous activity while minimizing data labeling.	Zefang Liu; John Buford;	arxiv-cs.CL	2023-10-19
989	Identifying and Adapting Transformer-Components Responsible for Gender Bias in An English Language Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study three methods for identifying causal relations between LM components and particular output: causal mediation analysis, automated circuit discovery and our novel, efficient method called DiffMask+ based on differential masking.	Abhijith Chintam; Rahel Beloch; Willem Zuidema; Michael Hanna; Oskar van der Wal;	arxiv-cs.CL	2023-10-19
990	MedAI Dialog Corpus (MEDIC): Zero-Shot Classification of Doctor and AI Responses in Health Consultations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we examine the efficacy of zero-shot learning models in classifying healthcare consultation responses from Doctors and AI systems.	Olumide E. Ojo; Olaronke O. Adebanji; Alexander Gelbukh; Hiram Calvo; Anna Feldman;	arxiv-cs.CL	2023-10-19
991	Evaluating The Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we evaluate the ability of large language models (LLMs) to perform multiple choice symbol binding (MCSB) for multiple choice question answering (MCQA) tasks in zero-shot, one-shot, and few-shot settings.	Duc-Vu Nguyen; Quoc-Nam Nguyen;	arxiv-cs.CL	2023-10-18
992	SHARCS: Efficient Transformers Through Routing with Dynamic Width Sub-networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SHARCS for adaptive inference that takes into account the hardness of input samples.	Mohammadreza Salehi; Sachin Mehta; Aditya Kusupati; Ali Farhadi; Hannaneh Hajishirzi;	arxiv-cs.LG	2023-10-18
993	VST++: Efficient and Stronger Visual Saliency Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To mitigate the computational costs of the VST model, we propose a Select-Integrate Attention (SIA) module, partitioning foreground into fine-grained segments and aggregating background information into a single coarse-grained token.	Nian Liu; Ziyang Luo; Ni Zhang; Junwei Han;	arxiv-cs.CV	2023-10-18
994	Understanding Video Transformers for Segmentation: A Survey of Application and Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, previous reviews of interpretability methods focused on transformers for classification, while analysis of video temporal dynamics modelling capabilities of video models received less attention. In this survey, we address the above with a thorough discussion of various categories of video segmentation, a component-wise discussion of the state-of-the-art transformer-based models, and a review of related interpretability methods.	Rezaul Karim; Richard P. Wildes;	arxiv-cs.CV	2023-10-18
995	Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and model dimension: Monarch matrices, a simple class of expressive structured matrices that captures many linear transforms, achieves high hardware efficiency on GPUs, and scales sub-quadratically.	DANIEL Y. FU et. al.	arxiv-cs.LG	2023-10-18
996	From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the inherent capabilities of transformer models in learning arithmetic algorithms, such as addition and multiplication.	Shaoxiong Duan; Yining Shi;	arxiv-cs.LG	2023-10-18
997	On The Effectiveness of Creating Conversational Agent Personalities Through Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we report on the effectiveness of our efforts to tailor the personality and conversational style of a conversational agent based on GPT-3.5 and GPT-4 through prompts.	Heng Gu; Chadha Degachi; Uğur Genç; Senthil Chandrasegaran; Himanshu Verma;	arxiv-cs.HC	2023-10-17
998	Field-testing Items Using Artificial Intelligence: Natural Language Processing with Transformers Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Five thousand variations of the RoBERTa model, an artificially intelligent transformer that can understand text language, completed an English literacy exam with 29 …	Hotaka Maeda;	arxiv-cs.CL	2023-10-17
999	Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Set-of-Mark (SoM), a new visual prompting method, to unleash the visual grounding abilities of large multimodal models (LMMs), such as GPT-4V.	JIANWEI YANG et. al.	arxiv-cs.CV	2023-10-17
1000	KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Particularly, using LLMs for complex reasoning tasks on knowledge graphs (KGs) remains largely untouched. To address this, we propose KG-GPT, a multi-purpose framework leveraging LLMs for tasks employing KGs.	Jiho Kim; Yeonsu Kwon; Yohan Jo; Edward Choi;	arxiv-cs.CL	2023-10-17
1001	Large Language Model Prediction Capabilities: Evidence from A Real-World Forecasting Tournament Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Accurately predicting the future would be an important milestone in the capabilities of artificial intelligence. However, research on the ability of large language models to …	P. Schoenegger; Peter S. Park;	ArXiv	2023-10-17
1002	Probing The Creativity of Large Language Models: Can Models Produce Divergent Semantic Association? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The present study aims to investigate the creative thinking of large language models through a cognitive perspective.	Honghua Chen; Nai Ding;	arxiv-cs.CL	2023-10-17
1003	Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We evaluate Factored Verification, a simple automated method for detecting hallucinations in abstractive summaries.	Charlie George; Andreas Stuhlmüller;	arxiv-cs.CL	2023-10-16
1004	Battle of The Large Language Models: Dolly Vs LLaMA Vs Vicuna Vs Guanaco Vs Bard Vs ChatGPT — A Text-to-SQL Parsing Comparison Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As practitioners of Text-to-SQL parsing, we are grateful for their valuable contributions to open-source research.	SHUO SUN et. al.	arxiv-cs.CL	2023-10-16
1005	Enhanced Transformer Architecture for Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a novel structure of Transformer is proposed.	Woohyeon Moon; Taeyoung Kim; Bumgeun Park; Dongsoo Har;	arxiv-cs.CL	2023-10-16
1006	Fine-tuning ChatGPT for Automatic Scoring IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we fine-tuned GPT-3.5 on six assessment tasks with a diverse dataset of middle-school and high-school student responses and expert scoring.	Ehsan Latif; Xiaoming Zhai;	arxiv-cs.CL	2023-10-16
1007	Verbosity Bias in Preference Labeling By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, recent studies are investigating the replacement of human feedback with feedback from other LLMs named Reinforcement Learning from AI Feedback (RLAIF). We examine the biases that come along with evaluating LLMs with other LLMs and take a closer look into verbosity bias — a bias where LLMs sometimes prefer more verbose answers even if they have similar qualities.	Keita Saito; Akifumi Wachi; Koki Wataoka; Youhei Akimoto;	arxiv-cs.CL	2023-10-16
1008	Prompt Packer: Deceiving LLMs Through Compositional Instruction with Hidden Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce an innovative technique for obfuscating harmful instructions: Compositional Instruction Attacks (CIA), which refers to attacking by combination and encapsulation of multiple instructions.	Shuyu Jiang; Xingshu Chen; Rui Tang;	arxiv-cs.CL	2023-10-16
1009	Learning to Rank Context for Named Entity Recognition Using A Synthetic Dataset Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Instead, we propose to generate a synthetic context retrieval training dataset using Alpaca, an instructiontuned large language model (LLM).	Arthur Amalvy; Vincent Labatut; Richard Dufour;	arxiv-cs.CL	2023-10-16
1010	Battle of The Large Language Models: Dolly Vs LLaMA Vs Vicuna Vs Guanaco Vs Bard Vs ChatGPT – A Text-to-SQL Parsing Comparison Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The success of ChatGPT has ignited an AI race, with researchers striving to develop new large language models (LLMs) that can match or surpass the language understanding and …	SHUO SUN et. al.	Conference on Empirical Methods in Natural Language …	2023-10-16
1011	Text Summarization Using Large Language Models: A Comparative Study of MPT-7b-instruct, Falcon-7b-instruct, and OpenAI Chat-GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Leveraging Large Language Models (LLMs) has shown remarkable promise in enhancing summarization techniques.	Lochan Basyal; Mihir Sanghvi;	arxiv-cs.CL	2023-10-16
1012	Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we aim to assess the performance of OpenAI’s newest model, GPT-4V(ision), specifically in the realm of multimodal medical diagnosis.	CHAOYI WU et. al.	arxiv-cs.CV	2023-10-15
1013	GPT-Prompt Controlled Diffusion for Weakly-Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the quality of pseudo labels degrades significantly when the size of available dataset is limited. Thus, in this paper, we tackle this problem from a different view by introducing a novel approach called GPT-Prompt Controlled Diffusion (GPCD) for data augmentation.	Wangyu Wu; Tianhong Dai; Xiaowei Huang; Fei Ma; Jimin Xiao;	arxiv-cs.CV	2023-10-15
1014	Legend at ArAIEval Shared Task: Persuasion Technique Detection Using A Language-Agnostic Text Representation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we share our best performing submission to the Arabic AI Tasks Evaluation Challenge (ArAIEval) at ArabicNLP 2023.	OLUMIDE E. OJO et. al.	arxiv-cs.CL	2023-10-14
1015	One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method based on Hessian sensitivity-aware mixed sparsity pruning to prune LLMs to at least 50% sparsity without the need of any retraining.	HANG SHAO et. al.	arxiv-cs.CL	2023-10-14
1016	HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a hierarchical contrastive learning framework, HiCL, which considers local segment-level and global sequence-level relationships to improve training efficiency and effectiveness.	Zhuofeng Wu; Chaowei Xiao; VG Vinod Vydiswaran;	arxiv-cs.CL	2023-10-14
1017	Geo-knowledge-guided GPT Models Improve The Extraction of Location Descriptions from Disaster-related Social Media Messages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a method that fuses geo-knowledge of location descriptions and a Generative Pre-trained Transformer (GPT) model, such as ChatGPT and GPT-4.	YINGJIE HU et. al.	arxiv-cs.CY	2023-10-13
1018	PIM-GPT: A Hybrid Process-in-Memory Accelerator for Autoregressive Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a PIM accelerator, PIM-GPT, which achieves end-to-end acceleration of GPT inference with high performance and high energy efficiency.	Yuting Wu; Ziyu Wang; Wei D. Lu;	arxiv-cs.AR	2023-10-13
1019	Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in The Fight Against Misinformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FACT-GPT (Fact-checking Augmentation with Claim matching Task-oriented Generative Pre-trained Transformer), a framework designed to automate the claim matching phase of fact-checking using Large Language Models (LLMs).	Eun Cheol Choi; Emilio Ferrara;	arxiv-cs.CL	2023-10-13
1020	Table-GPT: Table-tuned GPT for Diverse Table Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a new \emph{table-tuning} paradigm, where we continue to train/fine-tune language models like GPT-3.5 and ChatGPT, using diverse table-tasks synthesized from real tables as training data, with the goal of enhancing language models’ ability to understand tables and perform table tasks.	PENG LI et. al.	arxiv-cs.CL	2023-10-13
1021	GLoRE: Evaluating Logical Reasoning of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, there has been a scarcity of attempts to assess the logical reasoning capacities of these LLMs, an essential facet of natural language understanding. To encourage further investigation in this area, we introduce GLoRE, a meticulously assembled General Logical Reasoning Evaluation benchmark comprised of 12 datasets that span three different types of tasks.	HANMENG LIU et. al.	arxiv-cs.CL	2023-10-13
1022	DDMT: Denoising Diffusion Mask Transformer Models for Multivariate Time Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to the rapid increase in data scale and dimensionality, the issues of noise and Weak Identity Mapping (WIM) during time series reconstruction have become increasingly pronounced. To address this, we introduce a novel Adaptive Dynamic Neighbor Mask (ADNM) mechanism and integrate it with the Transformer and Denoising Diffusion Model, creating a new framework for multivariate time series anomaly detection, named Denoising Diffusion Mask Transformer (DDMT).	Chaocheng Yang; Tingyin Wang; Xuanhui Yan;	arxiv-cs.LG	2023-10-12
1023	LLM-augmented Preference Learning from Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work aims to serve as a first step towards using LLMs for the CPC task.	INWON KANG et. al.	arxiv-cs.CL	2023-10-12
1024	Can GPT Models Be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on Mock CFA Exams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this perspective, we hope this work paves the way for future studies to continue enhancing LLMs for financial reasoning through rigorous evaluation.	ETHAN CALLANAN et. al.	arxiv-cs.CL	2023-10-12
1025	Prometheus: Inducing Fine-grained Evaluation Capability in Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Prometheus, a fully open-source LLM that is on par with GPT-4’s evaluation capabilities when the appropriate reference materials (reference answer, score rubric) are accompanied.	SEUNGONE KIM et. al.	arxiv-cs.CL	2023-10-12
1026	Low-Resource Clickbait Spoiling for Indonesian Via Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our contributions include the construction of manually labeled clickbait spoiling corpus in Indonesian and an evaluation on using cross-lingual zero-shot question answering-based models to tackle clikcbait spoiling for low-resource language like Indonesian.	Ni Putu Intan Maharani; Ayu Purwarianti; Alham Fikri Aji;	arxiv-cs.CL	2023-10-12
1027	Detection and Prediction of Clopidogrel Treatment Failures Using Longitudinal Structured Electronic Health Records Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose machine learning algorithms to automatically detect and predict clopidogrel treatment failure using longitudinal structured electronic health records (EHR).	Samuel Kim; In Gu Sean Lee; Mijeong Irene Ban; Jane Chiang;	arxiv-cs.LG	2023-10-12
1028	Transformer Choice Net: A Transformer Neural Network for Choice Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we develop a transformer neural network architecture, the Transformer Choice Net, that is suitable for predicting multiple choices.	Hanzhao Wang; Xiaocheng Li; Kalyan Talluri;	arxiv-cs.LG	2023-10-12
1029	Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the task of prompting large language models to specialize in specific text prediction tasks can be challenging, particularly for designers without expertise in prompt engineering. To address this, we introduce Promptor, a conversational prompt generation agent designed to engage proactively with designers.	Junxiao Shen; John J. Dudley; Jingyao Zheng; Bill Byrne; Per Ola Kristensson;	arxiv-cs.CL	2023-10-12
1030	QASiNa: Religious Domain Question Answering Using Sirah Nabawiyah Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the Question Answering Sirah Nabawiyah (QASiNa) dataset, a novel dataset compiled from Sirah Nabawiyah literatures in Indonesian language.	Muhammad Razif Rizqullah; Ayu Purwarianti; Alham Fikri Aji;	arxiv-cs.CL	2023-10-12
1031	Multiclass Classification of Policy Documents with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we test the prediction performance of an alternative strategy, which requires human involvement much less than full manual coding.	Erkan Gunes; Christoffer Koch Florczak;	arxiv-cs.CL	2023-10-12
1032	Ziya-Visual: Bilingual Large Vision-Language Model Via Multi-Task Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Ziya-Visual series, a set of bilingual large-scale vision-language models (LVLMs) designed to incorporate visual semantics into LLM for multi-modal dialogue.	JUNYU LU et. al.	arxiv-cs.CL	2023-10-12
1033	Multichannel Consecutive Data Cross-extraction with 1DCNN-attention for Diagnosis of Power Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The unutilized sequential data contains the significant temporal information reflecting the transformer condition. In light of this, the structure of multichannel consecutive data cross-extraction (MCDC) is proposed in this article in order to comprehensively exploit the intrinsic characteristic and evaluate the states of transformer.	Wei Zheng; Guogang Zhang; Chenchen Zhao; Qianqian Zhu;	arxiv-cs.LG	2023-10-11
1034	InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce Retro 48B, the largest LLM pretrained with retrieval.	BOXIN WANG et. al.	arxiv-cs.CL	2023-10-11
1035	Large Language Models Are Zero-Shot Time Series Forecasters IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Developing this approach, we find that large language models (LLMs) such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks. To facilitate this performance, we propose procedures for effectively tokenizing time series data and converting discrete distributions over tokens into highly flexible densities over continuous values.	Nate Gruver; Marc Finzi; Shikai Qiu; Andrew Gordon Wilson;	arxiv-cs.LG	2023-10-11
1036	3D TransUNet: Advancing Medical Image Segmentation Through Vision Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we extend the 2D TransUNet architecture to a 3D network by building upon the state-of-the-art nnU-Net architecture, and fully exploring Transformers’ potential in both the encoder and decoder design.	JIENENG CHEN et. al.	arxiv-cs.CV	2023-10-11
1037	Ethical Reasoning Over Moral Alignment: A Case and Framework for In-Context Ethical Policies in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this position paper, we argue that instead of morally aligning LLMs to specific set of ethical principles, we should infuse generic ethical reasoning capabilities into them so that they can handle value pluralism at a global scale.	Abhinav Rao; Aditi Khandelwal; Kumar Tanmay; Utkarsh Agarwal; Monojit Choudhury;	arxiv-cs.CL	2023-10-11
1038	Prompt Engineering or Fine Tuning: An Empirical Assessment of Large Language Models in Automated Software Engineering Tasks Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we investigate the effectiveness of state-of-the-art LLM, i.e., GPT-4, with three different prompting engineering techniques (i.e., basic prompting, in-context …	JIHO SHIN et. al.	ArXiv	2023-10-11
1039	Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper meticulously examines a simple transformer trained for Othello, extending prior research to enhance comprehension of the emergent world model of Othello-GPT.	Dean S. Hazineh; Zechen Zhang; Jeffery Chiu;	arxiv-cs.LG	2023-10-11
1040	A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the potential for using these techniques to improve upon prior approaches for automated bug triaging is not well studied or understood. Therefore, in this paper we offer one of the first investigations that fine-tunes transformer-based language models for the task of bug triaging on four open source datasets, spanning a collective 53 years of development history with over 400 developers and over 150 software project components.	Atish Kumar Dipongkor; Kevin Moran;	arxiv-cs.SE	2023-10-10
1041	Large Language Models for Propaganda Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Detecting propaganda through NLP in text is challenging due to subtle manipulation techniques and contextual dependencies. To address this issue, we investigate the effectiveness of modern Large Language Models (LLMs) such as GPT-3 and GPT-4 for propaganda detection.	Kilian Sprenkamp; Daniel Gordon Jones; Liudmila Zavolokina;	arxiv-cs.CL	2023-10-10
1042	Jaeger: A Concatenation-Based Multi-Transformer VQA Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although there has been encouraging progress in document-based question answering due to the utilization of large language and open-world prior models\cite{1}, several challenges persist, including prolonged response times, extended inference durations, and imprecision in matching. In order to overcome these challenges, we propose Jaegar, a concatenation-based multi-transformer VQA model.	Jieting Long; Zewei Shi; Penghao Jiang; Yidong Gan;	arxiv-cs.CL	2023-10-10
1043	SparseCoder: Advancing Source Code Analysis with Sparse Attention and Learned Token Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The emerging Transformer-based approaches, though achieving remarkable performance, struggle with long code sequences due to their self-attention mechanism, which scales quadratically with the sequence length. This paper introduces SparseCoder, an innovative approach incorporating sparse attention and learned token pruning (LTP) method (adapted from natural language processing) to address this limitation.	Xueqi Yang; Mariusz Jakubowski; Kelly Kang; Haojie Yu; Tim Menzies;	arxiv-cs.SE	2023-10-10
1044	QualiGPT: GPT As An Easy-to-use Tool for Qualitative Coding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce QualiGPT, a specialized tool designed after considering challenges associated with ChatGPT and qualitative analysis.	He Zhang; Chuhao Wu; Jingyi Xie; ChanMin Kim; John M. Carroll;	arxiv-cs.HC	2023-10-10
1045	An Experiment on An Automated Literature Survey of Data-driven Speech Enhancement Methods Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods.	ARTHUR DOS SANTOS et. al.	arxiv-cs.SD	2023-10-09
1046	Foundation Models Meet Visualizations: Challenges and Opportunities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conversely, within FM4VIS, we highlight how foundation models can be utilized to advance the visualization field itself.	Weikai Yang; Mengchen Liu; Zheng Wang; Shixia Liu;	arxiv-cs.LG	2023-10-09
1047	Making Scalable Meta Learning Practical Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems.	SANG KEUN CHOE et. al.	arxiv-cs.LG	2023-10-09
1048	SC-Safety: A Multi-round Open-ended Question Adversarial Safety Benchmark for Large Language Models in Chinese Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To systematically assess the safety of Chinese LLMs, we introduce SuperCLUE-Safety (SC-Safety) – a multi-round adversarial benchmark with 4912 open-ended questions covering more than 20 safety sub-dimensions.	Liang Xu; Kangkang Zhao; Lei Zhu; Hang Xue;	arxiv-cs.CL	2023-10-09
1049	ViTs Are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this survey is to show the first use of ViTs in CV.	Md Sohag Mia; Abu Bakor Hayat Arnob; Abdu Naim; Abdullah Al Bary Voban; Md Shariful Islam;	arxiv-cs.CV	2023-10-09
1050	Abstractive Summarization of Large Document Collections Using GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a method of abstractive summarization designed to scale to document collections instead of individual documents.	Sengjie Liu; Christopher G. Healey;	arxiv-cs.AI	2023-10-09
1051	A Meta-Learning Perspective on Transformers for Causal Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Transformer architecture has become prominent in developing large causal language models.	Xinbo Wu; Lav R. Varshney;	arxiv-cs.LG	2023-10-09
1052	GPT-who: An Information Density-based Machine-Generated Text Detector Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose GPT-who, the first psycholinguistically-inspired domain-agnostic statistical detector.	Saranya Venkatraman; Adaku Uchendu; Dongwon Lee;	arxiv-cs.CL	2023-10-09
1053	Exploring The Maze of Multilingual Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present a comprehensive evaluation of three popular multilingual language models: mBERT, XLM-R, and GPT-3.	Sina Bagheri Nezhad; Ameeta Agrawal;	arxiv-cs.CL	2023-10-09
1054	GPT-4 As An Agronomist Assistant? Answering Agriculture Exams Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a comprehensive evaluation of popular LLMs, such as Llama 2 and GPT, on their ability to answer agriculture-related questions.	Bruno Silva; Leonardo Nunes; Roberto Estevão; Vijay Aski; Ranveer Chandra;	arxiv-cs.AI	2023-10-09
1055	GeoLLM: Extracting Geospatial Knowledge from Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we explore the question of whether the vast amounts of knowledge found in Internet language corpora, now compressed within large language models (LLMs), can be leveraged for geospatial prediction tasks.	ROHIN MANVI et. al.	arxiv-cs.CL	2023-10-09
1056	Quality Assurance of A GPT-based Sentiment Analysis System: Adversarial Review Data Generation and Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Results were thoroughly discussed from the perspective of AI quality assurance to present the quality analysis of an LLM model on generated adversarial textual data and the effectiveness of using SA on anomaly detection in data quality assurance.	Tinghui Ouyang; Hoang-Quoc Nguyen-Son; Huy H. Nguyen; Isao Echizen; Yoshiki Seo;	arxiv-cs.SE	2023-10-08
1057	Are Emily and Greg Still More Employable Than Lakisha and Jamal? Investigating Algorithmic Hiring Bias in The Era of ChatGPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We use contrastive input decoding on open-source LLMs to uncover potential sources of bias.	AKSHAJ KUMAR VELDANDA et. al.	arxiv-cs.CL	2023-10-08
1058	Tree-GPT: Modular Large Language Model Expert System for Forest Remote Sensing Image Understanding and Interactive Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel framework, Tree-GPT, which incorporates Large Language Models (LLMs) into the forestry remote sensing data workflow, thereby enhancing the efficiency of data analysis.	Siqi Du; Shengjun Tang; Weixi Wang; Xiaoming Li; Renzhong Guo;	arxiv-cs.CV	2023-10-07
1059	Hybrid Recommendation System Using Graph Neural Network and BERT Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel model that utilizes a Graph Neural Network (GNN) in conjunction with sentence transformer embeddings to predict anime recommendations for different users.	Shashidhar Reddy Javaji; Krutika Sarode;	arxiv-cs.IR	2023-10-07
1060	DiffNAS: Bootstrapping Diffusion Models By Prompting for Better Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we meticulously analyze the diffusion model and engineer a base model search approach, denoted DiffNAS.	WENHAO LI et. al.	arxiv-cs.AI	2023-10-07
1061	LLM4VV: Developing LLM-Driven Testsuite for Compiler Validation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are a new and powerful tool for a wide span of applications involving natural language and demonstrate impressive code generation abilities. The goal of this work is to automatically generate tests and use these tests to validate and verify compiler implementations of a directive-based parallel programming paradigm, OpenACC.	Christian Munley; Aaron Jarmusch; Sunita Chandrasekaran;	arxiv-cs.AI	2023-10-07
1062	TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel framework, TEMPO, that can effectively learn time series representations.	DEFU CAO et. al.	arxiv-cs.LG	2023-10-07
1063	Transformer-Based Neural Surrogate for Link-Level Path Loss Prediction from Variable-Sized Maps Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a transformer-based neural network architecture that enables predicting link-level properties from maps of various dimensions and from sparse measurements.	THOMAS M. HEHN et. al.	arxiv-cs.LG	2023-10-06
1064	A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the comparative performance of these AI accelerators on large language models has not been previously studied. In this paper, we systematically study LLMs on multiple AI accelerators and GPUs and evaluate their performance characteristics for these models.	MURALI EMANI et. al.	arxiv-cs.PF	2023-10-06
1065	Copy Suppression: Comprehensively Understanding An Attention Head Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a single attention head in GPT-2 Small that has one main role across the entire training distribution.	Callum McDougall; Arthur Conmy; Cody Rushing; Thomas McGrath; Neel Nanda;	arxiv-cs.LG	2023-10-06
1066	Degradation-Aware Self-Attention Based Transformer for Blind Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a new degradation-aware self-attention-based Transformer model, where we incorporate contrastive learning into the Transformer network for learning the degradation representations of input images with unknown noise.	Qingguo Liu; Pan Gao; Kang Han; Ningzhong Liu; Wei Xiang;	arxiv-cs.CV	2023-10-06
1067	Coding By Design: GPT-4 Empowers Agile Model Driven Development Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, we present a case-study showcasing a multi-agent simulation system of an Unmanned Vehicle Fleet.	Ahmed R. Sadik; Sebastian Brulin; Markus Olhofer;	arxiv-cs.SE	2023-10-06
1068	Analysis of The Reasoning with Redundant Information Provided Ability of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The study designed a modified version of the grade school math 8K (GSM-8K) dataset which has several variants focusing on different attributes of redundant information.	Wenbei Xie;	arxiv-cs.CL	2023-10-06
1069	Exploiting Transformer Activation Sparsity with Dynamic Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: At the same time, previous studies have revealed significant activation sparsity in these models, indicating the presence of redundant computations. In this paper, we propose Dynamic Sparsified Transformer Inference (DSTI), a method that radically reduces the inference cost of Transformer models by enforcing activation sparsity and subsequently transforming a dense model into its sparse Mixture of Experts (MoE) version.	Mikołaj Piórczyński; Filip Szatkowski; Klaudia Bałazy; Bartosz Wójcik;	arxiv-cs.LG	2023-10-06
1070	LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.	JIAMING WANG et. al.	arxiv-cs.SD	2023-10-06
1071	Self-Confirming Transformer for Locally Consistent Online Adaptation in Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes the self-confirming loss (SCL) in offline transformer training to address the online nonstationarity, which is motivated by the self-confirming equilibrium (SCE) in game theory.	Tao Li; Juan Guevara; Xinghong Xie; Quanyan Zhu;	arxiv-cs.LG	2023-10-06
1072	ALBERTA: ALgorithm-Based Error Resilience in Transformer Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel algorithm-based resilience framework called ALBERTA that allows us to perform end-to-end resilience analysis and protection of transformer-based architectures.	Haoxuan Liu; Vasu Singh; Michał Filipiuk; Siva Kumar Sastry Hari;	arxiv-cs.CR	2023-10-05
1073	Agent Instructs Large Language Models to Be General Zero-Shot Reasoners IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a method to improve the zero-shot reasoning abilities of large language models on general language understanding tasks.	Nicholas Crispino; Kyle Montgomery; Fankun Zeng; Dawn Song; Chenguang Wang;	arxiv-cs.CL	2023-10-05
1074	Reformulating Domain Adaptation of Large Language Models As Adapt-Retrieve-Revise Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a simple and effective domain adaptation framework for GPT-4 by reformulating generation as an \textbf{adapt-retrieve-revise} process.	Zhen wan; Yating Zhang; Yexiang Wang; Fei Cheng; Sadao Kurohashi;	arxiv-cs.CL	2023-10-05
1075	Quantized Transformer Language Model Implementations on Edge Devices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large-scale transformer-based models like the Bidirectional Encoder Representations from Transformers (BERT) are widely used for Natural Language Processing (NLP) applications, wherein these models are initially pre-trained with a large corpus with millions of parameters and then fine-tuned for a downstream NLP task.	MOHAMMAD WALI UR RAHMAN et. al.	arxiv-cs.CL	2023-10-05
1076	Automating Human Tutor-Style Programming Feedback: Leveraging GPT-4 Tutor Model for Hint Generation and GPT-3.5 Student Model for Hint Validation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we seek to push the limits of generative AI models toward providing high-quality programming hints and develop a novel technique, GPT4Hints-GPT3.5Val.	TUNG PHUNG et. al.	arxiv-cs.AI	2023-10-05
1077	Procedural Text Mining with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we investigate the usage of large language models (LLMs) in both zero-shot and in-context learning settings to tackle the problem of extracting procedures from unstructured PDF text in an incremental question-answering fashion.	Anisa Rula; Jennifer D’Souza;	arxiv-cs.CL	2023-10-05
1078	A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) are a special class of pretrained language models obtained by scaling model size, pretraining corpus and computation. LLMs, because of their large …	Katikapalli Subramanyam Kalyan;	ArXiv	2023-10-04
1079	How FaR Are Large Language Models From Agents with Theory-of-Mind? IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Thinking is for Doing.Humans can infer other people’s mental states from observations–an ability called Theory-of-Mind (ToM)–and subsequently act pragmatically on those …	PEI ZHOU et. al.	ArXiv	2023-10-04
1080	Can Language Models Employ The Socratic Method? Experiments with Code Debugging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a manually created dataset of multi-turn Socratic advice that is aimed at helping a novice programmer fix buggy solutions to simple computational problems.	Erfan Al-Hossami; Razvan Bunescu; Justin Smith; Ryan Teehan;	arxiv-cs.CL	2023-10-04
1081	Molecule Design By Latent Prompt Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a latent prompt Transformer model for solving challenging optimization problems such as molecule design, where the goal is to find molecules with optimal values of a target chemical or biological property that can be computed by an existing software.	Deqian Kong; Yuhao Huang; Jianwen Xie; Ying Nian Wu;	arxiv-cs.LG	2023-10-04
1082	Can Large Language Models Be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large language models (LLMs) have achieved remarkable success across a wide spectrum of tasks; however, they still face limitations in scenarios that demand long-term planning and spatial reasoning. To facilitate this line of research, in this work, we propose a new benchmark, termed $\textbf{P}$ath $\textbf{P}$lanning from $\textbf{N}$atural $\textbf{L}$anguage ($\textbf{PPNL}$).	Mohamed Aghzal; Erion Plaku; Ziyu Yao;	arxiv-cs.CL	2023-10-04
1083	Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we are motivated to study building an LLM cascade to save the cost of using LLMs, particularly for performing reasoning (e.g., mathematical, causal) tasks.	Murong Yue; Jie Zhao; Min Zhang; Liang Du; Ziyu Yao;	arxiv-cs.CL	2023-10-04
1084	Low-Resource Languages Jailbreak GPT-4 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: AI safety training and red-teaming of large language models (LLMs) are measures to mitigate the generation of unsafe content. Our work exposes the inherent cross-lingual …	Zheng-Xin Yong; Cristina Menghini; Stephen H. Bach;	arxiv-cs.CL	2023-10-03
1085	Nugget 2D: Dynamic Contextual Compression for Scaling Decoder-only Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a solution based on dynamic contextual compression, which extends the Nugget approach of Qin & Van Durme (2023) from BERT-like frameworks to decoder-only LMs.	Guanghui Qin; Corby Rosset; Ethan C. Chau; Nikhil Rao; Benjamin Van Durme;	arxiv-cs.CL	2023-10-03
1086	De Novo Drug Design with Joint Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: De novo drug design requires simultaneously generating novel molecules outside of training data and predicting their target properties, making it a hard task for generative models. To address this, we propose Joint Transformer that combines a Transformer decoder, Transformer encoder, and a predictor in a joint generative model with shared weights.	Adam Izdebski; Ewelina Weglarz-Tomczak; Ewa Szczurek; Jakub M. Tomczak;	arxiv-cs.LG	2023-10-03
1087	Automatic Pair Construction for Contrastive Post-training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4).	CANWEN XU et. al.	arxiv-cs.CL	2023-10-03
1088	MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has not been systematically studied. To bridge this gap, we present MathVista, a benchmark designed to combine challenges from diverse mathematical and visual tasks.	PAN LU et. al.	arxiv-cs.CV	2023-10-03
1089	Improving Drumming Robot Via Attention Transformer Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on the topic of drumming robots in entertainment.	Yang Yi; Zonghan Li;	arxiv-cs.RO	2023-10-03
1090	Can Large Language Models Provide Useful Feedback on Research Papers? A Large-scale Empirical Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the utility of LLM-generated feedback has not been systematically studied. To address this gap, we created an automated pipeline using GPT-4 to provide comments on the full PDFs of scientific papers.	WEIXIN LIANG et. al.	arxiv-cs.LG	2023-10-03
1091	HPC-GPT: Integrating Large Language Model for High-Performance Computing Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs), including the LLaMA model, have exhibited their efficacy across various general-domain natural language processing (NLP) tasks. However, their …	XIANZHONG DING et. al.	Proceedings of the SC ’23 Workshops of The International …	2023-10-03
1092	ResidualTransformer: Residual Low-Rank Learning with Weight-Sharing for Transformer Layers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to reduce model size by reparameterizing model weights across Transformer encoder layers and assuming a special weight composition and structure.	Yiming Wang; Jinyu Li;	arxiv-cs.CL	2023-10-03
1093	Instances Need More Care: Rewriting Prompts for Instances with LLMs in The Loop Yields Better Zero-Shot Performance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study introduces PRomPTed, an approach that optimizes the zero-shot prompts for individual task instances following an innovative manner of “LLMs in the loop”.	Saurabh Srivastava; Chengyue Huang; Weiguo Fan; Ziyu Yao;	arxiv-cs.CL	2023-10-03
1094	Who Is ChatGPT? Benchmarking LLMs’ Psychological Portrayal Using PsychoBench Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a framework, PsychoBench, for evaluating diverse psychological aspects of LLMs.	JEN-TSE HUANG et. al.	arxiv-cs.CL	2023-10-02
1095	Label Supervised LLaMA Finetuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a label-supervised adaptation for LLMs, which aims to finetuning the model with discriminant labels.	ZONGXI LI et. al.	arxiv-cs.CL	2023-10-02
1096	Can GPT-4 Replicate Empirical Software Engineering Research? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we examine LLMs’ abilities to perform replications of empirical software engineering research on new data.	JENNY T. LIANG et. al.	arxiv-cs.SE	2023-10-02
1097	GPT-Driver: Learning to Drive with GPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a simple yet effective approach that can transform the OpenAI GPT-3.5 model into a reliable motion planner for autonomous vehicles.	Jiageng Mao; Yuxi Qian; Junjie Ye; Hang Zhao; Yue Wang;	arxiv-cs.CV	2023-10-02
1098	Linear Attention Is (maybe) All You Need (to Understand Transformer Optimization) IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Consequently, the results obtained in this paper suggest that a simple linearized Transformer model could actually be a valuable, realistic abstraction for understanding Transformer optimization.	KWANGJUN AHN et. al.	arxiv-cs.LG	2023-10-02
1099	Efficient Remote Sensing Segmentation With Generative Adversarial Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most deep learning methods that achieve high segmentation accuracy require deep network architectures that are too heavy and complex to run on embedded devices with limited storage and memory space. To address this issue, this paper proposes an efficient Generative Adversarial Transfomer (GATrans) for achieving high-precision semantic segmentation while maintaining an extremely efficient size.	Luyi Qiu; Dayu Yu; Xiaofeng Zhang; Chenxiao Zhang;	arxiv-cs.CV	2023-10-02
1100	PolySketchFormer: Fast Transformers Via Sketching Polynomial Kernels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent theoretical results indicate the intractability of sub-quadratic softmax attention approximation under reasonable complexity assumptions. This paper addresses this challenge by first demonstrating that polynomial attention with high degree can effectively replace softmax without sacrificing model quality.	Praneeth Kacham; Vahab Mirrokni; Peilin Zhong;	arxiv-cs.LG	2023-10-02
1101	ITran: A Novel Transformer-based Approach for Industrial Anomaly Detection and Localization Related Papers Related Patents Related Grants Related Venues Related Experts View	Xiangyu Cai; Ruliang Xiao; Zhixia Zeng; Ping Gong; Youcong Ni;	Eng. Appl. Artif. Intell.	2023-10-01
1102	Robust Sentiment Analysis for Low Resource Languages Using Data Augmentation Approaches: A Case Study in Marathi Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present an exhaustive study of data augmentation approaches for the low-resource Indic language Marathi.	AABHA PINGLE et. al.	arxiv-cs.CL	2023-10-01
1103	Natural Language Models for Data Visualization Utilizing NvBench Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Contributing to the progress in this area of research, we built natural language translation models to construct simplified versions of data and visualization queries in a language called Vega Zero. In this paper, we explore the design and performance of these sequence to sequence transformer based machine learning model architectures using large language models such as BERT as encoders to predict visualization commands from natural language queries, as well as apply available T5 sequence to sequence models to the problem for comparison.	Shuo Wang; Carlos Crespo-Quinones;	arxiv-cs.CL	2023-10-01
1104	TRAM: Benchmarking Temporal Reasoning for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce TRAM, a temporal reasoning benchmark composed of ten datasets, encompassing various temporal aspects of events such as order, arithmetic, frequency, and duration, designed to facilitate a comprehensive evaluation of the temporal reasoning capabilities of large language models (LLMs).	Yuqing Wang; Yun Zhao;	arxiv-cs.CL	2023-10-01
1105	BooookScore: A Systematic Exploration of Book-length Summarization in The Era of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present the first study of the coherence of LLM-based book-length summarizers implemented via two prompting workflows: (1) hierarchically merging chunk-level summaries, and (2) incrementally updating a running summary.	Yapei Chang; Kyle Lo; Tanya Goyal; Mohit Iyyer;	arxiv-cs.CL	2023-10-01
1106	RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs.	ZEKUN MOORE WANG et. al.	arxiv-cs.CL	2023-10-01
1107	Feature Fusion Convolution-Aided Transformer for Automatic Modulation Recognition Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Automatic Modulation Recognition (AMR) is becoming increasingly important due to its key role in wireless communications. In order to enrich the feature information and reduce …	MUTIAN HU et. al.	IEEE Communications Letters	2023-10-01
1108	TensorFormer: A Tensor-Based Multimodal Transformer for Multimodal Sentiment Analysis and Depression Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Sentiment analysis is an important research field aiming to extract and fuse sentimental information from human utterances. Due to the diversity of human sentiment, analyzing from …	Hao Sun; Yen-Wei Chen; Lanfen Lin;	IEEE Transactions on Affective Computing	2023-10-01
1109	PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The most advanced text-to-image (T2I) models require significant training costs (e.g., millions of GPU hours), seriously hindering the fundamental innovation for the AIGC …	JUNSONG CHEN et. al.	ArXiv	2023-09-30
1110	Question-Answering Model for Schizophrenia Symptoms and Their Impact on Daily Life Using Mental Health Forums Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The purpose of this paper is to present a new methodology for building a medical dataset and obtain a QA model for analysis of symptoms and impact on daily life for a specific disease domain.	Christian Internò; Eloisa Ambrosini;	arxiv-cs.LG	2023-09-30
1111	AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Given that vector graphics are typically encoded using low-level graphics primitives, generating them directly is difficult. To address this, we propose the use of TikZ, a well-known abstract graphics language that can be compiled to vector graphics, as an intermediate representation of scientific figures.	Jonas Belouadi; Anne Lauscher; Steffen Eger;	arxiv-cs.CL	2023-09-30
1112	Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel gaze-driven sentence simplification system designed to enhance reading comprehension while maintaining their focus on the content.	Taichi Higasa; Keitaro Tanaka; Qi Feng; Shigeo Morishima;	arxiv-cs.CL	2023-09-30
1113	Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional language models frequently face challenges in generalizing beyond their training data and are typically designed for a single task, often focusing on bias detection at the sentence level. To address this, we present the Contextualized Bi-Directional Dual Transformer (CBDT) \textcolor{green}{\faLeaf} classifier.	SHAINA RAZA et. al.	arxiv-cs.CL	2023-09-30
1114	Investigating The Efficacy of Large Language Models in Reflective Assessment Methods Through Chain of Thoughts Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The primary aim of this research is to assess how well four language models can grade reflective essays of third-year medical students.	Baphumelele Masikisiki; Vukosi Marivate; Yvette Hlope;	arxiv-cs.CL	2023-09-30
1115	RelBERT: Embedding Relations with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: As an alternative, we propose to extract relation embeddings from relatively small language models.	Asahi Ushio; Jose Camacho-Collados; Steven Schockaert;	arxiv-cs.CL	2023-09-30
1116	Graph Neural Architecture Search with GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we integrate GPT-4 into GNAS and propose a new GPT-4 based Graph Neural Architecture Search method (GPT4GNAS for short).	HAISHUAI WANG et. al.	arxiv-cs.LG	2023-09-30
1117	PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces PIXART-$\alpha$, a Transformer-based T2I diffusion model whose image generation quality is competitive with state-of-the-art image generators (e.g., Imagen, SDXL, and even Midjourney), reaching near-commercial application standards.	JUNSONG CHEN et. al.	arxiv-cs.CV	2023-09-30
1118	Split and Merge: Aligning Position Biases in Large Language Model Based Evaluators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these LLM-based evaluators exhibit position bias, or inconsistency, when used to evaluate candidate answers in pairwise comparisons, favoring either the first or second answer regardless of content. To address this limitation, we propose PORTIA, an alignment-based system designed to mimic human comparison strategies to calibrate position bias in a lightweight yet effective manner.	ZONGJIE LI et. al.	arxiv-cs.CL	2023-09-29
1119	An Evaluation of GPT Models for Phenotype Concept Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: With the significant shift in the use of large language models (LLMs) for most NLP tasks, we examine the performance of the latest Generative Pre-trained Transformer (GPT) models underpinning ChatGPT as a foundation for the tasks of clinical phenotyping and phenotype annotation.	TUDOR GROZA et. al.	arxiv-cs.CL	2023-09-29
1120	A Large Language Model Approach to Educational Survey Feedback Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper assesses the potential for the large language models (LLMs) GPT-4 and GPT-3.5 to aid in deriving insight from education feedback surveys.	Michael J. Parker; Caitlin Anderson; Claire Stone; YeaRim Oh;	arxiv-cs.CL	2023-09-29
1121	SCALE: Synergized Collaboration of Asymmetric Language Translation Engines Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce SCALE, a collaborative framework that connects compact Specialized Translation Models (STMs) and general-purpose Large Language Models (LLMs) as one unified translation engine.	XIN CHENG et. al.	arxiv-cs.CL	2023-09-29
1122	Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper delves into the applicability of GPT-4’s learned knowledge for imperfect information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that leverages GPT-4’s capabilities for performing in imperfect information games.	JIAXIAN GUO et. al.	arxiv-cs.AI	2023-09-29
1123	The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the latest model, GPT-4V(ision), to deepen the understanding of LMMs.	ZHENGYUAN YANG et. al.	arxiv-cs.CV	2023-09-29
1124	Large Language Model Soft Ideologization Via AI-Self-Consciousness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the implications of GPT soft ideologization through the use of AI-self-consciousness.	Xiaotian Zhou; Qian Wang; Xiaofeng Wang; Haixu Tang; Xiaozhong Liu;	arxiv-cs.CL	2023-09-28
1125	Transformer-VQ: Linear-Time Transformers Via Vector Quantization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Transformer-VQ, a decoder-only transformer computing softmax-based dense self-attention in linear time.	Lucas D. Lingle;	arxiv-cs.LG	2023-09-28
1126	TranDRL: A Transformer-Driven Deep Reinforcement Learning Enabled Prescriptive Maintenance Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces an integrated framework that leverages the capabilities of the Transformer model-based neural networks and deep reinforcement learning (DRL) algorithms to optimize system maintenance actions.	Yang Zhao; Jiaxi Yang; Wenbo Wang; Helin Yang; Dusit Niyato;	arxiv-cs.LG	2023-09-28
1127	GPT-Fathom: Benchmarking Large Language Models to Decipher The Evolutionary Path Towards GPT-4 and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we introduce GPT-Fathom, an open-source and reproducible LLM evaluation suite built on top of OpenAI Evals.	SHEN ZHENG et. al.	arxiv-cs.CL	2023-09-28
1128	Can Language Models Learn to Listen? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a framework for generating appropriate facial responses from a listener in dyadic social interactions based on the speaker’s words.	EVONNE NG et. al.	iccv	2023-09-27
1129	Query Refinement Transformer for 3D Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides,noise background queries interfere with proper scene perception and accurate instance segmentation. To address the above issues, we propose a Query Refinement Transformer termed QueryFormer.	Jiahao Lu; Jiacheng Deng; Chuxin Wang; Jianfeng He; Tianzhu Zhang;	iccv	2023-09-27
1130	Agglomerative Transformer for Human-Object Interaction Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an agglomerative Transformer (AGER) that enables Transformer-based human-object interaction (HOI) detectors to flexibly exploit extra instance-level cues in a single-stage and end-to-end manner for the first time.	Danyang Tu; Wei Sun; Guangtao Zhai; Wei Shen;	iccv	2023-09-27
1131	3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose 3D-VisTA, a pre-trained Transformer for 3D Vis ion and Text Alignment that can be easily adapted to various downstream tasks.	ZIYU ZHU et. al.	iccv	2023-09-27
1132	Foreground-Background Distribution Modeling Transformer for Visual Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the feature learning of these Transformer-based trackers is easily disturbed by complex backgrounds. To address the above limitations, we propose a novel foreground-background distribution modeling transformer for visual object tracking (F-BDMTrack), including a fore-background agent learning (FBAL) module and a distribution-aware attention (DA2) module in a unified transformer architecture.	Dawei Yang; Jianfeng He; Yinchao Ma; Qianjin Yu; Tianzhu Zhang;	iccv	2023-09-27
1133	PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Generic image captions often miss visual details essential for the LM to answer visual questions correctly. To address this challenge, we propose PromptCap (Prompt-guided image Captioning), a captioning model designed to serve as a better connector between images and black-box LMs.	YUSHI HU et. al.	iccv	2023-09-27
1134	RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Specifically, a projection-aware hierarchical transformer is proposed to capture long-range dependencies and filter outliers by extracting point features globally.	JIUMING LIU et. al.	iccv	2023-09-27
1135	MSRA-SR: Image Super-resolution Transformer with Multi-scale Shared Representation Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an image super-resolution Transformer with Multi-scale Shared Representation Acquisition (MSRA-SR).	XIAOQIANG ZHOU et. al.	iccv	2023-09-27
1136	ProPainter: Improving Propagation and Transformer for Video Inpainting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, memory or computational constraints limit the temporal range of feature propagation and video Transformer, preventing exploration of correspondence information from distant frames. To address these issues, we propose an improved framework, called ProPainter, which involves enhanced ProPagation and an efficient Transformer.	Shangchen Zhou; Chongyi Li; Kelvin C.K. Chan; Chen Change Loy;	iccv	2023-09-27
1137	Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by the observations, we propose two Transformer variants: i) Context-Sharing Transformer (CST) that learns the global-shared contextual information within image frames with a lightweight computation.	YICHEN YUAN et. al.	iccv	2023-09-27
1138	MMST-ViT: Climate Change-aware Crop Yield Prediction Via Multi-Modal Spatial-Temporal Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we develop a deep learning-based solution, namely Multi-Modal Spatial-Temporal Vision Transformer (MMST-ViT), for predicting crop yields at the county level across the United States, by considering the effects of short-term meteorological variations during the growing season and the long-term climate change on crops.	FUDONG LIN et. al.	iccv	2023-09-27
1139	TextManiA: Enriching Visual Feature By Text-driven Manifold Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose TextManiA, a text-driven manifold augmentation method that semantically enriches visual feature spaces, regardless of class distribution.	Moon Ye-Bin; Jisoo Kim; Hongyeob Kim; Kilho Son; Tae-Hyun Oh;	iccv	2023-09-27
1140	A Transformer-based Neural ODE for Dense Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	Seyedalireza Khoshsirat; C. Kambhamettu;	Machine Vision and Applications	2023-09-27
1141	Part-Aware Transformer for Generalizable Person Re-identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We observe that while the global images of different IDs should have different features, their similar local parts (e.g., black backpack) are not bounded by this constraint. Motivated by this, we propose a pure Transformer model (termed Part-aware Transformer) for DG-ReID by designing a proxy task, named Cross-ID Similarity Learning (CSL), to mine local visual information shared by different IDs.	Hao Ni; Yuke Li; Lianli Gao; Heng Tao Shen; Jingkuan Song;	iccv	2023-09-27
1142	ActFormer: A GAN-based Transformer Towards General Action-Conditioned 3D Human Motion Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.	LIANG XU et. al.	iccv	2023-09-27
1143	TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a set of simple yet effective TOken REduction (TORE) strategies for Transformer-based Human Mesh Recovery from monocular images.	ZHIYANG DOU et. al.	iccv	2023-09-27
1144	Feature Modulation Transformer: Cross-Refinement of Global Representation Via High-Frequency Prior for Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our proposed solution, the cross-refinement adaptive feature modulation transformer (CRAFT), integrates the strengths of both convolutional and transformer structures.	Ao Li; Le Zhang; Yun Liu; Ce Zhu;	iccv	2023-09-27
1145	PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we first collaborate CLIP and GPT to be a unified 3D open-world learner, named as PointCLIP V2, which fully unleashes their potential for zero-shot 3D classification, segmentation, and detection.	XIANGYANG ZHU et. al.	iccv	2023-09-27
1146	Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we propose a novel query-based 3D detector called Clusterformer, our Clusterformer regards each object as a cluster of 3D space which mainly consists of the non-empty voxels belonging to the same object, and leverages the cluster to conduct the transformer decoder to generate the proposals from the sparse voxel features directly.	YU PEI et. al.	iccv	2023-09-27
1147	Skill Transformer: A Monolithic Policy for Mobile Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Skill Transformer, an approach for solving long-horizon robotic tasks by combining conditional sequence modeling and skill modularity.	Xiaoyu Huang; Dhruv Batra; Akshara Rai; Andrew Szot;	iccv	2023-09-27
1148	Vision Transformer Adapters for Generalizable Multitask Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the first multitasking vision transformer adapters that learn generalizable task affinities which can be applied to novel tasks and domains.	Deblina Bhattacharjee; Sabine Süsstrunk; Mathieu Salzmann;	iccv	2023-09-27
1149	AE-GPT: Using Large Language Models to Extract Adverse Events from Surveillance Reports-A Use Case with Influenza Vaccine Adverse Events Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, Large Language Models (LLMs) have shown promise in effectively identifying and cataloging AEs within clinical reports.	Yiming Li; Jianfu Li; Jianping He; Cui Tao;	arxiv-cs.CL	2023-09-27
1150	Trajectory Unified Transformer for Pedestrian Trajectory Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present Trajectory Unified TRansformer, called TUTR, which unifies the trajectory prediction components, social interaction and multimodal trajectory prediction, into a transformer encoder-decoder architecture to effectively remove the need for post-processing.	Liushuai Shi; Le Wang; Sanping Zhou; Gang Hua;	iccv	2023-09-27
1151	TransTIC: Transferring Transformer-based Image Compression from Human Perception to Machine Perception Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a transferable Transformer-based image compression framework, termed TransTIC.	YI-HSIN CHEN et. al.	iccv	2023-09-27
1152	Scalable Diffusion Models with Transformers IF:5 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches.	William Peebles; Saining Xie;	iccv	2023-09-27
1153	Dual Aggregation Transformer for Image Super-Resolution IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR.	ZHENG CHEN et. al.	iccv	2023-09-27
1154	MB-TaylorFormer: Multi-Branch Efficient Transformer Expanded By Taylor Formula for Image Dehazing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the quadratic computational complexity of softmax-attention limits the wide application in image dehazing task, especially for high-resolution images. To address this issue, we propose a new Transformer variant, which applies the Taylor expansion to approximate the softmax-attention and achieves linear computational complexity.	YUWEI QIU et. al.	iccv	2023-09-27
1155	Supersonic: Learning to Generate Source Code Optimizations in C/C++ Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Supersonic, a neural approach targeting minor source code modifications for optimization.	Zimin Chen; Sen Fang; Martin Monperrus;	arxiv-cs.SE	2023-09-26
1156	Question-Answering Approach to Evaluating Legal Summaries Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel legal summarization evaluation framework that utilizes GPT-4 to generate a set of question-answer pairs that cover main points and information in the reference summary.	Huihui Xu; Kevin Ashley;	arxiv-cs.CL	2023-09-26
1157	A Simple Text to Video Model Via Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a general and simple text to video model based on Transformer.	Gang Chen;	arxiv-cs.CV	2023-09-26
1158	Watch Your Language: Large Language Models and Content Moderation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) have exploded in popularity due to their ability to perform a wide array of natural language tasks. Text-based content moderation is one LLM use case …	Deepak Kumar; Yousef AbuHashem; Z. Durumeric;	ArXiv	2023-09-25
1159	LogGPT: Log Anomaly Detection Via GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To fill up the gap, we propose LogGPT, a novel framework that employs GPT for log anomaly detection.	Xiao Han; Shuhan Yuan; Mohamed Trabelsi;	arxiv-cs.LG	2023-09-25
1160	When Automated Assessment Meets Automated Content Generation: Examining Text Quality in The Era of GPTs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To do so, we propose an analysis framework that encompasses essay scoring ML-models, human and ML-generated essays, and a statistical model that parsimoniously considers the impact of type of respondent, prompt genre, and the ML model used for assessment model.	MARIALENA BEVILACQUA et. al.	arxiv-cs.CL	2023-09-25
1161	Watch Your Language: Investigating Content Moderation with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate a suite of commodity LLMs on two common content moderation tasks: rule-based community moderation and toxic content detection.	Deepak Kumar; Yousef AbuHashem; Zakir Durumeric;	arxiv-cs.HC	2023-09-25
1162	Does The most Sinfully Decadent Cake Ever Taste Good? Answering Yes/No Questions from Figurative Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the robustness of Question Answering (QA) models on figurative text.	Geetanjali Rakshit; Jeffrey Flanigan;	arxiv-cs.CL	2023-09-24
1163	Seeing Is Not Always Believing: Invisible Collision Attack and Defence on Pre-Trained Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel framework for an invisible attack on PTMs with enhanced MD5 collision.	Minghang Deng; Zhong Zhang; Junming Shao;	arxiv-cs.CR	2023-09-24
1164	Embers of Autoregression: Understanding Large Language Models Through The Problem They Are Trained to Solve IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To test our predictions, we evaluate two LLMs (GPT-3.5 and GPT-4) on eleven tasks, and we find robust evidence that LLMs are influenced by probability in the ways that we have hypothesized.	R. Thomas McCoy; Shunyu Yao; Dan Friedman; Matthew Hardy; Thomas L. Griffiths;	arxiv-cs.CL	2023-09-24
1165	A Chat About Boring Problems: Studying GPT-based Text Normalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We create a new taxonomy of text normalization errors and apply it to results from GPT-3.5-Turbo and GPT-4.0. Through this new framework, we can identify strengths and weaknesses of GPT-based TN, opening opportunities for future work.	YANG ZHANG et. al.	arxiv-cs.CL	2023-09-23
1166	Probing The Moral Development of Large Language Models Through Defining Issues Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we measure the moral reasoning ability of LLMs using the Defining Issues Test – a psychometric instrument developed for measuring the moral development stage of a person according to the Kohlberg’s Cognitive Moral Development Model.	Kumar Tanmay; Aditi Khandelwal; Utkarsh Agarwal; Monojit Choudhury;	arxiv-cs.CL	2023-09-23
1167	Transformer-based Image Compression with Variable Image Quality Objectives Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a Transformer-based image compression system that allows for a variable image quality objective according to the user’s preference.	Chia-Hao Kao; Yi-Hsin Chen; Cheng Chien; Wei-Chen Chiu; Wen-Hsiao Peng;	arxiv-cs.CV	2023-09-22
1168	Investigating Large Language Models and Control Mechanisms to Improve Text Readability of Biomedical Abstracts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we investigate the ability of state-of-the-art large language models (LLMs) on the task of biomedical abstract simplification, using the publicly available dataset for plain language adaptation of biomedical abstracts (\textbf{PLABA}).	ZIHAO LI et. al.	arxiv-cs.CL	2023-09-22
1169	RBFormer: Improve Adversarial Robustness of Transformer By Robust Bias Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, our emphasis lies on investigating the intrinsic robustness of the structure rather than introducing novel defense measures against adversarial attacks.	HAO CHENG et. al.	arxiv-cs.CV	2023-09-22
1170	AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, if there are noises or aberrant features in the original samples, Mixup may propagate them to the augmented samples, leading to over-sensitivity of the model to these outliers . To solve this problem, this paper proposes a new Mixup method called AMPLIFY.	Leixin Yang; Yu Xiang;	arxiv-cs.LG	2023-09-22
1171	SPION: Layer-Wise Sparse Training of Transformer Via Convolutional Flood Filling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel sparsification scheme for the Transformer that integrates convolution filters and the flood filling method to efficiently capture the layer-wise sparse pattern in attention operations.	Bokyeong Yoon; Yoonsang Han; Gordon Euhyun Moon;	arxiv-cs.LG	2023-09-21
1172	On The Relationship Between Skill Neurons and Robustness in Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Based on experiments with RoBERTa, it has been suggested that Prompt Tuning activates specific neurons in the transformer’s feed-forward networks, that are highly predictive and selective for the given task. In this paper, we study the robustness of Prompt Tuning in relation to these skill neurons, using RoBERTa and T5.	Leon Ackermann; Xenia Ohmer;	arxiv-cs.CL	2023-09-21
1173	The Cambridge Law Corpus: A Dataset for Legal AI Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Cambridge Law Corpus (CLC), a dataset for legal AI research.	ANDREAS ÖSTLING et. al.	arxiv-cs.CL	2023-09-21
1174	Code Soliloquies for Accurate Calculations in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design.	SHASHANK SONKAR et. al.	arxiv-cs.CL	2023-09-21
1175	Constraints First: A New MDD-based Model to Generate Sentences Under Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new approach to generating strongly constrained texts.	Alexandre Bonlarron; Aurélie Calabrèse; Pierre Kornprobst; Jean-Charles Régin;	arxiv-cs.AI	2023-09-21
1176	The Reversal Curse: LLMs Trained on A Is B Fail to Learn B Is A IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is worth noting, however, that if A is B appears in-context, models can deduce the reverse relationship. We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as Uriah Hawthorne is the composer of Abyssal Melodies and showing that they fail to correctly answer Who composed Abyssal Melodies?	LUKAS BERGLUND et. al.	arxiv-cs.CL	2023-09-21
1177	Generative AI in Mafia-like Game Simulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we explore the efficacy and potential of Generative AI models, specifically focusing on their application in role-playing simulations exemplified through Spyfall, a renowned mafia-style game.	Munyeong Kim; Sungsu Kim;	arxiv-cs.AI	2023-09-20
1178	Sequence-to-Sequence Spanish Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper breaks new ground by introducing the implementation and evaluation of renowned encoder-decoder architectures exclusively pre-trained on Spanish corpora. Specifically, we present Spanish versions of BART, T5, and BERT2BERT-style models and subject them to a comprehensive assessment across various sequence-to-sequence tasks, including summarization, question answering, split-and-rephrase, dialogue, and translation.	Vladimir Araujo; Maria Mihaela Trusca; Rodrigo Tufiño; Marie-Francine Moens;	arxiv-cs.CL	2023-09-20
1179	The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.	ALEKSANDAR STANIĆ et. al.	arxiv-cs.LG	2023-09-20
1180	Generative Pre-Training of Time-Series Data for Unsupervised Fault Detection in Semiconductor Manufacturing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we capture features of time-series data with temporal convolutional embedding and Generative Pre-trained Transformer (GPT) to classify abnormal sequences from normal sequences using cross entropy loss.	Sewoong Lee; JinKyou Choi; Min Su Kim;	arxiv-cs.LG	2023-09-20
1181	An Evaluation of GPT-4 on The ETHICS Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report summarizes a short study of the performance of GPT-4 on the ETHICS dataset.	Sergey Rodionov; Zarathustra Amadeus Goertzel; Ben Goertzel;	arxiv-cs.CL	2023-09-19
1182	A Family of Pretrained Transformer Language Models for Russian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a collection of 13 Russian Transformer LMs, which spans encoder (ruBERT, ruRoBERTa, ruELECTRA), decoder (ruGPT-3), and encoder-decoder (ruT5, FRED-T5) architectures.	DMITRY ZMITROVICH et. al.	arxiv-cs.CL	2023-09-19
1183	Is GPT4 A Good Trader? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by these advancements, there has been a surge of interest among researchers to harness the capabilities of GPT-4 for the automated design of quantitative factors that do not overlap with existing factor libraries, with an aspiration to achieve alpha returns \cite{webpagequant}. In contrast to these work, this study aims to examine the fidelity of GPT-4’s comprehension of classic trading theories and its proficiency in applying its code interpreter abilities to real-world trading data analysis.	Bingzhe Wu;	arxiv-cs.AI	2023-09-19
1184	Rigorously Assessing Natural Language Explanations of Neurons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply our framework to the GPT-4-generated explanations of GPT-2 XL neurons of Bills et al. (2023) and show that even the most confident explanations have high error rates and little to no causal efficacy.	Jing Huang; Atticus Geiger; Karel D’Oosterlinck; Zhengxuan Wu; Christopher Potts;	arxiv-cs.CL	2023-09-19
1185	Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Tri-Distil-BERT, a multilingual model pre-trained on Bangla, English, and Hindi, and Mixed-Distil-BERT, a model fine-tuned on code-mixed data.	Md Nishat Raihan; Dhiman Goswami; Antara Mahmud;	arxiv-cs.CL	2023-09-18
1186	Evaluation of GPT-3 for Anti-Cancer Drug Sensitivity Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigated the potential of GPT-3 for the anti-cancer drug sensitivity prediction task using structured pharmacogenomics data across five tissue types and evaluated its performance with zero-shot prompting and fine-tuning paradigms.	Shaika Chowdhury; Sivaraman Rajaganapathy; Lichao Sun; James Cerhan; Nansu Zong;	arxiv-cs.LG	2023-09-18
1187	Q-Transformer: Scalable Offline Reinforcement Learning Via Autoregressive Q-Functions IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data.	YEVGEN CHEBOTAR et. al.	arxiv-cs.RO	2023-09-18
1188	Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, studies show that many of such synthesized codes contain vulnerabilities. We propose a novel vulnerability-constrained decoding approach to reduce the amount of vulnerable code generated by such models.	André Storhaug; Jingyue Li; Tianyuan Hu;	arxiv-cs.CR	2023-09-18
1189	Towards Ontology Construction with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a method for automatically constructing a concept hierarchy for a given domain by querying a large language model.	Maurice Funk; Simon Hosemann; Jean Christoph Jung; Carsten Lutz;	arxiv-cs.AI	2023-09-18
1190	Proposition from The Perspective of Chinese Language: A Chinese Proposition Classification Evaluation Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we put forward the concepts of explicit and implicit propositions and propose a comprehensive multi-level proposition classification system based on linguistics and logic.	CONGHUI NIU et. al.	arxiv-cs.CL	2023-09-18
1191	Facilitating NSFW Text Detection in Open-Domain Dialogue Systems Via Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection.	Huachuan Qiu; Shuai Zhang; Hongliang He; Anqi Li; Zhenzhong Lan;	arxiv-cs.CL	2023-09-18
1192	Performance of The Pre-Trained Large Language Model GPT-4 on Automated Short Answer Grading Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Over the years, carefully trained models have achieved increasingly higher levels of performance.	Gerd Kortemeyer;	arxiv-cs.CL	2023-09-17
1193	Effective Image Tampering Localization Via Enhanced Transformer and Co-attention Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose an effective image tampering localization network (EITLNet) based on a two-branch enhanced transformer encoder with attention-based feature fusion.	Kun Guo; Haochen Zhu; Gang Cao;	arxiv-cs.CV	2023-09-17
1194	Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we take a look at the topological structure of neuronal activity in the brain of Chat-GPT’s foundation language model, and analyze it with respect to a metric representing the notion of fairness.	Stephen Fitz;	arxiv-cs.CL	2023-09-17
1195	RMDM: A Multilabel Fakenews Dataset for Vietnamese Evidence Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we present a novel and challenging multilabel Vietnamese dataset (RMDM) designed to assess the performance of large language models (LLMs), in verifying electronic information related to legal contexts, focusing on fake news as potential input for electronic evidence.	HAI-LONG NGUYEN et. al.	arxiv-cs.CL	2023-09-16
1196	Pedestrian Trajectory Prediction Using Dynamics-based Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a dynamics-based deep learning framework with a novel asymptotically stable dynamical system integrated into a Transformer-based model.	Honghui Wang; Weiming Zhi; Gustavo Batista; Rohitash Chandra;	arxiv-cs.RO	2023-09-16
1197	Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our study assesses LLMs’ proficiency in structuring tables and introduces a novel fine-tuning method, cognizant of data structures, to bolster their performance.	XIANGRU TANG et. al.	arxiv-cs.CL	2023-09-16
1198	Contrastive Decoding Improves Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate that Contrastive Decoding — a simple, computationally light, and training-free text generation method proposed by Li et al 2022 — achieves large out-of-the-box improvements over greedy decoding on a variety of reasoning tasks.	Sean O’Brien; Mike Lewis;	arxiv-cs.CL	2023-09-16
1199	Has Sentiment Returned to The Pre-pandemic Level? A Sentiment Analysis Using U.S. College Subreddit Data from 2019 to 2022 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study aims to explore how people’s emotions have changed from the pre-pandemic during the pandemic to post-emergency period and whether it has returned to pre-pandemic level.	Tian Yan; Fang Liu;	arxiv-cs.CL	2023-09-15
1200	Large Language Models for Failure Mode Classification: An Investigation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we present the first investigation into the effectiveness of Large Language Models (LLMs) for Failure Mode Classification (FMC).	Michael Stewart; Melinda Hodkiewicz; Sirui Li;	arxiv-cs.CL	2023-09-15
1201	Attention-Only Transformers and Implementing MLPs with Attention Heads Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We prove that an MLP neuron can be implemented by a masked attention head with internal dimension 1 so long as the MLP’s activation function comes from a restricted class including SiLU and close approximations of ReLU and GeLU.	Robert Huben; Valerie Morris;	arxiv-cs.LG	2023-09-15
1202	Structural Self-Supervised Objectives for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the first part, we introduce three alternative pre-training objectives to BERT’s Masked Language Modeling (MLM), namely Random Token Substitution (RTS), Cluster-based Random Token Substitution (C-RTS), and Swapped Language Modeling (SLM).	Luca Di Liello;	arxiv-cs.CL	2023-09-15
1203	AlbNER: A Corpus for Named Entity Recognition in Albanian Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents AlbNER, a corpus of 900 sentences with labeled named entities, collected from Albanian Wikipedia articles.	Erion Çano;	arxiv-cs.CL	2023-09-15
1204	InvestLM: A Large Language Model for Investment Using Financial Domain Instruction Tuning Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: We present a new financial domain large language model, InvestLM, tuned on LLaMA-65B (Touvron et al., 2023), using a carefully curated instruction dataset related to financial …	Yi Yang; Yixuan Tang; K. Y. Tam;	ArXiv	2023-09-15
1205	OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The authors explain where OpenAI got the tax law example in its livestream demonstration of GPT-4, why GPT-4 got the wrong answer, and how it fails to reliably calculate taxes. …	Andrew Blair-Stanek; Nils Holzenberger; Benjamin Van Durme;	arxiv-cs.AI	2023-09-15
1206	CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In fact, anomalous behaviors harming long context extrapolation exist between Rotary Position Embedding (RoPE) and vanilla self-attention unveiled by our work. To address this issue, we propose a novel attention mechanism, CoCA (Collinear Constrained Attention).	SHIYI ZHU et. al.	arxiv-cs.LG	2023-09-15
1207	Text Classification of Cancer Clinical Trial Eligibility Criteria Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we focus on seven common exclusion criteria in cancer trials: prior malignancy, human immunodeficiency virus, hepatitis B, hepatitis C, psychiatric illness, drug/substance abuse, and autoimmune illness.	Yumeng Yang; Soumya Jayaraj; Ethan B Ludmir; Kirk Roberts;	arxiv-cs.CL	2023-09-14
1208	Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, the bottleneck lies in the linear projection layers of multi-head attention and feedforward networks, constituting a substantial portion of the model size and contributing significantly to computation, memory, and power usage. To address this bottleneck, we propose folding attention, a technique targeting these linear layers, significantly reducing model size and improving memory and power efficiency.	YANG LI et. al.	arxiv-cs.LG	2023-09-14
1209	STRec: Sparse Transformer for Sequential Recommendations Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: With the rapid evolution of transformer architectures, researchers are exploring their application in sequential recommender systems (SRSs) and presenting promising performance on …	CHENGXI LI et. al.	Proceedings of the 17th ACM Conference on Recommender …	2023-09-14
1210	Large Language Models Can Infer Psychological Dispositions of Social Media Users Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: As Large Language Models (LLMs) demonstrate increasingly human-like abilities in various natural language processing (NLP) tasks that are bound to become integral to personalized …	Heinrich Peters; Sandra C. Matz;	ArXiv	2023-09-13
1211	Traveling Words: A Geometric Interpretation of Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel geometric perspective that elucidates the inner mechanisms of transformer operations.	Raul Molina;	arxiv-cs.CL	2023-09-13
1212	Circuit Breaking: Removing Model Behaviors with Targeted Ablation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a novel approach to removing undesirable behaviors by ablating a small number of causal pathways between model components, with the intention of disabling the computational circuit responsible for the bad behavior.	Maximilian Li; Xander Davies; Max Nadeau;	arxiv-cs.CL	2023-09-12
1213	The Moral Machine Experiment on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study utilized the Moral Machine framework to investigate the ethical decision-making tendencies of prominent LLMs, including GPT-3.5, GPT-4, PaLM 2, and Llama 2, comparing their responses to human preferences.	Kazuhiro Takemoto;	arxiv-cs.CL	2023-09-12
1214	Leveraging Large Language Models and Weak Supervision for Social Media Data Annotation: An Evaluation Using COVID-19 Self-reported Vaccination Tweets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, manual annotation of a large number of tweets is time-consuming and expensive. In this study, we evaluate the usage of Large Language Models, in this case GPT-4 (March 23 version), and weak supervision, to identify COVID-19 vaccine-related tweets, with the purpose of comparing performance against human annotators.	Ramya Tekumalla; Juan M. Banda;	arxiv-cs.CL	2023-09-12
1215	Exploring Large Language Models for Ontology Alignment Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: This work investigates the applicability of recent generative Large Language Models (LLMs), such as the GPT series and Flan-T5, to ontology alignment for identifying concept …	Yuan He; Jiaoyan Chen; Hang Dong; Ian Horrocks;	ArXiv	2023-09-12
1216	Unveiling The Potential of Large Language Models in Generating Semantic and Cross-language Clones Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our findings shed light on GPT-3’s strengths in code generation, offering insights into the potential applications and challenges of using advanced language models in software development.	PALASH R. ROY et. al.	arxiv-cs.SE	2023-09-12
1217	Characterizing Latent Perspectives of Media Houses Towards Public Figures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a zero-shot approach for non-extractive or generative characterizations of person entities from a corpus using GPT-2.	Sharath Srivatsa; Srinath Srinivasa;	arxiv-cs.CL	2023-09-12
1218	Comparing Llama-2 and GPT-3 LLMs for HPC Kernels Generation Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We evaluate the use of the open-source Llama-2 model for generating well-known, high-performance computing kernels (e.g., AXPY, GEMV, GEMM) on different parallel programming …	PEDRO VALERO-LARA et. al.	ArXiv	2023-09-12
1219	Zero-shot Learning with Minimum Instruction to Extract Social Determinants and Family History from Clinical Notes Using GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Demographics, Social determinants of health, and family history documented in the unstructured text within the electronic health records are increasingly being studied to understand how this information can be utilized with the structured data to improve healthcare outcomes. After the GPT models were released, many studies have applied GPT models to extract this information from the narrative clinical notes.	Neel Bhate; Ansh Mittal; Zhe He; Xiao Luo;	arxiv-cs.CL	2023-09-11
1220	Memory Injections: Correcting Multi-Hop Reasoning Failures During Inference in Transformer-Based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads.	MANSI SAKARVADIA et. al.	arxiv-cs.CL	2023-09-11
1221	Strategic Behavior of Large Language Models: Game Structure Vs. Contextual Framing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Crucially, we extend our analysis to examine the role of contextual framing, such as diplomatic relations or casual friendships, in shaping the models’ decisions.	Nunzio Lorè; Babak Heydari;	arxiv-cs.GT	2023-09-11
1222	Detecting Natural Language Biases with Prompt-based Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this project, we want to explore the newly emerging field of prompt engineering and apply it to the downstream task of detecting LM biases.	Md Abdul Aowal; Maliha T Islam; Priyanka Mary Mammen; Sandesh Shetty;	arxiv-cs.CL	2023-09-11
1223	SparseSwin: Swin Transformer with Sparse Transformer Block Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper aims to reduce the number of parameters and in turn, made the transformer more efficient.	Krisna Pinasthika; Blessius Sheldo Putra Laksono; Riyandi Banovbi Putera Irsal; Syifa Hukma Shabiyya; Novanto Yudistira;	arxiv-cs.CV	2023-09-11
1224	NExT-GPT: Any-to-Any Multimodal LLM IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To fill the gap, we present an end-to-end general-purpose any-to-any MM-LLM system, NExT-GPT.	Shengqiong Wu; Hao Fei; Leigang Qu; Wei Ji; Tat-Seng Chua;	arxiv-cs.AI	2023-09-11
1225	Black-Box Analysis: GPTs Across Time in Legal Textual Entailment Task Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The evolution of Generative Pre-trained Transformer (GPT) models has led to significant advancements in various natural language processing applications, particularly in legal textual entailment. We present an analysis of GPT-3.5 (ChatGPT) and GPT-4 performances on COLIEE Task 4 dataset, a prominent benchmark in this domain.	Ha-Thanh Nguyen; Randy Goebel; Francesca Toni; Kostas Stathis; Ken Satoh;	arxiv-cs.CL	2023-09-11
1226	Long-Range Transformer Architectures for Document Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we explore multiple strategies to apply Transformer based models to long multi-page documents.	Thibault Douzon; Stefan Duffner; Christophe Garcia; Jérémy Espinas;	arxiv-cs.CL	2023-09-11
1227	CrisisTransformers: Pre-trained Language Models and Sentence Encoders for Crisis-related Social Media Texts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Advances in applications like text classification, semantic search, and clustering contribute to the effective processing of crisis-related texts, which is essential for emergency responders to gain a comprehensive view of a crisis event, whether historical or real-time. To address these gaps in crisis informatics literature, this study introduces CrisisTransformers, an ensemble of pre-trained language models and sentence encoders trained on an extensive corpus of over 15 billion word tokens from tweets associated with more than 30 crisis events, including disease outbreaks, natural disasters, conflicts, and other critical incidents.	Rabindra Lamsal; Maria Rodriguez Read; Shanika Karunasekera;	arxiv-cs.CL	2023-09-11
1228	Unified Contrastive Fusion Transformer for Multimodal Human Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a new multimodal fusion architecture, referred to as Unified Contrastive Fusion Transformer (UCFFormer) designed to integrate data with diverse distributions to enhance HAR performance.	Kyoung Ok Yang; Junho Koh; Jun Won Choi;	arxiv-cs.CV	2023-09-10
1229	Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition in Conversations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose to combine the two approaches to perform ERC, as a means to obtain simpler and more interpretable Large Language Models-based classifiers.	Patrícia Pereira; Rui Ribeiro; Helena Moniz; Luisa Coheur; Joao Paulo Carvalho;	arxiv-cs.CL	2023-09-08
1230	Encoding Multi-Domain Scientific Papers By Ensembling Multiple CLS Tokens Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we argue that using multiple CLS tokens could make a Transformer better specialize to multiple scientific domains.	Ronald Seoh; Haw-Shiuan Chang; Andrew McCallum;	arxiv-cs.CL	2023-09-08
1231	From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We conduct a human preference study on 100 CNN DailyMail articles and find that that humans prefer GPT-4 summaries that are more dense than those generated by a vanilla prompt and almost as dense as human written summaries.	Griffin Adams; Alexander Fabbri; Faisal Ladhak; Eric Lehman; Noémie Elhadad;	arxiv-cs.CL	2023-09-08
1232	CNN Injected Transformer for Image Exposure Correction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a CNN Injected Transformer (CIT) to harness the individual strengths of CNN and Transformer simultaneously.	Shuning Xu; Xiangyu Chen; Binbin Song; Jiantao Zhou;	arxiv-cs.CV	2023-09-08
1233	Insights Into The Inner Workings of Transformer Models for Protein Function Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivation: We explored how explainable artificial intelligence (XAI) can help to shed light into the inner workings of neural networks for protein function prediction, by extending the widely used XAI method of integrated gradients such that latent representations inside of transformer models, which were finetuned to Gene Ontology term and Enzyme Commission number prediction, can be inspected too. Results: The approach enabled us to identify amino acids in the sequences that the transformers pay particular attention to, and to show that these relevant sequence parts reflect expectations from biology and chemistry, both in the embedding layer and inside of the model, where we identified transformer heads with a statistically significant correspondence of attribution maps with ground truth sequence annotations (e.g. transmembrane regions, active sites) across many proteins.	Markus Wenzel; Erik Grüner; Nils Strodthoff;	arxiv-cs.LG	2023-09-07
1234	ProPainter: Improving Propagation and Transformer for Video Inpainting IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Moreover, memory or computational constraints limit the temporal range of feature propagation and video Transformer, preventing exploration of correspondence information from distant frames. To address these issues, we propose an improved framework, called ProPainter, which involves enhanced ProPagation and an efficient Transformer.	Shangchen Zhou; Chongyi Li; Kelvin C. K. Chan; Chen Change Loy;	arxiv-cs.CV	2023-09-07
1235	Supervised Learning and Large Language Model Benchmarks on Mental Health Datasets: Cognitive Distortions and Suicidal Risks in Chinese Social Media Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In response, we introduce two novel annotated datasets from Chinese social media, focused on cognitive distortions and suicidal risk classification.	HONGZHI QI et. al.	arxiv-cs.CL	2023-09-07
1236	Zero-Shot Audio Captioning Via Audibility Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose three desiderata for captioning audio — (i) fluency of the generated text, (ii) faithfulness of the generated text to the input audio, and the somewhat related (iii) audibility, which is the quality of being able to be perceived based only on audio.	Tal Shaharabany; Ariel Shaulov; Lior Wolf;	arxiv-cs.SD	2023-09-07
1237	Enhancing Pipeline-Based Conversational Agents with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the capabilities of LLMs to enhance pipeline-based conversational agents during two phases: 1) in the design and development phase and 2) during operations.	MINA FOOSHERIAN et. al.	arxiv-cs.CL	2023-09-07
1238	GPT Can Solve Mathematical Problems Without A Calculator IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: With sufficient training data, a 2 billion-parameter language model can accurately perform multi-digit arithmetic operations with almost 100% accuracy without data leakage, significantly surpassing GPT-4 (whose multi-digit multiplication accuracy is only 4.3%).	ZHEN YANG et. al.	arxiv-cs.LG	2023-09-06
1239	Leveraging BERT Language Models for Multi-Lingual ESG Issue Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The Multi-Lingual ESG Issue Identification (ML-ESG) shared task encompasses the classification of news documents into 35 distinct ESG issue labels. In this study, we explored multiple strategies harnessing BERT language models to achieve accurate classification of news documents across these labels.	Elvys Linhares Pontes; Mohamed Benjannet; Lam Kim Ming;	arxiv-cs.CL	2023-09-05
1240	On The Planning, Search, and Memorization Capabilities of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the potential of the state-of-the-art large language model (GPT-4) for planning tasks.	Yunhao Yang; Anshul Tomar;	arxiv-cs.CL	2023-09-04
1241	Convolutional Sparse Filter with Data and Mechanism Fusion: A Few-shot Fault Diagnosis Method for Power Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Jia Qin; Dongsheng Yang; Nan Wang; Xueqing Ni;	Eng. Appl. Artif. Intell.	2023-09-01
1242	Deep Internally Connected Transformer Hashing for Image Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View	Zijian Chao; Shuli Cheng; Yongming Li;	Knowl. Based Syst.	2023-09-01
1243	MSVT: Multiple Spatiotemporal Views Transformer for DeepFake Video Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recently, DeepFake videos have developed rapidly, causing new security issues in society. Due to the rough spatiotemporal view, existing video-based detection methods struggle to …	YANG YU et. al.	IEEE Transactions on Circuits and Systems for Video …	2023-09-01
1244	DPCTN: Dual Path Context-aware Transformer Network for Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View	Pengfei Song; Zhe Yang; Jinjiang Li; H. Fan;	Eng. Appl. Artif. Intell.	2023-09-01
1245	Classification of EEG Signals Using Transformer Based Deep Learning and Ensemble Models Related Papers Related Patents Related Grants Related Venues Related Experts View	M. Zeynali; Hadi Seyedarabi; R. Afrouzian;	Biomed. Signal Process. Control.	2023-09-01
1246	TransPhys: Transformer-based Unsupervised Contrastive Learning for Remote Heart Rate Measurement Related Papers Related Patents Related Grants Related Venues Related Experts View	Rui-Xuan Wang; Hong-mei Sun; Rong-Rong Hao; Ang Pan; Ruisheng Jia;	Biomed. Signal Process. Control.	2023-09-01
1247	CTransCNN: Combining Transformer and CNN in Multilabel Medical Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	XIN WU et. al.	Knowl. Based Syst.	2023-09-01
1248	Attention Transformer Mechanism and Fusion-based Deep Learning Architecture for MRI Brain Tumor Classification System Related Papers Related Patents Related Grants Related Venues Related Experts View	Sadafossadat Tabatabaei; Khosro Rezaee; Min Zhu;	Biomed. Signal Process. Control.	2023-09-01
1249	Transformer-based Visual Object Tracking Via Fine-coarse Concatenated Attention and Cross Concatenated MLP Related Papers Related Patents Related Grants Related Venues Related Experts View	LONG GAO et. al.	Pattern Recognit.	2023-09-01
1250	Hybrid CNN-Transformer Model for Medical Image Segmentation with Pyramid Convolution and Multi-layer Perceptron Related Papers Related Patents Related Grants Related Venues Related Experts View	Xiaowei Liu; Yikun Hu; Jianguo Chen;	Biomed. Signal Process. Control.	2023-09-01
1251	Evaluating Protein Binding Interfaces with Transformer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View	VITALII STEBLIANKIN et. al.	Nature Machine Intelligence	2023-09-01
1252	Why Do Universal Adversarial Attacks Work on Large Language Models?: Geometry Might Be The Answer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a novel geometric perspective explaining universal adversarial attacks on large language models.	Varshini Subhash; Anna Bialas; Weiwei Pan; Finale Doshi-Velez;	arxiv-cs.LG	2023-09-01
1253	On The Use of GPT-4 for Creating Goal Models: An Exploratory Study Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The emergence of large language models and conversational front-ends such as ChatGPT is revolutionizing many software engineering activities. The extent to which such technologies …	BOQI CHEN et. al.	2023 IEEE 31st International Requirements Engineering …	2023-09-01
1254	Multi-Modal Feature Pyramid Transformer for RGB-Infrared Object Detection Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: RGB-Infrared multi-modal object detection utilizes diverse and complementary information, showing some advantages in intelligent transportation field. The main challenge of …	Yaohui Zhu; Xiaoyu Sun; Miao Wang; Hua Huang;	IEEE Transactions on Intelligent Transportation Systems	2023-09-01
1255	Using Large Language Models to Automate Category and Trend Analysis of Scientific Articles: An Application in Ophthalmology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Purpose: In this paper, we present an automated method for article classification, leveraging the power of Large Language Models (LLM).	HINA RAJA et. al.	arxiv-cs.CL	2023-08-31
1256	BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present BioCoder, a benchmark developed to evaluate large language models (LLMs) in generating bioinformatics-specific code.	XIANGRU TANG et. al.	arxiv-cs.LG	2023-08-31
1257	Large Language Models for Semantic Monitoring of Corporate Disclosures: A Case Study on Korea’s Top 50 KOSPI Companies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the rapidly advancing domain of artificial intelligence, state-of-the-art language models such as OpenAI’s GPT-3.5-turbo and GPT-4 offer unprecedented opportunities for automating complex tasks. This research paper delves into the capabilities of these models for semantically analyzing corporate disclosures in the Korean context, specifically for timely disclosure.	Junwon Sung; Woojin Heo; Yunkyung Byun; Youngsam Kim;	arxiv-cs.CL	2023-08-31
1258	Response: Emergent Analogical Reasoning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In our tests, GPT-3 fails to solve even the easiest variants of the problems presented in the original paper.	Damian Hodel; Jevin West;	arxiv-cs.CL	2023-08-30
1259	Hanayo: Harnessing Wave-like Pipeline Parallelism for Enhanced Large Model Training Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Hanayo, a wave-like pipeline parallelism strategy that boasts a concise structure and practical applicability, alongside a high-performance pipeline execution runtime to tackle the challenges of pipeline strategy implementation.	Ziming Liu; Shenggan Cheng; Haotian Zhou; Yang You;	arxiv-cs.DC	2023-08-30
1260	ToddlerBERTa: Exploiting BabyBERTa for Grammar Learning and Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ToddlerBERTa, a BabyBERTa-like language model, exploring its capabilities through five different models with varied hyperparameters.	Omer Veysel Cagatan;	arxiv-cs.CL	2023-08-30
1261	Large Language Models As Data Preprocessors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Alongside showcasing the inherent capabilities of LLMs, we highlight their limitations, particularly in terms of computational expense and inefficiency. We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques, coupled with traditional methods like contextualization and feature selection, to improve the performance and efficiency of these models.	Haochen Zhang; Yuyang Dong; Chuan Xiao; Masafumi Oyamada;	arxiv-cs.AI	2023-08-30
1262	DTrOCR: Decoder-only Transformer for Optical Character Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we propose a simpler and more effective method for text recognition, known as the Decoder-only Transformer for Optical Character Recognition (DTrOCR).	Masato Fujitake;	arxiv-cs.CV	2023-08-30
1263	SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we improve a recently-proposed spiking Transformer (i.e., Spikformer) to make it possible to process language tasks and propose a two-stage knowledge distillation method for training it, which combines pre-training by distilling knowledge from BERT with a large collection of unlabelled texts and fine-tuning with task-specific instances via knowledge distillation again from the BERT fine-tuned on the same training examples.	CHANGZE LV et. al.	arxiv-cs.CL	2023-08-29
1264	AutoDroid: LLM-powered Task Automation in Android IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce AutoDroid, a mobile task automation system capable of handling arbitrary tasks on any Android application without manual efforts.	HAO WEN et. al.	arxiv-cs.AI	2023-08-29
1265	Cognitive Effects in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The rapid adoption of this technology naturally raises questions about the possible biases such models might exhibit. In this work, we tested one of these models (GPT-3) on a range of cognitive effects, which are systematic patterns that are usually found in human cognitive tasks.	Jonathan Shaki; Sarit Kraus; Michael Wooldridge;	arxiv-cs.AI	2023-08-28
1266	Distilled GPT for Source Code Summarization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present an alternative: we train an open source model using sample output generated by GPT-3.5 in a process related to knowledge distillation.	Chia-Yi Su; Collin McMillan;	arxiv-cs.SE	2023-08-28
1267	Time-Frequency Transformer: A Novel Time Frequency Joint Learning Method for Speech Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer.	YONG WANG et. al.	arxiv-cs.SD	2023-08-28
1268	GADePo: Graph-Assisted Declarative Pooling Transformers for Document-Level Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current methods rely on text-based encoders and employ various hand-coded pooling heuristics to aggregate information from entity mentions and associated contexts. In this paper, we replace these rigid pooling functions with explicit graph relations by leveraging the intrinsic graph processing capabilities of the Transformer model.	Andrei C. Coman; Christos Theodoropoulos; Marie-Francine Moens; James Henderson;	arxiv-cs.CL	2023-08-28
1269	Breaking The Bank with ChatGPT: Few-Shot Text Classification for Finance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the use of conversational GPT models for easy and quick few-shot text classification in the financial domain using the Banking77 dataset.	Lefteris Loukas; Ilias Stogiannidis; Prodromos Malakasiotis; Stavros Vassos;	arxiv-cs.CL	2023-08-28
1270	Challenges of GPT-3-based Conversational Agents for Healthcare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the challenges and risks of using GPT-3-based models for medical question-answering (MedQA).	Fabian Lechner; Allison Lahnala; Charles Welch; Lucie Flek;	arxiv-cs.CL	2023-08-28
1271	ANER: Arabic and Arabizi Named Entity Recognition Using Transformer-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present ANER, a web-based named entity recognizer for the Arabic, and Arabizi languages.	ABDELRAHMAN BODA SADALLAH et. al.	arxiv-cs.CL	2023-08-28
1272	Target-independent XLA Optimization Using Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We specifically focus on target-independent optimization XLA HLO pass ordering: our approach aims at finding the optimal sequence of compiler optimization passes, which is decoupled from target-dependent optimization.	Milan Ganai; Haichen Li; Theodore Enns; Yida Wang; Randy Huang;	arxiv-cs.LG	2023-08-28
1273	A Unified Transformer-based Network for Multimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel Unified Biosensor-Vision Multi-modal Transformer-based (UBVMT) method to classify emotions in an arousal-valence space by combining a 2D representation of an ECG/PPG signal with the face information.	Kamran Ali; Charles E. Hughes;	arxiv-cs.CV	2023-08-27
1274	Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this survey paper, we provide an examination of alternative open-sourced models of large GPTs, focusing on user-friendly and relatively small models that facilitate easier deployment and accessibility.	KAIYUAN GAO et. al.	arxiv-cs.CL	2023-08-27
1275	Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The goal of this work is to improve the efficiency and effectiveness of knowledge distillation, enabling the development of more efficient and accurate models for a range of natural language processing tasks.	Apoorv Dankar; Adeem Jassani; Kartikaeya Kumar;	arxiv-cs.CL	2023-08-26
1276	Translate Meanings, Not Just Words: IdiomKB’s Role in Optimizing Idiomatic Translation with Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a multilingual idiom KB (IdiomKB) developed using large LMs to address this.	SHUANG LI et. al.	arxiv-cs.CL	2023-08-26
1277	GPTCloneBench: A Comprehensive Benchmark of Semantic Clones and Cross-language Clones Using GPT-3 Model and SemanticCloneBench Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a comprehensive semantic clone and cross-language clone benchmark, GPTCloneBench by exploiting SemanticCloneBench and OpenAI’s GPT-3 model.	AJMAIN INQIAD ALAM et. al.	arxiv-cs.SE	2023-08-26
1278	A Wide Evaluation of ChatGPT on Affective Computing Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we widely study the capabilities of the ChatGPT models, namely GPT-4 and GPT-3.5, on 13 affective computing problems, namely aspect extraction, aspect polarity classification, opinion extraction, sentiment analysis, sentiment intensity ranking, emotions intensity ranking, suicide tendency detection, toxicity detection, well-being assessment, engagement measurement, personality assessment, sarcasm detection, and subjectivity detection.	Mostafa M. Amin; Rui Mao; Erik Cambria; Björn W. Schuller;	arxiv-cs.AI	2023-08-26
1279	An Efficient FPGA-Based Accelerator for Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These nonlinear computations do pose challenges for accelerator design. In this paper, to propose an efficient FPGA-based hardware accelerator for Swin Transformer, we focused on using different strategies to deal with these nonlinear calculations and efficiently handling MAC computations to achieve the best acceleration results.	Zhiyang Liu; Pengyu Yin; Zhenhua Ren;	arxiv-cs.AR	2023-08-26
1280	Rethinking Language Models As Symbolic Knowledge Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we provide an exhaustive evaluation of language models of varying sizes and capabilities.	Vishwas Mruthyunjaya; Pouya Pezeshkpour; Estevam Hruschka; Nikita Bhutani;	arxiv-cs.CL	2023-08-25
1281	Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we collect the first open-source dataset to evaluate safeguards in LLMs, and deploy safer open-source LLMs at a low cost.	Yuxia Wang; Haonan Li; Xudong Han; Preslav Nakov; Timothy Baldwin;	arxiv-cs.CL	2023-08-25
1282	Transforming The Output of Generative Pre-trained Transformer: The Influence of The PGI Framework on Attention Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel approach named Persona-Grouping-Intelligence (PGI), which has been crafted to tackle the challenges posed by GPT models when applied to real-world business issues.	Aline Ioste;	arxiv-cs.AI	2023-08-25
1283	Analyzing Sentiments Regarding ChatGPT Using Novel BERT: A Machine Learning Approach Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Chatbots are AI-powered programs designed to replicate human conversation. They are capable of performing a wide range of tasks, including answering questions, offering …	SUDHEESH R et. al.	Inf.	2023-08-25
1284	Prompting A Large Language Model to Generate Diverse Motivational Messages: A Comparison with Human-Written Messages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) are increasingly capable and prevalent, and can be used to produce creative content.	Samuel Rhys Cox; Ashraf Abdul; Wei Tsang Ooi;	arxiv-cs.CL	2023-08-25
1285	Towards Hierarchical Regional Transformer-based Multiple Instance Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a Transformer-based multiple instance learning approach that replaces the traditional learned attention mechanism with a regional, Vision Transformer inspired self-attention mechanism. We present a method that fuses regional patch information to derive slide-level predictions and show how this regional aggregation can be stacked to hierarchically process features on different distance levels.	Josef Cersovsky; Sadegh Mohammadi; Dagmar Kainmueller; Johannes Hoehne;	arxiv-cs.CV	2023-08-24
1286	A Small and Fast BERT for Chinese Medical Punctuation Restoration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Considering a practical scenario, we propose a fast and light pre-trained model for Chinese medical punctuation restoration based on ‘pretraining and fine-tuning’ paradigm.	TONGTAO LING et. al.	arxiv-cs.CL	2023-08-24
1287	Financial News Analytics Using Fine-Tuned Llama 2 GPT Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the study, the model was fine-tuned for the following tasks: analysing a text from financial market perspectives, highlighting main points of a text, summarizing a text and extracting named entities with appropriate sentiments.	Bohdan M. Pavlyshenko;	arxiv-cs.CL	2023-08-24
1288	Multi-BERT for Embeddings for Recommendation System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel approach for generating document embeddings using a combination of Sentence-BERT (SBERT) and RoBERTa, two state-of-the-art natural language processing models.	Shashidhar Reddy Javaji; Krutika Sarode;	arxiv-cs.IR	2023-08-24
1289	View-target Relation-guided Unsupervised 2D Image-based 3D Model Retrieval Via Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Jiacheng Chang; Lanyong Zhang; Zhuang Shao;	Multimedia Systems	2023-08-24
1290	KTrans: Knowledge-Aware Transformer for Binary Code Embedding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel Transformer-based approach, namely kTrans, to generate knowledge-aware binary code embedding.	WENYU ZHU et. al.	arxiv-cs.SE	2023-08-24
1291	Devising and Detecting Phishing: Large Language Models Vs. Smaller Human Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we compare the performance of phishing emails created automatically by GPT-4 and manually using the V-Triad.	Fredrik Heiding; Bruce Schneier; Arun Vishwanath; Jeremy Bernstein; Peter S. Park;	arxiv-cs.CR	2023-08-23
1292	Towards Lossless Head Pruning Through Automatic Peer Distillation for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on attention head pruning as head attention is a key component of the transformer-based language models and provides interpretable knowledge meaning.	BINGBING LI et. al.	ijcai	2023-08-23
1293	Are ChatGPT and GPT-4 Good Poker Players? — A Pre-Flop Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we put ChatGPT and GPT-4 through the poker test and evaluate their poker skills.	Akshat Gupta;	arxiv-cs.CL	2023-08-23
1294	GPTEval: A Survey on Assessments of ChatGPT and GPT-4 IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, a comprehensive review summarizing the collective assessment findings is lacking. The objective of this survey is to thoroughly analyze prior assessments of ChatGPT and GPT-4, focusing on its language and reasoning abilities, scientific knowledge, and ethical considerations.	Rui Mao; Guanyi Chen; Xulang Zhang; Frank Guerin; Erik Cambria;	arxiv-cs.AI	2023-08-23
1295	Simple Is Better and Large Is Not Enough: Towards Ensembling of Foundational Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We utilize BERT and define three other ensemble techniques: {Shallow, Semi, and Deep}, wherein the Deep-Ensemble introduces a knowledge-guided reinforcement learning approach.	Nancy Tyagi; Aidin Shiri; Surjodeep Sarkar; Abhishek Kumar Umrawal; Manas Gaur;	arxiv-cs.CL	2023-08-23
1296	Graph Propagation Transformer for Graph Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a novel transformer architecture for graph representation learning.	ZHE CHEN et. al.	ijcai	2023-08-23
1297	Hierarchical Transformer for Scalable Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, conventional sampling-based methods fail to capture necessary high-level contextual information, resulting in a significant loss of performance. In this paper, we introduce the Hierarchical Scalable Graph Transformer (HSGT) as a solution to these challenges.	Wenhao Zhu; Tianyu Wen; Guojie Song; Xiaojun Ma; Liang Wang;	ijcai	2023-08-23
1298	Diagnose Like A Pathologist: Transformer-Enabled Hierarchical Attention-Guided Multiple Instance Learning for Whole Slide Image Classification IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, unlike human pathologists who selectively observe specific regions of histopathology tissues under different magnifications, most methods do not incorporate multiple resolutions of the WSIs, hierarchically and attentively, thereby leading to a loss of focus on the WSIs and information from other resolutions. To resolve this issue, we propose a Hierarchical Attention-Guided Multiple Instance Learning framework to fully exploit the WSIs.	Conghao Xiong; Hao Chen; Joseph J.Y. Sung; Irwin King;	ijcai	2023-08-23
1299	Constraints First: A New MDD-based Model to Generate Sentences Under Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a new approach to generating strongly constrained texts.	Alexandre Bonlarron; Aurélie Calabrèse; Pierre Kornprobst; Jean-Charles Régin;	ijcai	2023-08-23
1300	MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to push the performance boundaries, we propose a novel Mixed-strategy Adversarial Training algorithm (MAT).	Zhehua Zhong; Tianyi Chen; Zhen Wang;	ijcai	2023-08-23
1301	Vision Transformer Adapters for Generalizable Multitask Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the first multitasking vision transformer adapters that learn generalizable task affinities which can be applied to novel tasks and domains.	Deblina Bhattacharjee; Sabine Süsstrunk; Mathieu Salzmann;	arxiv-cs.CV	2023-08-23
1302	CostFormer:Cost Transformer for Cost Aggregation in Multi-view Stereo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, another problem may occur due to the quadratically growing computational complexity caused by Transformer, resulting in memory overflow and inference latency. In this paper, we overcome these limits with an efficient Transformer-based cost aggregation network, namely CostFormer.	WEITAO CHEN et. al.	ijcai	2023-08-23
1303	Exploring The Effectiveness of GPT Models in Test-Taking: A Case Study of The Driver’s License Knowledge Test Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our research proposes a method that enables GPT models to answer questions by employing context from an information source not previously included in their training data.	Saba Rahimi; Tucker Balch; Manuela Veloso;	arxiv-cs.CL	2023-08-22
1304	An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces an effective end-to-end speaker identification model applied Transformer-based contextual model.	Harunori Kawano; Sota Shimizu;	arxiv-cs.SD	2023-08-22
1305	Evaluating Large Language Models on Graphs: Performance Insights and Comparative Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we evaluate the capabilities of four LLMs in addressing several analytical problems with graph data.	Chang Liu; Bo Wu;	arxiv-cs.AI	2023-08-22
1306	TurboViT: Generating Fast Vision Transformers Via Generative Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the generation of fast vision transformer architecture designs via generative architecture search (GAS) to achieve a strong balance between accuracy and architectural and computational efficiency.	Alexander Wong; Saad Abbasi; Saeejith Nair;	arxiv-cs.CV	2023-08-22
1307	Tryage: Real-time, Intelligent Routing of User Prompts to Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a context-aware routing system, Tryage, that leverages a language model router for optimal selection of expert models from a model library based on analysis of individual input prompts.	Surya Narayanan Hari; Matt Thomson;	arxiv-cs.LG	2023-08-22
1308	Vision Transformer Pruning Via Matrix Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We end up selected the Singular Value Decomposition as the method to achieve our goal by comparing the original accuracy scores in the original Github repository and the accuracy scores of using those matrix decomposition methods, including Singular Value Decomposition, four versions of QR Decomposition, and LU factorization.	Tianyi Sun;	arxiv-cs.CV	2023-08-21
1309	Zero- and Few-Shot Prompting with LLMs: A Comparative Study with Fine-tuned Models for Bangla Sentiment Analysis IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we present a sizeable manually annotated dataset encompassing 33,606 Bangla news tweets and Facebook comments.	MD. ARID HASAN et. al.	arxiv-cs.CL	2023-08-21
1310	Patch Is Not All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their inherent reliance on sequential input enforces the manual partitioning of images into patch sequences, which disrupts the image’s inherent structural and semantic continuity. To handle this, we propose a novel Pattern Transformer (Patternformer) to adaptively convert images to pattern sequences for Transformer input.	CHANGZHEN LI et. al.	arxiv-cs.CV	2023-08-21
1311	Analyzing Transformer Dynamics As Movement Through Embedding Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Towards that end, we propose framing Transformer dynamics as movement through embedding space.	Sumeet S. Singh;	arxiv-cs.LG	2023-08-21
1312	Large Language Models on Wikipedia-Style Survey Generation: An Evaluation in NLP Concepts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we examine the proficiency of LLMs in generating succinct survey articles specific to the niche field of NLP in computer science, focusing on a curated list of 99 topics.	FAN GAO et. al.	arxiv-cs.CL	2023-08-20
1313	How Good Are LLMs at Out-of-Distribution Detection? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper embarks on a pioneering empirical investigation of OOD detection in the domain of LLMs, focusing on LLaMA series ranging from 7B to 65B in size.	BO LIU et. al.	arxiv-cs.CL	2023-08-20
1314	Cantnlp@LT-EDI-2023: Homophobia/Transphobia Detection in Social Media Comments Using Spatio-Temporally Retrained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper describes our multiclass classification system developed as part of the LTEDI@RANLP-2023 shared task.	Sidney G. -J. Wong; Matthew Durward; Benjamin Adams; Jonathan Dunn;	arxiv-cs.CL	2023-08-20
1315	Can Large Language Models Find And Fix Vulnerable Software? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we evaluated the capability of Large Language Models (LLMs), particularly OpenAI’s GPT-4, in detecting software vulnerabilities, comparing their performance against traditional static code analyzers like Snyk and Fortify.	David Noever;	arxiv-cs.SE	2023-08-20
1316	GPT-in-the-Loop: Adaptive Decision-Making for Multiagent Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces the GPT-in-the-loop approach, a novel method combining the advanced reasoning capabilities of Large Language Models (LLMs) like Generative Pre-trained Transformers (GPT) with multiagent (MAS) systems.	Nathalia Nascimento; Paulo Alencar; Donald Cowan;	arxiv-cs.MA	2023-08-20
1317	Activation Addition: Steering Language Models Without Optimization IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We instead investigate activation engineering: modifying activations at inference-time to predictably alter model behavior.	ALEXANDER MATT TURNER et. al.	arxiv-cs.CL	2023-08-20
1318	East: Efficient and Accurate Secure Transformer Framework for Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a framework \emph{East} to enable efficient and accurate secure Transformer inference.	YUANCHAO DING et. al.	arxiv-cs.CR	2023-08-19
1319	Data-to-text Generation for Severely Under-Resourced Languages with GPT-3.5: A Bit of Help Needed from Google Translate Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: LLMs like GPT are great at tasks involving English which dominates in their training data. In this paper, we look at how they cope with tasks involving languages that are severely under-represented in their training data, in the context of data-to-text generation for Irish, Maltese, Welsh and Breton.	Michela Lorandi; Anya Belz;	arxiv-cs.CL	2023-08-19
1320	A Tailored Handwritten-Text-Recognition System for Medieval Latin Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our work, we introduce an end-to-end pipeline, tailored to the medieval Latin dictionary, for locating, extracting, and transcribing the lemmas.	PHILIPP KOCH et. al.	arxiv-cs.CV	2023-08-18
1321	BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce BioMedGPT, an open multimodal generative pre-trained transformer (GPT) for biomedicine, to bridge the gap between the language of life and human natural language.	YIZHEN LUO et. al.	arxiv-cs.CE	2023-08-18
1322	How Susceptible Are LLMs to Logical Fallacies? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: More specifically, we present Logic Competence Measurement Benchmark (LOGICOM), a diagnostic benchmark to assess the robustness of LLMs against logical fallacies.	Amirreza Payandeh; Dan Pluth; Jordan Hosier; Xuesu Xiao; Vijay K. Gurbani;	arxiv-cs.CL	2023-08-18
1323	A Transformer-based Framework For Multi-variate Time Series: A Remaining Useful Life Prediction Use Case Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposed an encoder-transformer architecture-based framework for multivariate time series prediction for a prognostics use case.	Oluwaseyi Ogunfowora; Homayoun Najjaran;	arxiv-cs.LG	2023-08-18
1324	YORC: Yoruba Reading Comprehension Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we create YORC: a new multi-choice Yoruba Reading Comprehension dataset that is based on Yoruba high-school reading comprehension examination.	Anuoluwapo Aremu; Jesujoba O. Alabi; David Ifeoluwa Adelani;	arxiv-cs.CL	2023-08-18
1325	Accelerated Materials Language Processing Enabled By GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we develop generative pretrained transformer (GPT)-enabled pipelines where the complex architectures of prior MLP models are replaced with strategic designs of prompt engineering.	Jaewoong Choi; Byungju Lee;	arxiv-cs.CL	2023-08-18
1326	WizardMath: Empowering Mathematical Reasoning for Large Language Models Via Reinforced Evol-Instruct IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we present WizardMath, which enhances the mathematical reasoning abilities of Llama-2, by applying our proposed Reinforcement Learning from Evol-Instruct Feedback (RLEIF) method to the domain of math.	HAIPENG LUO et. al.	arxiv-cs.CL	2023-08-18
1327	Linearity of Relation Decoding in Transformer Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations: relations between words and their synonyms, entities and their attributes, etc. We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation.	EVAN HERNANDEZ et. al.	arxiv-cs.CL	2023-08-17
1328	MaScQA: A Question Answering Dataset for Investigating Materials Science Knowledge of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we curate a dataset of 650 challenging questions from the materials domain that require the knowledge and skills of a materials student who has cleared their undergraduate degree.	Mohd Zaki; N. M. Anoop Krishnan;	arxiv-cs.CL	2023-08-17
1329	Exploring Demonstration Ensembling for In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we explore Demonstration Ensembling (DENSE) as an alternative to simple concatenation.	Muhammad Khalifa; Lajanugen Logeswaran; Moontae Lee; Honglak Lee; Lu Wang;	arxiv-cs.CL	2023-08-17
1330	Boosting Logical Reasoning in Large Language Models Through A New Framework: The Graph of Thought IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Recent advancements in large-scale models, such as GPT-4, have showcased remarkable capabilities in addressing standard queries. However, when facing complex problems that require …	Bin Lei; pei-Hung Lin; Chunhua Liao; Caiwen Ding;	arxiv-cs.LG	2023-08-16
1331	Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Survey Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This survey focuses on transformer-based MTL architectures and, to the best of our knowledge, is novel in that it systematically analyses how transformer-based MTL in NLP fits into ML lifecycle phases.	LOVRE TORBARINA et. al.	arxiv-cs.CL	2023-08-16
1332	SkinDistilViT: Lightweight Vision Transformer for Skin Lesion Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We provide a production-specific solution to the skin cancer classification problem that matches human performance in melanoma identification by training a vision transformer on melanoma medical images annotated by experts.	Vlad-Constantin Lungu-Stan; Dumitru-Clementin Cercel; Florin Pop;	arxiv-cs.CV	2023-08-16
1333	Fast Training of NMT Model with Data Sorting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One potential area for improvement is to address the computation of empty tokens that the Transformer computes only to discard them later, leading to an unnecessary computational burden. To tackle this, we propose an algorithm that sorts translation sentence pairs based on their length before batching, minimizing the waste of computing power.	Daniela N. Rim; Kimera Richard; Heeyoul Choi;	arxiv-cs.CL	2023-08-16
1334	Attention Is Not All You Need Anymore Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, a family of drop-in replacements for the self-attention mechanism in the Transformer, called the Extractors, is proposed.	Zhe Chen;	arxiv-cs.LG	2023-08-15
1335	From Commit Message Generation to History-Aware Commit Message Completion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We argue that if we could shift the focus from commit message generation to commit message completion and use previous commit history as additional context, we could significantly improve the quality and the personal nature of the resulting commit messages. In this paper, we propose and evaluate both of these novel ideas.	ALEKSANDRA ELISEEVA et. al.	arxiv-cs.SE	2023-08-15
1336	Domain Adaptation for Deep Unit Test Case Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we leverage Transformer-based code models to generate unit tests with the help of Domain Adaptation (DA) at a project level.	Jiho Shin; Sepehr Hashtroudi; Hadi Hemmati; Song Wang;	arxiv-cs.SE	2023-08-15
1337	Block-Wise Encryption for Reliable Vision Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we focus on block-wise encryption for the vision transformer, and we introduce three applications: privacy-preserving image classification, access control, and the combined use of federated learning and encrypted images.	Hitoshi Kiya; Ryota Iijima; Teru Nagamori;	arxiv-cs.CR	2023-08-15
1338	Graph-Segmenter: Graph Transformer with Boundary-aware Attention for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, since the relation modeling between windows was not the primary emphasis of previous work, it was not fully utilized. To address this issue, we propose a Graph-Segmenter, including a Graph Transformer and a Boundary-aware Attention module, which is an effective network for simultaneously modeling the more profound relation between windows in a global view and various pixels inside each window as a local one, and for substantial low-cost boundary adjustment.	Zizhang Wu; Yuanzhu Gan; Tianhao Xu; Fan Wang;	arxiv-cs.CV	2023-08-15
1339	Radio2Text: Streaming Speech Recognition Using MmWave Radio Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Radio2Text, the first mmWave-based system for streaming automatic speech recognition (ASR) with a vocabulary size exceeding 13,000 words.	Running Zhao; Jiangtao Yu; Hang Zhao; Edith C. H. Ngai;	arxiv-cs.SD	2023-08-15
1340	SST: A Simplified Swin Transformer-based Model for Taxi Destination Prediction Based on Existing Trajectory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simplified Swin Transformer (SST) structure that does not use the shifted window idea in the traditional Swin Transformer, as trajectory data is consecutive in nature.	Zepu Wang; Yifei Sun; Zhiyu Lei; Xincheng Zhu; Peng Sun;	arxiv-cs.CV	2023-08-14
1341	Playing with Words: Comparing The Vocabulary and Lexical Richness of ChatGPT and Humans Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we perform an initial comparison of the vocabulary and lexical richness of ChatGPT and humans when performing the same tasks.	Pedro Reviriego; Javier Conde; Elena Merino-Gómez; Gonzalo Martínez; José Alberto Hernández;	arxiv-cs.CL	2023-08-14
1342	Approximating Human-Like Few-shot Learning with GPT-based Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conceptualize the learning process as information compression.	Cynthia Huang; Yuqing Xie; Zhiying Jiang; Jimmy Lin; Ming Li;	arxiv-cs.AI	2023-08-14
1343	LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Adversarial prompts can bypass their safety measures. We propose LLM Self Defense, a simple approach to defend against these attacks by having an LLM screen the induced responses.	MANSI PHUTE et. al.	arxiv-cs.CL	2023-08-14
1344	An Ensemble Approach to Question Classification: Integrating Electra Transformer, GloVe, and LSTM Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents an innovative ensemble approach for question classification, combining the strengths of Electra, GloVe, and LSTM models.	Sanad Aburass; Osama Dorgham; Maha Abu Rumman;	arxiv-cs.CL	2023-08-13
1345	GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs Via Cipher IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we discover that chat in cipher can bypass the safety alignment techniques of LLMs, which are mainly conducted in natural languages.	YOULIANG YUAN et. al.	arxiv-cs.CL	2023-08-12
1346	ViGT: Proposal-free Video Grounding with A Learnable Token in The Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	Kun Li; Dan Guo; Meng Wang;	Science China Information Sciences	2023-08-11
1347	Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we present ZeroShotALI, a novel recommender system that leverages a state-of-the-art large language model (LLM) in conjunction with a domain-specifically optimized transformer-based text-matching solution.	LARS HILLEBRAND et. al.	arxiv-cs.CL	2023-08-11
1348	Identification of The Relevance of Comments in Codes Using Bag of Words and Transformer Based Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The paper presents the overview of the models and other significant findings on the training corpus.	Sruthi S; Tanmay Basu;	arxiv-cs.IR	2023-08-11
1349	Task Conditioned BERT for Joint Intent Detection and Slot-filling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Dialogue systems need to deal with the unpredictability of user intents to track dialogue state and the heterogeneity of slots to understand user preferences. In this paper we investigate the hypothesis that solving these challenges as one unified model will allow the transfer of parameter support data across the different tasks.	Diogo Tavares; Pedro Azevedo; David Semedo; Ricardo Sousa; João Magalhães;	arxiv-cs.CL	2023-08-11
1350	Large Language Models to Identify Social Determinants of Health in Electronic Health Records Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHR). This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented, and explored the role of synthetic clinical text for improving the extraction of these scarcely documented, yet extremely valuable, clinical data.	MARCO GUEVARA et. al.	arxiv-cs.CL	2023-08-11
1351	Large Language Models in Cryptocurrency Securities Cases: Can A GPT Model Meaningfully Assist Lawyers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study securities cases involving cryptocurrencies as one of numerous contexts where AI could support the legal process, studying GPT-3.5’s legal reasoning and ChatGPT’s legal drafting capabilities.	Arianna Trozze; Toby Davies; Bennett Kleinberg;	arxiv-cs.AI	2023-08-11
1352	Large Language Models in Cryptocurrency Securities Cases: Can ChatGPT Replace Lawyers? Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large Language Models (LLMs) could enhance access to the legal system. However, empirical research on their effectiveness in conducting legal tasks is scant. We study securities …	Arianna Trozze; Toby P Davies; Bennett Kleinberg;	ArXiv	2023-08-11
1353	Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study presents a comparative analysis of machine learning and transformer-based approaches for deceptive text classification.	Anusuya Krishnan;	arxiv-cs.CL	2023-08-10
1354	Adaptive Low Rank Adaptation of Segment Anything to Salient Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although SAM excels in general object segmentation, it lacks the intrinsic ability to detect salient objects, resulting in suboptimal performance in this domain. To address this challenge, we present the Segment Salient Object Model (SSOM), an innovative approach that adaptively fine-tunes SAM for salient object detection by harnessing the low-rank structure inherent in deep learning.	Ruikai Cui; Siyuan He; Shi Qiu;	arxiv-cs.CV	2023-08-10
1355	Bringing Order Into The Realm of Transformer-based Language Models for Artificial Intelligence and Law Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Like for other textual domains, TLMs have indeed pushed the state-of-the-art of AI approaches for many tasks of interest in the legal domain. Despite the first Transformer model being proposed about six years ago, there has been a rapid progress of this technology at an unprecedented rate, whereby BERT and related models represent a major reference, also in the legal domain.	Candida M. Greco; Andrea Tagarelli;	arxiv-cs.CL	2023-08-10
1356	Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a novel MTL model by combining both merits of deformable CNN and query-based Transformer with shared gating for multi-task learning of dense prediction.	YANGYANG XU et. al.	arxiv-cs.CV	2023-08-10
1357	Metacognitive Prompting Improves Understanding in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes.	Yuqing Wang; Yun Zhao;	arxiv-cs.CL	2023-08-10
1358	Testing GPT-4 with Wolfram Alpha and Code Interpreter Plug-ins on Math and Science Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This report describes a test of the large language model GPT-4 with the Wolfram Alpha and the Code Interpreter plug-ins on 105 original problems in science and math, at the high school and college levels, carried out in June-August 2023.	Ernest Davis; Scott Aaronson;	arxiv-cs.AI	2023-08-10
1359	Joint-Relation Transformer for Multi-Person Motion Prediction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose the Joint-Relation Transformer, which utilizes relation information to enhance interaction modeling and improve future motion prediction.	QINGYAO XU et. al.	arxiv-cs.CV	2023-08-09
1360	Performance Analysis of Transformer Based Models (BERT, ALBERT and RoBERTa) in Fake News Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we explore those transformer models and found that ALBERT outperformed other models with 87.6% accuracy, 86.9% precision, 86.9% F1-score, and 174.5 run-time (s/epoch) respectively.	Shafna Fitria Nur Azizah; Hasan Dwi Cahyono; Sari Widya Sihwi; Wisnu Widiarto;	arxiv-cs.CL	2023-08-09
1361	A Comparative Study of Open-Source Large Language Models, GPT-4 and Claude 2: Multiple-Choice Test Taking in Nephrology Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study investigated the medical knowledge capability of LLMs, specifically in the context of internal medicine subspecialty multiple-choice test-taking ability.	SEAN WU et. al.	arxiv-cs.CL	2023-08-09
1362	PETformer: Long-term Time Series Forecasting Via Placeholder-enhanced Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Placeholder-enhanced Technique (PET) to enhance the computational efficiency and predictive accuracy of Transformer in LTSF tasks.	Shengsheng Lin; Weiwei Lin; Wentai Wu; Songbo Wang; Yongxiang Wang;	arxiv-cs.LG	2023-08-09
1363	Knowledge Transfer to Solve Split and Rephrase Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: The usage of large pre-trained language models like BERT and GPT has brought a transformative impact on various natural language processing (NLP) tasks in recent times. However, a …	A. B. Alajlouni; Jinlong Li;	2023 International Conference on Information Technology …	2023-08-09
1364	I-WAS: A Data Augmentation Method with GPT-2 for Simile Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing research on simile detection often relies on corpora that are limited in size and do not adequately represent the full range of simile forms. To address this issue, we propose a simile data augmentation method based on \textbf{W}ord replacement And Sentence completion using the GPT-2 language model.	Yongzhu Chang; Rongsheng Zhang; Jiashu Pu;	arxiv-cs.CL	2023-08-08
1365	Ahead of The Text: Leveraging Entity Preposition for Financial Relation Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result of our methodology, we achieved the 1st place ranking on the competition’s public leaderboard.	Stefan Pasch; Dimitrios Petridis;	arxiv-cs.CL	2023-08-08
1366	A Computational Model for Predicting Customer Behaviors Using Transformer Adapted with Tabular Features Related Papers Related Patents Related Grants Related Venues Related Experts View	Khang Nguyen; T. N. Mai; H. Nguyen; Viet-Anh Nguyen;	International Journal of Computational Intelligence Systems	2023-08-08
1367	Accelerating LLM Inference with Staged Speculative Decoding IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel algorithm, staged speculative decoding, to accelerate LLM inference in small-batch, on-device scenarios.	Benjamin Spector; Chris Re;	arxiv-cs.AI	2023-08-08
1368	Topological Interpretations of GPT-3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is an experiential study of investigating a consistent method for deriving the correlation between sentence vector and semantic meaning of a sentence.	Tianyi Sun; Bradley Nelson;	arxiv-cs.CL	2023-08-07
1369	MedMine: Examining Pre-trained Language Models on Medication Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Such obstacles include their imbalanced performances on different entity types and clinical events. In this work, we examine current state-of-the-art pre-trained language models (PLMs) on such tasks, via fine-tuning including the monolingual model Med7 and multilingual large language model (LLM) XLM-RoBERTa.	Haifa Alrdahi; Lifeng Han; Hendrik Šuvalov; Goran Nenadic;	arxiv-cs.CL	2023-08-07
1370	When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Smart contracts are prone to various vulnerabilities, leading to substantial financial losses over time. Current analysis tools mainly target vulnerabilities with fixed control or …	YUQIANG SUN et. al.	ArXiv	2023-08-07
1371	GPTScan: Detecting Logic Vulnerabilities in Smart Contracts By Combining GPT with Program Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose GPTScan, the first tool combining GPT with static analysis for smart contract logic vulnerability detection.	YUQIANG SUN et. al.	arxiv-cs.CR	2023-08-07
1372	CORAL: Expert-Curated Medical Oncology Reports to Advance Language Model Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We developed a detailed schema for annotating textual oncology information, encompassing patient characteristics, tumor characteristics, tests, treatments, and temporality.	MADHUMITA SUSHIL et. al.	arxiv-cs.CL	2023-08-07
1373	Few-shot Medical Image Classification with Simple Shape and Texture Text Descriptors Using Vision-language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the usefulness of vision-language models (VLMs) and large language models for binary few-shot classification of medical images.	Michal Byra; Muhammad Febrian Rachmadi; Henrik Skibbe;	arxiv-cs.CV	2023-08-07
1374	Detecting Spells in Fantasy Literature with A Transformer Based Artificial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our work, we use BERT for context-based phrase recognition of magic spells in the Harry Potter novel series.	Marcel Moravek; Alexander Zender; Andreas Müller;	arxiv-cs.CL	2023-08-07
1375	An Attacker’s Dream? Exploring The Capabilities of ChatGPT for Developing Malware IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: We investigate the potential for abuse of recent AI advances by developing seven malware programs and two attack tools using ChatGPT, OpenAI Playground’s text-davinci-003 model, …	YIN MINN PA PA et. al.	Proceedings of the 16th Cyber Security Experimentation and …	2023-08-07
1376	Pre-Trained Large Language Models for Industrial Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we take HVAC (Heating, Ventilation, and Air Conditioning) building control as an example to examine the ability of GPT-4 (one of the first-tier foundation models) as the controller.	Lei Song; Chuheng Zhang; Li Zhao; Jiang Bian;	arxiv-cs.AI	2023-08-06
1377	PromptCARE: Prompt Copyright Protection By Watermark Injection and Verification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose PromptCARE, the first framework for prompt copyright protection through watermark injection and verification.	Hongwei Yao; Jian Lou; Kui Ren; Zhan Qin;	arxiv-cs.MM	2023-08-05
1378	Resource Management for GPT-based Model Deployed on Clouds: Challenges, Solutions, and Future Directions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To facilitate effective resource management, we introduce a comprehensive resource management framework and present resource scheduling algorithms specifically designed for the GPT-based model.	Yongkang Dang; Minxian Xu; Kejiang Ye;	arxiv-cs.DC	2023-08-05
1379	ConvFormer: Revisiting Transformer for Sequential User Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we re-examine Transformer-like architectures aiming to advance state-of-the-art performance.	HAO WANG et. al.	arxiv-cs.AI	2023-08-05
1380	MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper proposes a novel transformer-based framework that aims to enhance weakly supervised semantic segmentation (WSSS) by generating accurate class-specific object localization maps as pseudo labels.	LIAN XU et. al.	arxiv-cs.CV	2023-08-05
1381	TransformerLight: A Novel Sequence Modeling Based Traffic Signaling Mechanism Via Gated Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we formulate TSC as a sequence modeling problem with a sequence of Markov decision process described by states, actions, and rewards from the traffic environment.	QIANG WU et. al.	kdd	2023-08-04
1382	Adaptive Disentangled Transformer for Sequential Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of layer-wise disentanglement for Transformer architectures and propose the Adaptive Disentangled Transformer (ADT) framework, which is able to adaptively determine the optimal degree of disentanglement of attention heads within different layers.	Yipeng Zhang; Xin Wang; Hong Chen; Wenwu Zhu;	kdd	2023-08-04
1383	Explaining Relation Classification Models with Semantic Extents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce semantic extents, a concept to analyze decision patterns for the relation classification task.	Lars Klöser; Andre Büsgen; Philipp Kohl; Bodo Kraft; Albert Zündorf;	arxiv-cs.CL	2023-08-04
1384	Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a constraint-aware and ranking-distilled token pruning method ToP, which selectively removes unnecessary tokens as input sequence passes through layers, allowing the model to improve online inference speed while preserving accuracy.	JUNYAN LI et. al.	kdd	2023-08-04
1385	Text Is All You Need: Learning Language Representations for Sequential Recommendation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose to model user preferences and item features as language representations that can be generalized to new items and datasets.	JIACHENG LI et. al.	kdd	2023-08-04
1386	ChatGPT for GTFS: Benchmarking LLMs on GTFS Understanding and Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we benchmark OpenAI’s GPT-3.5-Turbo and GPT-4 LLMs which are the backbone of ChatGPT.	Saipraneeth Devunuri; Shirin Qiam; Lewis Lehe;	arxiv-cs.IR	2023-08-04
1387	Baby’s CoThought: Leveraging Large Language Models for Enhanced Reasoning in Compact Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a CoThought pipeline, which efficiently trains smaller baby language models (BabyLMs) by leveraging the Chain of Thought prompting of LLMs.	Zheyu Zhang; Han Yang; Bolei Ma; David Rügamer; Ercong Nie;	arxiv-cs.CL	2023-08-03
1388	ClassEval: A Manually-Crafted Benchmark for Evaluating LLMs on Class-level Code Generation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we make the first attempt to evaluate LLMs in a more challenging code generation scenario, i.e. class-level code generation.	XUEYING DU et. al.	arxiv-cs.CL	2023-08-03
1389	Does Correction Remain A Problem For Large Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The second experiment explores the notion of correction as a preparatory task for other NLP tasks, examining whether large language models can tolerate and perform adequately on texts containing certain levels of noise or errors. By addressing these experiments, we aim to shed light on the significance of correction in the era of large language models and its implications for various NLP applications.	Xiaowu Zhang; Xiaotian Zhang; Cheng Yang; Hang Yan; Xipeng Qiu;	arxiv-cs.CL	2023-08-03
1390	Local Large Language Models for Complex Structured Medical Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces an approach that combines the language reasoning capabilities of large language models (LLMs) with the benefits of local training to tackle complex, domain-specific tasks.	V. K. Cody Bumgardner; Aaron Mullen; Sam Armstrong; Caylin Hickey; Jeff Talbert;	arxiv-cs.CL	2023-08-03
1391	Is GPT-4 A Reliable Rater? Evaluating Consistency in GPT-4 Text Ratings Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: This study reports the Intraclass Correlation Coefficients of feedback ratings produced by OpenAI’s GPT-4, a large language model (LLM), across various iterations, time frames, …	Veronika Hackl; Alexandra Elena Müller; M. Granitzer; Maximilian Sailer;	ArXiv	2023-08-03
1392	Food Classification Using Joint Representation of Visual and Textual Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a multimodal classification framework that uses the modified version of EfficientNet with the Mish activation function for image classification, and the traditional BERT transformer-based network is used for text classification.	Prateek Mittal; Puneet Goyal; Joohi Chauhan;	arxiv-cs.CV	2023-08-03
1393	Mani-GPT: A Generative Model for Interactive Robotic Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we propose Mani-GPT, a Generative Pre-trained Transformer (GPT) for interactive robotic manipulation.	Zhe Zhang; Wei Chai; Jiankun Wang;	arxiv-cs.RO	2023-08-03
1394	Exploring The Psychology of LLMs’ Moral and Legal Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ethical issues raised by LLMs and the need to align future versions makes it important to know how state of the art models reason about moral and legal issues. In this paper, we employ the methods of experimental psychology to probe into this question.	Guilherme F. C. F. Almeida; José Luiz Nunes; Neele Engelmann; Alex Wiegmann; Marcelo de Araújo;	arxiv-cs.AI	2023-08-02
1395	Leveraging Few-Shot Data Augmentation and Waterfall Prompting for Response Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper discusses our approaches for task-oriented conversational modelling using subjective knowledge, with a particular emphasis on response generation.	Lea Krause; Selene Báez Santamaría; Michiel van der Meer; Urja Khurana;	arxiv-cs.CL	2023-08-02
1396	Harnessing GPT-4 for Generation of Cybersecurity GRC Policies: A Focus on Ransomware Attack Mitigation IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	TIMOTHY R. MCINTOSH et. al.	Comput. Secur.	2023-08-01
1397	Feature Pre-inpainting Enhanced Transformer for Video Inpainting Related Papers Related Patents Related Grants Related Venues Related Experts View	Guanxiao Li; Kecheng Zhang; Yu Su; Jingyu Wang;	Eng. Appl. Artif. Intell.	2023-08-01
1398	Multi-level Learning Counting Via Pyramid Vision Transformer and CNN Related Papers Related Patents Related Grants Related Venues Related Experts View	Jiayu Liu; He Li; Weihang Kong;	Eng. Appl. Artif. Intell.	2023-08-01
1399	Vision Transformer Meets Convolutional Neural Network for Plant Disease Classification Related Papers Related Patents Related Grants Related Venues Related Experts View	P. S. Thakur; Shubhangi Chaturvedi; P. Khanna; T. Sheorey; A. Ojha;	Ecol. Informatics	2023-08-01
1400	GPT Has Become Financially Literate: Insights from Financial Literacy Tests of GPT and A Preliminary Test of How People Use It As A Source of Advice Related Papers Related Patents Related Grants Related Venues Related Experts View	Paweł Niszczota; S. Abbas;	ArXiv	2023-08-01
1401	Relative-position Embedding Based Spatially and Temporally Decoupled Transformer for Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	Yujun Ma; Ruili Wang;	Pattern Recognit.	2023-08-01
1402	Simultaneous Multistep Transformer Architecture for Model Predictive Control Related Papers Related Patents Related Grants Related Venues Related Experts View	Junho Park; Mohammadreza Babaei; Samuel Arce; Ashwin N. Venkat; J. Hedengren;	Comput. Chem. Eng.	2023-08-01
1403	TransCFD: A Transformer-based Decoder for Flow Field Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View	Jundou Jiang; Guanxiong Li; Yi Jiang; Laiping Zhang; Xiaogang Deng;	Eng. Appl. Artif. Intell.	2023-08-01
1404	Lung Nodule Detection in Chest CT Images Based on Vision Transformer Network with Bayesian Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View	Hassan Mkindu; Longwen Wu; Yaqin Zhao;	Biomed. Signal Process. Control.	2023-08-01
1405	Short-range Air Combat Maneuver Decision of UAV Swarm Based on Multi-agent Transformer Introducing Virtual Objects Related Papers Related Patents Related Grants Related Venues Related Experts View	Feilong Jiang; Minqiang Xu; Yuqing Li; Hutao Cui; Rixin Wang;	Eng. Appl. Artif. Intell.	2023-08-01
1406	C3DBed: Facial Micro-expression Recognition with Three-dimensional Convolutional Neural Network Embedding in Transformer Model Related Papers Related Patents Related Grants Related Venues Related Experts View	Hang Pan; Lun Xie; Zhiliang Wang;	Eng. Appl. Artif. Intell.	2023-08-01
1407	EEG-based Seizure Prediction Via Hybrid Vision Transformer and Data Uncertainty Learning Related Papers Related Patents Related Grants Related Venues Related Experts View	ZHIWEI DENG et. al.	Eng. Appl. Artif. Intell.	2023-08-01
1408	Multilingual Hope Speech Detection: A Robust Framework Using Transfer Learning of Fine-tuning RoBERTa Model Related Papers Related Patents Related Grants Related Venues Related Experts View	Muhammad Shahid Iqbal Malik; A. Nazarova; Mona Mamdouh Jamjoom; D. Ignatov;	J. King Saud Univ. Comput. Inf. Sci.	2023-08-01
1409	Surface Defect Detection and Classification of Steel Using An Efficient Swin Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View	WEI ZHU et. al.	Adv. Eng. Informatics	2023-08-01
1410	Fault Transfer Diagnosis of Rolling Bearings Across Multiple Working Conditions Via Subdomain Adaptation and Improved Vision Transformer Network IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View	Pengfei Liang; Z. Yu; Bin Wang; Xuefang Xu; Jiaye Tian;	Adv. Eng. Informatics	2023-08-01
1411	A Transformer Condition Recognition Method Based on Dissolved Gas Analysis Features Selection and Multiple Models Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View	XIAOHUI HAN et. al.	Eng. Appl. Artif. Intell.	2023-08-01
1412	ARIMA-AdaBoost Hybrid Approach for Product Quality Prediction in Advanced Transformer Manufacturing Related Papers Related Patents Related Grants Related Venues Related Experts View	Chun-Hua Chien; A. Trappey; C. Wang;	Adv. Eng. Informatics	2023-08-01
1413	A Benchmark for Understanding Dialogue Safety in Mental Health Support Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In real-world interactions, a model response deemed acceptable in casual conversations might have a negligible positive impact on users seeking mental health support. To address these limitations, this paper aims to develop a theoretically and factually grounded taxonomy that prioritizes the positive impact on help-seekers.	HUACHUAN QIU et. al.	arxiv-cs.CL	2023-07-31
1414	ChatMOF: An Autonomous AI System for Predicting and Generating Metal-Organic Frameworks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: ChatMOF is an autonomous Artificial Intelligence (AI) system that is built to predict and generate metal-organic frameworks (MOFs).	Yeonghun Kang; Jihan Kim;	arxiv-cs.CL	2023-07-31
1415	Contrastive Learning for API Aspect Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a novel approach – CLAA – for API aspect detection in API reviews that utilizes transformer models trained with a supervised contrastive loss objective function.	G. M. Shahariar; Tahmid Hasan; Anindya Iqbal; Gias Uddin;	arxiv-cs.SE	2023-07-31
1416	Classifying Multilingual Party Manifestos: Domain Transfer Across Country, Time, and Genre Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We explore the potential of domain transfer across geographical locations, languages, time, and genre in a large-scale database of political manifestos.	Matthias Aßenmacher; Nadja Sauter; Christian Heumann;	arxiv-cs.CL	2023-07-31
1417	Does Fine-tuning GPT-3 with The OpenAI API Leak Personally-identifiable Information? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we simulate a privacy attack on GPT-3 using OpenAI’s fine-tuning API.	ALBERT YU SUN et. al.	arxiv-cs.LG	2023-07-30
1418	Video Frame Interpolation with Flow Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Video Frame Interpolation Flow Transformer to incorporate motion dynamics from optical flows into the self-attention mechanism.	Pan Gao; Haoyue Tian; Jie Qin;	arxiv-cs.CV	2023-07-30
1419	Evaluating ChatGPT and GPT-4 for Visual Programming Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Generative AI has the potential to drastically improve the landscape of computing education by automatically generating personalized feedback and content. In particular, this …	A. Singla;	Proceedings of the 2023 ACM Conference on International …	2023-07-30
1420	MTD-GPT: A Multi-Task Decision-Making GPT Model for Autonomous Driving at Unsignalized Intersections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, achieving safe and accurate multi-task decision-making in complex scenarios, such as unsignalized intersections, remains a challenge for autonomous vehicles. This paper presents a novel approach to this issue with the development of a Multi-Task Decision-Making Generative Pre-trained Transformer (MTD-GPT) model.	Jiaqi Liu; Peng Hang; Xiao qi; Jianqiang Wang; Jian Sun;	arxiv-cs.RO	2023-07-29
1421	Differential Evolution Algorithm Based Hyper-Parameters Selection of Transformer Neural Network Model for Load Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analyze the efficacy of the recently developed Transformer-based Neural Network model in Load forecasting.	Anuvab Sen; Arul Rhik Mazumder; Udayon Sen;	arxiv-cs.NE	2023-07-28
1422	Exploring Format Consistency for Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a framework named Unified Instruction Tuning (UIT), which calls OpenAI APIs for automatic format transfer among different instruction tuning datasets such as PromptSource, FLAN and CrossFit.	SHIHAO LIANG et. al.	arxiv-cs.CL	2023-07-28
1423	Fast Prototyping Next-Generation Accelerators for New ML Models Using MASE: ML Accelerator System Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an efficient and scalable approach for exploring accelerator systems to compute large ML models.	JIANYI CHENG et. al.	arxiv-cs.AR	2023-07-28
1424	A Critical Review of Large Language Models: Sensitivity, Bias, and The Path Toward Specialized AI Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It presents a critical review of Large Language Models (LLMs), addressing challenges related to bias and sensitivity.	Arash Hajikhani; Carolyn Cole;	arxiv-cs.CL	2023-07-28
1425	MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose MeMOTR, a long-term memory-augmented Transformer for multi-object tracking.	Ruopeng Gao; Limin Wang;	arxiv-cs.CV	2023-07-28
1426	VISU at WASSA 2023 Shared Task: Detecting Emotions in Reaction to News Stories Leveraging BERT and Stacked Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Emotion detection from complex dialogues is challenging and often requires context/domain understanding. Therefore in this research, we have focused on developing deep learning (DL) models using the combination of word embedding representations with tailored prepossessing strategies to capture the nuances of emotions expressed.	Vivek Kumar; Sushmita Singh; Prayag Tiwari;	arxiv-cs.CL	2023-07-27
1427	Metric-Based In-context Learning: A Case Study in Text Simplification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we conduct a case study on text simplification (TS) to investigate how to select the best and most robust examples for ICL.	Subha Vadlamannati; Gözde Gül Şahin;	arxiv-cs.CL	2023-07-27
1428	Improving Aspect-Based Sentiment with End-to-End Semantic Role Labeling Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a series of approaches aimed at enhancing the performance of Aspect-Based Sentiment Analysis (ABSA) by utilizing extracted semantic information from a Semantic Role Labeling (SRL) model.	Pavel Přibáň; Ondřej Pražák;	arxiv-cs.CL	2023-07-27
1429	New Interaction Paradigm for Complex EDA Software Leveraging GPT Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: SmartonAI is inspired by the HuggingGPT framework and employs large language models, such as GPT and BERT, to facilitate task planning and execution.	Boyu Han; Xinyu Wang; Yifan Wang; Junyu Yan; Yidong Tian;	arxiv-cs.SE	2023-07-27
1430	LLMediator: GPT-4 Assisted Online Dispute Resolution Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this article, we introduce LLMediator, an experimental platform designed to enhance online dispute resolution (ODR) by utilizing capabilities of state-of-the-art large language …	H. Westermann; Jaromír Šavelka; Karim Benyekhlef;	ArXiv	2023-07-27
1431	Evaluating Generative Models for Graph-to-Text Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we explore the capability of generative models to generate descriptive text from graph data in a zero-shot setting.	Shuzhou Yuan; Michael Färber;	arxiv-cs.CL	2023-07-27
1432	Mental-LLM: Leveraging Large Language Models for Mental Health Prediction Via Online Text Data IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we present a comprehensive evaluation of multiple LLMs on various mental health prediction tasks via online text data, including Alpaca, Alpaca-LoRA, FLAN-T5, GPT-3.5, and GPT-4.	XUHAI XU et. al.	arxiv-cs.CL	2023-07-26
1433	Comparative Analysis of Libraries for The Sentimental Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the project will use Five Python and R libraries NLTK, TextBlob, Vader, Transformers (GPT and BERT pretrained), and Tidytext will be used in the study to apply sentiment analysis techniques.	Wendy Ccoya; Edson Pinto;	arxiv-cs.CL	2023-07-26
1434	Unveiling Security, Privacy, and Ethical Concerns of ChatGPT IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By exploring the upgrade path from GPT-1 to GPT-4, discussing the model’s features, limitations, and potential applications, this study aims to shed light on the potential risks of integrating ChatGPT into our daily lives.	Xiaodong Wu; Ran Duan; Jianbing Ni;	arxiv-cs.CR	2023-07-26
1435	Look Ahead: Improving The Accuracy of Time-Series Forecasting By Previewing Future Time Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate a novel direction towards improving the forecasting performance even more, which is orthogonal to the aforementioned mainstreams as a model-agnostic scheme.	Seonmin Kim; Dong-Kyu Chae;	sigir	2023-07-25
1436	ARB: Advanced Reasoning Benchmark for Large Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate recent models such as GPT-4 and Claude on ARB and demonstrate that current models score well below 50% on more demanding tasks. In order to improve both automatic and assisted evaluation capabilities, we introduce a rubric-based evaluation approach, allowing GPT-4 to score its own intermediate reasoning steps.	TOMOHIRO SAWADA et. al.	arxiv-cs.CL	2023-07-25
1437	GPT-3 Models Are Few-Shot Financial Reasoners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We run several experiments with GPT-3 and find that a separate retrieval model and logic engine continue to be essential components to achieving SOTA performance in this task, particularly due to the precise nature of financial questions and the complex information stored in financial documents. With this understanding, our refined prompt-engineering approach on GPT-3 achieves near SOTA accuracy without any fine-tuning.	Raul Salles de Padua; Imran Qureshi; Mustafa U. Karakaplan;	arxiv-cs.CL	2023-07-25
1438	On Answer Position Bias in Transformers for Question Answering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we analyze the self-attention and embedding generation components of five Transformer-based models with different architectures and position embedding strategies.	Rafael Glater; Rodrygo L. T. Santos;	sigir	2023-07-25
1439	Is GPT A Computational Model of Emotion? Detailed Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: GPT-4 showed the highest performance in the initial study but fell short in the second, despite providing superior results after minor prompt engineering.	Ala N. Tak; Jonathan Gratch;	arxiv-cs.CL	2023-07-25
1440	LAPCA: Language-Agnostic Pretraining with Cross-Lingual Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While previous works used machine translation and iterative training, we present a novel approach to cross-lingual pretraining called LAPCA (language-agnostic pretraining with cross-lingual alignment). We train the LAPCA-LM model based on XLM-RoBERTa and łexa that significantly improves cross-lingual knowledge transfer for question answering and sentence retrieval on, e.g., XOR-TyDi and Mr. TyDi datasets, and in the zero-shot cross-lingual scenario performs on par with supervised methods, outperforming many of them on MKQA.	Dmitry Abulkhanov; Nikita Sorokin; Sergey Nikolenko; Valentin Malykh;	sigir	2023-07-25
1441	Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a novel content-based text-to-motion retrieval task, which aims at retrieving relevant motions based on a specified natural-language textual description.	Nicola Messina; Jan Sedmidubsky; Fabrizio Falchi; Tomás Rebok;	sigir	2023-07-25
1442	SimTDE: Simple Transformer Distillation for Sentence Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we introduce SimTDE, a simple knowledge distillation framework to compress sentence embeddings transformer models with minimal performance loss and significant size and latency reduction.	JIAN XIE et. al.	sigir	2023-07-25
1443	MaxSimE: Explaining Transformer-based Semantic Similarity Via Contextualized Best Matching Token Pairs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose MaxSimE, an explanation method for language models applied to measure semantic similarity.	Eduardo Brito; Henri Iser;	sigir	2023-07-25
1444	Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current fusion-based methods have the performance limitations due to the small receptive field of convolution and inadequate fusion of audio-visual features. To overcome these issues, we propose a novel \textbf{Au}dio-aware query-enhanced \textbf{TR}ansformer (AuTR) to tackle the task.	JINXIANG LIU et. al.	arxiv-cs.SD	2023-07-24
1445	A High Frequency Active Clamp Forward Converter with Coreless Transformer Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, a highly compact, low power $(\leq 10)$, high frequency (2 MHz) isolated active clamp forward converter, comprising a coreless Printed Circuit Board-based …	Reza Asrar Ghaderloo; Ali Parsa Sirat; A. Shoulaie;	2023 North American Power Symposium (NAPS)	2023-07-24
1446	Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Transformer-based detection and segmentation methods use a list of learned detection queries to retrieve information from the transformer network and learn to predict the location and category of one specific object from each query.	Yiming Cui; Linjie Yang; Haichao Yu;	arxiv-cs.CV	2023-07-23
1447	Gradient-Based Word Substitution for Obstinate Adversarial Examples Generation in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the problem of generating obstinate (over-stability) adversarial examples by word substitution in NLP, where input text is meaningfully changed but the model’s prediction does not, even though it should.	Yimu Wang; Peng Shi; Hongyang Zhang;	arxiv-cs.CL	2023-07-23
1448	Repformer: A Robust Shared-encoder Dual-pipeline Transformer for Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View	Fengwei Gu; Jun Lu; Chengtao Cai; Qidan Zhu; Zhaojie Ju;	Neural Computing and Applications	2023-07-22
1449	Identifying Misinformation on YouTube Through Transcript Contextual Analysis with Transformer Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce a novel methodology for video classification, focusing on the veracity of the content.	Christos Christodoulou; Nikos Salamanos; Pantelitsa Leonidou; Michail Papadakis; Michael Sirivianos;	arxiv-cs.CL	2023-07-22
1450	GPT-4 Can’t Reason IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, despite the genuinely impressive improvement, there are good reasons to be highly skeptical of GPT-4’s ability to reason. This position paper discusses the nature of reasoning; criticizes the current formulation of reasoning problems in the NLP community, as well as the way in which LLM reasoning performance is currently evaluated; introduces a small collection of 21 diverse reasoning problems; and performs a detailed qualitative evaluation of GPT-4’s performance on those problems.	Konstantine Arkoudas;	arxiv-cs.CL	2023-07-21
1451	AIGC Empowering Telecom Sector White Paper_chinese Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In the global craze of GPT, people have deeply realized that AI, as a transformative technology and key force in economic and social development, will bring great leaps and …	YE OUYANG et. al.	arxiv-cs.AI	2023-07-21
1452	Enhancing CLIP with GPT-4: Harnessing Visual Descriptions As Prompts IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: They can also be manipulated to provide visual information in any structure. In this work, we show that GPT-4 can be used to generate text that is visually descriptive and how this can be used to adapt CLIP to downstream tasks.	MAYUG MANIPARAMBIL et. al.	arxiv-cs.CV	2023-07-21
1453	DBAFormer: A Double-Branch Attention Transformer for Long-Term Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View	Ji Huang; Minbo Ma; Yongsheng Dai; Jie Hu; Shengdong Du;	Human-Centric Intelligent Systems	2023-07-20
1454	LLM Cognitive Judgements Differ From Human Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In the present work I examine GPT-3 and ChatGPT capabilities on an limited-data inductive reasoning task from the cognitive science literature.	Sotiris Lamprinidis;	arxiv-cs.CL	2023-07-20
1455	Meta-Transformer: A Unified Framework for Multimodal Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a framework, named Meta-Transformer, that leverages a $\textbf{frozen}$ encoder to perform multimodal perception without any paired multimodal training data.	YIYUAN ZHANG et. al.	arxiv-cs.CV	2023-07-20
1456	Addressing Compiler Errors: Stack Overflow or Large Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study systematically examines 100 compiler error messages from three sources to determine the most effective approach for programmers encountering compiler errors.	Patricia Widjojo; Christoph Treude;	arxiv-cs.SE	2023-07-20
1457	UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Preliminary results from experiments in the extension of pre-trained LMs as well as training from scratch show that this framework improves downstream performance on multiple biomedical and clinical Named Entity Recognition (NER) tasks.	Aidan Mannion; Thierry Chevalier; Didier Schwab; Lorraine Geouriot;	arxiv-cs.CL	2023-07-20
1458	Of Models and Tin Men: A Behavioural Economics Study of Principal-Agent Problems in AI Alignment Using Large-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: AI Alignment is often presented as an interaction between a single designer and an artificial agent in which the designer attempts to ensure the agent’s behavior is consistent with its purpose, and risks arise solely because of conflicts caused by inadvertent misalignment between the utility function intended by the designer and the resulting internal utility function of the agent. With the advent of agents instantiated with large-language models (LLMs), which are typically pre-trained, we argue this does not capture the essential aspects of AI safety because in the real world there is not a one-to-one correspondence between designer and agent, and the many agents, both artificial and human, have heterogeneous values.	Steve Phelps; Rebecca Ranson;	arxiv-cs.AI	2023-07-20
1459	A LLM Assisted Exploitation of AI-Guardian Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Large language models (LLMs) are now highly capable at a diverse range of tasks. This paper studies whether or not GPT-4, one such LLM, is capable of assisting researchers in the …	Nicholas Carlini;	ArXiv	2023-07-20
1460	An In-Depth Evaluation of Federated Learning on Biomedical Natural Language Processing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we evaluated FL on 2 biomedical NLP tasks encompassing 8 corpora using 6 LMs.	LE PENG et. al.	arxiv-cs.CL	2023-07-20
1461	Swin-Fusion: Swin-Transformer with Feature Fusion for Human Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View	Tiansheng Chen; L. Mo;	Neural Processing Letters	2023-07-20
1462	DP-TBART: A Transformer-based Autoregressive Model for Differentially Private Tabular Data Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present Differentially-Private TaBular AutoRegressive Transformer (DP-TBART), a transformer-based autoregressive model that maintains differential privacy and achieves performance competitive with marginal-based methods on a wide variety of datasets, capable of even outperforming state-of-the-art methods in certain settings.	Rodrigo Castellon; Achintya Gopal; Brian Bloniarz; David Rosenberg;	arxiv-cs.LG	2023-07-19
1463	SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a style control (SC) VALL-E model based on the neural codec language model (called VALL-E), which follows the structure of the generative pretrained transformer 3 (GPT-3).	Daegyeom Kim; Seongho Hong; Yong-Hoon Choi;	arxiv-cs.SD	2023-07-19
1464	Can Model Fusing Help Transformers in Long Document Classification? An Empirical Study Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this research, we explore on employing Model Fusing for long document classification while comparing the results with well-known BERT and Longformer architectures.	Damith Premasiri; Tharindu Ranasinghe; Ruslan Mitkov;	arxiv-cs.CL	2023-07-18
1465	Unveiling Gender Bias in Terms of Profession Across LLMs: Analyzing and Addressing Sociological Implications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Gender bias in artificial intelligence (AI) and natural language processing has garnered significant attention due to its potential impact on societal perceptions and biases. This research paper aims to analyze gender bias in Large Language Models (LLMs) with a focus on multiple comparisons between GPT-2 and GPT-3.5, some prominent language models, to better understand its implications.	Vishesh Thakur;	arxiv-cs.CL	2023-07-18
1466	Development of The ChatGPT, Generative Artificial Intelligence and Natural Large Language Models for Accountable Reporting and Use (CANGARU) Guidelines Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A few publishers and journals have recently created their own sets of rules; however, the absence of a unified approach may lead to a ‘Babel Tower Effect,’ potentially resulting in confusion rather than desired standardization. In response to this, we present the ChatGPT, Generative Artificial Intelligence, and Natural Large Language Models for Accountable Reporting and Use Guidelines (CANGARU) initiative, with the aim of fostering a cross-disciplinary global inclusive consensus on the ethical use, disclosure, and proper reporting of GAI/GPT/LLM technologies in academia.	GIOVANNI E. CACCIAMANI et. al.	arxiv-cs.AI	2023-07-18
1467	Automated Ableism: An Exploration of Explicit Disability Biases in Sentiment and Toxicity Analysis Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze sentiment analysis and toxicity detection models to detect the presence of explicit bias against people with disability (PWD).	Pranav Narayanan Venkit; Mukund Srinath; Shomir Wilson;	arxiv-cs.CL	2023-07-18
1468	U-shaped Transformer: Retain High Frequency Context in Time Series Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the low-pass characteristics of transformers and try to incorporate the advantages of MLP.	Qingkui Chen; Yiqin Zhang;	arxiv-cs.LG	2023-07-18
1469	How Is ChatGPT’s Behavior Changing Over Time? IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time.	Lingjiao Chen; Matei Zaharia; James Zou;	arxiv-cs.CL	2023-07-18
1470	Scale-Aware Modulation Meet Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a new vision Transformer, Scale-Aware Modulation Transformer (SMT), that can handle various downstream tasks efficiently by combining the convolutional network and vision Transformer.	Weifeng Lin; Ziheng Wu; Jiayu Chen; Jun Huang; Lianwen Jin;	arxiv-cs.CV	2023-07-17
1471	Assessing The Quality of Multiple-Choice Questions Using GPT-4 and Rule-Based Methods Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we compared the performance of a rule-based method we developed to a machine-learning based method utilizing GPT-4 for the task of automatically assessing multiple-choice questions based on 19 common item-writing flaws.	Steven Moore; Huy A. Nguyen; Tianying Chen; John Stamper;	arxiv-cs.CL	2023-07-16
1472	SentimentGPT: Exploiting GPT for Advanced Sentiment Analysis and Its Departure from Current Machine Learning IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This study presents a thorough examination of various Generative Pretrained Transformer (GPT) methodologies in sentiment analysis, specifically in the context of Task 4 on the SemEval 2017 dataset.	Kiana Kheiri; Hamid Karimi;	arxiv-cs.CL	2023-07-16
1473	A Survey of Techniques for Optimizing Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Researchers have proposed techniques to optimize transformer inference at all levels of abstraction.	Krishna Teja Chitty-Venkata; Sparsh Mittal; Murali Emani; Venkatram Vishwanath; Arun K. Somani;	arxiv-cs.LG	2023-07-16
1474	The Potential and Pitfalls of Using A Large Language Model Such As ChatGPT or GPT-4 As A Clinical Assistant Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We performed two analyses using ChatGPT and GPT-4, one to identify patients with specific medical diagnoses using a real-world large electronic health record database and the other, in providing diagnostic assistance to healthcare workers in the prospective evaluation of hypothetical patients.	JINGQING ZHANG et. al.	arxiv-cs.CL	2023-07-16
1475	GeoGPT: Understanding and Processing Geospatial Tasks Through An Autonomous GPT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Several cases including geospatial data crawling, spatial query, facility siting, and mapping validate the effectiveness of our framework. Though limited cases are presented in this paper, GeoGPT can be further extended to various tasks by equipping with more GIS tools, and we think the paradigm of foundational plus professional implied in GeoGPT provides an effective way to develop next-generation GIS in this era of large foundation models.	Yifan Zhang; Cheng Wei; Shangyou Wu; Zhengting He; Wenhao Yu;	arxiv-cs.CL	2023-07-15
1476	Transformers Are Universal Predictors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find limits to the Transformer architecture for language modeling and show it has a universal prediction property in an information-theoretic sense.	Sourya Basu; Moulik Choraria; Lav R. Varshney;	arxiv-cs.LG	2023-07-15
1477	Leveraging Large Language Models for The Generation of Novel Metaheuristic Optimization Algorithms Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: In this paper, we investigate the potential of using Large Language Models (LLMs) such as GPT-4 to generate novel hybrid swarm intelligence optimization algorithms. We use the LLM …	Michal Pluhacek; Anezka Kazikova; T. Kadavy; Adam Viktorin; R. Šenkeřík;	Proceedings of the Companion Conference on Genetic and …	2023-07-15
1478	Improving BERT with Hybrid Pooling Network and Drop Mask Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a HybridBERT model which combines self-attention and pooling networks to encode different contextual features in each layer.	QIAN CHEN et. al.	arxiv-cs.CL	2023-07-14
1479	MorphPiece : A Linguistic Tokenizer for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: I propose a linguistically motivated tokenization scheme, MorphPiece, which is based partly on morphological segmentation of the underlying text.	Haris Jabbar;	arxiv-cs.CL	2023-07-14
1480	TriFormer: A Multi-modal Transformer Framework For Mild Cognitive Impairment Conversion Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To accurately predict the MCI conversion to stable MCI or progressive MCI, we propose Triformer, a novel transformer-based framework with three specialized transformers to incorporate multi-model data.	LINFENG LIU et. al.	arxiv-cs.CV	2023-07-14
1481	MaxSR: Image Super-Resolution Using Improved MaxViT Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Because transformer models have powerful representation capacity and the in-built self-attention mechanisms in transformer models help to leverage self-similarity prior in input low-resolution image to improve performance for single image super-resolution, we present a single image super-resolution model based on recent hybrid vision transformer of MaxViT, named as MaxSR.	Bincheng Yang; Gangshan Wu;	arxiv-cs.CV	2023-07-14
1482	Making Local Algorithms Efficiently Self-stabilizing in Arbitrary Asynchronous Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper deals with the trade-off between time, workload, and versatility in self-stabilization, a general and lightweight fault-tolerant concept in distributed computing.In this context, we propose a transformer that provides an asynchronous silent self-stabilizing version Trans(AlgI) of any terminating synchronous algorithm AlgI.	Stéphane Devismes; David Ilcinkas; Colette Johnen; Frédéric Mazoit;	arxiv-cs.DC	2023-07-13
1483	Retrieval Augmented Generation Using Engineering Design Knowledge Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present a data-driven method to identify explicit facts of the form – head entity :: relationship :: tail entity from patented artefact descriptions.	L Siddharth; Jianxi Luo;	arxiv-cs.CL	2023-07-13
1484	Negated Complementary Commonsense Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a model-agnostic methodology to improve the performance in negated complementary scenarios.	Navid Rezaei; Marek Z. Reformat;	arxiv-cs.CL	2023-07-13
1485	Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study how LLMs can be used to scale biomedical knowledge curation.	YU GU et. al.	arxiv-cs.CL	2023-07-12
1486	No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This trend has motivated research on efficient training algorithms designed to improve training, validation, and downstream performance faster than standard training. In this work, we revisit three categories of such algorithms: dynamic architectures (layer stacking, layer dropping), batch selection (selective backprop, RHO loss), and efficient optimizers (Lion, Sophia).	Jean Kaddour; Oscar Key; Piotr Nawrot; Pasquale Minervini; Matt J. Kusner;	arxiv-cs.LG	2023-07-12
1487	Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a new method called self-distilled quantization (SDQ) that minimizes accumulative quantization errors and outperforms baselines.	James O’ Neill; Sourav Dutta;	arxiv-cs.CL	2023-07-12
1488	SwiFT: Swin 4D FMRI Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing approaches for fMRI analysis utilize hand-crafted features, but the process of feature extraction risks losing essential information in fMRI scans. To address this challenge, we present SwiFT (Swin 4D fMRI Transformer), a Swin Transformer architecture that can learn brain dynamics directly from fMRI volumes in a memory and computation-efficient manner.	PETER YONGHO KIM et. al.	arxiv-cs.CV	2023-07-12
1489	Agreement Tracking for Multi-Issue Negotiation Dialogues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our work introduces the novel task of agreement tracking for two-party multi-issue negotiations, which requires continuous monitoring of agreements within a structured state space.	Amogh Mannekote; Bonnie J. Dorr; Kristy Elizabeth Boyer;	arxiv-cs.CL	2023-07-12
1490	Efficient Convolution and Transformer-Based Network for Video Frame Interpolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, a novel method integrating a transformer encoder and convolutional features is proposed.	Issa Khalifeh; Luka Murn; Marta Mrak; Ebroul Izquierdo;	arxiv-cs.CV	2023-07-12
1491	Large Language Models IF:4 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Artificial intelligence is making spectacular progress, and one of the best examples is the development of large language models (LLMs) such as OpenAI’s GPT series. In these lectures, written for readers with a background in mathematics or physics, we give a brief history and survey of the state of the art, and describe the underlying transformer architecture in detail.	Michael R. Douglas;	arxiv-cs.CL	2023-07-11
1492	Explaining Competitive-Level Programming Solutions Using LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we approach competitive-level programming problem-solving as a composite task of reasoning and code generation.	Jierui Li; Szymon Tworkowski; Yingying Wu; Raymond Mooney;	arxiv-cs.CL	2023-07-11
1493	ShredGP: Guitarist Style-Conditioned Tablature Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ShredGP, a GuitarPro tablature generative Transformer-based model conditioned to imitate the style of four distinct iconic electric guitarists.	PEDRO SARMENTO et. al.	arxiv-cs.SD	2023-07-11
1494	Argumentative Segmentation Enhancement for Legal Summarization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the argumentative segmentation, we propose a novel task of classifying argumentative segments of legal case decisions.	Huihui Xu; Kevin Ashley;	arxiv-cs.CL	2023-07-11
1495	Vacaspati: A Diverse Corpus of Bangla Literature Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The literary works are collected from various websites; only those works that are publicly available without copyright violations or restrictions are collected.	Pramit Bhattacharyya; Joydeep Mondal; Subhadip Maji; Arnab Bhattacharya;	arxiv-cs.CL	2023-07-11
1496	ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown Summary Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Abstract: The disruptive application of ChatGPT (GPT-3.5, GPT-4) to a variety of domains has become a topic of much discussion in the scientific community and society at large. Large …	M. Scanlon; F. Breitinger; C. Hargreaves; Jan-Niclas Hilgert; John W. Sheppard;	ArXiv	2023-07-10
1497	Learning to Solve Constraint Satisfaction Problems with Recurrent Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We show that Transformer extended with recurrence is a viable approach to learning to solve CSPs in an end-to-end manner, having clear advantages over state-of-the-art methods such as Graph Neural Networks, SATNet, and some neuro-symbolic models.	Zhun Yang; Adam Ishay; Joohyung Lee;	arxiv-cs.AI	2023-07-10
1498	Can Large Language Models Write Good Property-Based Tests? IF:3 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We call our approach PBT-GPT, and propose three different strategies of prompting the LLM for PBT.	Vasudev Vikram; Caroline Lemieux; Rohan Padhye;	arxiv-cs.SE	2023-07-10
1499	SimpleMTOD: A Simple Language Model for Multimodal Task-Oriented Dialogue with Symbolic Scene Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In-order to capture the semantics of visual scenes, we introduce both local and de-localized tokens for objects within a scene.	Bhathiya Hemanthage; Christian Dondrup; Phil Bartie; Oliver Lemon;	arxiv-cs.CL	2023-07-10
1500	Herding AI Cats: Lessons from Designing A Chatbot By Prompting GPT-3 IF:3 Summary Related Papers Related Patents Related Grants Related Venues Related Experts View Abstract: Prompting Large Language Models (LLMs) is an exciting new approach to designing chatbots. But can it improve LLM’s user experience (UX) reliably enough to power chatbot products? …	J. ZAMFIRESCU-PEREIRA et. al.	Proceedings of the 2023 ACM Designing Interactive Systems …	2023-07-10