ICML 2025 Papers with Code & Data

June 7, 2025November 14, 2025 admin

The International Conference on Machine Learning (ICML) is one of the top machine learning conferences in the world. In 2025, it is to be held in Vancouver. The 2025 event will be held in Vancouver, starting July 13th.

To facilitate rapid community engagement with the presented research, we have compiled an extensive index of accepted papers that have associated public code or data repositories. We list all of them in the following table. This index was generated using an automated extraction process. While we strive for completeness, some papers with public resources may have been missed. Please inform us if you discover any additional papers that should be included. Readers should be aware that some code repositories may not be made fully public until the conference officially begins.

In addition to this index, we encourage readers to explore our related resources: ICML-2025 papers & highlights: For curated summaries and key takeaways from this year’s conference. “Best Paper” Digest (ICML): A historical overview of the most influential ICML papers published since 2004.

This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive daily paper digests on the latest research in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.

Experience the full potential of our services today!

TABLE 1: ICML 2025 Papers with Code & Data

	Paper	Author(s)	Code
1	Synthesizing Software Engineering Data in A Test-Driven Manner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SWE-Flow, a novel data synthesis framework grounded in Test-Driven Development (TDD).To facilitate further research, we release all code, datasets, models, and Docker images at [Github](https://github.com/Hambaobao/SWE-Flow).	Lei Zhang; Jiaxi Yang; Min Yang; Jian Yang; Mouxiang Chen; Jiajun Zhang; Zeyu Cui; Binyuan Hui; Junyang Lin;	code
2	PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models By Watching Stuff Drop Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work studies the process of post-training these models for accurate world modeling through the lens of the simple, yet fundamental, physics task of modeling object freefall.	Chenyu Li; Oscar Michel; Xichen Pan; Sainan Liu; Mike Roberts; Saining Xie;	code
3	Prompt-to-Leaderboard: Prompt-Adaptive LLM Evaluations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This averaging obscures user- and prompt-specific variations in model performance. To address this, we propose Prompt-to-Leaderboard (P2L), a method that produces leaderboards specific to a prompt or set of prompts.	Evan Frick; Connor Chen; Joseph Tennyson; Tianle Li; Wei-Lin Chiang; Anastasios Nikolas Angelopoulos; Ion Stoica;	code
4	PaperBench: Evaluating AI’s Ability to Replicate AI Research Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research.	Giulio Starace; Oliver Jaffe; Dane Sherburn; James Aung; Jun Shern Chan; Leon Maksin; Rachel Dias; Evan Mays; Benjamin Kinsella; Wyatt Thompson; Johannes Heidecke; Amelia Glaese; Tejal Patwardhan;	code
5	Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we describe a system that uses vision-language models in a hierarchical structure, first reasoning over complex prompts and user feedback to deduce the most appropriate next step to fulfill the task, and then performing that step with low-level actions.	Lucy Xiaoyang Shi; brian ichter; Michael Robert Equi; Liyiming Ke; Karl Pertsch; Quan Vuong; James Tanner; Anna Walling; Haohuan Wang; Niccolo Fusai; Adrian Li-Bell; Danny Driess; Lachy Groom; Sergey Levine; Chelsea Finn;	code
6	Roll The Dice & Look Before You Leap: Going Beyond The Creative Limits of Next-token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We design a suite of minimal algorithmic tasks that are a loose abstraction of _open-ended_ real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day language model.	Vaishnavh Nagarajan; Chen Henry Wu; Charles Ding; Aditi Raghunathan;	code
7	XLSTM 7B: A Recurrent LLM for Fast and Efficient Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce xLSTM 7B, a 7-billion-parameter LLM that combines xLSTM’s architectural benefits with targeted optimizations for fast and efficient inference.	Maximilian Beck; Korbinian Pöppel; Phillip Lippe; Richard Kurle; Patrick M Blies; Günter Klambauer; Sebastian Böck; Sepp Hochreiter;	code
8	SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While significant progress has been made in robotic manipulation, existing approaches often fall short in generalization to complex environmental variations and addressing memory-dependent tasks. To bridge this gap, we introduce SAM2Act, a multi-view robotic transformer-based policy that leverages multi-resolution upsampling with visual representations from large-scale foundation model.	Haoquan Fang; Markus Grotz; Wilbert Pumacay; Yi Ru Wang; Dieter Fox; Ranjay Krishna; Jiafei Duan;	code
9	Any4: Learned 4-bit Numeric Representation for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present any4, a learned 4-bit weight quantization solution for large language models (LLMs) providing arbitrary numeric representations without requiring pre-processing of weights or activations.	Mostafa Elhoushi; Jeff Johnson;	code
10	Agent-as-a-Judge: Evaluate Agents with Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These approaches either focus exclusively on final outcomes—ignoring the step-by-step nature of the thinking done by agentic systems—or require excessive manual labour. To address this, we introduce the Agent-as-a-Judge framework, wherein agentic systems are used to evaluate agentic systems.	Mingchen Zhuge; Changsheng Zhao; Dylan R. Ashley; Wenyi Wang; Dmitrii Khizbullin; Yunyang Xiong; Zechun Liu; Ernie Chang; Raghuraman Krishnamoorthi; Yuandong Tian; Yangyang Shi; Vikas Chandra; Jürgen Schmidhuber;	code
11	Context Is Key: A Benchmark for Forecasting with Essential Textual Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters, and propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.	Andrew Robert Williams; Arjun Ashok; Étienne Marcotte; Valentina Zantedeschi; Jithendaraa Subramanian; Roland Riachi; James Requeima; Alexandre Lacoste; Irina Rish; Nicolas Chapados; Alexandre Drouin;	code
12	Taming Rectified Flow for Inversion and Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their robust generative capabilities, these models often struggle with inversion inaccuracies, which could further limit their effectiveness in downstream tasks such as image and video editing. To address this issue, we propose RF-Solver, a novel training-free sampler that effectively enhances inversion precision by mitigating the errors in the ODE-solving process of rectified flow.	Jiangshan Wang; Junfu Pu; Zhongang Qi; Jiayi Guo; Yue Ma; Nisha Huang; Yuxin Chen; Xiu Li; Ying Shan;	code
13	Detecting Strategic Deception with Linear Probes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Monitoring outputs alone is insufficient, since the AI might produce seemingly benign outputs while its internal reasoning is misaligned. We thus evaluate if linear probes can robustly detect deception by monitoring model activations.	Nicholas Goldowsky-Dill; Bilal Chughtai; Stefan Heimersheim; Marius Hobbhahn;	code
14	MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present MimicMotion, a framework for generating high-quality human videos of arbitrary length using motion guidance.	Yuang Zhang; Jiaxi Gu; Li-Wen Wang; Han Wang; JunqiCheng; Yuefeng Zhu; FangYuan Zou;	code
15	SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses.	Yung-Sung Chuang; Benjamin Cohen-Wang; Zejiang Shen; Zhaofeng Wu; Hu Xu; Xi Victoria Lin; James R. Glass; Shang-Wen Li; Wen-tau Yih;	code
16	ShieldAgent: Shielding Agents Via Verifiable Safety Policy Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: More critically, existing guardrails for LLMs are not applicable due to the complex and dynamic nature of agents. To tackle these challenges, we propose ShieldAgent, the first guardrail agent designed to enforce explicit safety policy compliance for the action trajectory of other protected agents through logical reasoning.	Zhaorun Chen; Mintong Kang; Bo Li;	code
17	The Diffusion Duality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they are typically outperformed by autoregressive models and masked diffusion models. In this work, we narrow this performance gap by leveraging a key insight: Uniform-state diffusion processes naturally emerge from an underlying Gaussian diffusion.	Subham Sekhar Sahoo; Justin Deschenaux; Aaron Gokaslan; Guanghan Wang; Justin T Chiu; Volodymyr Kuleshov;	code
18	History-Guided Video Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we find two key challenges to guiding with variable-length history: architectures that only support fixed-size conditioning, and the empirical observation that CFG-style history dropout performs poorly. To address this, we propose the Diffusion Forcing Transformer (DFoT), a video diffusion architecture and theoretically grounded training objective that jointly enable conditioning on a flexible number of history frames.	Kiwhan Song; Boyuan Chen; Max Simchowitz; Yilun Du; Russ Tedrake; Vincent Sitzmann;	code
19	Spatial Reasoning with Denoising Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Spatial Reasoning Models (SRMs), a framework to perform reasoning over sets of continuous variables via denoising generative models.To measure this, we introduce a set of benchmark tasks that test the quality of complex reasoning in generative models and can quantify hallucination.	Christopher Wewer; Bartlomiej Pogodzinski; Bernt Schiele; Jan Eric Lenssen;	code
20	Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Aguvis, a unified vision-based framework for autonomous GUI agents that directly operates on screen images, standardizes cross-platform interactions and incorporates structured reasoning via inner monologue.	Yiheng Xu; Zekun Wang; Junli Wang; Dunjie Lu; Tianbao Xie; Amrita Saha; Doyen Sahoo; Tao Yu; Caiming Xiong;	code
21	Weak-to-Strong Jailbreaking on Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the weak-to-strong* jailbreaking attack, an efficient inference time attack for aligned LLMs to produce harmful text.*	Xuandong Zhao; Xianjun Yang; Tianyu Pang; Chao Du; Lei Li; Yu-Xiang Wang; William Yang Wang;	code
22	Effective and Efficient Masked Image Generation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building upon this insight, we carefully explore the design space of training and sampling, identifying key factors that contribute to both performance and efficiency. Based on the improvements observed during this exploration, we develop our model, referred to as \textbf{eMIGM}.	Zebin You; Jingyang Ou; Xiaolu Zhang; Jun Hu; JUN ZHOU; Chongxuan Li;	code
23	ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers Under Domain Shifts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce ExPLoRA, a highly effective technique to improve transfer learning of pre-trained vision transformers (ViTs) under domain shifts.	Samar Khanna; Medhanie Irgau; David B. Lobell; Stefano Ermon;	code
24	AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AutoAdvExBench, a benchmark to evaluate if large language models (LLMs) can autonomously exploit defenses to adversarial examples.	Nicholas Carlini; Edoardo Debenedetti; Javier Rando; Milad Nasr; Florian Tramèr;	code
25	Massive Values in Self-Attention Modules Are The Key to Contextual Knowledge Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models (LLMs) have achieved remarkable success in contextual knowledge understanding. In this paper, we show for the first time that these concentrated massive values consistently emerge in specific regions of attention queries (Q) and keys (K) while not having such patterns in values (V) in various modern transformer-based LLMs.	Mingyu Jin; Kai Mei; Wujiang Xu; Mingjie Sun; Ruixiang Tang; Mengnan Du; Zirui Liu; Yongfeng Zhang;	code
26	Mitigating Object Hallucination in Large Vision-Language Models Via Image-Grounded Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches require either costly training or fine-tuning, or API access to proprietary LLMs for post-generation correction. In response to these limitations, we propose Mitigating hallucinAtion via image-gRounded guIdaNcE (MARINE), a framework that is both training-free and API-free.	Linxi Zhao; Yihe Deng; Weitong Zhang; Quanquan Gu;	code
27	FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, VAR encounters two primary challenges: (1) its complex and rigid scale design limits generalization in next scale prediction, and (2) the generator’s dependence on a discrete tokenizer with the same complex scale structure restricts modularity and flexibility in updating the tokenizer. To address these limitations, we introduce FlowAR, a general next scale prediction method featuring a streamlined scale design, where each subsequent scale is simply double the previous one.	Sucheng Ren; Qihang Yu; Ju He; Xiaohui Shen; Alan Yuille; Liang-Chieh Chen;	code
28	ProofAug: Efficient Neural Theorem Proving Via Fine-grained Proof Structure Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, for proof synthesis with LLMs, previous work applies automation tools either only when explicitly invoked by the model or at a single granularity level, failing to fully exploit their power. To solve this issue, we propose ProofAug, a procedure that equips LLMs with automation methods at various granularities through fine-grained structure analysis of model-generated proof proposals.	Haoxiong Liu; Jiacheng Sun; Zhenguo Li; Andrew C Yao;	code
29	Improving The Diffusability of Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we perform a spectral analysis of modern autoencoders and identify inordinate high-frequency components in their latent spaces, which are especially pronounced in the autoencoders with a large bottleneck channel size.	Ivan Skorokhodov; Sharath Girish; Benran Hu; Willi Menapace; Yanyu Li; Rameen Abdal; Sergey Tulyakov; Aliaksandr Siarohin;	code
30	TabICL: A Tabular Foundation Model for In-Context Learning on Large Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TabICL, a tabular foundation model for classification, pretrained on synthetic datasets with up to 60K samples and capable of handling 500K samples on affordable resources.	Jingang QU; David Holzmüller; Gaël Varoquaux; Marine Le Morvan;	code
31	CoMemo: LVLMs Need Image Context with Image Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, inherited LLM architectural designs introduce suboptimal characteristics for multimodal processing. First, LVLMs exhibit a bimodal distribution in attention allocation, leading to the progressive neglect of middle visual content as context expands. Second, conventional positional encoding schemes fail to preserve vital 2D structural relationships when processing dynamic high-resolution images. To address these limitations, we propose CoMemo – a dual-path architecture that combines a Context image path with an image Memory path for visual processing, effectively alleviating visual information neglect.	Shi Liu; Weijie Su; Xizhou Zhu; Wenhai Wang; Jifeng Dai;	code
32	Temporal Query Network for Efficient Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel technique called Temporal Query (TQ) to more effectively capture multivariate correlations, thereby improving model performance in MTSF tasks.	Shengsheng Lin; Haojun Chen; Haijie Wu; Chunyun Qiu; Weiwei Lin;	code
33	MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore quantization for MoE models and highlight two key insights: 1) linear blocks exhibit varying quantization sensitivity, and 2) divergent expert activation frequencies create heterogeneous computational characteristics. Based on these observations, we introduce MxMoE, a mixed-precision optimization framework for MoE models that considers both algorithmic and system perspectives.	Haojie Duanmu; Xiuhong Li; Zhihang Yuan; Size Zheng; Jiangfei Duan; Xingcheng Zhang; Dahua Lin;	code
34	An All-Atom Generative Model for Designing Protein Complexes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite these developments, the study and modeling of multi-chain proteins remain largely uncharted, though they are vital for understanding biological functions. Recognizing the importance of these interactions, we introduce APM (all-Atom Protein generative Model), a model specifically designed for modeling multi-chain proteins.	Ruizhe Chen; Dongyu Xue; Xiangxin Zhou; Zaixiang Zheng; xiangxiang Zeng; Quanquan Gu;	code
35	Predictive Data Selection: The Data That Predicts Is The Data That Teaches Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to directly estimate the contribution of data during pretraining and select pretraining data in an efficient manner.	KaShun SHUM; Yuzhen Huang; Hongjian Zou; dingqi; YiXuan Liao; Xiaoxin Chen; Qian Liu; Junxian He;	code
36	Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address three underexplored research questions: (1) How can activation sparsity be measured more accurately?	Yuqi Luo; Chenyang Song; Xu Han; Yingfa Chen; Chaojun Xiao; Xiaojun Meng; Liqun Deng; Jiansheng Wei; Zhiyuan Liu; Maosong Sun;	code
37	From Feature Interaction to Feature Generation: A Generative Paradigm of CTR Prediction Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike sequential recommendation, which naturally fits a generative next-item prediction paradigm, it’s hard to formulate CTR models into this paradigm without explicit feature order. Therefore, we propose a novel Supervised Feature Generation framework for CTR models, shifting from the discriminative feature interaction paradigm to the generative feature generation paradigm.	Mingjia Yin; Junwei Pan; Hao Wang; Ximei Wang; Shangyu Zhang; Jie Jiang; Defu Lian; Enhong Chen;	code
38	What If We Recaption Billions of Web Images with LLaMA-3? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, large-scale investigations in this area remain predominantly closed-source. Our paper aims to bridge this community effort, leveraging the powerful and $\textit{open-sourced}$ LLaMA-3, a GPT-4 level LLM.	Xianhang Li; Haoqin Tu; Mude Hui; Zeyu Wang; Bingchen Zhao; Junfei Xiao; Sucheng Ren; Jieru Mei; Qing Liu; Huangjie Zheng; Yuyin Zhou; Cihang Xie;	code
39	TimeBridge: Non-Stationarity Matters for Long-term Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose TimeBridge, a novel framework designed to bridge the gap between non-stationarity and dependency modeling in long-term time series forecasting.	Peiyuan Liu; Beiliang Wu; Yifan Hu; Naiqi Li; Tao Dai; Jigang Bao; Shu-Tao Xia;	code
40	On Path to Multimodal Generalist: General-Level and General-Bench Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this project, we introduce an evaluation framework to delineate the capabilities and behaviors of current multimodal generalists.To evaluate the comprehensive abilities of various generalists, we present a massive multimodal benchmark, General-Bench, which encompasses a broader spectrum of skills, modalities, formats, and capabilities, including over 700 tasks and 325,800 instances.	Hao Fei; Yuan Zhou; Juncheng Li; Xiangtai Li; Qingshan Xu; Bobo Li; Shengqiong Wu; Yaoting Wang; Junbao Zhou; Jiahao Meng; Qingyu Shi; Zhiyuan Zhou; Liangtao Shi; Minghe Gao; Daoan Zhang; Zhiqi Ge; Siliang Tang; Kaihang Pan; Yaobo Ye; Haobo Yuan; Tao Zhang; Weiming Wu; Tianjie Ju; Zixiang Meng; Shilin Xu; Liyu Jia; Wentao Hu; Meng Luo; Jiebo Luo; Tat-Seng Chua; Shuicheng YAN; Hanwang Zhang;	code
41	Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce \emph{preference embedding}, an approach that embeds responses into a latent space to capture intricate preference structures efficiently, achieving linear query complexity.	Yifan Zhang; Ge Zhang; Yue Wu; Kangping Xu; Quanquan Gu;	code
42	On The Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To simulate faulty agents, we propose two approaches—AutoTransform and AutoInject—which introduce mistakes into the agents’ responses.	Jen-tse Huang; Jiaxu Zhou; Tailin Jin; Xuhui Zhou; Zixi Chen; Wenxuan Wang; Youliang Yuan; Michael Lyu; Maarten Sap;	code
43	Contrastive Private Data Synthesis Via Weighted Multi-PLM Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods relying on pre-trained models for data synthesis often struggle in data-deficient scenarios, suffering from limited sample size, inevitable generation noise and existing pre-trained model bias. To address these challenges, we propose a novel contrAstive private data Synthesis via Weighted multiple Pre-trained generative models framework, named as WASP.	Tianyuan Zou; Yang Liu; Peng Li; Yufei Xiong; Jianqing Zhang; Jingjing Liu; Xiaozhou Ye; Ye Ouyang; Ya-Qin Zhang;	code
44	OR-Bench: An Over-Refusal Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This study proposes a novel method for automatically generating large-scale over-refusal datasets.	Justin Cui; Wei-Lin Chiang; Ion Stoica; Cho-Jui Hsieh;	code
45	TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using the obtained reward and Bradley-Terry model, this work establishes a framework of computable loss functions with token-level reward guidance for DPO, and proposes a practical reward guidance based on the induced DPO reward.	Mingkang Zhu; Xi Chen; Zhongdao Wang; Bei Yu; Hengshuang Zhao; Jiaya Jia;	code
46	Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, state-of-the-art unlearning methods face a critical vulnerability: they are susceptible to “relearning” the removed information from a small number of forget data points, known as relearning attacks. In this paper, we systematically investigate how to make unlearned models robust against such attacks.	Chongyu Fan; Jinghan Jia; Yihua Zhang; Anil Ramakrishna; Mingyi Hong; Sijia Liu;	code
47	MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present MELON (Masked re-Execution and TooL comparisON), a novel IPI defense.	Kaijie Zhu; Xianjun Yang; Jindong Wang; Wenbo Guo; William Yang Wang;	code
48	Elucidating The Design Space of Multimodal Protein Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically elucidate the design space of multimodal PLMs to overcome their limitations.	Cheng-Yen Hsieh; Xinyou Wang; Daiheng Zhang; Dongyu Xue; Fei Ye; Shujian Huang; Zaixiang Zheng; Quanquan Gu;	code
49	ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This lack of context-awareness can lead to suboptimal performance, as the same action may hold different meanings depending on its surrounding context. To address this issue, we propose ActionPiece to explicitly incorporate context when tokenizing action sequences.	Yupeng Hou; Jianmo Ni; Zhankui He; Noveen Sachdeva; Wang-Cheng Kang; Ed H. Chi; Julian McAuley; Derek Zhiyuan Cheng;	code
50	H-Tuning: Toward Low-Cost and Efficient ECG-based Cardiovascular Disease Detection with Pre-Trained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose a holistic method (H-Tuning) for low-cost and efficient fine-tuning of pre-trained models on downstream datasets.	Rushuang Zhou; Yuanting Zhang; Yining Dong;	code
51	T1: Advancing Language Model Reasoning Through Reinforcement Learning and Inference Scaling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present T1 to scale RL by encouraging exploration and understand inference scaling.	Zhenyu Hou; Xin Lv; Rui Lu; Jiajie Zhang; Yujiang Li; Zijun Yao; Juanzi Li; Jie Tang; Yuxiao Dong;	code
52	Unifying 2D and 3D Vision-Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel language-conditioned mask decoder shared across 2D and 3D modalities to ground objects effectively in both RGB and RGB-D images, outperforming box-based approaches.	Ayush Jain; Alexander Swerdlow; Yuzhou Wang; Sergio Arnaud; Ada Martin; Alexander Sax; Franziska Meier; Katerina Fragkiadaki;	code
53	RStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present rStar-Math to demonstrate that small language models (SLMs) can rival or even surpass the math reasoning capability of OpenAI o1, without distillation from superior models.	Xinyu Guan; Li Lyna Zhang; Yifei Liu; Ning Shang; Youran Sun; Yi Zhu; Fan Yang; Mao Yang;	code
54	Improving LLM Safety Alignment with Dual-Objective Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Direct preference optimization (DPO), a widely deployed alignment method, exhibits limitations in both experimental and theoretical contexts as its loss function proves suboptimal for refusal learning. Through gradient-based analysis, we identify these shortcomings and propose an improved safety alignment that disentangles DPO objectives into two components: (1) robust refusal training, which encourages refusal even when partial unsafe generations are produced, and (2) targeted unlearning of harmful knowledge.	Xuandong Zhao; Will Cai; Tianneng Shi; David Huang; Licong Lin; Song Mei; Dawn Song;	code
55	MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we propose MMedPO, a novel multimodal medical preference optimization approach that considers the clinical relevance of preference samples to enhance Med-LVLM alignment.	Kangyu Zhu; Peng Xia; Yun Li; Hongtu Zhu; Sheng Wang; Huaxiu Yao;	code
56	Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Alongside PhyGenBench, we propose a novel evaluation framework called PhyGenEval.We will release the data and codes at https://github.com/OpenGVLab/PhyGenBench	Fanqing Meng; Jiaqi Liao; Xinyu Tan; Quanfeng Lu; Wenqi Shao; Kaipeng Zhang; Yu Cheng; Dianqi Li; Ping Luo;	code
57	Time-VLM: Exploring Multimodal Vision-Language Models for Augmented Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conversely, vision captures intricate temporal patterns but lacks semantic context, limiting the complementary potential of these modalities. To address this, we propose Time-VLM, a novel multimodal framework that leverages pre-trained Vision-Language Models (VLMs) to bridge temporal, visual, and textual modalities for enhanced forecasting.	Siru Zhong; Weilin Ruan; Ming Jin; Huan Li; Qingsong Wen; Yuxuan Liang;	code
58	Watch Out Your Album! On The Inadvertent Privacy Memorization in Multi-Modal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate how randomly generated task-irrelevant private content can become spuriously correlated with downstream objectives due to partial mini-batch training dynamics, thus causing inadvertent memorization.	Tianjie Ju; Yi Hua; Hao Fei; Zhenyu Shao; Yubin Zheng; Haodong Zhao; Mong-Li Lee; Wynne Hsu; Zhuosheng Zhang; Gongshen Liu;	code
59	Heads Up! Large Language Models Can Perform Tasks Without Your Instruction Via Selective Attention Head Masking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the modules inside LLMs and demonstrate that, by simply masking or retaining specific attention heads during inference, LLMs can exhibit specific task functionalities without requiring explicit instructions or modifications to the model parameters.	Senyu Han; Hongchuan Zeng; Kai Yu; Lu Chen;	code
60	A General Framework for Inference-time Scaling and Steering of Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose FK steering, a framework for inference-time steering diffusion models with reward functions.	Raghav Singhal; Zachary Horvitz; Ryan Teehan; Mengye Ren; Zhou Yu; Kathleen McKeown; Rajesh Ranganath;	code
61	Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose and formulate a new research area: automated failure attribution for LLM multi-agent systems.	Shaokun Zhang; Ming Yin; Jieyu Zhang; Jiale Liu; Zhiguang Han; Jingyang Zhang; Beibin Li; Chi Wang; Huazheng Wang; Yiran Chen; Qingyun Wu;	code
62	Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we start from a new perspective to excavate the reason behind the failure generalization in AIGI detection, named the asymmetry phenomenon, where a naively trained detector tends to favor overfitting to the limited and monotonous fake patterns, causing the feature space to become highly constrained and low-ranked, which is proved seriously limiting the expressivity and generalization.	Zhiyuan Yan; Jiangming Wang; Peng Jin; Ke-Yue Zhang; Chengchun Liu; Shen Chen; Taiping Yao; Shouhong Ding; Baoyuan Wu; Li Yuan;	code
63	AssistanceZero: Scalably Solving Assistance Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first scalable approach to solving assistance games and apply it to a new, challenging Minecraft-based assistance game with over $10^{400}$ possible goals.	Cassidy Laidlaw; Eli Bronstein; Timothy Guo; Dylan Feng; Lukas Berglund; Justin Svegliato; Stuart Russell; Anca Dragan;	code
64	NextCoder: Robust Adaptation of Code LMs to Diverse Code Edits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, contemporary code language models (LMs) lack the ability to handle diverse types of code-edit requirements. In this work, we attempt to overcome this shortcoming through (1) a novel synthetic data generation pipeline and (2) a robust model adaptation algorithm.	Tushar Aggarwal; Swayam Singh; Abhijeet Awasthi; Aditya Kanade; Nagarajan Natarajan;	code
65	OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose OTTER, a novel VLA architecture that leverages these existing alignments through explicit, text-aware visual feature extraction.	Huang Huang; Fangchen Liu; Letian Fu; Tingfan Wu; Mustafa Mukadam; Jitendra Malik; Ken Goldberg; Pieter Abbeel;	code
66	TimeBase: The Power of Minimalism in Efficient Long-term Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce TimeBase, an ultra-lightweight network to harness the power of minimalism in LTSF.	Qihe Huang; Zhengyang Zhou; Kuo Yang; Zhongchao Yi; Xu Wang; Yang Wang;	code
67	LieRE: Lie Rotational Positional Encodings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, RoPE faces significant limitations beyond language processing: it is constrained to one-dimensional sequence data and, even with learnable phases, offers limited representational capacity. We address these challenges with Lie Relative Encodings (LieRE), which generalizes RoPE to high-dimensional rotation matrices by leveraging their Lie group structure.	Sophie Ostmeier; Brian Axelrod; Maya Varma; Michael Moseley; Akshay S Chaudhari; Curtis Langlotz;	code
68	Parrot: Multilingual Visual Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, we propose Parrot, a novel approach that leverages textual guidance for visual token alignment at the language level.Additionally, we introduce the Massive Multilingual Multimodal Benchmark (MMMB), a new benchmark comprising 6 languages, 15 categories, and 12,000 questions, to assess multilingual capabilities.	Hai-Long Sun; Da-Wei Zhou; Yang Li; Shiyin Lu; Chao Yi; Qing-Guo Chen; Zhao Xu; Weihua Luo; Kaifu Zhang; De-Chuan Zhan; Han-Jia Ye;	code
69	The Jailbreak Tax: How Useful Are Your Jailbreak Outputs? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we ask whether the model outputs produced by existing jailbreaks are actually useful.Overall, our work proposes jailbreak utility as a new important metric in AI safety, and introduces benchmarks to evaluate existing and future jailbreaks.	Kristina Nikolić; Luze Sun; Jie Zhang; Florian Tramèr;	code
70	AdvAgent: Controllable Blackbox Red-teaming on Web Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their access to sensitive resources and autonomous decision-making also introduce significant security risks, where successful attacks could lead to severe consequences. To systematically uncover these vulnerabilities, we propose AdvAgent, a black-box red-teaming framework for attacking web agents.	Chejian Xu; Mintong Kang; Jiawei Zhang; Zeyi Liao; Lingbo Mo; Mengqi Yuan; Huan Sun; Bo Li;	code
71	(How) Do Language Models Track State? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study state tracking in LMs trained or fine-tuned to compose permutations (i.e., to compute the order of a set of objects after a sequence of swaps).	Belinda Z. Li; Zifan Carl Guo; Jacob Andreas;	code
72	NoLiMa: Long-Context Evaluation Beyond Literal Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in these benchmarks, models can exploit existing literal matches between the needle and haystack to simplify the task. To address this, we introduce NoLiMa, a benchmark extending NIAH with a carefully designed needle set, where questions and needles have minimal lexical overlap, requiring models to infer latent associations to locate the needle within the haystack.	Ali Modarressi; Hanieh Deilamsalehy; Franck Dernoncourt; Trung Bui; Ryan A. Rossi; Seunghyun Yoon; Hinrich Schuetze;	code
73	Dendritic Localized Learning: Toward Biologically Plausible Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although various alternative learning approaches have been proposed to address these issues, most either fail to satisfy all three criteria simultaneously or yield suboptimal results. Inspired by the dynamics and plasticity of pyramidal neurons, we propose Dendritic Localized Learning (DLL), a novel learning algorithm designed to overcome these challenges.	Changze Lv; Jingwen Xu; Yiyang Lu; Xiaohua Wang; Zhenghua Wang; Zhibo Xu; Di Yu; Xin Du; Xiaoqing Zheng; Xuanjing Huang;	code
74	All-atom Diffusion Transformers: Unified Generative Modelling of Molecules and Materials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the All-atom Diffusion Transformer (ADiT), a unified latent diffusion framework for jointly generating both periodic materials and non-periodic molecular systems using the same model: (1) An autoencoder maps a unified, all-atom representations of molecules and materials to a shared latent embedding space; and (2) A diffusion model is trained to generate new latent embeddings that the autoencoder can decode to sample new molecules or materials.	Chaitanya K. Joshi; Xiang Fu; Yi-Lun Liao; Vahe Gharakhanyan; Benjamin Kurt Miller; Anuroop Sriram; Zachary Ward Ulissi;	code
75	AnyEdit: Edit Any Knowledge Encoded in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These limitations arise from their reliance on editing a single token’s hidden state, a limitation we term as “efficacy barrier”. To solve this, we propose \textbf{AnyEdit}, a new autoregressive editing paradigm.	Houcheng Jiang; Junfeng Fang; Ningyu Zhang; Mingyang Wan; Guojun Ma; Xiang Wang; Xiangnan He; Tat-Seng Chua;	code
76	VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our key insight is that a visual masked autoencoder, pre-trained on the ImageNet dataset, can naturally be a numeric series forecaster.	Mouxiang Chen; Lefei Shen; Zhuo Li; Xiaoyun Joy Wang; Jianling Sun; Chenghao Liu;	code
77	Sundial: A Family of Highly Capable Time Series Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Sundial, a family of native, flexible, and scalable time series foundation models.	Yong Liu; Guo Qin; Zhiyuan Shi; Zhi Chen; Caiyin Yang; Xiangdong Huang; Jianmin Wang; Mingsheng Long;	code
78	Efficient Federated Incomplete Multi-View Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Federated multi-view clustering (FMVC) has emerged as a potential solution, but existing approaches suffer from substantial limitations, including excessive communication overhead, insufficient privacy protection, and inadequate handling of missing views. To address these issues, we propose Efficient Federated Incomplete Multi-View Clustering (EFIMVC), a novel framework that introduces a localized optimization strategy to significantly reduce communication costs while ensuring theoretical convergence.	Suyuan Liu; Hao Yu; Hao Tan; KE LIANG; Siwei Wang; Shengju Yu; En Zhu; Xinwang Liu;	code
79	ReferSplat: Referring Segmentation in 3D Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we propose ReferSplat, a framework that explicitly models 3D Gaussian points with natural language expressions in a spatially aware paradigm.To support research in this area, we construct the first R3DGS dataset, Ref-LERF.	Shuting He; Guangquan Jie; Changshuo Wang; Yun Zhou; Shuming Hu; Guanbin Li; Henghui Ding;	code
80	Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Unisolver, a novel Transformer model trained on diverse data and conditioned on diverse PDEs, aiming towards a universal neural PDE solver capable of solving a wide scope of PDEs.	Hang Zhou; Yuezhou Ma; Haixu Wu; Haowen Wang; Mingsheng Long;	code
81	CommVQ: Commutative Vector Quantization for KV Cache Compression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are increasingly used in applications requiring long context lengths, but the key-value (KV) cache often becomes a memory bottleneck on GPUs as context grows. To address this, we propose Commutative Vector Quantization (CommVQ) to significantly reduce memory usage for long-context LLM inference.	Junyan Li; Yang Zhang; Muhammad Yusuf Hassan; Talha Chafekar; Tianle Cai; Zhile Ren; Pengsheng Guo; Foroozan Karimzadeh; Colorado Reed; Chong Wang; Chuang Gan;	code
82	Improving Your Model Ranking on Chatbot Arena By Vote Rigging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this strategy is practically inefficient because there are over $190$ models on Chatbot Arena and on average only about 1% of new battles will involve $m\_{t}$. To overcome this, we propose an omnipresent rigging strategy, exploiting the Elo rating mechanism of Chatbot Arena that any new vote on a battle can influence the ranking of the target model $m\_{t}$, even if $m\_{t}$ is not directly involved in the battle.	Rui Min; Tianyu Pang; Chao Du; Qian Liu; Minhao Cheng; Min Lin;	code
83	ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we introduce ALMTokenizer, a novel low-bitrate and semantically rich audio codec tokenizer for audio language models.	Dongchao Yang; Songxiang Liu; Haohan Guo; Jiankun Zhao; Yuanyuan Wang; Helin Wang; Zeqian Ju; Xubo Liu; Xueyuan Chen; Xu Tan; Xixin Wu; Helen M. Meng;	code
84	Geometry Informed Tokenization of Molecules for Language Model Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although tokenization of molecular graphs exists, that for 3D geometries is largely unexplored. Here, we attempt to bridge this gap by proposing a novel method which converts molecular geometries into SE(3)-invariant 1D discrete sequences.	Xiner Li; Limei Wang; Youzhi Luo; Carl Edwards; Shurui Gui; Yuchao Lin; Heng Ji; Shuiwang Ji;	code
85	Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Querent, i.e., the query-aware long contextual dynamic modeling framework, which achieves a theoretically bounded approximation of full self-attention while delivering practical efficiency.	Zhengrui Guo; Qichen Sun; Jiabo MA; Lishuang Feng; Jinzhuo Wang; Hao Chen;	code
86	CASE-Bench: Context-Aware SafEty Benchmark for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Code and data used in the paper are available at https://anonymous.4open.science/r/CASEBench-D5DB.	Guangzhi Sun; Xiao Zhan; Shutong Feng; Phil Woodland; Jose Such;	code
87	Do NOT Think That Much for 2+3=? On The Overthinking of Long Reasoning Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a self-training paradigm, we propose strategies to mitigate overthinking, simplifying reasoning processes without compromising accuracy.	Xingyu Chen; Jiahao Xu; Tian Liang; Zhiwei He; Jianhui Pang; Dian Yu; Linfeng Song; Qiuzhi Liu; Mengfei Zhou; Zhuosheng Zhang; Rui Wang; Zhaopeng Tu; Haitao Mi; Dong Yu;	code
88	DPO Meets PPO: Reinforced Token Optimization for RLHF Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the great successes of PPO in the alignment of state-of-the-art closed-source large language models (LLMs), its open-source implementation is still largely sub-optimal, as widely reported by numerous research studies. To address these issues, we introduce a framework that models RLHF problems as a Markov decision process (MDP), enabling the capture of fine-grained token-wise information.	Han Zhong; Zikang Shan; Guhao Feng; Wei Xiong; Xinle Cheng; Li Zhao; Di He; Jiang Bian; Liwei Wang;	code
89	MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present MENTOR, a method that improves both the architecture* and optimization of RL agents.*	Suning Huang; Zheyu Aqa Zhang; Tianhai Liang; Yihan Xu; Zhehao Kou; Chenhao Lu; Guowei Xu; Zhengrong Xue; Huazhe Xu;	code
90	Visual Autoregressive Modeling for Image Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building upon the tremendous success of autoregressive models in the language domain, we propose \textbf{VARSR}, a novel visual autoregressive modeling for ISR framework with the form of next-scale prediction.Furthermore, we collect large-scale data and design a training process to obtain robust generative priors.	Yunpeng Qu; Kun Yuan; Jinhua Hao; Kai Zhao; Qizhi Xie; Ming Sun; Chao Zhou;	code
91	Probing Visual Language Priors in VLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Vision-Language Models (VLMs) may over-rely on visual language priors from their training data rather than true visual reasoning. To investigate this, we introduce ViLP, a benchmark featuring deliberately out-of-distribution images synthesized via image generation models and out-of-distribution Q\&A pairs.	Tiange Luo; Ang Cao; Gunhee Lee; Justin Johnson; Honglak Lee;	code
92	Reward-Augmented Data Enhances Direct Preference Alignment of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an effective yet simple data relabeling method that conditions the preference pairs on quality scores to construct a reward-augmented dataset.	Shenao Zhang; Zhihan Liu; Boyi Liu; Yufeng Zhang; Yingxiang Yang; Yongfei Liu; Liyu Chen; Tao Sun; Zhaoran Wang;	code
93	SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, while uniform-precision quantization is computationally efficient, it often compromises model performance. To address this, we propose SliM-LLM, a salience-driven mixed-precision quantization framework that allocates bit-widths at the group-wise with high accuracy.	Wei Huang; Haotong Qin; Yangdong Liu; Yawei Li; Qinshuo Liu; Xianglong Liu; Luca Benini; Michele Magno; Shiming Zhang; XIAOJUAN QI;	code
94	Trajectory World Models for Heterogeneous Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we explore pre-training world models for heterogeneous environments by addressing key transfer barriers in both data diversity and model flexibility.	Shaofeng Yin; Jialong Wu; Siqiao Huang; Xingjian Su; Xu He; Jianye HAO; Mingsheng Long;	code
95	Simultaneous Multi-Robot Motion Planning with Projected Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, this work proposes Simultaneous MRMP Diffusion (SMD), a novel approach integrating constrained optimization into the diffusion sampling process to produce collision-free, kinematically feasible trajectories.Additionally, the paper introduces a comprehensive MRMP benchmark to evaluate trajectory planning algorithms across scenarios with varying robot densities, obstacle complexities, and motion constraints.	Jinhao Liang; Jacob K Christopher; Sven Koenig; Ferdinando Fioretto;	code
96	Inductive Gradient Adjustment for Spectral Bias in Implicit Neural Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we delve into the linear dynamics model of MLPs and theoretically identify the empirical Neural Tangent Kernel (eNTK) matrix as a reliable link between spectral bias and training dynamics.	Kexuan Shi; Hai Chen; Leheng Zhang; Shuhang Gu;	code
97	Private Federated Learning Using Preference-Optimized Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our key insight is that the private client feedback collected by prior DP synthetic data methods (Hou et al., 2024; Xie et al., 2024) can be viewed as a preference ranking.	Charlie Hou; Mei-Yu Wang; Yige Zhu; Daniel Lazar; Giulia Fanti;	code
98	Memorization Sinks: Isolating Memorization During LLM Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we put forward a new paradigm of MemSinks that promotes isolation of memorization by design.	Gaurav Rohit Ghosal; Pratyush Maini; Aditi Raghunathan;	code
99	MATH-Perturb: Benchmarking LLMs’ Math Reasoning Abilities Against Hard Perturbations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large language models have demonstrated impressive performance on challenging mathematical reasoning tasks, which has triggered the discussion of whether the performance is achieved by true reasoning capability or memorization.To investigate this question, prior work has constructed mathematical benchmarks when questions undergo simple perturbations — modifications that still preserve the underlying reasoning patterns of the solutions.	Kaixuan Huang; Jiacheng Guo; Zihao Li; Xiang Ji; Jiawei Ge; Wenzhe Li; Yingqing Guo; Tianle Cai; Hui Yuan; Runzhe Wang; Yue Wu; Ming Yin; Shange Tang; Yangsibo Huang; Chi Jin; Xinyun Chen; Chiyuan Zhang; Mengdi Wang;	code
100	Putnam-AXIOM: A Functional & Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Putnam-AXIOM, a benchmark of 522 university-level competition problems drawn from the prestigious William Lowell Putnam Mathematical Competition, and Putnam-AXIOM Variation, an unseen companion set of 100 functional variants generated by programmatically perturbing variables, and constants.	Aryan Gulati; Brando Miranda; Eric Chen; Emily Xia; Kai Fronsdal; Bruno de Moraes Dumont; Sanmi Koyejo;	code
101	MMInference: Accelerating Pre-filling for Long-Context Visual Language Models Via Modality-Aware Permutation Sparse Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the quadratic attention complexity during the pre-filling phase remains a significant obstacle to real-world deployment. To overcome this limitation, we introduce MMInference (Multimodality Million tokens Inference), a dynamic sparse attention method that accelerates the prefilling stage for long-context multi-modal inputs.	Yucheng Li; Huiqiang Jiang; Chengruidong Zhang; Qianhui Wu; Xufang Luo; Surin Ahn; Amir H. Abdi; Dongsheng Li; Jianfeng Gao; Yuqing Yang; Lili Qiu;	code
102	Can Classic GNNs Be Strong Baselines for Graph-level Tasks? Simple Architectures Meet Excellence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we explore the untapped potential of GNNs through an enhanced framework, GNN+, which integrates six widely used techniques: edge feature integration, normalization, dropout, residual connections, feed-forward networks, and positional encoding, to effectively tackle graph-level tasks.	Yuankai Luo; Lei Shi; Xiao-Ming Wu;	code
103	Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For example, supervised fine-tuning improves reasoning quality but requires vast labeled data, while reward-maximizing reinforcement learning finds top-reward solutions while neglecting the solution diversity. To fill this gap, we propose Flow of Reasoning (FoR), an efficient diversity-seeking LLM finetuning method aimed at improving reasoning quality and diversity with minimal data.	Fangxu Yu; Lai Jiang; Haoqiang Kang; Shibo Hao; Lianhui Qin;	code
104	FG-CLIP: Fine-Grained Visual and Textual Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, we propose Fine-Grained CLIP (FG-CLIP), which enhances fine-grained understanding through three key innovations.We construct a comprehensive dataset, termed FineHARD, by integrating high-quality region-specific annotations with challenging fine-grained negative samples.	Chunyu Xie; Bin Wang; Fanjing Kong; Jincheng Li; Dawei Liang; Gengshen Zhang; Dawei Leng; Yuhui Yin;	code
105	EARTH: Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing deep-learning methods often overlook the dynamic nature of epidemics and fail to account for the specific mechanisms of disease transmission. In response to these challenges, we introduce an innovative end-to-end framework called Epidemiology-Aware Neural ODE with Continuous Disease Transmission Graph (EARTH) in this paper.	Guancheng Wan; Zewen Liu; Xiaojun Shan; Max SY Lau; B. Aditya Prakash; Wei Jin;	code
106	How Contaminated Is Your Benchmark? Measuring Dataset Leakage in Large Language Models with Kernel Divergence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Measuring dataset contamination thus becomes essential to ensure that performance evaluations genuinely reflect a model’s ability to generalize to unseen data, rather than relying on memorized examples. To address this problem, we propose Kernel Divergence Score (KDS), a novel method that evaluates dataset contamination by computing the divergence between the kernel similarity matrix of sample embeddings, before and after fine-tuning on the benchmark dataset.	Hyeong Kyu Choi; Maxim Khanov; Hongxin Wei; Yixuan Li;	code
107	Componential Prompt-Knowledge Alignment for Domain Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This arises from the random positioning of knowledge components within prompts, where irrelevant component fusion introduces interference. To address this, we propose Componential Prompt-Knowledge Alignment (KA-Prompt), a novel prompt-based DIL method that introduces component-aware prompt-knowledge alignment during training, significantly improving both the learning and inference capacity of the model.	Kunlun Xu; Xu Zou; Gang Hua; Jiahuan Zhou;	code
108	GradPS: Resolving Futile Neurons in Parameter Sharing Network for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that the existence of futile neurons whose update is canceled out by gradient conflicts among agents leads to poor learning efficiency and diversity. To address this deficiency, we propose GradPS, a gradient-based PS method.	Haoyuan Qin; Zhengzhu Liu; Chenxing Lin; Chennan Ma; Songzhu Mei; Siqi Shen; Cheng Wang;	code
109	Ultra-Resolution Adaptation with Ease Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, training models for high-resolution image generation remains challenging, particularly when training data and computational resources are limited. In this paper, we explore this practical problem from two key perspectives: data and parameter efficiency, and propose a set of key guidelines for ultra-resolution adaptation termed URAE.	Ruonan Yu; Songhua Liu; Zhenxiong Tan; Xinchao Wang;	code
110	PARM: Multi-Objective Test-Time Alignment Via Preference-Aware Autoregressive Reward Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, GenARM (Xu et al., 2025) first independently trains Autoregressive Reward Models (ARMs) for each preference dimension without awareness of each other, then combines their outputs based on user-specific preference vectors during inference to achieve multi-objective test-time alignment, leading to two key limitations: the need for multiple ARMs increases the inference cost, and the separate training of ARMs causes the misalignment between the guided generation and the user preferences. To address these issues, we propose Preference-aware ARM (PARM), a single unified ARM trained across all preference dimensions.	Baijiong Lin; Weisen Jiang; Yuancheng Xu; Hao Chen; Ying-Cong Chen;	code
111	FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose FOUNDER, a framework that integrates the generalizable knowledge embedded in FMs with the dynamic modeling capabilities of WMs to enable open-ended task solving in embodied environments in a reward-free manner.	Yucen Wang; Rui Yu; Shenghua Wan; Le Gan; De-Chuan Zhan;	code
112	A Mixture-Based Framework for Guiding Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a novel mixture approximation of these intermediate distributions. Since direct gradient-based sampling of these mixtures is infeasible due to intractable terms, we propose a practical method based on Gibbs sampling.	Yazid Janati; Badr MOUFAD; Mehdi Abou El Qassime; Alain Oliviero Durmus; Eric Moulines; Jimmy Olsson;	code
113	CPCF: A Cross-Prompt Contrastive Framework for Referring Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models often suffer from suboptimal performance due to incorrect responses tailored to misleading areas adjacent to or similar to the target region. This work introduces CPCF, a novel framework to address this issue and achieve superior results.	Lanyun Zhu; Deyi Ji; Tianrun Chen; Haiyang Wu; De Wen Soh; Jun Liu;	code
114	Towards Graph Foundation Models: Learning Generalities Across Graphs Via Task-Trees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, discovering such generalities in graph-structured data, especially across heterogeneous graph tasks, remains an open challenge. To address this, we propose a novel approach to cross-task generalization in graphs via task-trees, which serve as unified learning instances aligning node-, edge-, and graph-level tasks.	Zehong Wang; Zheyuan Zhang; Tianyi Ma; Nitesh V Chawla; Chuxu Zhang; Yanfang Ye;	code
115	Beyond Message Passing: Neural Graph Pattern Machine Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures.	Zehong Wang; Zheyuan Zhang; Tianyi Ma; Nitesh V Chawla; Chuxu Zhang; Yanfang Ye;	code
116	MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we simplify the process of building an MAS by reframing it as a generative language task, where the input is a user query and the output is a corresponding MAS.	Rui Ye; Shuo Tang; Rui Ge; Yaxin Du; Zhenfei Yin; Siheng Chen; Jing Shao;	code
117	Probabilistic Group Mask Guided Discrete Optimization for Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing approaches often disregard parameter dependencies, resulting in an over-reliance on newly allocated parameters. To address this issue, we propose Probabilistic Group Mask selection (PGM), a group-wise approach that captures parameter dependencies by exploring candidate masks within each group.	Fengqiang Wan; Yang Yang;	code
118	CurvGAD: Leveraging Curvature for Enhanced Graph Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: CurvGAD introduces two parallel pipelines for enhanced anomaly interpretability: (1) Curvature-equivariant geometry reconstruction, which focuses exclusively on reconstructing the edge curvatures using a mixed-curvature, Riemannian encoder and Gaussian kernel-based decoder; and (2) Curvature-invariant structure and attribute reconstruction, which decouples structural and attribute anomalies from geometric irregularities by regularizing graph curvature under discrete Ollivier-Ricci flow, thereby isolating the non-geometric anomalies.	Karish Grover; Geoffrey J. Gordon; Christos Faloutsos;	code
119	SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce SyncMind, a framework that systematically defines the out-of-sync problem faced by large language model (LLM) agents in collaborative software engineering (CSE).	Xuehang Guo; Xingyao Wang; Yangyi Chen; Sha Li; Chi Han; Manling Li; Heng Ji;	code
120	Compositional Condition Question Answering in Tabular Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these, we introduce a new Compositional Condition Tabular Understanding method, called {\sc CoCoTab}.	Jun-Peng Jiang; Tao Zhou; De-Chuan Zhan; Han-Jia Ye;	code
121	Mixture of Lookup Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Their large parameter size still limits deployment, and offloading, which load experts into VRAM only when needed, significantly increase inference latency. To address this, we propose Mixture of Lookup Experts (MoLE), a new MoE architecture that is efficient in both communication and VRAM usage.	Shibo Jie; Yehui Tang; Kai Han; Yitong Li; Duyu Tang; Zhi-Hong Deng; Yunhe Wang;	code
122	Identifying and Understanding Cross-Class Features in Adversarial Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel perspective on studying AT through the lens of class-wise feature attribution.	Zeming Wei; Steven Y. Guo; Yisen Wang;	code
123	QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While post-training compression methods are very popular, the question of obtaining even more accurate compressed models by directly training over such representations, i.e., Quantization-Aware Training (QAT), is still open: for example, a recent study put the optimal bit-width at which models can be trained using QAT, while staying accuracy-competitive with standard FP16/BF16 precision, at 8-bits weights and activations. We advance this state-of-the-art via a new method called QuEST, for which we demonstrate optimality at 4-bits and stable convergence as low as 1-bit weights and activations.	Andrei Panferov; Jiale Chen; Soroush Tabesh; Mahdi Nikdan; Dan Alistarh;	code
124	Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that VDMs inherently produce visual representations that encompass both current static information and predicted future dynamics, thereby providing valuable guidance for robot action learning. Based on this hypothesis, we propose the Video Prediction Policy (VPP), which learns implicit inverse dynamics model conditioned on predicted future representations inside VDMs.	Yucheng Hu; Yanjiang Guo; Pengchao Wang; Xiaoyu Chen; Yen-Jen Wang; Jianke Zhang; Koushil Sreenath; Chaochao Lu; Jianyu Chen;	code
125	PertEval-scFM: Benchmarking Single-Cell Foundation Models for Perturbation Effect Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present PertEval-scFM, a standardized framework designed to evaluate models for perturbation effect prediction.	Aaron Wenteler; Martina Occhetta; Nikhil Branson; Victor Curean; Magdalena Huebner; William Dee; William Connell; Siu Pui Chung; Alex Hawkins-Hooker; Yasha Ektefaie; César Miguel Valdez Córdova; Amaya Gallagher-Syed;	code
126	Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by relative representation similarity measures, we introduce Inference-Time Decomposition of Activation models (ITDAs).	Patrick Leask; Neel Nanda; Noura Al Moubayed;	code
127	OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce OWLS, an open-access, reproducible suite of multilingual speech recognition and translation models spanning 0.25B to 18B parameters, with the 18B version being the largest speech model, to the best of our knowledge.	William Chen; Jinchuan Tian; Yifan Peng; Brian Yan; Chao-Han Huck Yang; Shinji Watanabe;	code
128	DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Neurosymbolic learning enables the integration of symbolic reasoning with deep learning but faces significant challenges in scaling to complex symbolic programs, large datasets, or both. We introduce DOLPHIN, a framework that tackles these challenges by supporting neurosymbolic programs in Python, executing complex symbolic reasoning on the CPU while vectorizing probabilistic computations and gradient propagation on the GPU.	Aaditya Naik; Jason Liu; Claire Wang; Amish Sethi; Saikat Dutta; Mayur Naik; Eric Wong;	code
129	Adjoint Sampling: Highly Scalable Diffusion Samplers Via Adjoint Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Adjoint Sampling, a highly scalable and efficient algorithm for learning diffusion processes that sample from unnormalized densities, or energy functions.	Aaron J Havens; Benjamin Kurt Miller; Bing Yan; Carles Domingo-Enrich; Anuroop Sriram; Daniel S. Levine; Brandon M Wood; Bin Hu; Brandon Amos; Brian Karrer; Xiang Fu; Guan-Horng Liu; Ricky T. Q. Chen;	code
130	ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Quantization of all weight, activation and key-value (KV) cache tensors to 4-bit without significantly degrading generalizability is challenging, due to the high quantization error caused by extreme outliers in activations. To tackle this problem, we propose ResQ, a PTQ method that pushes further the state-of-the-art.	Utkarsh Saxena; Sayeh Sharify; Kaushik Roy; Xin Wang;	code
131	Decomposition of Graphic Design with Unified Multimodal Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This task presents two core challenges: (1) predicting the attribute information (metadata) of each layer, and (2) recovering the occluded regions within overlapping layers to enable high-fidelity image reconstruction. To address this, we present the Decompose Layer Model (DeaM), a large unified multimodal model that integrates a conjoined visual encoder, a language model, and a condition-aware RGB-A decoder.	Hui Nie; Zhao Zhang; Yutao Cheng; Maoke Yang; Gonglei Shi; Qingsong Xie; Jie Shao; Xinglong Wu;	code
132	DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One formulation of the structure elucidation task is the conditional de novo generation of molecular structure given a mass spectrum. Toward a more accurate and efficient scientific discovery pipeline for small molecules, we present DiffMS, a formula-restricted encoder-decoder generative network that achieves state-of-the-art performance on this task.	Montgomery Bohde; Mrunali Manjrekar; Runzhong Wang; Shuiwang Ji; Connor W. Coley;	code
133	Beyond The Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce rotation symmetry, a novel form of parameter space symmetry for transformers that generalizes permutation symmetry by rotating parameter matrices in self-attention layers.	Binchi Zhang; Zaiyi Zheng; Zhengzhang Chen; Jundong Li;	code
134	SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To maximize sparsity while retaining essential information, we introduce a rank-based strategy to adaptively determine the sparsification ratio for each layer, alongside a token recycling method that compresses pruned tokens into more compact representations.	Yuan Zhang; Chun-Kai Fan; Junpeng Ma; Wenzhao Zheng; Tao Huang; Kuan Cheng; Denis A Gudovskiy; Tomoyuki Okuno; Yohei Nakata; Kurt Keutzer; Shanghang Zhang;	code
135	Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we present a novel approach to generate large-scale, high-resolution canopy height maps over time.	Jan Pauls; Max Zimmer; Berkant Turan; Sassan Saatchi; Philippe CIAIS; Sebastian Pokutta; Fabian Gieseke;	code
136	RUN: Reversible Unfolding Network for Concealed Object Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods often employ reversible strategies to concentrate on uncertain regions but only focus on the mask level, overlooking the valuable of the RGB domain. To address this, we propose a Reversible Unfolding Network (RUN) in this paper.	Chunming He; Rihan Zhang; Fengyang Xiao; Chengyu Fang; Longxiang Tang; Yulun Zhang; Linghe Kong; Deng-Ping Fan; Kai Li; Sina Farsiu;	code
137	Measuring Diversity in Synthetic Datasets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce DCScore, a novel method for measuring synthetic dataset diversity from a classification perspective.	Yuchang Zhu; Huizhe Zhang; Bingzhe Wu; Jintang Li; Zibin Zheng; Peilin Zhao; Liang Chen; Yatao Bian;	code
138	CodeSync: Synchronizing Large Language Models with Dynamic Code Evolution at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce CodeSync, a data engine to identify outdated code patterns and collect real-time code knowledge updates from Python third-party libraries.	Chenlong Wang; Zhaoyang Chu; Zhengxiang Cheng; Xuyi Yang; Kaiyue Qiu; Yao Wan; Zhou Zhao; Xuanhua Shi; Hai Jin; Dongping Chen;	code
139	GuardAgent: Safeguard LLM Agents Via Knowledge-Enabled Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose GuardAgent, the first guardrail agent to protect target agents by dynamically checking whether their actions satisfy given safety guard requests.In addition, we propose two novel benchmarks: EICU-AC benchmark to assess the access control for healthcare agents and Mind2Web-SC benchmark to evaluate the safety policies for web agents.	Zhen Xiang; Linzhi Zheng; Yanjie Li; Junyuan Hong; Qinbin Li; Han Xie; Jiawei Zhang; Zidi Xiong; Chulin Xie; Carl Yang; Dawn Song; Bo Li;	code
140	A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the mixture of LoRAs (MoE-LoRA) still exhibits its low robustness during tuning and inferring. Inspired by the Riemannian Preconditioners which train LoRA as a sub-space projector, we propose a new training strategy for MoE-LoRA, to stabilize and boost its feature learning by gate-rescaled multi-space projections.	Mengyang Sun; Yihao Wang; Tao Feng; Dan Zhang; Yifan Zhu; Jie Tang;	code
141	Meta-Black-Box-Optimization Through Offline Q-function Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the online learning paradigms in existing works makes the efficiency of MetaBBO problematic. To address this, we propose an offline learning-based MetaBBO framework in this paper, termed Q-Mamba, to attain both effectiveness and efficiency in MetaBBO.	Zeyuan Ma; Zhiguang Cao; Zhou Jiang; Hongshu Guo; Yue-Jiao Gong;	code
142	Causality-Aware Contrastive Learning for Robust Multivariate Time-Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes Causality-Aware contrastive learning for RObust multivariate Time-Series (CAROTS), a novel MTSAD pipeline that incorporates the notion of causality into contrastive learning.	HyunGi Kim; Jisoo Mok; Dongjun Lee; Jaihyun Lew; Sungjae Kim; Sungroh Yoon;	code
143	CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present CoSER, a collection of a high-quality dataset, open models, and an evaluation protocol towards effective RPLAs of established characters.	Xintao Wang; Heng Wang; Yifei Zhang; Xinfeng Yuan; Rui Xu; Jen-tse Huang; Siyu Yuan; Haoran Guo; Jiangjie Chen; Shuchang Zhou; Wei Wang; Yanghua Xiao;	code
144	CodeSteer: Symbolic-Augmented Language Models Via Code/Text Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CodeSteer, an effective method for guiding LLM code/text generation.	Yongchao Chen; Yilun Hao; Yueying Liu; Yang Zhang; Chuchu Fan;	code
145	Closed-Loop Long-Horizon Robotic Planning Via Equilibrium Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite recent advances in language model agents, they remain prone to planning errors and limited in their ability to plan ahead. To address these limitations in robotic planning, we advocate a self-refining scheme that iteratively refines a draft plan until an equilibrium is reached.	Jinghan Li; Zhicheng Sun; Yadong MU;	code
146	Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Ca2-VDM, an efficient autoregressive VDM with Causal generation and Cache sharing.	Kaifeng Gao; Jiaxin Shi; Hanwang Zhang; Chunping Wang; Jun Xiao; Long Chen;	code
147	MedRAX: Medical Reasoning Agent for Chest X-ray Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present MedRAX, the first versatile AI agent that seamlessly integrates state-of-the-art CXR analysis tools and multimodal large language models into a unified framework.	Adibvafa Fallahpour; Jun Ma; Alif Munim; Hongwei Lyu; BO WANG;	code
148	Federated Incomplete Multi-view Clustering with Globally Fused Graph Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, missing data problem in federated multi-view clustering task is less explored. To address these problems, we propose a novel Federated Incomplete Multi-view Clustering method with globally Fused Graph guidance (FIMCFG).	Guoqing Chao; Zhenghao Zhang; Lei Meng; Jie Wen; Dianhui Chu;	code
149	Automatically Identify and Rectify: Robust Deep Contrastive Multi-view Clustering in Noisy Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, noise is pervasive in real-world scenarios, leading to a significant degradation in performance. To tackle this problem, we propose a novel multi-view clustering framework for the automatic identification and rectification of noisy data, termed AIRMVC.	Xihong Yang; Siwei Wang; Fangdi Wang; Jiaqi Jin; Suyuan Liu; Yue Liu; En Zhu; Xinwang Liu; Yueming Jin;	code
150	MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, the weight-wise trust ratio in LAMB is error-prone as it overlooks relationships of weight values within rows or columns. Building on these observations, we propose a novel optimizer, MERIT, which leverages the max-norm to calculate the trust ratio to constrain the max attention logit more effectively.	Yang Luo; Zangwei Zheng; Ziheng Qin; Zirui Zhu; Yong Liu; Yang You;	code
151	From Passive to Active Reasoning: Can Large Language Models Ask The Right Questions Under Incomplete Information? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By contrast, active reasoning—where an LLM must interact with external systems to acquire missing evidence or data—has received little systematic attention. To address this shortfall, we present AR-Bench, a novel benchmark designed explicitly to evaluate an LLM’s active reasoning skills.	Zhanke Zhou; Xiao Feng; Zhaocheng Zhu; Jiangchao Yao; Sanmi Koyejo; Bo Han;	code
152	Proactive Agents for Multi-Turn Text-to-Image Generation Under Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study this alignment problem in text-to-image (T2I) generation and propose a prototype for proactive T2I agents equipped with an interface to (1) actively ask clarification questions when uncertain, and (2) present their uncertainty about user intent as an understandable and editable belief graph. We build simple prototypes for such agents and propose a new scalable and automated evaluation approach using two agents, one with a ground truth intent (an image) while the other tries to ask as few questions as possible to align with the ground truth.	Meera Hahn; Wenjun Zeng; Nithish Kannen; Rich Galt; Kartikeya Badola; Been Kim; Zi Wang;	code
153	Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in SEMG Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we revisit the problem from a short-term enhancement perspective to improve precision and robustness against various common noisy scenarios with learnable denoise using sEMG intrinsic pattern information and sliding-window attention.	Weiyu Guo; Ziyue Qiao; Ying Sun; Yijie Xu; Hui Xiong;	code
154	OmniAudio: Generating Spatial Audio from 360-Degree Video Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To generate spatial audio from 360-degree video, we propose a novel framework \textbf{OmniAudio}, which leverages self-supervised pre-training using both spatial audio data (in FOA format) and large-scale non-spatial data.	Huadai Liu; Tianyi Luo; Kaicheng Luo; Qikai Jiang; Peiwen Sun; Jialei Wang; Rongjie Huang; Qian Chen; Wen Wang; Xiangtai Li; ShiLiang Zhang; Zhijie Yan; Zhou Zhao; Wei Xue;	code
155	Compositional Scene Understanding Through Inverse Generative Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate scene understanding as an inverse generative modeling problem, where we seek to find conditional parameters of a visual generative model to best fit a given natural image.	Yanbo Wang; Justin Dauwels; Yilun Du;	code
156	Scaling Video-Language Models to 10K Frames Via Hierarchical Differential Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce differential distillation, a principled approach that systematically preserves task-relevant information while suppressing redundancy.	Chuanqi Cheng; Jian Guan; Wei Wu; Rui Yan;	code
157	$S^2$FGL: Spatial Spectral Federated Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the challenges, we propose a global knowledge repository to mitigate label signal disruption and a frequency alignment to address spectral client drifts.	Zihan Tan; Suyuan Huang; Guancheng Wan; Wenke Huang; He Li; Mang Ye;	code
158	Normalizing Flows Are Capable Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we demonstrate that NFs are more powerful than previously believed.	Shuangfei Zhai; Ruixiang ZHANG; Preetum Nakkiran; David Berthelot; Jiatao Gu; Huangjie Zheng; Tianrong Chen; Miguel Ángel Bautista; Navdeep Jaitly; Joshua M. Susskind;	code
159	The Devil Is in The Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we find that MM-RMs trained on existing datasets often struggle to generalize to out-of-distribution data due to their reliance on unimodal spurious correlations, primarily text-only shortcuts within the training distribution, which prevents them from leveraging true multimodal reward functions. To address this, we introduce a Shortcut-aware MM-RM learning algorithm that mitigates this issue by dynamically reweighting training samples, shifting the distribution toward better multimodal understanding, and reducing dependence on unimodal spurious correlations.	Zichao Li; Xueru Wen; Jie Lou; Yuqiu Ji; Yaojie Lu; Xianpei Han; Debing Zhang; Le Sun;	code
160	Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 2) Visual Feature Diversity: The diversity of visual features makes it challenging to leverage naive image features directly for image-text alignment in downstream tasks. In this work, we propose Federated Disentangled Tuning with Textual Prior Decoupling and Visual Dynamic Adaptation (FedDDA) to overcome the above limitations.	Yihao Yang; Wenke Huang; Guancheng Wan; Bin Yang; Mang Ye;	code
161	Flexible, Efficient, and Stable Adversarial Attacks on Machine Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing adversarial MU attacks suffer from three key limitations: inflexibility due to pre-defined attack targets, inefficiency in handling multiple attack requests, and instability caused by non-convex loss functions. To address these challenges, we propose a Flexible, Efficient, and Stable Attack (DDPA).	Zihan Zhou; Yang Zhou; Zijie Zhang; Lingjuan Lyu; Da Yan; Ruoming Jin; Dejing Dou;	code
162	Efficient Robotic Policy Learning Via Latent Space Backward Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a critical question: Can robotic planning be both efficient and accurate enough for real-time control in long-horizon, multi-stage tasks? To address this, we propose a Backward Planning scheme in Latent space (LBP), which begins by grounding the task into final latent goals, followed by recursively predicting intermediate subgoals closer to the current state.	Dongxiu Liu; Haoyi Niu; Zhihao Wang; Jinliang Zheng; Yinan Zheng; Zhonghong Ou; Jianming HU; Jianxiong Li; Xianyuan Zhan;	code
163	DMOSpeech: Direct Metric Optimization Via Distilled Diffusion Model in Zero-Shot Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, existing TTS approaches are limited by non-differentiable components or iterative sampling that prevent true end-to-end optimization with perceptual metrics. We introduce DMOSpeech, a distilled diffusion-based TTS model that uniquely achieves both faster inference and superior performance compared to its teacher model.	Yinghao Aaron Li; Rithesh Kumar; Zeyu Jin;	code
164	LangDAug: Langevin Data Augmentation for Multi-Source Domain Generalization in Medical Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose LangDAug, a novel Langevin Data Augmentation for multi-source domain generalization in 2D medical image segmentation.	Piyush Tiwary; Kinjawl Bhattacharyya; Prathosh AP;	code
165	Info-Coevolution: An Efficient Framework for Data Model Coevolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Info-Coevolution, a novel framework that efficiently enables models and data to coevolve through online selective annotation with no bias.	Ziheng Qin; Hailun Xu; Wei Chee Yew; Qi Jia; Yang Luo; Kanchan Sarkar; Danhui Guan; Kai Wang; Yang You;	code
166	Concept-Centric Token Interpretation for Vector-Quantized Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Concept-Oriented Token Explanation (CORTEX), a novel approach for interpreting VQGMs by identifying concept-specific token combinations.	Tianze Yang; Yucheng Shi; Mengnan Du; Xuansheng Wu; Qiaoyu Tan; Jin Sun; Ninghao Liu;	code
167	AffectGPT: A New Dataset, Model, and Benchmark for Emotion Understanding with Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the current community suffers from a lack of large-scale datasets with intensive, descriptive emotion annotations, as well as a multimodal-centric framework to maximize the potential of MLLMs for emotion understanding. To address this, we establish a new benchmark for MLLM-based emotion understanding with a novel dataset (MER-Caption) and a new model (AffectGPT).	Zheng Lian; Haoyu Chen; Lan Chen; Haiyang Sun; Licai Sun; Yong Ren; Zebang Cheng; Bin Liu; Rui Liu; Xiaojiang Peng; Jiangyan Yi; Jianhua Tao;	code
168	OV-MER: Towards Open-Vocabulary Multimodal Emotion Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paradigm shift aims to enable models to predict emotions beyond a fixed label space, accommodating a flexible set of categories to better reflect the nuanced spectrum of human emotions. To achieve this, we propose a novel paradigm: Open-Vocabulary MER (OV-MER), which enables emotion prediction without being confined to predefined spaces.	Zheng Lian; Haiyang Sun; Licai Sun; Haoyu Chen; Lan Chen; Hao Gu; Zhuofan Wen; Shun Chen; Zhang Siyuan; Hailiang Yao; Bin Liu; Rui Liu; Shan Liang; Ya Li; Jiangyan Yi; Jianhua Tao;	code
169	Towards Efficient Online Tuning of VLM Agents Via Counterfactual Soft Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel online fine-tuning method, Counterfactual Soft Reinforcement Learning (CoSo), better suited to the textual output space of VLM agents.	Lang Feng; Weihao Tan; Zhiyi Lyu; Longtao Zheng; Haiyang Xu; Ming Yan; Fei Huang; Bo An;	code
170	Long-Short Alignment for Effective Long-Context Modeling in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a fresh perspective on length generalization, shifting the focus from the conventional emphasis on input features such as positional encodings or data structures to the output distribution of the model.	Tianqi Du; Haotian Huang; Yifei Wang; Yisen Wang;	code
171	Simplifying DINO Via Coding Rate Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we posit that we can remove most such-motivated idiosyncrasies in the pre-training pipelines, and only need to add an explicit coding rate term in the loss function to avoid collapse of the representations.	Ziyang Wu; Jingyuan Zhang; Druv Pai; XuDong Wang; Chandan Singh; Jianwei Yang; Jianfeng Gao; Yi Ma;	code
172	Pretraining Generative Flow Networks with Inexpensive Rewards for Molecular Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively.	Mohit Pandey; Gopeshh Subbaraj; Artem Cherkasov; Martin Ester; Emmanuel Bengio;	code
173	Super Deep Contrastive Information Bottleneck for Multi-modal Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although there is a wealth of research on MMC, due to the complexity of datasets, a major challenge remains in how to deeply explore the complex latent information and interdependencies between modalities. To address this issue, this paper proposes a method called super deep contrastive information bottleneck (SDCIB) for MMC, which aims to explore and utilize all types of latent information to the fullest extent.	Zhengzheng Lou; Ke Zhang; Yucong Wu; Shizhe Hu;	code
174	Large Continual Instruction Assistant Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a general continual instruction tuning framework to address the challenge.	Jingyang Qiao; zhizhong zhang; Xin Tan; Yanyun Qu; Shouhong Ding; Yuan Xie;	code
175	Discriminative Policy Optimization for Token-Level Reward Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the conflict between generative language modeling and reward modeling may introduce instability and lead to inaccurate credit assignments. To address this challenge, we revisit token-level reward assignment by decoupling reward modeling from language generation and derive a token-level reward model through the optimization of a discriminative policy, termed the Q-function Reward Model (Q-RM).	Hongzhan Chen; Tao Yang; Shiping Gao; Ruijun Chen; Xiaojun Quan; Hongtao Tian; Ting Yao;	code
176	BAME: Block-Aware Mask Evolution for Efficient N:M Sparse Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce BAME, a method that maintains consistent sparsity throughout the N:M sparse training process.	Chenyi yang; Wenjie Nie; Yuxin Zhang; Yuhang Wu; Xiawu Zheng; GUANNAN JIANG; Rongrong Ji;	code
177	Griffin: Towards A Graph-Centric Relational Database Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Griffin, the first foundation model attemptation designed specifically for Relational Databases (RDBs).	Yanbo Wang; Xiyuan Wang; Quan Gan; Minjie Wang; Qibin Yang; David Wipf; Muhan Zhang;	code
178	LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs – No Silver Bullet for LC or RAG Routing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LaRA, a novel benchmark with 2326 test cases across four QA tasks and three long context types, for rigorous evaluation.	Kuan Li; Liwen Zhang; Yong Jiang; Pengjun Xie; Fei Huang; Shuai Wang; Minhao Cheng;	code
179	Revolve: Optimizing AI Systems By Tracking Response Evolution in Textual Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce $\textbf{REVOLVE}$, an optimization method that tracks how $\textbf{R}$esponses $\textbf{EVOLVE}$ across iterations in LLM systems.	Peiyan Zhang; Haibo Jin; Leyang Hu; Xinnuo Li; Liying Kang; Man Luo; Yangqiu Song; Haohan Wang;	code
180	Textural or Textual: How Vision-Language Models Read Text in Images Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To disentangle orthographic form from meaning, we introduce the ToT dataset, which includes controlled word pairs that either share semantics with distinct appearances (synonyms) or share appearance with differing semantics (paronyms).	Hanzhang Wang; Qingyuan Ma;	code
181	Do We Really Need Message Passing in Brain Network Modeling? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Surprisingly, this paper observes the significant performance and efficiency enhancements of the Hadamard product compared to the matrix product, which is the matrix form of message passing, in processing the brain network.	Liang Yang; Yuwei Liu; Jiaming Zhuo; Di Jin; Chuan Wang; Zhen Wang; Xiaochun Cao;	code
182	Wasserstein Flow Matching: Generative Modeling Over Families of Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Wasserstein flow matching (WFM), which lifts flow matching onto families of distributions using the Wasserstein geometry.	Doron Haviv; Aram-Alexandre Pooladian; Dana Pe’er; Brandon Amos;	code
183	Retrieval-Augmented Perception: High-resolution Image Perception Meets Visual RAG Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we propose Retrieval-Augmented Perception (RAP), a training-free framework that retrieves and fuses relevant image crops while preserving spatial context using the proposed Spatial-Awareness Layout.	Wenbin Wang; Yongcheng Jing; Liang Ding; Yingjie Wang; Li Shen; Yong Luo; Bo Du; Dacheng Tao;	code
184	BSLoRA: Enhancing The Parameter Efficiency of LoRA with Intra-Layer and Inter-Layer Sharing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing methods reduce stored parameters via parameter sharing, they fail to capture both local and global information simultaneously. To address this issue, we propose the Bi-Share LoRA (BSLoRA), which extends local LoRA with intra-LoRA and inter-LoRA parameter sharing to better capture local and global information.	Yuhua Zhou; Ruifeng Li; Changhai Zhou; Fei Yang; Aimin PAN;	code
185	Preserving AUC Fairness in Learning with Noisy Protected Groups Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these studies often overlook the impact of noisy protected groups, leading to fairness violations in practice. To address this, we propose the first robust AUC fairness approach under noisy protected groups with fairness theoretical guarantees using distributionally robust optimization.	Mingyang Wu; Li Lin; Wenbin Zhang; Xin Wang; Zhenhuan Yang; Shu Hu;	code
186	Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel ML algorithm that provably makes use of the full information from both value and demand queries, and we show via experiments that combining both query types results in significantly better learning performance in practice.	Ermis Soumalias; Jakob Heiss; Jakob Weissteiner; Sven Seuken;	code
187	MF-LAL: Drug Compound Generation Using Multi-Fidelity Latent Space Active Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: More accurate methods for activity prediction exist, such as molecular dynamics based binding free energy calculations, but they are too computationally expensive to use in a generative model. To address this challenge, we propose Multi-Fidelity Latent space Active Learning (MF-LAL), a generative modeling framework that integrates a set of oracles with varying cost-accuracy tradeoffs.	Peter Eckmann; Dongxia Wu; Germano Heinzelmann; Michael K Gilson; Rose Yu;	code
188	Graph World Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While multiple graph foundation models have been proposed, they focus on graph learning tasks and cannot extend to diverse multi-modal data and interdisciplinary tasks. To address these challenges, we propose the Graph World Model (GWM), a world model that supports both unstructured and graph-structured states with multi-modal information and represents diverse tasks as actions.	Tao Feng; Yexin Wu; Guanyu Lin; Jiaxuan You;	code
189	Piloting Structure-Based Drug Design Via Modality-Specific Optimal Schedule Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A major bottleneck lies in the twisted probability path of multi-modalities—continuous 3D positions and discrete 2D topologies—which jointly determine molecular geometries. By establishing the fact that noise schedules decide the Variational Lower Bound (VLB) for the twisted probability path, we propose VLB-Optimal Scheduling (VOS) strategy in this under-explored area, which optimizes VLB as a path integral for SBDD.	Keyue Qiu; Yuxuan Song; Zhehuan Fan; Peidong Liu; Zhe Zhang; Mingyue Zheng; Hao Zhou; Wei-Ying Ma;	code
190	Empower Structure-Based Molecule Optimization with Gradient Guided Bayesian Flow Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel backward correction strategy that optimizes within a sliding window of the past histories, allowing for a seamless trade-off between explore-and-exploit during optimization.	Keyue Qiu; Yuxuan Song; Jie Yu; Hongbo Ma; Ziyao Cao; Zhilong Zhang; Yushuai Wu; Mingyue Zheng; Hao Zhou; Wei-Ying Ma;	code
191	Faster and Stronger: When ANN-SNN Conversion Meets Parallel Spiking Calculation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel parallel conversion learning framework, which establishes a mathematical mapping relationship between each time-step of the parallel spiking neurons and the cumulative spike firing rate.	Zecheng Hao; Qichao Ma; Kang Chen; Yi Zhang; Zhaofei Yu; Tiejun Huang;	code
192	Differentiable Quadratic Optimization For The Maximum Independent Set Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the non-convexity of the objective, we propose optimizing several initializations in parallel using momentum-based gradient descent, complemented by an efficient MIS checking criterion derived from our theory.	Ismail Alkhouri; Cedric Le Denmat; Yingjie Li; CUNXI YU; Jia Liu; Rongrong Wang; Alvaro Velasquez;	code
193	FrameBridge: Improving Image-to-Video Generation with Bridge Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Diffusion models have achieved remarkable progress on image-to-video (I2V) generation, while their noise-to-data generation process is inherently mismatched with this task, which may lead to suboptimal synthesis quality.	Yuji Wang; Zehua Chen; Chen Xiaoyu; Yixiang Wei; Jun Zhu; Jianfei Chen;	code
194	Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TimeFuse, a framework for collective time-series forecasting with sample-level adaptive fusion of heterogeneous models.	Zhining Liu; Ze Yang; Xiao Lin; Ruizhong Qiu; Tianxin Wei; Yada Zhu; Hendrik Hamann; Jingrui He; Hanghang Tong;	code
195	CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CUPS, a novel method for learning sequence-to-sequence 3D human shapes and poses from RGB videos with uncertainty quantification.	Harry Zhang; Luca Carlone;	code
196	Learning Distribution-wise Control in Representation Space for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we extend this approach to the distribution level, enabling the model to learn not only pointwise transformations but also the surrounding regions of the concept subspace.	Chunyuan Deng; Ruidi Chang; Hanjie Chen;	code
197	Learning Policy Committees for Effective Personalization in MDPs with Diverse Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: On the other hand, approaches that aim to tackle task diversity, such as using task embedding as policy context and task clustering, typically lack performance guarantees and require a large number of training tasks. To address these challenges, we propose a novel approach for learning a policy committee that includes at least one near-optimal policy with high probability for tasks encountered during execution.	Luise Ge; Michael Lanier; Anindya Sarkar; Bengisu Guresti; Chongjie Zhang; Yevgeniy Vorobeychik;	code
198	Discovering Latent Causal Graphs from Spatiotemporal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present SPACY (SPAtiotemporal Causal discoverY), a novel framework based on variational inference, designed to model latent time series and their causal relationships from spatiotemporal data.	Kun Wang; Sumanth Varambally; Duncan Watson-Parris; Yian Ma; Rose Yu;	code
199	Rethinking Point Cloud Data Augmentation: Topologically Consistent Deformation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SinPoint, a novel method designed to preserve the topological structure of the original point cloud through a homeomorphism.	Jian Bi; Qianliang Wu; Xiang Li; Shuo Chen; Jianjun Qian; lei luo; Jian Yang;	code
200	Scaling Trends in Language Model Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As both attackers and defenders gain access to more compute, and as models become larger, what will be the effect on robustness? We argue that to answer this question requires a scaling lens, which we adopt in an extensive study of language model robustness across several classification tasks, model families, and adversarial attacks.	Nikolaus H. R. Howe; Ian R. McKenzie; Oskar John Hollinsworth; Michał Zając; Tom Tseng; Aaron David Tucker; Pierre-Luc Bacon; Adam Gleave;	code
201	MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MuseControlLite, a lightweight mechanism designed to fine-tune text-to-music generation models for precise conditioning using various time-varying musical attributes and reference audio signals.	Fang-Duo Tsai; Shih-Lun Wu; Weijaw Lee; Sheng-Ping Yang; Bo-Rui Chen; Hao-Chung Cheng; Yi-Hsuan Yang;	code
202	TAROT: Targeted Data Selection Via Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose TAROT, a targeted data selection framework grounded in Optimal Transport theory.	Lan Feng; Fan Nie; Yuejiang Liu; Alexandre Alahi;	code
203	Rethinking Causal Ranking: A Balanced Perspective on Uplift Model Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we identify a fundamental limitation in existing evaluation metrics, such as the uplift and Qini curves, which fail to rank individuals with binary negative outcomes accurately.	Minqin Zhu; Zexu Sun; Ruoxuan Xiong; Anpeng Wu; Baohong Li; Caizhi Tang; JUN ZHOU; Fei Wu; Kun Kuang;	code
204	AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces AKRMap, a new DR technique designed to visualize cross-modal embeddings metric with enhanced accuracy by learning kernel regression of the metric landscape in the projection space.	Yilin Ye; Junchao Huang; Xingchen ZENG; Jiazhi Xia; Wei Zeng;	code
205	Patch-wise Structural Loss for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing forecasting models rely heavily on point-wise loss functions like Mean Squared Error, which treat each time step independently and neglect the structural dependencies inherent in time series data, making it challenging to capture complex temporal patterns accurately. To address these challenges, we propose a novel Patch-wise Structural (PS) loss, designed to enhance structural alignment by comparing time series at the patch level.	Dilfira Kudrat; Zongxia Xie; Yanru Sun; Tianyu Jia; Qinghua Hu;	code
206	PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation Via Few-Shot Private Data and Generative APIs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In practice, the few-shot private data challenge is particularly prevalent in specialized domains like healthcare and industry. To address this challenge, we propose a novel API-assisted algorithm, Private Contrastive Evolution (PCEvolve), which iteratively mines inherent inter-class contrastive relationships in few-shot private data beyond individual data points and seamlessly integrates them into an adapted Exponential Mechanism (EM) to optimize DP’s utility in an evolution loop.	Jianqing Zhang; Yang Liu; JIE FU; Yang Hua; Tianyuan Zou; Jian Cao; Qiang Yang;	code
207	RBench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a graduate-level, multi-disciplinary, EnglishChinese benchmark, dubbed as Reasoning Bench (RBench), for assessing the reasoning capability of both language and multimodal models.	Meng-Hao Guo; Jiajun Xu; Yi Zhang; Jiaxi Song; Haoyang Peng; Yi-Xuan Deng; Xinzhi Dong; Kiyohiro Nakayama; Zhengyang Geng; Chen Wang; Bolin Ni; Guo-Wei Yang; Yongming Rao; Houwen Peng; Han Hu; Gordon Wetzstein; Shi-min Hu;	code
208	RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Code auditing is the process of reviewing code with the aim of identifying bugs.	Jinyao Guo; Chengpeng Wang; Xiangzhe Xu; Zian Su; Xiangyu Zhang;	code
209	Determining Layer-wise Sparsity for Large Language Models Through A Theoretical Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the challenge of determining the layer-wise sparsity rates of large language models (LLMs) through a theoretical perspective.	Weizhong Huang; Yuxin Zhang; Xiawu Zheng; Fei Chao; Rongrong Ji;	code
210	What Makes In-context Learning Effective for Mathematical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, in this paper, we aim to theoretically analyze the impact of in-context demonstrations on LLMs’ reasoning performance.	Jiayu Liu; Zhenya Huang; Chaokun Wang; Xunpeng Huang; ChengXiang Zhai; Enhong Chen;	code
211	ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, we introduce the Time-Series Question Answering (Time-Series QA) task and release EngineMT-QA, the first large-scale, multi-task, temporal-textual QA dataset designed to capture complex interactions between time-series signals and natural language. Building on this resource, we propose the Instruct Time Transformer (ITFormer), a novel framework that bridges time-series encoders with frozen large language models (LLMs).	Yilin wang; Peixuan Lei; Jie Song; Yuzhe Hao; Tao Chen; Yuxuan Zhang; LEI JIA; Yuanxiang Li; zhongyu wei;	code
212	Reinforcement Learning with Adaptive Reward Modeling for Expensive-to-Evaluate Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Training reinforcement learning (RL) agents requires extensive trials and errors, which becomes prohibitively time-consuming in systems with costly reward evaluations. To address this challenge, we propose adaptive reward modeling (AdaReMo) which accelerates RL training by decomposing the complicated reward function into multiple localized fast reward models approximating direct reward evaluation with neural networks.	Hongyuan Su; Yu Zheng; Yuan Yuan; Yuming Lin; Depeng Jin; Yong Li;	code
213	One Leaf Reveals The Season: Occlusion-Based Contrastive Learning with Semantic-Aware Views for Efficient Visual Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a scalable and straightforward pre-training paradigm for efficient visual conceptual representation called occluded image contrastive learning (OCL).	Xiaoyu Yang; Lijian Xu; Hongsheng Li; Shaoting Zhang;	code
214	Unbiased Recommender Learning from Implicit Feedback Via Weakly Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This assumption risks misclassifying potential positive samples within the unlabeled data, thereby undermining model performance. To address this issue, we introduce PURL, a model-agnostic framework that reframes implicit feedback recommendation as a weakly supervised learning task, eliminating the need for negative samples.	Hao Wang; Zhichao Chen; Haotian Wang; Yanchao Tan; Licheng Pan; Tianqiao Liu; Xu Chen; Haoxuan Li; Zhouchen Lin;	code
215	SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SongGen, a fully open-source, single-stage auto-regressive transformer designed for controllable song generation.To foster community engagement and future research, we will release our model weights, training code, annotated data, and preprocessing pipeline.	Zihan Liu; Shuangrui Ding; Zhixiong Zhang; Xiaoyi Dong; Pan Zhang; Yuhang Zang; Yuhang Cao; Dahua Lin; Jiaqi Wang;	code
216	In-Context Learning and Occam’s Razor Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In particular, we show that the next-token prediction loss used to train in-context learners is directly equivalent to a data compression technique called prequential coding, and that minimizing this loss amounts to jointly minimizing both the training error and the complexity of the model that was implicitly learned from context. Our theory and the empirical experiments we use to support it not only provide a normative account of in-context learning, but also elucidate the shortcomings of current in-context learning methods, suggesting ways in which they can be improved.	Eric Elmoznino; Tom Marty; Tejas Kasetty; Leo Gagnon; Sarthak Mittal; Mahan Fathi; Dhanya Sridhar; Guillaume Lajoie;	code
217	Towards A Formal Theory of Representational Compositionality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, while we have strong intuitions about what compositionality is, we lack satisfying formal definitions for it. Here, we propose such a definition called representational compositionality that is conceptually simple, quantitative, and grounded in algorithmic information theory.	Eric Elmoznino; Thomas Jiralerspong; Yoshua Bengio; Guillaume Lajoie;	code
218	Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores the mechanisms of external slow-thinking from a theoretical standpoint.	Zeyu Gan; Yun Liao; Yong Liu;	code
219	Synthetic Face Datasets Generation Via Latent Space Exploration from Brownian Identity Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a new method, inspired by the physical motion of soft particles subjected to stochastic Brownian forces, allowing us to sample identities distributions in a latent space under various constraints.With this in hands, we generate several face datasets and benchmark them by training face recognition models, showing that data generated with our method exceeds the performance of previously GAN-based datasets and achieves competitive performance with state-of-the-art diffusion-based synthetic datasets.	David Geissbühler; Hatef Otroshi Shahreza; Sébastien Marcel;	code
220	STAIR: Improving Safety Alignment with Introspective Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose STAIR, a novel framework that integrates SafeTy Alignment with Itrospective Reasoning.	Yichi Zhang; Siyuan Zhang; Yao Huang; Zeyu Xia; Zhengwei Fang; Xiao Yang; Ranjie Duan; Dong Yan; Yinpeng Dong; Jun Zhu;	code
221	Adapting While Learning: Grounding LLMs for Scientific Problems with Tool Usage Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by how human experts assess problem complexity before selecting solutions, we propose a novel two-component fine-tuning method, Adapting while Learning* (AWL).*	Bohan Lyu; Yadi Cao; Duncan Watson-Parris; Leon Bergen; Taylor Berg-Kirkpatrick; Rose Yu;	code
222	CSTrack: Enhancing RGB-X Tracking Via Compact Spatiotemporal Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: More critically, intra-modality spatial modeling within each dispersed space incurs substantial computational overhead, limiting resources for inter-modality spatial modeling and temporal modeling. To address this, we propose a novel tracker, CSTrack, which focuses on modeling Compact Spatiotemporal features to achieve simple yet effective tracking.	Xiaokun Feng; Dailing Zhang; Shiyu Hu; Xuchen Li; Meiqi Wu; Jing Zhang; Xiaotang Chen; Kaiqi Huang;	code
223	Towards Universal Offline Black-Box Optimization Via Learning Language Model Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we discuss multiple potential approaches, including an end-to-end learning framework in the form of next-token prediction, as well as prioritizing the learning of latent spaces with strong representational capabilities.To validate the effectiveness of these methods, we collect offline BBO tasks and data from open-source academic works for training.	Rong-Xi Tan; Ming Chen; Ke Xue; Yao Wang; Yaoyuan Wang; Fu Sheng; Chao Qian;	code
224	Boost-and-Skip: A Simple Guidance-Free Diffusion for Minority Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, existing diffusion-based minority generators often rely on computationally expensive guidance dedicated for minority generation. To address this, here we present a simple yet powerful guidance-free approach called Boost-and-Skip for generating minority samples using diffusion models.	Soobin Um; Beomsu Kim; Jong Chul Ye;	code
225	SPMC: Self-Purifying Federated Backdoor Defense Via Margin Contribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These attacks exploit FL decentralized nature, while existing defenses, based on isolated behaviors and fixed rules, can be bypassed by adaptive attackers. To address these limitations, we propose SPMC, a marginal collaboration defense mechanism that leverages intrinsic consistency across clients to estimate inter-client marginal contributions. This allows the system to dynamically reduce the influence of clients whose behavior deviates from the collaborative norm, thus maintaining robustness even as the number of attackers changes.	Wenwen He; Wenke Huang; Bin Yang; ShuKan Liu; Mang Ye;	code
226	Splitting with Importance-aware Updating for Heterogeneous Federated Learning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our key insight is decomposing client updates into consensus and divergence components, enabling the model to maintain core capabilities while adapting to domain-specific knowledge. We propose a novel federated learning framework called FedICU (Splitting with ImportanCe-aware Updating for Heterogeneous Federated Learning with Large Language Models), which introduces an aggregation mechanism that dynamically balances these components based on their contribution to global model performance, while implementing an importance-aware parameter updating strategy to prevent catastrophic forgetting and domain overfitting.	Yangxu Liao; Wenke Huang; Guancheng Wan; Jian Liang; Bin Yang; Mang Ye;	code
227	GHOST: Generalizable One-Shot Federated Graph Learning with Proxy-Based Topology Knowledge Retention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these issues, we introduce GHOST, an innovative one-shot FGL framework. In GHOST, we establish a proxy model for each client to leverage diverse local knowledge and integrate it to train the global model.	Jiaru Qian; Guancheng Wan; Wenke Huang; Guibin Zhang; Yuxin Wu; Bo Du; Mang Ye;	code
228	Privacy Attacks on Image AutoRegressive Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the privacy risks associated with IARs remain unexplored, raising concerns regarding their responsible deployment. To address this gap, we conduct a comprehensive privacy analysis of IARs, comparing their privacy risks to the ones of DMs as reference points.	Antoni Kowalczuk; Jan Dubiński; Franziska Boenisch; Adam Dziedzic;	code
229	Update Your Transformer to The Latest Release: Re-Basin of Task Vectors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate how to transfer fine-tuning to a new checkpoint without having to re-train, in a data-free manner.	Filippo Rinaldi; Giacomo Capitani; Lorenzo Bonicelli; Donato Crisostomi; Federico Bolelli; ELISA FICARRA; Emanuele Rodolà; Simone Calderara; Angelo Porrello;	code
230	Scaling Large Motion Models with Million-Level Human Motions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better integrate the motion modality, we propose Motionbook, an innovative motion encoding approach including (1) a compact yet lossless feature to represent motions; (2) a novel 2D lookup-free motion tokenizer that preserves fine-grained motion details while expanding codebook capacity, significantly enhancing the representational power of motion tokens.	Ye Wang; Sipeng Zheng; Bin Cao; Qianshan Wei; Weishuai Zeng; Qin Jin; Zongqing Lu;	code
231	Regularized Langevin Dynamics for Combinatorial Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes a simple yet effective sampling framework for combinatorial optimization (CO).	Shengyu Feng; Yiming Yang;	code
232	EasyInv: Toward Fast and Better DDIM Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces EasyInv, an easy yet novel approach that significantly advances the field of DDIM Inversion by addressing the inherent inefficiencies and performance limitations of traditional iterative optimization methods.	Ziyue Zhang; Mingbao Lin; Shuicheng YAN; Rongrong Ji;	code
233	Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: {However, weight misalignment and complex gradient dynamics make it challenging to adopt SVD prior to the LoRA MoE architecture.} To mitigate these issues, we propose \underline{G}reat L\underline{o}R\underline{A} Mixture-of-Exper\underline{t} (GOAT), a framework that (1) adaptively integrates relevant priors using an SVD-structured MoE, and (2) aligns optimization with full fine-tuned MoE by deriving a theoretical scaling factor.	Chenghao Fan; Zhenyi Lu; Sichen Liu; Chengfeng Gu; Xiaoye Qu; Wei Wei; Yu Cheng;	code
234	Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The alignment of large language models (LLMs) often assumes that using more clean data yields better outcomes, overlooking the match between model capacity and example difficulty. Challenging this, we propose a new principle: Preference data vary in difficulty, and overly difficult examples hinder alignment, by exceeding the model’s capacity.	Chengqian Gao; Haonan Li; Liu Liu; Zeke Xie; Peilin Zhao; zhiqiang xu;	code
235	CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: By introducing and analyzing the matching mechanism between Core Neurons and Core Tokens, we found that key neurons and tokens for inference mutually influence and reinforce each other. Building on this insight, we propose CoreMatching, a co-adaptive sparse inference framework, which leverages the synergy between token and neuron sparsity to enhance inference efficiency.	Qinsi Wang; Hancheng Ye; Ming-Yu Chung; Yudong Liu; Yueqian Lin; Martin Kuo; Mingyuan Ma; Jianyi Zhang; Yiran Chen;	code
236	From RAG to Memory: Non-Parametric Continual Learning for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance on more basic factual memory tasks drops considerably below standard RAG. We address this unintended deterioration and propose HippoRAG 2, a framework that outperforms standard RAG comprehensively on factual, sense-making, and associative memory tasks.	Bernal Jiménez Gutiérrez; Yiheng Shu; Weijian Qi; Sizhe Zhou; Yu Su;	code
237	Emotional Face-to-Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a new task, termed emotional face-to-speech, aiming to synthesize emotional speech directly from expressive facial cues.	Jiaxin Ye; Boyuan Cao; Hongming Shan;	code
238	Perception in Reflection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a perception in reflection paradigm designed to transcend the limitations of current large vision-language models (LVLMs), which are expected yet often fail to achieve perfect perception initially.	Yana Wei; Liang Zhao; Kangheng Lin; En Yu; Yuang Peng; Runpei Dong; Jianjian Sun; Haoran Wei; Zheng Ge; Xiangyu Zhang; Vishal M. Patel;	code
239	DCTdiff: Intriguing Properties of Image Generative Modeling in The DCT Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper explores image modeling from the frequency space and introduces DCTdiff, an end-to-end diffusion generative paradigm that efficiently models images in the discrete cosine transform (DCT) space.	Mang Ning; Mingxiao Li; Jianlin Su; Jia Haozhe; Lanmiao Liu; Martin Benes; Wenshuo Chen; Albert Ali Salah; Itir Onal Ertugrul;	code
240	Synthetic Text Generation for Training Large Language Models Via Gradient Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the first theoretically rigorous approach for generating synthetic human-readable text that provides convergence, performance, and privacy guarantees for fine-tuning LLMs on a target task.	Dang Nguyen; Zeman Li; Mohammadhossein Bateni; Vahab Mirrokni; Meisam Razaviyayn; Baharan Mirzasoleiman;	code
241	FlashTP: Fused, Sparsity-Aware Tensor Product for Machine Learning Interatomic Potentials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While equivariant MLIPs achieve state-of-the-art accuracy, they face significant computational bottlenecks centered around their Tensor-Product layer, which account for up to 75\% of training time and cause substantial memory overhead. We present FlashTP, a highly optimized tensor-product library that addresses these inefficiencies through kernel fusion, sparse computation, and path-aggregated execution.	Seung Yul Lee; Hojoon Kim; Yutack Park; Dawoon Jeong; Seungwu Han; Yeonhong Park; Jae W. Lee;	code
242	Cross-Modal Alignment Via Variational Copula Modelling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Copula is a powerful statistical structure in modelling the interactions between variables, as it bridges the joint distribution and marginal distributions of multiple variables. In this paper, we propose a novel copula modelling-driven multimodal learning framework, which focuses on learning the joint distribution of various modalities to capture the complex interaction among them.	Feng Wu; Tsai Hor Chan; Fuying Wang; Guosheng Yin; Lequan Yu;	code
243	Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate strategies for Vim and propose Stochastic Layer-Wise Shuffle (SLWS), a novel regularization method that can effectively improve the Vim training.	Zizheng Huang; Haoxing Chen; Jiaqi Li; jun lan; Huijia Zhu; Weiqiang Wang; Limin Wang;	code
244	AdaptiveStep: Automatically Dividing Reasoning Step Through Model Confidence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These approaches overlook the fact that certain words don’t usually indicate true decision points. To address this, we propose AdaptiveStep, a method that divides reasoning steps based on the model’s confidence in predicting the next word, offering more information on decision-making at each step, improving downstream tasks like reward model training.	Yuliang Liu; Junjie Lu; Chaofeng Qu; Zhaoling Chen; Zefan Cai; Jason Klein Liu; Chonghan Liu; Yunhui Xia; Li Zhao; Jiang Bian; Chuheng Zhang; Wei Shen; Zhouhan Lin;	code
245	Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that sparse coding* offers a compelling alternative for achieving adaptive representation with minimal overhead and higher fidelity.*	Tiansheng Wen; Yifei Wang; Zequn Zeng; Zhong Peng; Yudi Su; Xinyang Liu; Bo Chen; Hongwei Liu; Stefanie Jegelka; Chenyu You;	code
246	LEAPS: A Discrete Neural Sampler Via Locally Equivariant Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose LEAPS, an algorithm to sample from discrete distributions known up to normalization by learning a rate matrix of a continuous-time Markov chain (CTMC).To derive these importance weights, we introduce a set of Radon-Nikodym derivatives of CTMCs over their path measures.	Peter Holderrieth; Michael Samuel Albergo; Tommi Jaakkola;	code
247	Task-Gated Multi-Expert Collaboration Network for Degraded Multi-Modal Image Fusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, real-world imaging often suffers from degradation issues, such as noise, blur, and haze in visible imaging, as well as stripe noise in infrared imaging, which significantly degrades model performance. To address these challenges, we propose a task-gated multi-expert collaboration network (TG-ECNet) for degraded multi-modal image fusion.	Yiming Sun; Xin Li; Pengfei Zhu; Qinghua Hu; Dongwei Ren; Huiying Xu; Xinzhong Zhu;	code
248	SMART-PC: Skeletal Model Adaptation for Robust Test-Time Training in Point Clouds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SMART-PC, a skeleton-based framework that enhances resilience to corruptions by leveraging the geometric structure of 3D point clouds.	Ali Bahri; Moslem Yazdanpanah; Sahar Dastani; Mehrdad Noori; Gustavo Adolfo Vargas Hakim; David OSOWIECHI; Farzad Beizaee; Ismail Ben Ayed; Christian Desrosiers;	code
249	SpikeVideoFormer: An Efficient Spike-Driven Video Transformer with Hamming Attention and $\mathcal{O}(T)$ Complexity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SpikeVideoFormer, an efficient spike-driven video Transformer, featuring linear temporal complexity $\mathcal{O}(T)$.	Shihao Zou; Qingfeng Li; Wei Ji; Jingjing Li; Yongkui Yang; Guoqi Li; Chao Dong;	code
250	Compressed Image Generation with Denoising Diffusion Codebook Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel generative approach based on Denoising Diffusion Models (DDMs), which produces high-quality image samples along with their losslessly compressed bit-stream representations.	Guy Ohayon; Hila Manor; Tomer Michaeli; Michael Elad;	code
251	Unlocking Post-hoc Dataset Inference with Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Such in-distribution, held-out data is rarely available in practice, severely limiting the applicability of DI. In this work, we address this challenge by synthetically generating the required held-out set.	Bihe Zhao; Pratyush Maini; Franziska Boenisch; Adam Dziedzic;	code
252	ADHMR: Aligning Diffusion-based Human Mesh Recovery Via Direct Preference Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Probabilistic methods have tried to solve this by generating numerous plausible 3D human mesh predictions, but they often exhibit misalignment with 2D image observations and weak robustness to in-the-wild images. To address these issues, we propose ADHMR, a framework that Aligns a Diffusion-based HMR model in a preference optimization manner.	Wenhao Shen; Wanqi Yin; Xiaofeng Yang; Cheng Chen; Chaoyue Song; Zhongang Cai; Lei Yang; Hao Wang; Guosheng Lin;	code
253	Hybrid Batch Normalisation: Resolving The Dilemma of Batch Normalisation in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we resolve the dilemma of the BN layer in federated learning by developing a customised normalisation approach, Hybrid Batch Normalisation (HBN).	Hongyao Chen; Tianyang Xu; Xiaojun Wu; Josef Kittler;	code
254	Curvature Enhanced Data Augmentation for Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, a novel manifold learning approach for generating synthetic data was proposed, utilizing a first-order approximation of the data manifold. Building on this foundation, we present a theoretical framework and practical tools for approximating and sampling general data manifolds.	Ilya Kaufman; Omri Azencot;	code
255	Neural Graph Matching Improves Retrieval Augmented Generation in Molecular Machine Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We apply this approach to mass spectrum simulation and introduce MARASON, a novel model that incorporates neural graph matching to enhance a fragmentation-based neural network.	Runzhong Wang; Rui-Xi Wang; Mrunali Manjrekar; Connor W. Coley;	code
256	Morse: Dual-Sampling for Lossless Acceleration of Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present $Morse$, a simple dual-sampling framework for accelerating diffusion models losslessly.	Chao Li; Jiawei Fan; Anbang Yao;	code
257	Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we analyze and identify samples within benign datasets that contribute most to safety degradation, then fine-tune LLMs exclusively on these samples. We approach this problem from an outlier detection perspective and propose Self-Inf-N, to detect and extract outliers for fine-tuning.	Zihan Guan; Mengxuan Hu; Ronghang Zhu; Sheng Li; Anil Vullikanti;	code
258	ExpProof : Operationalizing Explanations for Confidential Models with ZKPs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive.	Chhavi Yadav; Evan Laufer; Dan Boneh; Kamalika Chaudhuri;	code
259	Learning Fused State Representations for Control from Multi-View Observations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Multi-view Fusion State for Control (MFSC), firstly incorporating bisimulation metric learning into MVRL to learn task-relevant representations.	Zeyu Wang; Yao-Hui Li; Xin Li; Hongyu Zang; Romain Laroche; Riashat Islam;	code
260	Taming Knowledge Conflicts in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Previous works attribute this conflict to the interplay between memory heads and context heads, attention heads assumed to promote either memory or context exclusively. In this study, we go beyond this fundamental assumption by uncovering a critical phenomenon we term the superposition of contextual information and parametric memory, where highly influential attention heads simultaneously contribute to both memory and context.	Gaotang Li; Yuzhong Chen; Hanghang Tong;	code
261	TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional models typically process all variables or time points uniformly, which limits their ability to capture complex variable relationships and obtain non-trivial time representations. To address this issue, we propose TimePro, an innovative Mamba-based model that constructs variate- and time-aware hyper-states.	Xiaowen Ma; Zhen-Liang Ni; Shuai Xiao; Xinghao Chen;	code
262	The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models Via Visual Information Steering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the internal dynamics of hallucination by examining the tokens logits rankings throughout the generation process, revealing three key patterns in how LVLMs process information: (1) gradual visual information loss — visually grounded tokens gradually become less favored throughout generation, and (2) early excitation — semantically meaningful tokens achieve peak activation in the layers earlier than the final layer.	Zhuowei Li; Haizhou Shi; Yunhe Gao; Di Liu; Zhenting Wang; Yuxiao Chen; Ting Liu; Long Zhao; Hao Wang; Dimitris N. Metaxas;	code
263	On Explaining Equivariant Graph Networks Via Improved Relevance Propagation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current XAI techniques either struggle to adapt to equivariant GNNs or fail to effectively handle positional data and evaluate the significance of geometric features adequately. To address these challenges, we introduce a novel method, known as EquiGX, which uses the Deep Taylor decomposition framework to extend the layer-wise relevance propagation rules tailored for spherical equivariant GNNs.	Hongyi Ling; Haiyang Yu; Zhimeng Jiang; Na Zou; Shuiwang Ji;	code
264	One Diffusion Step to Real-World Super-Resolution Via Flow Trajectory Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing one-step diffusion methods are constrained by the performance of the teacher model, where poor teacher performance results in image artifacts. To address this limitation, we propose FluxSR, a novel one-step diffusion Real-ISR technique based on flow matching models.	Jianze Li; Jiezhang Cao; Yong Guo; Wenbo Li; Yulun Zhang;	code
265	Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we identify two distinct goals of loss reweighting, namely, Saturation and Importance—the former indicates that those insufficiently optimized data should be emphasized, while the latter stresses some critical data that are most influential for loss minimization.	Puning Yang; Qizhou Wang; Zhuo Huang; Tongliang Liu; Chengqi Zhang; Bo Han;	code
266	NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue Via Next-Token-Pair Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we systematically explore the use of dual-channel speech data in the context of modern large language models, and introduce a novel generative modeling paradigm—Next-Token-Pair Prediction (NTPP)—to enable speaker-independent dual-channel spoken dialogue learning using decoder-only architectures for the first time.	Qichao Wang; Ziqiao Meng; Wenqian Cui; Yifei Zhang; Pengcheng Wu; Bingzhe Wu; Irwin King; Liang Chen; Peilin Zhao;	code
267	Stabilizing Sample Similarity in Representation Via Mitigating Random Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify random consistency—an inherent bias in Euclidean distance metrics—as a key obstacle to reliable evaluation, affecting both fairness and discrimination. To address this, we derive the expected Euclidean distance under uniformly distributed label permutations and introduce its closed-form solution, the Pure Square Euclidean Distance (PSED), which provably eliminates random consistency.	Jieting Wang; ZhangZelong; Feijiang Li; Yuhua Qian; Xinyan Liang;	code
268	Clients Collaborate: Flexible Differentially Private Federated Learning with Guaranteed Improvement of Utility-Privacy Trade-off Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel federated learning framework with rigorous privacy guarantees, named FedCEO, designed to strike a trade-off between model utility and user privacy by letting clients *Collaborate with Each Other*.	Yuecheng Li; Lele Fu; Tong Wang; Jian Lou; Bin Chen; Lei Yang; Jian Shen; Zibin Zheng; Chuan Chen;	code
269	Earley-Driven Dynamic Pruning for Efficient Structured Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, creating this mask requires checking the validity of all tokens in the LLM vocabulary at every decoding step, which often incurs significant overheads in existing constrained decoding engines. To address this challenge, we propose $\textbf{ZapFormat}$, a novel $\textbf{dynamic pruning}$ strategy based on the Earley algorithm that identifies and eliminates invalid or redundant Earley states in real-time, significantly reducing memory occupation of the Earley algorithm’s states.	Xintong Sun; Chi Wei; Minghao Tian; Shiwen Ni;	code
270	From Uncertain to Safe: Conformal Adaptation of Diffusion Models for Safe PDE Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods rarely consider safety requirements crucial in real-world applications. To address this limitation, we propose Safe Diffusion Models for PDE Control (SafeDiffCon), which introduce the uncertainty quantile as model uncertainty quantification to achieve optimal control under safety constraints through both post-training and inference phases.	Peiyan Hu; Xiaowei Qian; Wenhao Deng; Rui Wang; Haodong Feng; Ruiqi Feng; Tao Zhang; Long Wei; Yue Wang; Zhi-Ming Ma; Tailin Wu;	code
271	ParallelComp: Parallel Long-Context Compressor for Length Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose ParallelComp, a parallel long-context compression method that effectively overcomes the memory bottleneck, enabling 8B-parameter LLMs to extrapolate from 8K to 128K tokens on a single A100 80GB GPU in a training-free setting.	Jing Xiong; Jianghan Shen; Chuanyang Zheng; Zhongwei Wan; Chenyang Zhao; Chiwun Yang; Fanghua Ye; Hongxia Yang; Lingpeng Kong; Ngai Wong;	code
272	Incorporating Arbitrary Matrix Group Equivariance Into KANs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Equivariant Kolmogorov-Arnold Networks (EKAN), a method for incorporating arbitrary matrix group equivariance into KANs, aiming to broaden their applicability to more fields.	Lexiang Hu; Yisen Wang; Zhouchen Lin;	code
273	Explicit Discovery of Nonlinear Symmetries from Dynamic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose LieNLSD, which is, to our knowledge, the first method capable of determining the number of infinitesimal generators with nonlinear terms and their explicit expressions.	Lexiang Hu; Yikang Li; Zhouchen Lin;	code
274	Flat-LoRA: Low-Rank Adaptation Over A Flat Loss Landscape Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Flat-LoRA, which aims to identify a low-rank adaptation situated in a flat region of the full parameter space.	Tao Li; Zhengbao He; Yujun Li; Yasheng Wang; Lifeng Shang; Xiaolin Huang;	code
275	TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, coarse-grained clustering struggles to capture complex, time-varying interactions effectively. To address these challenges, we propose TimeFilter, a GNN-based framework for adaptive and fine-grained dependency modeling.	Yifan Hu; Guibin Zhang; Peiyuan Liu; Disen Lan; Naiqi Li; Dawei Cheng; Tao Dai; Shu-Tao Xia; Shirui Pan;	code
276	Non-stationary Diffusion For Probabilistic Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we innovatively utilize the Location-Scale Noise Model (LSNM) to relax the fixed uncertainty assumption of ANM.	Weiwei Ye; Zhuopeng Xu; Ning Gui;	code
277	Hgformer: Hyperbolic Graph Transformer for Collaborative Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite this remarkable progress, local structure modeling and embedding distortion still remain two notable limitations in the majority of GNN-based CF methods. Therefore, in this paper, we propose a novel Hyperbolic Graph Transformer architecture, to tackle the long-tail problems in CF tasks.	Xin Yang; Xingrun Li; Heng Chang; Yang jinze; Xihong Yang; Shengyu Tao; Maiko Shigeno; Ningkang Chang; Junfeng Wang; Dawei Yin; Erxue Min;	code
278	TeLoGraF: Temporal Logic Planning Via Graph-encoded Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose TeLoGraF, Temporal Logic Graph-encoded Flow, which utilizes Graph Neural Networks (GNN) encoder and flow-matching to learn solutions for general STL specifications.	Yue Meng; Chuchu Fan;	code
279	EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While language-centric embodied agents have garnered substantial attention, MLLM-based embodied agents remain underexplored due to the lack of comprehensive evaluation frameworks. To bridge this gap, we introduce EmbodiedBench, an extensive benchmark designed to evaluate vision-driven embodied agents.	Rui Yang; Hanyang Chen; Junyu Zhang; Mark Zhao; Cheng Qian; Kangrui Wang; Qineng Wang; Teja Venkat Koripella; Marziyeh Movahedi; Manling Li; Heng Ji; Huan Zhang; Tong Zhang;	code
280	Efficient Multi-modal Long Context Learning for Training-free Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Efficient Multi-Modal Long Context Learning (EMLoC), a novel training-free alternative that embeds demonstration examples directly into the model input.	Zehong Ma; Shiliang Zhang; Longhui Wei; Qi Tian;	code
281	Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we theoretically demonstrate that initial tokens in the draft sequence are more important than later ones. Building on this insight, we propose Gumiho, a hybrid model combining serial and parallel heads.	Jinze Li; Yixing Xu; Haiduo Huang; Xuanwu Yin; Dong Li; Edith C. H. Ngai; Emad Barsoum;	code
282	UniDB: A Unified Diffusion Bridge Framework Via Stochastic Optimal Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches frequently produce blurred or excessively smoothed image details and lack a comprehensive theoretical foundation to explain these shortcomings. To address these limitations, we propose UniDB, a unified framework for diffusion bridges based on Stochastic Optimal Control (SOC).	Kaizhen Zhu; Mokai Pan; Yuexin Ma; Yanwei Fu; Jingyi Yu; Jingya Wang; Ye Shi;	code
283	Hierarchical Planning for Complex Tasks with Knowledge Graph-RAG and Symbolic Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose a neuro-symbolic approach that enhances LLMs-based planners with Knowledge Graph-based RAG for hierarchical plan generation.	Flavio Petruzzellis; Cristina Cornelio; Pietro Lio;	code
284	Overcoming Non-monotonicity in Transducer-based Streaming Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its input-synchronous decoding mechanism presents challenges in tasks requiring non-monotonic alignments, such as simultaneous translation. In this research, we address this issue by integrating Transducer’s decoding with the history of input stream via a learnable monotonic attention.	Zhengrui Ma; Yang Feng; Min zhang;	code
285	OneForecast: A Universal Framework for Global and Regional Weather Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In recent years, deep learning models have made significant progress in weather forecasting, but challenges remain, such as balancing global and regional high-resolution forecasts, excessive smoothing in extreme event predictions, and insufficient dynamic system modeling. To address these issues, this paper proposes a global-regional nested weather forecasting framework (OneForecast) based on graph neural networks.	Yuan Gao; Hao Wu; Ruiqi Shu; huanshuo dong; Fan Xu; Rui Ray Chen; Yibo Yan; Qingsong Wen; Xuming Hu; Kun Wang; Jiahao Wu; Li Qing; Hui Xiong; Xiaomeng Huang;	code
286	Efficient Motion Prompt Learning for Robust Visual Tracking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a lightweight and plug-and-play motion prompt tracking method.	Jie Zhao; Xin Chen; Yongsheng Yuan; Michael Felsberg; Dong Wang; Huchuan Lu;	code
287	Imitation Learning from A Single Temporally Misaligned Video Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our key insight is that matching should instead be defined at the level of sequences.	William Huey; Huaxiaoyue Wang; Anne Wu; Yoav Artzi; Sanjiban Choudhury;	code
288	AtlasD: Automatic Local Symmetry Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formalize the notion of local symmetry as atlas equivariance.	Manu Bhat; Jonghyun Park; Jianke Yang; Nima Dehmamy; Robin Walters; Rose Yu;	code
289	MCU: An Evaluation Framework for Open-Ended Game Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, evaluating such open-ended agents remains difficult, with current benchmarks facing scalability limitations. To address this, we introduce \textit{Minecraft Universe} (MCU), a comprehensive evaluation framework set within the open-world video game Minecraft.	Xinyue Zheng; Haowei Lin; Kaichen He; Zihao Wang; QIANG FU; Haobo Fu; Zilong Zheng; Yitao Liang;	code
290	POQD: Performance-Oriented Query Decomposer for Multi-vector Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Even worse, jointly solving this problem and training the downstream retrieval-based systems, say RAG systems could be highly inefficient. To overcome these challenges, we propose Performance-Oriented Query Decomposer (POQD), a novel query decomposition framework for MVR.	Yaoyang Liu; Junlin Li; Yinjun Wu; zhen chen;	code
291	FSTLLM: Spatio-Temporal LLM for Few Shot Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models typically require large volumes of training data and often struggle in data-scarce scenarios. To address this limitation, we propose a framework named Few-shot Spatio-Temporal Large Language Models (FSTLLM), aimed at enhancing model robustness and predictive performance in few-shot settings.	YUE JIANG; Yile Chen; Xiucheng Li; Qin Chao; SHUAI LIU; Gao Cong;	code
292	Interpreting CLIP with Hierarchical Sparse Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce Matryoshka SAE (MSAE), a new architecture that learns hierarchical representations at multiple granularities simultaneously, enabling a direct optimization of both metrics without compromise.	Vladimir Zaigrajew; Hubert Baniecki; Przemyslaw Biecek;	code
293	Whoever Started The Interference Should End It: Guiding Data-Free Model Merging Via Task Vectors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although prior work has explored many merging strategies, resolving interference without additional data for retraining or test-time computation remains challenging. In this paper, we theoretically demonstrate that the task vectors of the linear layer constitute an approximate linear subspace for its corresponding input.	Runxi Cheng; Feng Xiong; Yongxian Wei; Wanyun Zhu; Chun Yuan;	code
294	Directly Forecasting Belief for Reinforcement Learning with Delays Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: State-of-the-art (SOTA) methods typically employ recursive, step-by-step forecasting of states.	Qingyuan Wu; Yuhui Wang; Simon Sinong Zhan; Yixuan Wang; Chung-Wei Lin; Chen Lv; Qi Zhu; Jürgen Schmidhuber; Chao Huang;	code
295	Action Dubber: Timing Audible Actions Via Inflectional Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the task of Audible Action Temporal Localization, which aims to identify the spatio-temporal coordinates of audible movements.To support this task, we introduce a new benchmark dataset, $Audible623$, derived from Kinetics and UCF101 by removing non-essential vocalization subsets.	Wenlong Wan; Weiying Zheng; Tianyi Xiang; Guiqing Li; Shengfeng He;	code
296	VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods cannot scale up to extremely large scenes, due to the inefficient tracking and mapping strategies that need to optimize all 3D Gaussians in the limited GPU memories throughout the training to maintain the geometry and color consistency to previous RGBD observations. To resolve this issue, we propose novel tracking and mapping strategies to work with a novel 3D representation, dubbed view-tied 3D Gaussians, for RGBD SLAM systems.	Pengchong Hu; Zhizhong Han;	code
297	What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing benchmarks face significant limitations, including uncontrollable task complexity, extensive manual annotation, and a lack of multidimensional evaluation. In response to these challenges, we introduce OmniBench, a self-generating, graph-based benchmark with an automated pipeline for synthesizing tasks of controllable complexity through subtask composition.	Wendong Bu; Yang Wu; Qifan Yu; Minghe Gao; Bingchen Miao; Zhenkui Zhang; Kaihang Pan; liyunfei; Mengze Li; Wei Ji; Juncheng Li; Siliang Tang; Yueting Zhuang;	code
298	Feature Shift Localization Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Feature Shift Localization Network (FSL-Net), a neural network that can localize feature shifts in large and high-dimensional datasets in a fast and accurate manner.	Míriam Barrabés; Daniel Mas Montserrat; Kapal Dev; Alexander G. Ioannidis;	code
299	LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces Llavaguard, a suite of VLM-based vision safeguards that address the critical need for reliable tools in the era of large-scale data and models.For teaching a VLM safeguard on safety, we further create a multimodal safety dataset with high-quality human expert annotations, where each image is labeled with a safety rating, category, and rationale.	Lukas Helff; Felix Friedrich; Manuel Brack; Kristian Kersting; Patrick Schramowski;	code
300	Enhancing Treatment Effect Estimation Via Active Learning: A Counterfactual Covering Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To reduce the bound, we propose a greedy radius reduction algorithm, which excels under an idealized, balanced data distribution.	Hechuan Wen; Tong Chen; Mingming Gong; Li Kheng Chai; Shazia Sadiq; Hongzhi Yin;	code
301	Latent Imputation Before Prediction: A New Computational Paradigm for De Novo Peptide Sequencing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the issue of missing fragmentation, attributable to factors such as suboptimal fragmentation efficiency and instrumental constraints, presents a formidable challenge in practical applications. To tackle this obstacle, we propose a novel computational paradigm called $\underline{\textbf{L}}$atent $\underline{\textbf{I}}$mputation before $\underline{\textbf{P}}$rediction (LIPNovo).	Ye Du; Chen Yang; Nanxi Yu; Wanyu Lin; Qian Zhao; Shujun Wang;	code
302	TLLC: Transfer Learning-based Label Completion for Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in real-world scenarios, workers typically annotate only a few instances, leading to insufficient worker modeling and thus limiting the improvement of label completion. To address this issue, we propose a novel transfer learning-based label completion (TLLC) method.	Wenjun Zhang; Liangxiao Jiang; Chaoqun Li;	code
303	FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Though Rectified Flows (ReFlows) with distillation offer a promising way for fast sampling, its fast inversion transforms images back to structured noise for recovery and following editing remains unsolved. This paper introduces FireFlow, an embarrassingly simple yet effective zero-shot approach that inherits the startling capacity of ReFlow-based models (such as FLUX) in generation while extending its capabilities to accurate inversion and editing in 8 steps.	Yingying Deng; Xiangyu He; Changwang Mei; Peisong Wang; Fan Tang;	code
304	ROME Is Forged in Adversity: Robust Distilled Datasets Via Information Bottleneck Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While adversarial robustness has been extensively studied in related fields, research on improving DD robustness is still limited. To address this, we propose ROME, a novel method that enhances the adversarial RObustness of DD by leveraging the InforMation BottlenEck (IB) principle.	Zheng Zhou; Wenquan Feng; Qiaosheng Zhang; Shuchang Lyu; Qi Zhao; Guangliang Cheng;	code
305	Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite these advancements, most loss functions are still primarily pixel-wise, while regional and boundary-focused loss functions often incur high computational costs or are restricted to small-scale regions. To address this limitation, we propose the complex wavelet mutual information (CWMI) loss, a novel loss function that leverages mutual information from subband images decomposed by a complex steerable pyramid.	Renhao Lu;	code
306	One Wave To Explain Them All: A Unifying Perspective On Feature Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Feature attribution methods aim to improve the transparency of deep neural networks by identifying the input features that influence a model’s decision.	Gabriel Kasmi; Amandine Brunetto; Thomas Fel; Jayneel Parekh;	code
307	Modeling All-Atom Glycan Structures Via Hierarchical Message Passing and Multi-Scale Pre-training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, previous methods mainly focused on modeling the backbone structure of glycans as graphs of monosaccharides (i.e., sugar units), while they neglected the atomic structures underlying each monosaccharide, which are actually important indicators of glycan properties. We fill this blank by introducing the GlycanAA model for All-Atom-wise Glycan modeling.	Minghao Xu; Jiaze Song; Keming Wu; Xiangxin Zhou; Bin CUI; Wentao Zhang;	code
308	Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle it, we introduce a new metric Balancing Preservation Modification (BPM), that tailored for instruction-based image editing by explicitly disentangling the image into editing-relevant and irrelevant regions for specific consideration.	Zhuoying Li; Zhu Xu; Yuxin Peng; Yang Liu;	code
309	RollingQ: Reviving The Cooperation Dynamics in Multimodal Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To revive adaptability, we propose a simple yet effective method Rolling Query (RollingQ), which balances attention allocation by rotating the query to break the self-reinforcing cycle and mitigate the key distribution gap.	HaoTian Ni; Yake Wei; Hang Liu; Gong Chen; Chong Peng; Hao Lin; Di Hu;	code
310	Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate token quality from a noisy-label perspective and propose a generic token cleaning pipeline for SFT tasks.	Jinlong Pang; Na Di; Zhaowei Zhu; Jiaheng Wei; Hao Cheng; Chen Qian; Yang Liu;	code
311	KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Experimental results show that we can achieve nearly lossless 3.25-bit mixed precision KV cache quantization for LLMs like Llama-3.1-8B-Instruct and 4.0-bit for sensitive models like Qwen2.5-7B-Instruct on mathematical reasoning tasks.	Xing Li; Zeyu XING; Yiming Li; Linping Qu; Hui-Ling Zhen; Yiwu Yao; Wulong Liu; Sinno Jialin Pan; Mingxuan Yuan;	code
312	CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing benchmarks and models are primarily limited to a small set of modalities and tasks, which hinders the development of large-scale multimodal methods that can make holistic assessments of patient health and well-being. To bridge this gap, we introduce Clinical Large-scale Integrative Multimodal Benchmark (CLIMB), a comprehensive clinical benchmark unifying diverse clinical data across imaging, language, temporal, and graph modalities.	Wei Dai; Peilin Chen; Malinda Lu; Daniel A Li; Haowen Wei; Hejie Cui; Paul Pu Liang;	code
313	Posterior Inference with Diffusion Models for High-dimensional Black-box Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, those methods often underperform compared to BO methods due to limited expressivity and difficulty of uncertainty estimation in high-dimensional spaces. To overcome these issues, we introduce \textbf{DiBO}, a novel framework for solving high-dimensional black-box optimization problems.	Taeyoung Yun; Kiyoung Om; Jaewoo Lee; Sujin Yun; Jinkyoo Park;	code
314	Automatically Interpreting Millions of Features in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we build an open-source automated pipeline to generate and evaluate natural language interpretations for SAE latents using LLMs.	Gonçalo Santos Paulo; Alex Troy Mallen; Caden Juang; Nora Belrose;	code
315	Gradient Aligned Regression Via Pairwise Losses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose GAR (Gradient Aligned Regression) as a competitive alternative method in label space, which is constituted by a conventional regression loss and two pairwise label difference losses for gradient alignment including magnitude and direction.	Dixian Zhu; Tianbao Yang; Livnat Jerby;	code
316	Diffusion on Language Model Encodings for Protein Sequence Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present DiMA, a latent diffusion framework that operates on protein language model representations.	Viacheslav Meshchaninov; Pavel Strashnov; Andrey Shevtsov; Fedor Nikolaev; Nikita Ivanisenko; Olga Kardymon; Dmitry Vetrov;	code
317	GTR: A General, Multi-View, and Dynamic Framework for Trajectory Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose GTR, a general, multi-view, and dynamic Trajectory Representation framework built on a pre-train and fine-tune architecture.	Xiangheng Wang; Ziquan Fang; Chenglong Huang; Danlei Hu; Lu Chen; Yunjun Gao;	code
318	UDora: A Unified Red Teaming Framework Against LLM Agents By Dynamically Hijacking Their Own Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present UDora, a unified red teaming framework designed for LLM agents that dynamically hijacks the agent’s reasoning processes to compel malicious behavior.	Jiawei Zhang; Shuang Yang; Bo Li;	code
319	SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, effectively embedding precise safety knowledge into MLLMs for autonomous driving remains a significant challenge. To address this, we propose SafeAuto, a framework that enhances MLLM-based autonomous driving by incorporating both unstructured and structured knowledge.	Jiawei Zhang; Xuan Yang; Taiqi Wang; Yu Yao; Aleksandr Petiushko; Bo Li;	code
320	Improving LLM Video Understanding with 16 Frames Per Second Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce F-16, the first multimodal LLM designed for high-frame-rate video understanding.We will release the source code, model checkpoints, and data at [https://github.com/bytedance/F-16](https://github.com/bytedance/F-16).	Yixuan Li; Changli Tang; Jimin Zhuang; Yudong Yang; Guangzhi Sun; Wei Li; Zejun MA; Chao Zhang;	code
321	FlatQuant: Flatness Matters for LLM Quantization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose FlatQuant (Fast and Learnable Affine Transformation), a new post-training quantization approach that enhances the flatness of weights and activations.	Yuxuan Sun; Ruikang Liu; Haoli Bai; Han Bao; Kang Zhao; Yuening Li; JiaxinHu; Xianzhi Yu; Lu Hou; Chun Yuan; Xin Jiang; Wulong Liu; Jun Yao;	code
322	CAT: Contrastive Adversarial Training for Evaluating The Robustness of Protective Perturbations in Latent Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Extensive experiments demonstrate that our CAT method significantly reduces the effectiveness of protective perturbations in customization, urging the community to reconsider and improve the robustness of existing protective perturbations.	Sen Peng; Mingyue Wang; Jianfei He; Jijia Yang; Xiaohua Jia;	code
323	Variational Control for Guidance in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new method within this framework that achieves state-of-the-art results on several linear, non-linear, and blind inverse problems without requiring additional model training or specificity to pixel or latent space diffusion models.	Kushagra Pandey; Farrin Marouf Sofian; Felix Draxler; Theofanis Karaletsos; Stephan Mandt;	code
324	Can We Predict Performance of Large Models Across Vision-Language Tasks? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a new framework for predicting unknown performance scores based on observed ones from other LVLMs or tasks.	Qinyu Zhao; Ming Xu; Kartik Gupta; Akshay Asthana; Liang Zheng; Stephen Gould;	code
325	RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To effectively realize low-bit quantization of weights, activations and KV caches in LLMs, we propose an algorithm named Rotated Straight-Through-Estimator (RoSTE), which combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy that identifies an effective rotation configuration to reduce activation outliers.	Quan Wei; Chung-Yiu Yau; Hoi To Wai; Yang Zhao; Dongyeop Kang; Youngsuk Park; Mingyi Hong;	code
326	Latent Thought Models with Variational Bayes Inference-Time Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel class of language models, Latent Thought Models (LTMs), which incorporate explicit latent thought vectors that follow an explicit prior model in latent space.	Deqian Kong; Minglu Zhao; Dehong Xu; Bo Pang; Shu Wang; Edouardo Honig; Zhangzhang Si; Chuan Li; Jianwen Xie; Sirui Xie; Ying Nian Wu;	code
327	From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, these methods struggle to learn generalized representations due to their inability to model the hierarchical structure of ECG data. To address this gap, we introduce MELP, a novel Multi-scale ECG-Language Pretraining (MELP) model that fully leverages hierarchical supervision from ECG-text pairs.	Fuying Wang; Jiacheng Xu; Lequan Yu;	code
328	NICE Data Selection for Instruction Tuning in LLMs with Non-differentiable Evaluation Metric Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work aims to select training data for instruction tuning to improve the LLM performance on specific tasks.	Jingtan Wang; Xiaoqiang Lin; Rui Qiao; Pang Wei Koh; Chuan-Sheng Foo; Bryan Kian Hsiang Low;	code
329	EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this article, we propose a novel ensemble method, namely EnsLoss, which extends the ensemble learning concept to combine loss functions within the ERM framework.	Ben Dai;	code
330	The Lock-in Hypothesis: Stagnation By Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The training and deployment of large language models (LLMs) create a feedback loop with human users: models learn human beliefs from data, reinforce these beliefs with generated content, reabsorb the reinforced beliefs, and feed them back to users again and again.	Tianyi Qiu; Zhonghao He; Tejasveer Chugh; Max Kleiman-Weiner;	code
331	Empowering World Models with Reflection for Embodied Video Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing models often lack robust understanding, limiting their ability to perform multi-step predictions or handle Out-of-Distribution (OOD) scenarios. To address this challenge, we propose the Reflection of Generation (RoG), a set of intermediate reasoning strategies designed to enhance video prediction.	Xiaowei Chi; Chun-Kai Fan; Hengyuan Zhang; Xingqun Qi; Rongyu Zhang; Anthony Chen; Chi-Min Chan; Wei Xue; Qifeng Liu; Shanghang Zhang; Yike Guo;	code
332	Label Distribution Propagation-based Label Completion for Crowdsourcing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, WSLC considers solely the correlation of the labels annotated by different workers on per individual instance while totally ignoring the correlation of the labels annotated by different workers among similar instances. To fill this gap, we propose a novel label distribution propagation-based label completion (LDPLC) algorithm.	Tong Wu; Liangxiao Jiang; Wenjun Zhang; Chaoqun Li;	code
333	Origin Identification for Text-Guided Image-to-Image Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to visual discrepancy across generations produced by different diffusion models, this similarity-based approach fails when training on images from one model and testing on those from another, limiting its effectiveness in real-world applications. To solve this challenge of the proposed ID$^2$ task, we contribute the first dataset and a theoretically guaranteed method, both emphasizing generalizability.	Wenhao Wang; Yifan Sun; Zongxin Yang; Zhentao Tan; Zhengdong Hu; Yi Yang;	code
334	Zero-Shot Offline Imitation Learning Via Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this framework can suffer from myopic behavior: the agent’s immediate actions towards achieving individual goals may undermine long-term objectives. We introduce a novel method that mitigates this issue by directly optimizing the occupancy matching objective that is intrinsic to imitation learning.	Thomas Rupf; Marco Bagatella; Nico Gürtler; Jonas Frey; Georg Martius;	code
335	Can Transformers Learn Full Bayesian Inference in Context? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: More specifically, we introduce a general framework that builds on ideas from prior fitted networks and continuous normalizing flows and enables us to infer complex posterior distributions for models such as generalized linear models and latent factor models.	Arik Reuter; Tim G. J. Rudner; Vincent Fortuin; David Rügamer;	code
336	HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation Via Heterogeneous Knowledge Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present HealthGPT, a powerful Medical Large Vision-Language Model (Med-LVLM) that integrates medical visual comprehension and generation capabilities within a unified autoregressive paradigm.	Tianwei Lin; Wenqiao Zhang; SIJING LI; Yuqian Yuan; Binhe Yu; Haoyuan Li; Wanggui He; Hao Jiang; Mengze Li; Song xiaohui; Siliang Tang; Jun Xiao; Hui Lin; Yueting Zhuang; Beng Chin Ooi;	code
337	Stable Fair Graph Representation Learning with Lipschitz Constraint Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a stable fair Graph Neural Network (SFG) to maintain training stability while preserving accuracy and fairness performance.	Qiang Chen; Zhongze Wu; Xiu Su; Xi Lin; Zhe Qu; Shan You; Shuo Yang; Chang Xu;	code
338	One Stone, Two Birds: Enhancing Adversarial Defense Through The Lens of Distributional Discrepancy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the strength of SADD-based methods by theoretically showing that minimizing distributional discrepancy can help reduce the expected loss on AEs.	Jiacheng Zhang; Benjamin I. P. Rubinstein; Jingfeng Zhang; Feng Liu;	code
339	Bridging Layout and RTL: Knowledge Distillation Based Timing Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conversely, existing RTL-level approaches sacrifice accuracy due to the limited physical information available. We propose RTLDistil, a novel cross-stage knowledge distillation framework that bridges this gap by transferring precise physical characteristics from a layout-aware teacher model (Teacher GNN) to an efficient RTL-level student model (Student GNN), both implemented as graph neural networks (GNNs).	Mingjun Wang; Yihan Wen; Bin Sun; Jianan Mu; Juan Li; Xiaoyi Wang; Jing Justin Ye; Bei Yu; Huawei Li;	code
340	L3A: Label-Augmented Analytic Adaptation for Multi-Label Class Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Multi-label CIL (MLCIL) extends CIL to a real-world scenario where each sample may belong to multiple classes, introducing several challenges: label absence, which leads to incomplete historical information due to missing labels, and class imbalance, which results in the model bias toward majority classes. To address these challenges, we propose Label-Augmented Analytic Adaptation (L3A), an exemplar-free approach without storing past samples.	Xiang Zhang; Run He; Chen Jiao; Di Fang; Ming Li; Ziqian Zeng; Cen Chen; Huiping Zhuang;	code
341	Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel kernel-based method to align CLIP’s visual representation with that of DINOv2, ensuring that the resulting embeddings maintain compatibility with text embeddings while enhancing perceptual capabilities.	Shizhan Gong; Yankai Jiang; Qi Dou; Farzan Farnia;	code
342	TMetaNet: Topological Meta-Learning Framework for Dynamic Link Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Armed with the DZP ideas, we propose TMetaNet, a new meta-learning parameter update model based on dynamic topological features.	Hao Li; Hao Wan; Yuzhou Chen; Dongsheng Ye; Yulia Gel; Hao Jiang;	code
343	Efficient Quantification of Multimodal Interaction at Sample Level Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We first develop a redundancy estimation framework, employing an appropriate pointwise information measure to quantify this most decomposable and measurable interaction. Building upon this, we propose a general interaction estimation method that employs efficient entropy estimation, specifically tailored for sample-wise estimation in continuous distributions.	Zequn Yang; Hongfa Wang; Di Hu;	code
344	Adapting Precomputed Features for Efficient Graph Condensation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the efficiency issue, we completely bypass trajectory matching and propose a novel two-stage framework.	Yuan Li; Jun Hu; Zemin Liu; Bryan Hooi; Jia Chen; Bingsheng He;	code
345	Sample-specific Noise Injection for Diffusion-based Adversarial Purification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we discover that an optimal $t$ for each sample indeed could be different.*	Yuhao Sun; Jiacheng Zhang; Zesheng Ye; Chaowei Xiao; Feng Liu;	code
346	WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, such interaction analysis remains underexplored due to the lack of dedicated language datasets that address it. Therefore, we propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a comprehensive large-scale Q&As dataset built on WOMD focusing on describing and reasoning traffic rule-induced interactions in driving scenarios.	Yiheng Li; Cunxin Fan; Chongjian GE; Seth Z. Zhao; Chenran Li; Chenfeng Xu; Huaxiu Yao; Masayoshi Tomizuka; Bolei Zhou; Chen Tang; Mingyu Ding; Wei Zhan;	code
347	Enhancing Adversarial Robustness with Conformal Prediction: A Framework for Guaranteed Model Reliability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Correspondingly, we introduce OPSA-AT (Adversarial Training), a defense strategy that integrates OPSA within a novel conformal training paradigm.	Jie Bao; Chuangyin Dang; Rui Luo; Hanwei Zhang; Zhixin Zhou;	code
348	Three-Dimensional Trajectory Prediction with 3DMoTraj Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Mathematically, trajectory prediction becomes significantly more complex when transitioning from 2D to 3D. To tackle this challenge, we analyze the prediction complexity of 3D trajectories and propose a new method consisting of two key components: decoupled trajectory prediction and correlated trajectory refinement.	Hao Zhou; Xu Yang; Mingyu Fan; Lu Qi; Xiangtai Li; Ming-Hsuan Yang; Fei Luo;	code
349	GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Graph Inspired Veracity Extrapolation (GIVE), a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input.	Jiashu He; Mingyu Derek Ma; Jinxuan Fan; Dan Roth; Wei Wang; Alejandro Ribeiro;	code
350	Regress, Don’t Guess: A Regression-like Loss on Number Tokens for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we here present a regression-like loss that operates purely on token level.	Jonas Zausinger; Lars Pennig; Anamarija Kozina; Sean Sdahl; Julian Sikora; Adrian Dendorfer; Timofey Kuznetsov; Mohamad Hagog; Nina Wiedemann; Kacper Chlodny; Vincent Limbach; Anna Ketteler; Thorben Prein; Vishwa Mohan Singh; Michael Danziger; Jannis Born;	code
351	EFDTR: Learnable Elliptical Fourier Descriptor Transformer for Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel vertex regression loss grounded in Fourier elliptic descriptors, which removes the need for rasterization or heuristic approximations and resolves ambiguities in boundary point assignment through frequency-domain matching.	Jiawei Cao; Chaochen Gu; Hao Cheng; Xiaofeng Zhang; Kaijie Wu; Changsheng Lu;	code
352	Test-Time Canonicalization By Foundation Models for Robust Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose FoCal, a test-time, data-driven framework that achieves robust perception by leveraging internet-scale visual priors from foundation models.	Utkarsh Singhal; Ryan Feng; Stella X. Yu; Atul Prakash;	code
353	Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel framework for estimating counterfactual outcomes with spatial-temporal attributes using the Transformer, exhibiting stronger estimation ability.	He Li; Haoang Chi; Mingyu Liu; Wanrong Huang; Liyang Xu; Wenjing Yang;	code
354	Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These scenarios face significant challenges due to high variance and poor performance with low-quality propensity scores and heavy-tailed reward distributions. We address these issues by introducing a novel estimator based on the log-sum-exponential (LSE) operator, which outperforms traditional inverse propensity score estimators.	Armin Behnamnia; Gholamali Aminian; Alireza Aghaei; Chengchun Shi; Vincent Y. F. Tan; Hamid R. Rabiee;	code
355	Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose federated full-parameter tuning at scale for LLMs (Ferret), the first first-order method with shared randomness to enable scalable full-parameter tuning of LLMs across decentralized data sources while maintaining competitive model accuracy.	Yao Shu; Wenyang Hu; See-Kiong Ng; Bryan Kian Hsiang Low; Fei Yu;	code
356	De-AntiFake: Rethinking The Protective Perturbations Against Voice Cloning Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From this perspective, we propose a novel two-stage purification method: (1) Purify the perturbed speech; (2) Refine it using phoneme guidance to align it with the clean speech distribution.	Wei Fan; Kejiang Chen; Chang Liu; Weiming Zhang; Nenghai Yu;	code
357	Learning from Sample Stability for Deep Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unstable representations across epochs often lead to mispredictions, indicating difficulty in memorization and atypicality. Leveraging these findings, we introduce supervision signals for the first time based on sample stability at the representation level.	Zhixin Li; Yuheng Jia; Hui LIU; Junhui Hou;	code
358	LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a new KV cache optimization paradigm called LaCache, a training-free method for efficient and accurate generative inference of LLMs.	Dachuan Shi; Yonggan Fu; Xiangchi Yuan; Zhongzhi Yu; Haoran You; Sixu Li; Xin Dong; Jan Kautz; Pavlo Molchanov; Yingyan Celine Lin;	code
359	Controlling Large Language Model with Latent Action Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we apply CoLA* to the Llama-3.1-8B model.*	Chengxing Jia; Ziniu Li; Pengyuan Wang; Yi-Chen Li; Zhenyu Hou; Yuxiao Dong; Yang Yu;	code
360	Demystifying Singular Defects in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we provide both theoretical insights and empirical validation across a range of recent models, leading to the following observations: i) The layer-wise singular direction predicts the abrupt explosion of token norms in LLMs.	Haoqi Wang; Tong Zhang; Mathieu Salzmann;	code
361	VCT: Training Consistency Models with Variational Noise Coupling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Variational Consistency Training (VCT), a flexible and effective framework compatible with various forward kernels, including those in flow matching.	Gianluigi Silvestri; Luca Ambrogioni; Chieh-Hsin Lai; Yuhta Takida; Yuki Mitsufuji;	code
362	CaDA: Cross-Problem Routing Solver with Constraint-Aware Dual-Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Furthermore, they rely solely on global connectivity, which fails to focus on key nodes and leads to inefficient representation learning. This paper introduces a \underline{C}onstraint-\underline{A}ware \underline{D}ual-\underline{A}ttention Model (CaDA), designed to address these limitations.	Han Li; Fei Liu; Zhi Zheng; Yu Zhang; Zhenkun Wang;	code
363	Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, our study reveals that the gap in feature distribution between novel and existing tasks is primarily driven by differences in mean and covariance moments. Building on this insight, we propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.	Fangwen Wu; Lechao Cheng; Shengeng Tang; Xiaofeng Zhu; Chaowei Fang; Dingwen Zhang; Meng Wang;	code
364	PTTA: Purifying Malicious Samples for Test-Time Model Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although malicious samples that would undermine the model’s optimization should be filtered out, it also leads to a waste of test data. To alleviate this issue, we focus on how to make full use of the malicious test samples for TTA by transforming them into benign ones, and propose a plug-and-play method, PTTA.	Jing Ma; Hanlin Li; Xiang Xiang;	code
365	Differential Coding for Training-Free ANN-to-SNN Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, many conversion methods are based on rate coding, which requires numerous spikes and longer time-steps compared to directly trained SNNs, leading to increased energy consumption and latency. This article introduces differential coding for ANN-to-SNN conversion, a novel coding scheme that reduces spike counts and energy consumption by transmitting changes in rate information rather than rates directly, and explores its application across various layers.	Zihan Huang; Wei Fang; Tong Bu; Peng Xue; Zecheng Hao; Wenxuan Liu; Yuanhong Tang; Zhaofei Yu; Tiejun Huang;	code
366	Sable: A Performant, Efficient and Scalable Sequence Model for MARL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Sable, a performant, memory-efficient, and scalable sequence modelling approach to MARL.	Omayma Mahjoub; Sasha Abramowitz; Ruan John de Kock; Wiem Khlifi; Simon Verster Du Toit; Jemma Daniel; Louay Ben Nessir; Louise Beyers; Juan Claude Formanek; Liam Clark; Arnu Pretorius;	code
367	Handling Imbalanced Pseudolabels for Vision-Language Models with Concept Alignment and Confusion-Aware Calibrated Margin Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To fill this gap, we delve into imbalanced pseudolabels and identify two primary contributing factors: concept mismatch and concept confusion. To mitigate these two issues, we propose a novel framework incorporating concept alignment and confusion-aware calibrated margin mechanisms.	Yuchen Wang; Xuefeng Bai; Xiucheng Li; Weili Guan; Liqiang Nie; Xinyang Chen;	code
368	BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Nonetheless, it remains impossible to deploy DM to resource-limited edge devices. To address this problem, we propose BiMaCoSR, which combines binarization and one-step distillation to obtain extreme compression and acceleration.	Kai Liu; Kaicheng Yang; Zheng Chen; Zhiteng Li; Yong Guo; Wenbo Li; Linghe Kong; Yulun Zhang;	code
369	Speculate, Then Collaborate: Fusing Knowledge of Language Models During Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Thus, enabling LLMs to solve problems collaboratively by integrating their complementary knowledge promises to improve their performance across domains. To realize this potential, we introduce a novel Collaborative Speculative Decoding (CoSD) algorithm that enables efficient LLM knowledge fusion at test time without requiring additional model training.	Ziyao Wang; Muneeza Azmat; Ang Li; Raya Horesh; Mikhail Yurochkin;	code
370	Neural Interpretable PDEs: Harmonizing Fourier Insights with Attention for Scalable and Interpretable Physics Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce Neural Interpretable PDEs (NIPS), a novel neural operator architecture that builds upon and enhances Nonlocal Attention Operators (NAO) in both predictive accuracy and computational efficiency.	Ning Liu; Yue Yu;	code
371	SkipGPT: Each Token Is One of A Kind Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SkipGPT, a dynamic layer pruning framework designed to optimize computational resource allocation through two core innovations: (1) global token-aware routing to prioritize critical tokens and (2) decoupled pruning policies for MLP and self-attention components.	Anhao Zhao; Fanghua Ye; Yingqi Fan; Junlong Tong; Jing Xiong; Zhiwei Fei; Hui Su; Xiaoyu Shen;	code
372	Human Body Restoration with One-Step Diffusion Model and A New Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we propose a high-quality dataset automated cropping and filtering (HQ-ACF) pipeline.	Jue Gong; Jingkai Wang; Zheng Chen; Xing Liu; Hong Gu; Yulun Zhang; Xiaokang Yang;	code
373	NeuroTree: Hierarchical Functional Brain Pathway Decoding for Mental Health Disorders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although existing fMRI-based graph neural networks (GNNs) have demonstrated significant potential in brain network feature extraction, they often fail to characterize complex relationships between brain regions and demographic information in mental disorders. To overcome these limitations, we propose a learnable NeuroTree framework that integrates a $k$-hop AGE-GCN with neural ordinary differential equations (ODEs) and contrastive masked functional connectivity (CMFC) to enhance similarities and dissimilarities of brain region distance.	Jun-En Ding; Dongsheng Luo; Chenwei Wu; Feng Liu;	code
374	Understanding and Mitigating Memorization in Diffusion Models for Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, we provide a theoretical explanation for why memorization occurs in tabular diffusion models. To address this issue, we propose TabCutMix, a simple yet effective data augmentation technique that exchanges randomly selected feature segments between random same-class training sample pairs.	Zhengyu Fang; Zhimeng Jiang; Huiyuan Chen; Xiao Li; Jing Li;	code
375	Semantic Shift Estimation Via Dual-Projection and Classifier Reconstruction for Exemplar-Free Class-Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, the embeddings of old tasks shift in the embedding space after learning new tasks, and the classifier becomes biased towards new tasks due to training solely with new data, hindering the balance between old and new knowledge. To address these issues, we propose the Dual-Projection Shift Estimation and Classifier Reconstruction (DPCR) approach for EFCIL.	Run He; Di Fang; Yicheng Xu; Yawen Cui; Ming Li; Cen Chen; Ziqian Zeng; Huiping Zhuang;	code
376	MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this approach is often hindered by the high cost of obtaining large amounts of labeled data. To address this challenge, we propose Many-Shot Adaptive Pseudo-LabEling, namely MAPLE, a novel influence-based many-shot ICL framework that utilizes pseudo-labeled samples to compensate for the lack of label information.	Zihan Chen; Song Wang; Zhen Tan; Jundong Li; Cong Shen;	code
377	Does One-shot Give The Best Shot? Mitigating Model Inconsistency in One-shot Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a novel OFL framework FAFI that enhances the one-shot training on the client side to essentially overcome inferior local uploading.	Hui Zeng; Wenke Huang; Tongqing Zhou; Xinyi Wu; Guancheng Wan; Yingwen Chen; Zhiping Cai;	code
378	Generalized Category Discovery Via Reciprocal Learning and Class-Wise Distribution Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent parametric-based methods suffer from inferior base discrimination due to unreliable self-supervision. To address this issue, we propose a Reciprocal Learning Framework (RLF) that introduces an auxiliary branch devoted to base classification.	Duo Liu; Zhiquan Tan; Linglan Zhao; Zhongqiang Zhang; Xiangzhong Fang; Weiran Huang;	code
379	Beyond One-Hot Labels: Semantic Mixing for Model Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce calibration-aware data augmentation to create synthetic datasets of diverse samples and their ground-truth uncertainty.	Haoyang Luo; Linwei Tao; Minjing Dong; Chang Xu;	code
380	Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce a novel PFL framework, called as PHN-HVVS, which decomposes the design space into Voronoi grids and deploys a genetic algorithm (GA) for Voronoi grid partitioning within high-dimensional space.	Mengmeng Chen; Xiaohu Wu; QIQI LIU; Tiantian He; Yew-Soon Ong; Yaochu Jin; Qicheng Lao; Han Yu;	code
381	EgoPrivacy: What Your First-Person Camera Says About You? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To further emphasize the privacy threats inherent to egocentric vision, we propose Retrieval-Augmented Attack, a novel attack strategy that leverages ego-to-exo retrieval from an external pool of exocentric videos to boost the effectiveness of demographic privacy attacks.	Yijiang Li; Genpei Zhang; Jiacheng Cheng; Yi Li; Xiaojun Shan; Dashan Gao; Jiancheng Lyu; Yuan Li; Ning Bi; Nuno Vasconcelos;	code
382	Weakly-Supervised Contrastive Learning for Imprecise Class Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead of directly relying on imprecise class labels, we measure the semantic similarity between example pairs, which quantifies how closely they belong to the same category by iteratively refining weak supervisory signals. Based on this concept, we propose a graph-theoretic framework for weakly-supervised contrastive learning, where semantic similarity serves as the graph weights.	Zi-Hao Zhou; Jun-Jie Wang; Tong Wei; Min-Ling Zhang;	code
383	TtBA: Two-third Bridge Approach for Decision-Based Adversarial Attack Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel normal-vector-based method called Two-third Bridge Attack (TtBA).	Feiyang Wang; Xingquan Zuo; Hai Huang; Gang Chen;	code
384	CEGA: A Cost-Effective Approach for Graph-Based Model Extraction and Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we evaluate the vulnerability of GNNs to MEAs and explore their potential for cost-effective model acquisition in non-adversarial research settings.	Zebin Wang; Menghan Lin; Bolin Shen; Ken Anderson; Molei Liu; Tianxi Cai; Yushun Dong;	code
385	Beyond Zero Initialization: Investigating The Impact of Non-Zero Initialization on LoRA Fine-Tuning Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the impact of non-zero initialization on LoRA’s fine-tuning dynamics from an infinite-width perspective.	Shiwei Li; Xiandi Luo; Xing Tang; Haozhao Wang; Hao Chen; weihongluo; Yuhua Li; xiuqiang He; Ruixuan Li;	code
386	The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To improve the training efficiency of federated learning (FL), previous research has employed low-rank decomposition techniques to reduce communication overhead. In this paper, we seek to enhance the performance of these low-rank decomposition methods.	Shiwei Li; Xiandi Luo; Haozhao Wang; Xing Tang; Shijie Xu; weihongluo; Yuhua Li; xiuqiang He; Ruixuan Li;	code
387	MATS: An Audio Language Model Under Text-only Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose MATS, an audio-language multimodal LLM designed to handle Multiple Audio task using solely Text-only Supervision.	Wen Wang; RuiBing Hou; Hong Chang; Shiguang Shan; Xilin Chen;	code
388	Splitting & Integrating: Out-of-Distribution Detection Via Adversarial Gradient Attribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel OOD detection method called \textbf{S \& I} based on layer \textbf{S}plitting and gradient \textbf{I}ntegration via Adversarial Gradient Attribution.	Jiayu Zhang; Xinyi Wang; Zhibo Jin; Zhiyu Zhu; Jianlong Zhou; Fang Chen; Huaming Chen;	code
389	Towards An Explainable Comparison and Alignment of Feature Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the Spectral Pairwise Embedding Comparison (SPEC) framework to compare embeddings and identify their differences in clustering a reference dataset.Furthermore, we introduce an optimization problem using this framework to align two embeddings, ensuring that clusters identified in one embedding are also captured in the other model.	Mohammad Jalali; Bahar Dibaei Nia; Farzan Farnia;	code
390	Robot-Gated Interactive Imitation Learning with Adaptive Intervention Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the Adaptive Intervention Mechanism (AIM), a novel robot-gated IIL algorithm that learns an adaptive criterion for requesting human demonstrations.	Haoyuan Cai; Zhenghao Peng; Bolei Zhou;	code
391	PatchPilot: A Cost-Efficient Software Engineering Agent with Early Attempts on Formal Verification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose PatchPilot, an agentic patcher that strikes a balance between patching efficacy, stability, and cost-efficiency.	Hongwei Li; Yuheng Tang; Shiqi Wang; Wenbo Guo;	code
392	ESPFormer: Doubly-Stochastic Attention with Expected Sliced Transport Plans Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel, fully parallelizable doubly-stochastic attention mechanism based on sliced optimal transport, leveraging Expected Sliced Transport Plans (ESP).	Ashkan Shahbazi; Elaheh Akbari; Darian Salehi; Xinran Liu; Navid NaderiAlizadeh; Soheil Kolouri;	code
393	On The Guidance of Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose the first framework of general guidance for flow matching.	Ruiqi Feng; Chenglei Yu; Wenhao Deng; Peiyan Hu; Tailin Wu;	code
394	TimeDART: A Diffusion Autoregressive Transformer for Self-Supervised Time Series Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose \textbf{TimeDART}, a novel self-supervised time series pre-training framework that unifies two powerful generative paradigms to learn more transferable representations.	Daoyu Wang; Mingyue Cheng; Zhiding Liu; Qi Liu;	code
395	Protein Structure Tokenization: Benchmarking and New Recipe Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Compared to the leading model ESM3, our method achieves an average of 6.31\% performance improvement across 24 supervised tasks, with sensitivity and utilization rates increased by 12.83\% and 124.03\%, respectively.	Xinyu Yuan; Zichen Wang; Marcus D. Collins; Huzefa Rangwala;	code
396	HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Remarkably, our image-free* approach reduces training time by $25\\%$ compared with the previous method.*	Yushi Huang; Zining Wang; Ruihao Gong; Jing Liu; Xinjie Zhang; Jinyang Guo; Xianglong Liu; Jun Zhang;	code
397	LETS Forecast: Learning Embedology for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While deep learning has achieved major success in time series forecasting, many existing approaches do not explicitly model the dynamics. To bridge this gap, we introduce DeepEDM, a framework that integrates nonlinear dynamical systems modeling with deep neural networks.	Abrar Majeedi; Viswanatha Reddy Gajjala; Satya Sai Srinath Namburi GNVV; Nada Magdi Elkordi; Yin Li;	code
398	When Do LLMs Help With Node Classification? A Comprehensive Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although many studies demonstrate the impressive performance of LLM-based methods, the lack of clear design guidelines may hinder their practical application. In this work, we aim to establish such guidelines through a fair and systematic comparison of these algorithms.	Xixi Wu; Yifei Shen; Fangzhou Ge; Caihua Shan; Yizhu Jiao; Xiangguo Sun; Hong Cheng;	code
399	AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Many video-to-audio (VTA) methods have been proposed for dubbing silent AI-generated videos.	Yuqin Cao; Xiongkuo Min; Yixuan Gao; Wei Sun; Guangtao Zhai;	code
400	Stacey: Promoting Stochastic Steepest Descent Via Accelerated $\ell_p$-Smooth Nonconvex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While popular optimization methods such as SGD, AdamW, and Lion depend on steepest descent updates in either $\ell_2$ or $\ell_\infty$ norms, there remains a critical gap in handling the non-Euclidean structure observed in modern deep networks training. In this work, we address this need by introducing a new accelerated $\ell_p$ steepest descent algorithm, called Stacey, which uses interpolated primal-dual iterate sequences to effectively navigate non-Euclidean smooth optimization tasks.	Xinyu Luo; Site Bai; Bolian Li; Petros Drineas; Ruqi Zhang; Brian Bullins;	code
401	ConText: Driving In-context Learning for Text Removal and Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents the first study on adapting the visual in-context learning (V-ICL) paradigm to optical character recognition tasks, specifically focusing on text removal and segmentation.	Fei Zhang; Pei Zhang; Baosong Yang; Fei Huang; Yanfeng Wang; Ya Zhang;	code
402	LIFT The Veil for The Truth: Principal Weights Emerge After Rank Reduction for Reasoning-Focused Supervised Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we state that weights with the largest magnitude after low-rank approximation are critical weights for fine-tuning, which we call Principal Weights.	Zihang Liu; Tianyu Pang; Oleg Balabanov; Chaoqun Yang; Tianjin Huang; Lu Yin; Yaoqing Yang; Shiwei Liu;	code
403	Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing NAT approaches often rely on Connectionist Temporal Classification (CTC) loss, which presents significant optimization challenges due to CTC’s complexity and increases the risk of training failures. To address these issues, we propose an improved non-autoregressive peptide sequencing model that incorporates a structured protein sequence curriculum learning strategy.	Xiang Zhang; Jiaqi Wei; Zijie Qiu; Sheng Xu; Nanqing Dong; ZhiQiang Gao; Siqi Sun;	code
404	Distillation of Discrete Diffusion Through Dimensional Correlations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The code used in the paper is available at https://github.com/sony/di4c.In this paper, (i) we propose mixture models for discrete diffusion that are capable of treating dimensional correlations while remaining scalable, and (ii) we provide a set of loss functions for distilling the iterations of existing models.	Satoshi Hayakawa; Yuhta Takida; Masaaki Imaizumi; Hiromi Wakaki; Yuki Mitsufuji;	code
405	Open-Det: An Efficient Learning Framework for Open-Ended Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the existing OED models, such as GenerateU, require large-scale datasets for training, suffer from slow convergence, and exhibit limited performance. To address these issues, we present a novel and efficient Open-Det framework, consisting of four collaborative parts.	Guiping Cao; Tao Wang; Wenjian Huang; Xiangyuan Lan; Jianguo Zhang; Dongmei Jiang;	code
406	FuseUNet: A Multi-Scale Feature Fusion Method for U-like Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While recent improvements to UNet have focused on enhancing encoder and decoder capabilities, these limitations remain overlooked. To overcome these challenges, we propose a novel multi-scale feature fusion method that reimagines the UNet decoding process as solving an initial value problem (IVP), treating skip connections as discrete nodes.	Quansong He; Xiangde Min; Kaishen Wang; Tao He;	code
407	EmoGrowth: Incremental Multi-label Emotion Decoding with Augmented Emotional Relation Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an Augmented Emotional Semantics Learning (AESL) framework to address two critical challenges: past- and future-missing partial label problems.	Kaicheng Fu; Changde Du; Jie Peng; Kunpeng Wang; Shuangchen Zhao; Xiaoyu Chen; Huiguang He;	code
408	Concentration Distribution Learning from Label Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, it’s impossible to obtain the total description degree of hidden labels that not in the label space, which leads to the loss of information and confusion in instances. To solve the above problem, we come up with a new concept named background concentration to serve as the absolute description degree term of the label distribution and introduce it into the LDL process, forming the improved paradigm of concentration distribution learning.	Jiawei Tang; Yuheng Jia;	code
409	Distributed Conformal Prediction Via Message Passing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Conformal Prediction (CP) offers a robust post-hoc calibration framework, providing distribution-free statistical coverage guarantees for prediction sets by leveraging held-out datasets. In this work, we address a decentralized setting where each device has limited calibration data and can communicate only with its neighbors over an arbitrary graph topology.	Haifeng Wen; Hong Xing; Osvaldo Simeone;	code
410	Selective Prompt Anchoring for Code Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that this attention dilution issue is an important reason for code generation errors. To mitigate this issue, we propose *Selective Prompt Anchoring* (SPA) to guide code LLMs to pay more attention to user intent when generating code.	Yuan Tian; Tianyi Zhang;	code
411	Learning The RoPEs: Better 2D and 3D Position Encodings with STRING Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce $\textbf{STRING}$: Separable Translationally Invariant Position Encodings.	Connor Schenck; Isaac Reid; Mithun George Jacob; Alex Bewley; Joshua Ainslie; David Rendleman; Deepali Jain; Mohit Sharma; Kumar Avinava Dubey; Ayzaan Wahid; Sumeet Singh; René Wagner; Tianli Ding; Chuyuan Fu; Arunkumar Byravan; Jake Varley; Alexey A. Gritsenko; Matthias Minderer; Dmitry Kalashnikov; Jonathan Tompson; Vikas Sindhwani; Krzysztof Marcin Choromanski;	code
412	Active Learning for Efficient Discovery of Optimal Combinatorial Perturbations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce NAIAD, an active learning framework that efficiently discovers optimal gene pairs by leveraging single-gene perturbation effects and adaptive gene embeddings that scale with the training data size, mitigating overfitting in small-sample learning while capturing complex gene interactions as more data is collected.	Jason Qin; Hans-Hermann Wessels; Carlos Fernandez-Granda; Yuhan Hao;	code
413	Aligning LLMs By Predicting Preferences from User Writing Samples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces PROSE, a method designed to enhance the precision of preference descriptions inferred from user writing samples.	Stéphane Aroca-Ouellette; Natalie Mackraz; Barry-John Theobald; Katherine Metcalf;	code
414	Categorical Schrödinger Bridge Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we provide a theoretical and algorithmic foundation for solving SB in discrete spaces using the recently introduced Iterative Markovian Fitting (IMF) procedure.	Grigoriy Ksenofontov; Alexander Korotin;	code
415	From Thousands to Billions: 3D Visual Language Grounding Via Render-Supervised Distillation from 2D VLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: 3D vision-language grounding faces a fundamental data bottleneck: while 2D models train on billions of images, 3D models have access to only thousands of labeled scenes–a six-order-of-magnitude gap that severely limits performance. We introduce \textbf{\emph{LIFT-GS}}, a practical distillation technique that overcomes this limitation by using differentiable rendering to bridge 3D and 2D supervision.	Ang Cao; Sergio Arnaud; Oleksandr Maksymets; Jianing Yang; Ayush Jain; Ada Martin; Vincent-Pierre Berges; Paul McVay; Ruslan Partsey; Aravind Rajeswaran; Franziska Meier; Justin Johnson; Jeong Joon Park; Alexander Sax;	code
416	SADA: Stability-guided Adaptive Diffusion Acceleration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Stability-guided Adaptive Diffusion Acceleration (SADA), a novel paradigm that unifies step-wise and token-wise sparsity decisions via a single stability criterion to accelerate sampling of ODE-based generative models (Diffusion and Flow-matching).	Ting Jiang; Yixiao Wang; Hancheng Ye; Zishan Shao; Jingwei Sun; Jingyang Zhang; Zekai Chen; Jianyi Zhang; Yiran Chen; Hai Li;	code
417	Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we provide a comprehensive analysis of their staleness and inferior performance on large-scale problems.	Rui Xue; Tong Zhao; Neil Shah; Xiaorui Liu;	code
418	MaskTwins: Dual-form Complementary Masking for Domain-Adaptive Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we reframe masked reconstruction as a sparse signal reconstruction problem and theoretically prove that the dual form of complementary masks possesses superior capabilities in extracting domain-agnostic image features.	Jiawen Wang; Yinda Chen; Xiaoyu Liu; Che Liu; Dong Liu; Jianqing Gao; Zhiwei Xiong;	code
419	Balancing Interference and Correlation in Spatial Experimental Designs: A Causal Graph Cut Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a surrogate function for the mean squared error (MSE) of the estimator, which facilitates the use of classical graph cut algorithms to learn the optimal design.	Jin Zhu; Jingyi Li; Hongyi Zhou; Yinan Lin; Zhenhua Lin; Chengchun Shi;	code
420	GuidedQuant: Large Language Model Quantization Via Exploiting End Loss Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods either (1) fail to account for the varying importance of hidden features to the end loss or, when incorporating end loss, (2) neglect the critical interactions between model weights. To address these limitations, we propose GuidedQuant, a novel quantization approach that integrates gradient information from the end loss into the quantization objective while preserving cross-weight dependencies within output channels.	Jinuk Kim; Marwa El Halabi; Wonpyo Park; Clemens JS Schaefer; Deokjae Lee; Yeonhong Park; Jae W. Lee; Hyun Oh Song;	code
421	UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing benchmarks often fall short in evaluating LLMs’ abilities on the breadth and depth of undergraduate-level physics, underscoring the need for a comprehensive evaluation. To fill this gap, we introduce UGPhysics, a large-scale and diverse benchmark specifically designed to evaluate UnderGraduate-level Physics (UGPhysics) reasoning with LLMs.	Xin Xu; Qiyun Xu; Tong Xiao; Tianhao Chen; Yuchen Yan; Jiaxin ZHANG; Shizhe Diao; Can Yang; Yang Wang;	code
422	MoE-SVD: Structured Mixture-of-Experts LLMs Compression Via Singular Value Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present MoE-SVD, a new decomposition-based compression framework tailored for MoE LLMs without any extra training.	Wei Li; Lujun Li; Hao Gu; You-Liang Huang; Mark G. Lee; Shengjie Sun; Wei Xue; Yike Guo;	code
423	From Debate to Equilibrium: Belief‑Driven Multi‑Agent LLM Reasoning Via Bayesian Nash Equilibrium Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Efficient Coordination via Nash Equilibrium (ECON), a hierarchical reinforcement-learning paradigm that marries distributed reasoning with centralized final output.	Xie Yi; Zhanke Zhou; Chentao Cao; Qiyu Niu; Tongliang Liu; Bo Han;	code
424	KIND: Knowledge Integration and Diversion for Training Decomposable Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, traditional pre-trained models often face deployment challenges due to their fixed sizes, and are prone to negative transfer when discrepancies arise between training tasks and target tasks. To address this, we propose KIND, a novel pre-training method designed to construct decomposable models.	Yucheng Xie; Fu Feng; Ruixiao Shi; Jing Wang; Yong Rui; Xin Geng;	code
425	CostFilter-AD: Enhancing Anomaly Detection Through Matching Cost Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Often, such a matching process is inaccurate yet overlooked, leading to sub-optimal detection. To address this issue, we introduce the concept of cost filtering, borrowed from classical matching tasks, such as depth and flow estimation, into the UAD problem.	Zhe Zhang; Mingxiu Cai; Hanxiao Wang; Gaochang Wu; Tianyou Chai; Xiatian Zhu;	code
426	AnalogGenie-Lite: Enhancing Scalability and Precision in Circuit Topology Discovery Through Lightweight Graph Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work proposes AnalogGenie-Lite, a decoder-only transformer that discovers novel analog IC topologies with significantly enhanced scalability and precision via lightweight graph modeling.	Jian Gao; Weidong Cao; Xuan Zhang;	code
427	Balancing Model Efficiency and Performance: Adaptive Pruner for Long-tailed Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel adaptive pruning strategy, LTAP (Long-Tailed Adaptive Pruner), aimed at balancing model efficiency and performance to better address the challenges posed by long-tailed data distributions.	Zhe Zhao; HaiBin Wen; Pengkun Wang; ShuangWang; Zhenkun Wang; Qingfu Zhang; Yang Wang;	code
428	TIMING: Temporality-Aware Integrated Gradients for Time Series Explanation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current evaluation metrics fail to assess this capability, as they inadvertently cancel out opposing feature contributions. To address this limitation, we propose novel evaluation metrics—Cumulative Prediction Difference (CPD) and Cumulative Prediction Preservation (CPP)—to systematically assess whether attribution methods accurately identify significant positive and negative points in time series XAI.	Hyeongwon Jang; Changhun Kim; Eunho Yang;	code
429	Continuous Visual Autoregressive Generation Via Score Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When applied to continuous modalities such as visual data, Visual AutoRegressive modeling (VAR) typically resorts to quantization-based approaches to cast the data into a discrete space, which can introduce significant information loss. To tackle this issue, we introduce a Continuous VAR framework that enables direct visual autoregressive generation without vector quantization.	Chenze Shao; Fandong Meng; Jie Zhou;	code
430	How Effective Can Dropout Be in Multiple Instance Learning ? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we empirically explore how effective the dropout can be in MIL.	Wenhui Zhu; Peijie Qiu; Xiwen Chen; Zhangsihao Yang; Aristeidis Sotiras; Abolfazl Razi; Yalin Wang;	code
431	HyperNear: Unnoticeable Node Injection Attacks on Hypergraph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through empirical analysis, we develop a relatively unnoticeable attack approach by monitoring changes in homophily and leveraging this self-regulating property to enhance stealth. Building on these insights, we introduce HyperNear, i.e., $\underline{N}$ode inj$\underline{E}$ction $\underline{A}$ttacks on hype$\underline{R}$graph neural networks, the first node injection attack framework specifically tailored for HNNs.	Tingyi Cai; Yunliang Jiang; Ming Li; Lu Bai; Changqin Huang; Yi Wang;	code
432	Subspace Optimization for Large Language Models with Convergence Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their convergence guarantees remain unclear, particularly in stochastic settings. In this paper, we reveal that GaLore does not always converge to the optimal solution and provide an explicit counterexample to support this finding.	Yutong He; Pengrui Li; Yipeng Hu; Chuyan Chen; Kun Yuan;	code
433	Physics-informed Temporal Alignment for Auto-regressive PDE Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The challenge becomes particularly evident for out-of-distribution data, as the pretraining performance may approach random model initialization for downstream tasks with long-term dynamics. To deal with this problem, we propose physics-informed temporal alignment (PITA), a self-supervised learning framework inspired by inverse problem solving.	Congcong Zhu; Xiaoyan Xu; Jiayue Han; Jingrun Chen;	code
434	Fast Large Language Model Collaborative Decoding Via Speculation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Collaborative decoding via Speculation (CoS), a novel framework that accelerates collaborative decoding without compromising performance.	Jiale Fu; Yuchu Jiang; Junkai Chen; Jiaming Fan; Xin Geng; Xu Yang;	code
435	Training Diffusion-based Generative Models with Limited Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel theoretical insight for diffusion models that two factors, i.e., the denoiser function hypothesis space and the number of training samples, can affect the denoising score matching error of all training samples.	Zhaoyu Zhang; Yang Hua; Guanxiong Sun; Hui Wang; Seán McLoone;	code
436	The Four Color Theorem for Cell Instance Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel cell instance segmentation method inspired by the four-color theorem.	Ye Zhang; Yu Zhou; Yifeng Wang; Jun Xiao; Ziyue Wang; Yongbing Zhang; Jianxu Chen;	code
437	OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This data scarcity also contributes to the generalization difficulties experienced by learning-based methods. To address these challenges, we propose a scalable framework for synthesizing a high-quality dataset, named OptMATH.	Hongliang Lu; Zhonglin Xie; Yaoyu Wu; Can Ren; Yuxuan Chen; Zaiwen Wen;	code
438	You Always Recognize Me (YARM): Robust Texture Synthesis Against Multi-View Corruption Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the use of warning colors and camouflage in the real world, we propose designing a robust appearance that can enhance model recognition of low-quality image data.	Weihang Ran; Wei Yuan; Yinqiang Zheng;	code
439	SpikF: Spiking Fourier Network for Efficient Long-term Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their application in long-term prediction tasks remains underexplored, which is primarily due to two critical challenges: (1) current SNN encoding methods are unable to effectively encode long temporal information, leading to increased computational complexity and energy consumption; (2) though Transformer-based models have achieved state-of-the-art accuracy in temporal prediction tasks, the absence of proper positional encoding for spiking self-attention restricts Spiking Transformer from effectively utilizing positional information, resulting in performance degradation. To address these challenges, we introduce an attention-free framework, Spiking Fourier Network (SpikF), that encodes input sequences in patches and employs an innovative frequency domain selection mechanism to effectively utilize the sequential properties of time-series data.	Wenjie Wu; Dexuan Huo; Hong Chen;	code
440	Stray Intrusive Outliers-Based Feature Selection on Intra-Class Asymmetric Instance Distribution or Multiple High-Density Clusters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a supervised FS method, Stray Intrusive Outliers-based FS (SIOFS), for data classification with intra-class ADMHC.	Lixin Yuan; Yirui Wu; Wenxiao Zhang; Minglei Yuan; Jun Liu;	code
441	Foundation Molecular Grammar: Multi-Modal Foundation Models Induce Interpretable Molecular Graph Languages Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Foundation Molecular Grammar (FMG), which leverages multi-modal foundation models (MMFMs) to induce an interpretable molecular language.	Michael Sun; Weize Yuan; Gang Liu; Wojciech Matusik; Jie Chen;	code
442	LensLLM: Unveiling Fine-Tuning Dynamics for LLM Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: * In this work, we propose a novel theoretical framework that provides a proper lens to assess the generalization capabilities of LLMs, thereby enabling accurate and efficient LLM selection for downstream applications.	Xinyue Zeng; Haohui Wang; Junhong Lin; Jun Wu; Tyler Cody; Dawei Zhou;	code
443	AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, several practical challenges persist, including managing intricate dependencies among features and quantifying uncertainty in predictions. This study aims to tackle these critical limitations by introducing adapters—feature-space transformations that facilitate the effective use of pre-trained univariate time series FMs for multivariate tasks.	Abdelhakim Benechehab; Vasilii Feofanov; Giuseppe Paolo; Albert Thomas; Maurizio Filippone; Balázs Kégl;	code
444	Positional Encoding Meets Persistent Homology on Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our insights inform the design of a novel learnable method, PiPE (Persistence-informed Positional Encoding), which is provably more expressive than both PH and PE.	Yogesh Verma; Amauri H Souza; Vikas K Garg;	code
445	Gamma Distribution PCA-Enhanced Feature Learning for Angle-Robust SAR Target Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We validate $\Gamma$PCA model based on two commonly used backbones, ResNet and ViT, and conduct multiple robustness experiments on the MSTAR benchmark dataset.	Chong Zhang; Peng Zhang; Mengke Li;	code
446	Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present comprehensive safety evaluations across various mainstream quantization techniques and diverse calibration datasets, utilizing widely accepted safety benchmarks.	Kejia Chen; Jiawen Zhang; Jiacong Hu; Yu Wang; Jian Lou; Zunlei Feng; Mingli Song;	code
447	Permutation-based Rank Test in The Presence of Discretization and Application in Causal Discovery with Mixed Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For example, in psychometric studies, the continuous level of certain personality dimensions of a person can only be measured after being discretized into order-preserving options such as disagree, neutral, and agree. Motivated by this, we propose Mixed data Permutation-based Rank Test (MPRT), which properly controls the statistical errors even when some or all variables are discretized.	Xinshuai Dong; Ignavier Ng; Boyang Sun; Haoyue Dai; Guang-Yuan Hao; Shunxing Fan; Peter Spirtes; Yumou Qiu; Kun Zhang;	code
448	Instance Correlation Graph-based Naive Bayes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: At the same time, none of them takes into account the correlations among instances. To fill this gap, we propose a novel algorithm called instance correlation graph-based naive Bayes (ICGNB).	Chengyuan Li; Liangxiao Jiang; Wenjun Zhang; Liangjun Yu; Huan Zhang;	code
449	Uncertainty-Based Extensible Codebook for Discrete Federated Learning in Heterogeneous Data Silos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, we propose an innovative yet straightforward iterative framework, termed \emph{Uncertainty-Based Extensible-Codebook Federated Learning (UEFL)}.	Tianyi Zhang; Yu Cao; Dianbo Liu;	code
450	CoastalBench: A Decade-Long High-Resolution Dataset to Emulate Complex Coastal Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing studies often focus on relatively small datasets and simple processes. To fill this gap, we introduce a decade-long, high-resolution (<100m) coastal circulation modeling dataset on a real-world 3D mesh in southwest Florida with around 6 million cells.	Zelin Xu; Yupu Zhang; Tingsong Xiao; Maitane Olabarrieta Lizaso; Jose M. Gonzalez-Ondina; Zibo Liu; Shigang Chen; Zhe Jiang;	code
451	Maximum Entropy Reinforcement Learning with Diffusion Policy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we employ the diffusion model, a powerful generative model capable of capturing complex multimodal distributions, as the policy representation to fulfill the MaxEnt RL objective, developing a method named MaxEnt RL with Diffusion Policy (MaxEntDP).	Xiaoyi Dong; Jian Cheng; Xi Sheryl Zhang;	code
452	DLP: Dynamic Layerwise Pruning in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches often rely on pre-defined values, which can result in suboptimal performance. To overcome these limitations, we propose a novel method called Dynamic Layerwise Pruning (DLP).	Yuli Chen; Bo Cheng; Jiale Han; Yingying Zhang; Yingting Li; Shuhao Zhang;	code
453	GraphGPT: Generative Pre-trained Graph Eulerian Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GraphGPT, a novel self-supervised generative pre-trained* model for graph learning based on the Graph Eulerian Transformer (GET).*	Qifang Zhao; Weidong Ren; Tianyu Li; Hong Liu; Xingsheng He; Xiaoxiao Xu;	code
454	Robust Spatio-Temporal Centralized Interaction for OOD Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most models relying on node-to-node messaging interaction exhibit sensitivity to spatiotemporal shifts, encountering out-of-distribution (OOD) challenges. To address these issues, we introduce \textbf{\underline{S}}patio-\textbf{\underline{T}}emporal \textbf{\underline{O}}OD \textbf{\underline{P}}rocessor (STOP), which employs a centralized messaging mechanism along with a message perturbation mechanism to facilitate robust spatiotemporal interactions.	Jiaming Ma; Binwu Wang; Pengkun Wang; Zhengyang Zhou; Xu Wang; Yang Wang;	code
455	Physics-Informed Weakly Supervised Learning For Interatomic Potentials Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, machine-learned interatomic potentials (MLIPs) often struggle with generalization and robustness, leading to unphysical energy and force predictions in atomistic simulations. To address this, we propose a physics-informed, weakly supervised training framework for MLIPs.	Makoto Takamoto; Viktor Zaverkin; Mathias Niepert;	code
456	FeatSharp: Your Vision Model Features, Sharper Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel method to coherently and cheaply upsample the feature maps of low-resolution vision encoders while picking up on fine-grained details that would otherwise be lost due to resolution.	Mike Ranzinger; Greg Heinrich; Pavlo Molchanov; Bryan Catanzaro; Andrew Tao;	code
457	Socialized Coevolution: Advancing A Better World Through Cross-Task Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by Social Learning (SL), this paper introduces a practical paradigm of Socialized Coevolution (SC).	Xinjie Yao; Yu Wang; Pengfei Zhu; Wanyu Lin; Ruipu Zhao; Zhoupeng Guo; Weihao Li; Qinghua Hu;	code
458	SynEVO: A Neuro-inspired Spatiotemporal Evolutional Framework for Cross-domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, inspired by neuroscience theories, we theoretically derive the increased information boundary via learning cross-domain collective intelligence and propose a Synaptic EVOlutional spatiotemporal network, SynEVO, where SynEVO breaks the model independence and enables cross-domain knowledge to be shared and aggregated.	Jiayue Liu; Zhongchao Yi; Zhengyang Zhou; Qihe Huang; Kuo Yang; Xu Wang; Yang Wang;	code
459	Reflection-Bench: Evaluating Epistemic Agency in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Correspondingly, we propose Reflection-Bench, a cognitive-psychology-inspired benchmark consisting of seven tasks with long-term relevance and minimization of data leakage.	Lingyu Li; Yixu Wang; Haiquan Zhao; Shuqi Kong; Yan Teng; Chunbo Li; Yingchun Wang;	code
460	QT-DoG: Quantization-Aware Training for Domain Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Quantization-aware Training for Domain Generalization (QT-DoG) and demonstrate that weight quantization effectively leads to flatter minima in the loss landscape, thereby enhancing domain generalization.	Saqib Javed; Hieu Le; Mathieu Salzmann;	code
461	Identifying Neural Dynamics Using Interventional State Space Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose interventional state-space models (iSSM), a class of causal models that can predict neural responses to novel perturbations.	Amin Nejatbakhsh; Yixin Wang;	code
462	HyperIMTS: Hypergraph Neural Network for Irregular Multivariate Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To represent and learn both dependencies from original observations in a unified form, we propose HyperIMTS, a Hypergraph neural network for Irregular Multivariate Time Series forecasting.	Boyuan Li; Yicheng Luo; Zhen Liu; Junhao Zheng; Jianming Lv; Qianli Ma;	code
463	L-Diffusion: Laplace Diffusion for Efficient Pathology Image Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Laplace Diffusion Model, referred to as L-Diffusion, an innovative framework tailored for efficient pathology image segmentation.	Weihan Li; Linyun Zhou; YangJian; Shengxuming Zhang; Xiangtong Du; Xiuming Zhang; Jing Zhang; ChaoqingXu; Mingli Song; Zunlei Feng;	code
464	Taming Diffusion for Dataset Distillation with High Representativeness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically investigate issues present in current diffusion-based dataset distillation methods, including inaccurate distribution matching, distribution deviation with random noise, and separate sampling. Building on this, we propose D$^3$HR, a novel diffusion-based framework to generate distilled datasets with high representativeness.	Lin Zhao; Yushu Wu; Xinru Jiang; Jianyang Gu; Yanzhi Wang; Xiaolin Xu; Pu Zhao; Xue Lin;	code
465	Are LLMs Prescient? A Continuous Evaluation Using Daily News As The Oracle Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These benchmarks also fall short in assessing how LLM performance changes over time, as they consist of a static set of questions without a temporal dimension. To address these limitations, we propose using future event prediction as a continuous evaluation method to assess LLMs’ temporal generalization and forecasting abilities.	Hui Dai; Ryan Teehan; Mengye Ren;	code
466	DiLQR: Differentiable Iterative Linear Quadratic Regulator Via Implicit Differentiation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces DiLQR, a framework that facilitates differentiation through iLQR, allowing it to serve as a trainable and differentiable module, either as or within a neural network.	Shuyuan Wang; Philip D Loewen; Michael Forbes; Bhushan Gopaluni; Wei Pan;	code
467	Rethinking Chain-of-Thought from The Perspective of Self-Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Interestingly, we observe that both CoT reasoning and self-training share the core objective: iteratively leveraging model-generated information to progressively reduce prediction uncertainty. Building on this insight, we propose a novel CoT framework to improve reasoning performance.	Zongqian Wu; Baoduo Xu; Ruochen Cui; Mengmeng Zhan; Xiaofeng Zhu; Lei Feng;	code
468	Efficient Logit-based Knowledge Distillation of Deep Spiking Neural Networks for Full-Range Timestep Deployment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite this, SNNs often suffer from accuracy degradation compared to ANNs and face deployment challenges due to fixed inference timesteps, which require retraining for adjustments, limiting operational flexibility. To address these issues, our work considers the spatio-temporal property inherent in SNNs, and proposes a novel distillation framework for deep SNNs that optimizes performance across full-range timesteps without specific retraining, enhancing both efficacy and deployment adaptability.	Chengting Yu; Xiaochen Zhao; Lei Liu; Shu Yang; Gaoang Wang; Erping Li; Aili Wang;	code
469	One Arrow, Two Hawks: Sharpness-aware Minimization for Federated Learning Via Global Model Trajectory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most SAM-based methods do not directly consider the global objective and require two backward pass per iteration, resulting in diminished effectiveness. To overcome these two bottlenecks, we leverage the global model trajectory to directly measure sharpness for the global objective, requiring only a single backward pass.	Yuhang Li; Tong Liu; Yangguang Cui; Ming Hu; Xiaoqiang Li;	code
470	UltraTWD: Optimizing Ultrametric Trees for Tree-Wasserstein Distance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address it, we introduce UltraTWD, a novel unsupervised framework that simultaneously optimizes both ultrametric tree structures and edge weights to more faithfully approximate the cost matrix.	Fangchen Yu; Yanzhen Chen; Jiaxing Wei; Jianfeng Mao; Wenye Li; Qiang Sun;	code
471	LLM-Assisted Semantically Diverse Teammate Generation for Efficient Multi-agent Coordination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, such teammates lack semantic information, resulting in inefficient teammate generation and poor adaptability of the agents. To tackle these challenges, we propose Semantically Diverse Teammate Generation (SemDiv), a novel framework leveraging the capabilities of large language models (LLMs) to discover and learn diverse coordination behaviors at the semantic level.	Lihe Li; Lei Yuan; Pengsen Liu; Tao Jiang; Yang Yu;	code
472	IMTS Is Worth Time $\times$ Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While pre-trained foundation models show potential for addressing these challenges, they are typically designed for Regularly Sampled Time Series (RTS). Motivated by the visual Mask AutoEncoder’s (MAE) powerful capability for modeling sparse multi-channel information and its success in RTS forecasting, we propose VIMTS, a framework adapting Visual MAE for IMTS forecasting.	Zhangyi Hu; Jiemin Wu; Hua XU; Mingqian Liao; Ninghui Feng; Bo Gao; Songning Lai; Yutao Yue;	code
473	High Dynamic Range Novel View Synthesis with Single Exposure Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While effective, this multiple-exposure HDR-NVS approach has significant limitations, including susceptibility to motion artifacts (e.g., ghosting and blurring), high capture and storage costs. To overcome these challenges, we introduce, for the first time, the single-exposure HDR-NVS problem, where only single exposure LDR images are available during training.	Kaixuan Zhang; HuWang; Minxian Li; Mingwu Ren; Mao Ye; Xiatian Zhu;	code
474	Sample Efficient Demonstration Selection for In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formulate the exemplar selection task as a top-m best arms identification problem.We release our code and data (https://github.com/kiranpurohit/CASE).	Kiran Purohit; Venktesh V; Sourangshu Bhattacharya; Avishek Anand;	code
475	Do Not Mimic My Voice : Speaker Identity Unlearning for Zero-Shot Text-to-Speech Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the new challenge of speaker identity unlearning for ZS-TTS systems.	Taesoo Kim; Jinju Kim; Dong Chan Kim; Jong Hwan Ko; Gyeong-Moon Park;	code
476	BSO: Binary Spiking Online Optimization Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their training algorithms often require substantial memory overhead due to latent weights storage and temporal processing requirements. To address this issue, we propose Binary Spiking Online (BSO) optimization algorithm, a novel online training algorithm that significantly reduces training memory.	Yu Liang; Yu Yang; Wenjie Wei; Ammar Belatreche; Shuai Wang; Malu Zhang; Yang Yang;	code
477	BECAME: Bayesian Continual Learning with Adaptive Model Merging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging.	Mei Li; Yuxiang Lu; Qinyan Dai; Suizhi Huang; Yue Ding; Hongtao Lu;	code
478	PolyConf: Unlocking Polymer Conformation Generation Through Hierarchical Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose PolyConf, a pioneering tailored polymer conformation generation method that leverages hierarchical generative models to unlock new possibilities.Moreover, we develop the first benchmark with a high-quality polymer conformation dataset derived from molecular dynamics simulations to boost related research in this area.	Fanmeng Wang; Wentao Guo; Qi Ou; Hongshuai Wang; Haitao Lin; Hongteng Xu; Zhifeng Gao;	code
479	LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on our theory, we propose a theory-driven algorithm, LoRA-One, where the linear convergence (as well as generalization) is built and incorporating preconditioners theoretically helps mitigate the effects of ill-conditioning.	Yuanhe Zhang; Fanghui Liu; Yudong Chen;	code
480	ELoRA: Low-Rank Adaptation for Equivariant GNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce ELoRA (Equivariant Low-Rank Adaptation), a novel fine-tuning method designed specifically for SO(3) equivariant Graph Neural Networks (GNNs), the backbones in multiple pre-trained interatomic potentials.	Chen Wang; Siyu Hu; Guangming Tan; Weile Jia;	code
481	HybridGS: High-Efficiency Gaussian Splatting Data Compression Using Dual-Channel Sparse Representation and Point Cloud Encoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a new 3DGS compression framework called HybridGS, which takes advantage of both compact generation and standardized point cloud data encoding.	Qi Yang; Le Yang; Geert Van der Auwera; Zhu Li;	code
482	DragLoRA: Online Optimization of LoRA Adapters for Drag-based Image Editing in Diffusion Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches often suffer from limited accuracy due to the low representation ability of the feature in motion supervision, as well as inefficiencies caused by the large search space required for point tracking. To address these limitations, we present DragLoRA, a novel framework that integrates LoRA (Low-Rank Adaptation) adapters into the drag-based editing pipeline.	Siwei Xia; Li Sun; Tiantian Sun; Qingli Li;	code
483	MUDDFormer: Breaking Residual Bottlenecks in Transformers Via Multiway Dynamic Dense Connections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose MUltiway Dynamic Dense (MUDD) connections, a simple yet effective method to address the limitations of residual connections and enhance cross-layer information flow in Transformers.	Da Xiao; Qingye Meng; Shengping Li; Xingyuan Yuan;	code
484	Knowledge Swapping Via Learning and Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Knowledge Swapping, a novel task designed to selectively regulate knowledge of a pretrained model by enabling the forgetting of user-specified information, retaining essential knowledge, and acquiring new knowledge simultaneously.	Mingyu Xing; Lechao Cheng; Shengeng Tang; Yaxiong Wang; Zhun Zhong; Meng Wang;	code
485	MoRAgent: Parameter Efficient Agent Tuning with Mixture-of-Roles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce three key strategies for PEFT in agent tasks: 1) Inspired by the increasingly dominant \textit{Reason+Action} paradigm, we first decompose the capabilities necessary for the agent tasks into three distinct roles: reasoner, executor, and summarizer.	Jing Han; Binwei Yan; Tianyu Guo; Zheyuan Bai; Mengyu Zheng; Hanting Chen; Ying Nie;	code
486	Instruct2See: Learning to Remove Any Obstructions Across Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Most existing methods address occlusions from specific elements like fences or raindrops, but are constrained by the wide range of real-world obstructions, making comprehensive data collection impractical. To overcome these challenges, we propose Instruct2See, a novel zero-shot framework capable of handling both seen and unseen obstacles.	Junhang Li; Yu Guo; Chuhua XIAN; Shengfeng He;	code
487	Geometric Feature Embedding for Effective 3D Few-Shot Class Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing 3D FSCIL approaches primarily utilize multimodal pre-trained models to extract the semantic features, heavily dependent on meticulously designed high-quality prompts and fine-tuning strategies. To reduce this dependence, this paper proposes a novel method for 3D FSCIL with Embedded Geometric features (3D-FLEG).	Xiangqi Li; Libo Huang; Zhulin An; Weilun Feng; Chuanguang Yang; Boyu Diao; Fei Wang; Yongjun Xu;	code
488	Channel Normalization for Time Series Channel Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we highlight the importance of CID and propose Channel Normalization (CN), a simple yet effective normalization strategy that enhances CID by assigning distinct affine transformation parameters to each channel.	Seunghan Lee; Taeyoung Park; Kibok Lee;	code
489	Improving Consistency Models with Generator-Augmented Flows Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The related estimation error induces a discrepancy between consistency distillation and training that, we show, still holds in the continuous-time limit. To alleviate this issue, we propose a novel flow that transports noisy data towards their corresponding outputs derived from a consistency model.	Thibaut Issenhuth; Sangchul Lee; Ludovic Dos Santos; Jean-Yves Franceschi; Chansoo Kim; Alain Rakotomamonjy;	code
490	Improving Out-of-Distribution Detection Via Dynamic Covariance Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we argue that the influence of ill-distributed samples can be corrected by dynamically adjusting the prior geometry in response to new data.	Kaiyu Guo; Zijian Wang; Tan Pan; Brian C. Lovell; Mahsa Baktashmotlagh;	code
491	UnMORE: Unsupervised Multi-Object Segmentation Via Center-Boundary Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce unMORE, a novel two-stage pipeline designed to identify many complex objects in real-world images.	Yafei YANG; Zihui Zhang; Bo Yang;	code
492	Policy Design for Two-sided Platforms with Participation Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper thus studies the dynamics and recommender policy design on two-sided platforms under the population effects for the first time.	Haruka Kiyohara; Fan Yao; Sarah Dean;	code
493	Generalization Performance of Ensemble Clustering: From Theory to Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines the generalization performance of ensemble clustering, focusing on generalization error, excess risk and consistency.	Xu Zhang; Haoye Qiu; Weixuan Liang; Hui LIU; Junhui Hou; Yuheng Jia;	code
494	Learning Input Encodings for Kernel-Optimal Implicit Neural Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We first formulate the optimal kernel that minimizes pointwise expected squared error, then demonstrate that the Neural Tangent Kernel of the composed function (INR with input encoding) can approximate any positive semidefinite dot-product kernels through input feature mapping adjustments. Building upon these insights, we propose a Kernel Alignment Regularizer (KAR) that naturally integrates with existing INR systems to enhance kernel alignment.	Zhemin Li; Liyuan Ma; Hongxia Wang; Yaoyun Zeng; Xiaolong Han;	code
495	Learning Adaptive Lighting Via Channel-Aware Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we identify shared fundamental properties across these tasks: i) different color channels have different light properties, and ii) the channel differences reflected in the spatial and frequency domains are different. Leveraging these insights, we introduce the channel-aware Learning Adaptive Lighting Network (LALNet), a multi-task framework designed to handle multiple light-related tasks efficiently.	Qirui Yang; Peng-Tao Jiang; Hao Zhang; Jinwei Chen; Bo Li; Huanjing Yue; Jingyu Yang;	code
496	PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PDE-Transformer, an improved transformer-based architecture for surrogate modeling of physics simulations on regular grids.	Benjamin Holzschuh; Qiang Liu; Georg Kohl; Nils Thuerey;	code
497	Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Video-Enhanced Offline RL (VeoRL), a model-based method that constructs an interactive world model from diverse, unlabeled video data readily available online.	Minting Pan; Yitao Zheng; Jiajian Li; Yunbo Wang; Xiaokang Yang;	code
498	I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents ThinkDiff, a novel alignment paradigm that empowers text-to-image diffusion models with multimodal in-context understanding and reasoning capabilities by integrating the strengths of vision-language models (VLMs).	Zhenxing Mi; Kuan-Chieh Wang; Guocheng Qian; Hanrong Ye; Runtao Liu; Sergey Tulyakov; Kfir Aberman; Dan Xu;	code
499	Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This raises a question: “Does attention fail for graphs in natural language settings?” Motivated by these observations, we embarked on an empirical study from the perspective of attention mechanisms to explore how LLMs process graph-structured data.	Zhong Guan; Likang Wu; Hongke Zhao; Ming He; Jianping Fan;	code
500	$\texttt{I$^2$MoE}$: Interpretable Multimodal Interaction-aware Mixture-of-Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose $\texttt{I$^2$MoE}$ ($\underline{I}$nterpretable Multimodal $\underline{I}$nteraction-aware $\underline{M}$ixture-$\underline{o}$f-$\underline{E}$xperts), an end-to-end MoE framework designed to enhance modality fusion by explicitly modeling diverse multimodal interactions, as well as providing interpretation on a local and global level.	Jiayi Xin; Sukwon Yun; Jie Peng; Inyoung Choi; Jenna L. Ballard; Tianlong Chen; Qi Long;	code
501	FlexControl: Computation-Aware Conditional Control with Differentiable Router for Text-to-Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, most implementations (e.g., ControlNet) rely on ad-hoc heuristics to choose which network blocks to control — an approach that varies unpredictably with different tasks. To address this gap, we propose FlexControl, a novel framework that equips all diffusion blocks with control signals during training and employs a trainable gating mechanism to dynamically select which control signal to activate at each denoising step.	Zheng Fang; Lichuan Xiang; Xu Cai; Kaicheng Zhou; Hongkai Wen;	code
502	Sorbet: A Neuromorphic Hardware-Compatible Transformer-Based Spiking Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, key operations like softmax and layer normalization (LN) are difficult to implement on neuromorphic hardware, and many of these early works sidestepped them. To address these challenges, we introduce Sorbet, a transformer-based spiking language model that is more neuromorphic hardware-compatible.	Kaiwen Tang; Zhanglu Yan; Weng-Fai Wong;	code
503	WMarkGPT: Watermarked Image Understanding Via Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a meticulously designed three-stage learning pipeline to progressively equip WMarkGPT with the necessary abilities.	Songbai Tan; Xuerui Qiu; Yao Shu; Gang Xu; Linrui Xu; Xiangyu Xu; Huiping Zhuang; Ming Li; Fei Yu;	code
504	Reaction Graph: Towards Reaction-Level Modeling for Chemical Reactions with 3D Structures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Reaction Graph (RG), a unified graph representation that encapsulates the 3D molecular structures within chemical reactions.	Yingzhao Jian; Yue Zhang; Ying Wei; Hehe Fan; Yi Yang;	code
505	Few-Shot Learner Generalizes Across AI-Generated Image Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Besides, collecting adequate training data from online generative models is often expensive or infeasible. To overcome these issues, we propose Few-Shot Detector (FSD), a novel AI-generated image detector which learns a specialized metric space for effectively distinguishing unseen fake images using very few samples.	Shiyu Wu; Jing Liu; Jing Li; Yequan Wang;	code
506	How Do Images Align and Complement LiDAR? Towards A Harmonized Multi-modal 3D Panoptic Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these approaches have shown promising results, they still face challenges, such as misalignment during data augmentation and the reliance on post-processing steps. To address these issues, we propose Image-Assists-LiDAR (IAL), a novel multi-modal 3D panoptic segmentation framework.	Yining Pan; Qiongjie Cui; Xulei Yang; Na Zhao;	code
507	MixBridge: Heterogeneous Image-to-Image Backdoor Attack Through Mixture of Schrödinger Bridges Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing backdoor formulations mainly address single-attack scenarios and are limited to Gaussian noise input models. To fill this gap, we propose MixBridge, a novel diffusion Schrödinger bridge (DSB) framework to cater to arbitrary input distributions (taking I2I tasks as special cases).	Shixi Qin; Zhiyong Yang; Shilong Bao; Shi Wang; Qianqian Xu; Qingming Huang;	code
508	Right Time to Learn: Promoting Generalization Via Bio-inspired Spacing Effect in Knowledge Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose an easy-to-use and compatible strategy named Spaced KD to improve the effectiveness of both online KD and self KD, in which the student model distills knowledge from a teacher model trained with a space interval ahead.	Guanglong Sun; Hongwei Yan; Liyuan Wang; Qian Li; Bo Lei; Yi Zhong;	code
509	Cut Out and Replay: A Simple Yet Versatile Strategy for Multi-Label Online Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It not only enables models to simultaneously address catastrophic forgetting, missing labels, and class imbalance challenges, but also serves as an orthogonal solution that seamlessly integrates with existing approaches.	Xinrui Wang; Shao-Yuan Li; Jiaqiang Zhang; Songcan Chen;	code
510	UniMate: A Unified Model for Mechanical Metamaterial Generation, Property Prediction, and Condition Confirmation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose a unified model named UniMate, which consists of a modality alignment module and a synergetic diffusion generation module.	Wangzhi Zhan; Jianpeng Chen; Dongqi Fu; Dawei Zhou;	code
511	CFPT: Empowering Time Series Forecasting Through Cross-Frequency Interaction and Periodic-Aware Timestamp Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Long-term time series forecasting has been widely studied, yet two aspects remain insufficiently explored: the interaction learning between different frequency components and the exploitation of periodic characteristics inherent in timestamps. To address the above issues, we propose CFPT, a novel method that empowering time series forecasting through Cross-Frequency Interaction (CFI) and Periodic-Aware Timestamp Modeling (PTM).	Feifei Kou; Jiahao Wang; Lei Shi; Yuhan Yao; Yawen Li; Suguo Zhu; Zhongbao Zhang; Junping Du;	code
512	Information Bottleneck-guided MLPs for Robust Spatial-temporal Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we investigate the problem: can simple neural networks such as Multi-Layer Perceptrons (MLPs) achieve robust spatial-temporal forecasting while remaining efficient?*	Min Chen; Guansong Pang; Wenjun Wang; Cheng Yan;	code
513	Better to Teach Than to Give: Domain Generalized Semantic Segmentation Via Agent Queries with Diffusion Model Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel agent \textbf{Query}-driven learning framework based on \textbf{Diff}usion model guidance for DGSS, named QueryDiff.	Fan Li; Xuan Wang; Min Qi; Zhaoxiang Zhang; yuelei xu;	code
514	Retraining-free Merging of Sparse MoE Via Hierarchical Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the deployment of SMoE models faces constraints from extensive memory requirements of expert components in resource-limited environments. To address these limitations, this paper introduces Hierarchical Clustering for Sparsely activated Mixture of Experts (HC-SMoE), a task-agnostic expert merging framework for parameter reduction without retraining.	I-Chun Chen; Hsu-Shen Liu; Wei-Fang Sun; Chen-Hao Chao; Yen-Chang Hsu; Chun-Yi Lee;	code
515	Weight Matrices Compression Based on PDB Model in Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, a novel Population Double Bulk (PDB) model is proposed to characterize the eigenvalue behavior of the weight matrix, which is more general than the existing Population Unit Bulk (PUB) model.	Xiaoling Wu; Junpeng Zhu; Zeng Li;	code
516	SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce SAeUron, a novel method leveraging features learned by sparse autoencoders (SAEs) to remove unwanted concepts in text-to-image diffusion models.	Bartosz Cywiński; Kamil Deja;	code
517	Structure-informed Risk Minimization for Robust Ensemble Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by Distributionally Robust Optimization (DRO), we propose Structure-informed Risk Minimization (SRM), a principled framework that learns robust ensemble weights without access to test data.	Fengchun Qiao; Yanlin Chen; Xi Peng;	code
518	GenZSL: Generative Zero-Shot Learning Via Inductive Variational Autoencoder Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing generative ZSL methods merely generate (imagine) the visual features from scratch guided by the strong class semantic vectors annotated by experts, resulting in suboptimal generative performance and limited scene generalization. To address these and advance ZSL, we propose an inductive variational autoencoder for generative zero-shot learning, dubbed GenZSL.	Shiming Chen; Dingjie Fu; Salman Khan; Fahad Shahbaz Khan;	code
519	WGFormer: An SE(3)-Transformer Driven By Wasserstein Gradient Flows for Molecular Ground-State Conformation Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel and effective method to bridge the energy-based simulation and the learning-based strategy, which designs and learns a Wasserstein gradient flow-driven SE(3)-Transformer, called WGFormer, for ground-state conformation prediction.	Fanmeng Wang; Minjie Cheng; Hongteng Xu;	code
520	The Emperor’s New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we design a systematic and controlled pipeline along with two novel metrics—fidelity and contamination resistance—to provide a fine-grained and comprehensive assessment of existing BDC mitigation strategies.	Yifan Sun; Han Wang; Dongbai Li; Gang Wang; Huan Zhang;	code
521	ArrayDPS: Unsupervised Blind Speech Separation with A Diffusion Prior Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose ArrayDPS to solve the BSS problem in an unsupervised, array-agnostic, and generative manner.	Zhongweiyang Xu; Xulin Fan; Zhong-Qiu Wang; Xilin Jiang; Romit Roy Choudhury;	code
522	BoA: Attention-aware Post-training Quantization Without Backpropagation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel backpropagation-free PTQ algorithm that optimizes quantized weights by considering inter-layer dependencies.	Junhan Kim; Ho-young Kim; Eulrang Cho; Chungman Lee; Joonyoung Kim; Yongkweon Jeon;	code
523	ML$^2$-GCL: Manifold Learning Inspired Lightweight Graph Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing works follow the basic principle of pulling positive pairs closer and pushing negative pairs far away, they still suffer from several critical problems, such as the underlying semantic disturbance brought by augmentation strategies, the failure of GCN in capturing long-range dependence, rigidness and inefficiency of node sampling techniques. To address these issues, we propose Manifold Learning Inspired Lightweight Graph Contrastive Learning (ML$^2$-GCL), which inherits the merits of both manifold learning and GCN.	Jianqing Liang; Zhiqiang Li; Xinkai Wei; Yuan Liu; Zhiqiang Wang;	code
524	Spherical-Nested Diffusion Model for Panoramic Image Outpainting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the fact that the majority of generative outpainting solutions operates on planar images, existing methods for panoramic images address the sphere nature by soft regularisation during the end-to-end learning, which still fails to fully exploit the spherical content. In this paper, we set out the first attempt to impose the sphere nature in the design of diffusion model, such that the panoramic format is intrinsically ensured during the learning procedure, named as spherical-nested diffusion (SpND) model.	Xiancheng Sun; Senmao Ma; Shengxi Li; Mai Xu; Jingyuan Xia; Lai Jiang; Xin Deng; Jiali Wang;	code
525	TINED: GNNs-to-MLPs By Teacher Injection and Dirichlet Energy Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present TINED, a novel approach that distills GNNs to MLPs on a layer-by-layer basis using Teacher Injection and Dirichlet Energy Distillation techniques.	Ziang Zhou; Zhihao Ding; Jieming Shi; Li Qing; Shiqi Shen;	code
526	Model Steering: Learning with A Reference Model Improves Generalization Bounds and Scaling Laws Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a theory-driven framework for model steering called DRRho risk minimization, which is rooted in Distributionally Robust Optimization (DRO).	Xiyuan Wei; Ming Lin; Fanjiang Ye; Fengguang Song; Liangliang Cao; My T. Thai; Tianbao Yang;	code
527	Discovering Global False Negatives On The Fly for Self-supervised Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this approach can result in the creation of negative pairs with similar semantics, referred to as false negatives, leading to their embeddings being falsely pushed apart. To address this issue, we introduce GloFND, an optimization-based approach that automatically learns on the fly the threshold for each anchor data to identify its false negatives during training.	Vicente Balmaseda; Bokun Wang; Ching-Long Lin; Tianbao Yang;	code
528	R3DM: Enabling Role Discovery and Diversity Through Dynamics Models in Multi-agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we propose Role Discovery and Diversity through Dynamics Models (R3DM), a novel role-based MARL framework that learns emergent roles by maximizing the mutual information between agents’ roles, observed trajectories, and expected future behaviors.	Harsh Goel; Mohammad Omama; Behdad Chalaki; Vaishnav Tadiparthi; Ehsan Moradi Pari; Sandeep P. Chinchali;	code
529	Hessian Geometry of Latent Space in Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel method for analyzing the latent space geometry of generative models, including statistical physics models and diffusion models, by reconstructing the Fisher information metric.	Alexander Lobashev; Dmitry Guskov; Maria Larchenko; Mikhail Tamm;	code
530	InfoSAM: Fine-Tuning The Segment Anything Model from An Information-Theoretic Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing PEFT methods for SAM neglect the domain-invariant relations encoded in the pre-trained model. To bridge this gap, we propose InfoSAM, an information-theoretic approach that enhances SAM fine-tuning by distilling and preserving its pre-trained segmentation knowledge.	yuanhong zhang; Muyao Yuan; Weizhan Zhang; Tieliang Gong; Wen Wen; Jiangyong Ying; Weijie Shi;	code
531	GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GPTAQ, a novel finetuning-free quantization method for compressing large-scale transformer architectures.	Yuhang Li; Ruokai Yin; Donghyun Lee; Shiting Xiao; Priyadarshini Panda;	code
532	Discriminative Finetuning of Generative Large Language Models Without Reward Models and Human Preference Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address the limitations of SFT by exploring one of the most successful techniques in conventional supervised learning: discriminative learning.	Siqi Guo; Ilgee Hong; Vicente Balmaseda; Changlong Yu; Liang Qiu; Xin Liu; Haoming Jiang; Tuo Zhao; Tianbao Yang;	code
533	Devil Is in The Details: Density Guidance for Detail-Aware Generation with Flow Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we analyze an existing technique, Prior Guidance, which scales the latent code to influence image detail.	Rafal Karczewski; Markus Heinonen; Vikas K Garg;	code
534	Circumventing Backdoor Space Via Weight Symmetry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Notably, recent studies have shown successful backdoor attacks across various learning paradigms, highlighting a critical security concern. To address this gap, we propose Two-stage Symmetry Connectivity (TSC), a novel backdoor purification defense that operates independently of data format and requires only a small fraction of clean samples.	Jie Peng; Hongwei Yang; Jing Zhao; Hengji Dong; Hui He; Weizhe Zhang; Haoyu He;	code
535	Model Immunization from A Condition Number Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a framework, based on the condition number of a Hessian matrix, to analyze model immunization for linear models.	Amber Yijia Zheng; Site Bai; Brian Bullins; Raymond A. Yeh;	code
536	Are High-Quality AI-Generated Images More Difficult for Models to Detect? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, our systematic study on cutting-edge text-to-image generators reveals a counterintuitive finding: AIGIs with higher quality scores, as assessed by human preference models, tend to be more easily detected by existing models. To investigate this, we examine how the text prompts for generation and image characteristics influence both quality scores and detector accuracy.	Yao Xiao; Binbin Yang; Weiyan Chen; Jiahao Chen; Zijie Cao; ZiYi Dong; Xiangyang Ji; Liang Lin; Wei Ke; Pengxu Wei;	code
537	Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we propose Similar, a step-wise multi-dimensional generalist reward model, which offers fine-grained signals for agent training and can choose better actions for inference-time scaling.Furthermore, we introduce the first benchmark in the virtual agent domain for step-wise, multi-dimensional reward model training and evaluation, named *SRM*.	Bingchen Miao; Yang Wu; Minghe Gao; Qifan Yu; Wendong Bu; Wenqiao Zhang; liyunfei; Siliang Tang; Tat-Seng Chua; Juncheng Li;	code
538	Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose CoTo, a progressive training strategy that gradually increases adapters’ activation probability over the course of fine-tuning.	Zhan Zhuang; Xiequn Wang; Wei Li; Yulong Zhang; Qiushi Huang; Shuhao Chen; Xuehao Wang; Yanbin Wei; Yuhe Nie; Kede Ma; Yu Zhang; Ying Wei;	code
539	CellFlux: Simulating Cellular Morphology Changes Via Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CellFlux, an image-generative model that simulates cellular morphology changes induced by chemical and genetic perturbations using flow matching.	Yuhui Zhang; Yuchang Su; Chenyu Wang; Tianhong Li; Zoe Wefers; Jeffrey J Nirschl; James Burgess; Daisy Ding; Alejandro Lozano; Emma Lundberg; Serena Yeung-Levy;	code
540	Optimizing Adaptive Attacks Against Watermarks for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formulate watermark robustness as an objective function and use preference-based optimization to tune adaptive* attacks against the specific watermarking method.*	Abdulrahman Diaa; Toluwani Aremu; Nils Lukas;	code
541	Unnatural Languages Are Not Bugs But Features for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) have been observed to process non-human-readable text sequences, such as jailbreak prompts, often viewed as a bug for aligned LLMs. In this work, we present a systematic investigation challenging this perception, demonstrating that unnatural languages – strings that appear incomprehensible to humans but maintain semantic meanings for LLMs – contain latent features usable by models.	Keyu Duan; Yiran Zhao; Zhili Feng; Jinjie Ni; Tianyu Pang; Qian Liu; Tianle Cai; Longxu Dou; Kenji Kawaguchi; Anirudh Goyal; J Zico Kolter; Michael Qizhe Shieh;	code
542	A Square Peg in A Square Hole: Meta-Expert for Long-Tailed Semi-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We observe that different experts are good at predicting different intervals of samples, e.g., long-tailed expert is skilled in samples located in the head interval and uniform expert excels in samples located in the medium interval. Therefore, we propose a dynamic expert assignment module that can estimate the class membership (i.e., head, medium, or tail class) of samples, and dynamically assigns suitable expert to each sample based on the estimated membership to produce high-quality pseudo-label in the training phase and produce prediction in the testing phase.	Yaxin Hou; Yuheng Jia;	code
543	Scalable Non-Equivariant 3D Molecule Generation Via Rotational Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, specialized equivariant architectures limit the scalability and efficiency of diffusion models. In this paper, we propose an approach that relaxes such equivariance constraints.	Yuhui Ding; Thomas Hofmann;	code
544	Generalized Interpolating Discrete Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Leveraging a novel diffusion ELBO, we achieve compute-matched state-of-the-art performance in diffusion language modeling. Exploiting GIDD’s flexibility, we explore a hybrid approach combining masking and uniform noise, leading to improved sample quality and unlocking the ability for the model to correct its own mistakes, an area where autoregressive models notoriously have struggled.	Dimitri von Rütte; Janis Fluri; Yuhui Ding; Antonio Orvieto; Bernhard Schölkopf; Thomas Hofmann;	code
545	Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ensuring fairness in medical image segmentation is critical due to biases in imbalanced clinical data acquisition caused by demographic attributes (e.g., age, sex, race) and clinical factors (e.g., disease severity). To address these challenges, we introduce Distribution-aware Mixture of Experts (dMoE), inspired by optimal control theory.	Yujin Oh; Pengfei Jin; Sangjoon Park; Sekeun Kim; Siyeop yoon; Jin Sung Kim; Kyungsang Kim; Xiang Li; Quanzheng Li;	code
546	TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce `TypyBench`, a benchmark designed to evaluate LLMs’ type inference across entire Python repositories.	Honghua Dong; Jiacheng Yang; Xun Deng; Yuhe Jiang; Gennady Pekhimenko; Fan Long; Xujie Si;	code
547	EvoMesh: Adaptive Physical Simulation with Hierarchical Graph Evolutions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose EvoMesh, a fully differentiable framework that jointly learns graph hierarchies and physical dynamics, adaptively guided by physical inputs.	Huayu Deng; Xiangming Zhu; Yunbo Wang; Xiaokang Yang;	code
548	Perceptual-GS: Scene-adaptive Perceptual Densification for Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods struggle to adaptively optimize the distribution of Gaussian primitives based on scene characteristics, making it challenging to balance reconstruction quality and efficiency. Inspired by human perception, we propose scene-adaptive perceptual densification for Gaussian Splatting (Perceptual-GS), a novel framework that integrates perceptual sensitivity into the 3DGS training process to address this challenge.	Hongbi Zhou; Zhangkai Ni;	code
549	LADA: Scalable Label-Specific CLIP Adapter for Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This requires selecting the expected parameters for input images during inference, which is prone to error that degrades performance. To address this problem, we introduce LADA (Label-specific ADApter).	Mao-Lin Luo; Zi-Hao Zhou; Tong Wei; Min-Ling Zhang;	code
550	Efficiently Serving Large Multimodal Models Using EPD Disaggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Encode-Prefill-Decode (EPD) Disaggregation, a novel framework that separates the encoding, prefill, and decode stages onto dedicated resources.	Gursimran Singh; Xinglu Wang; Yifan Hu; Timothy Tin Long Yu; Linzi Xing; Wei Jiang; Zhefeng Wang; Bai Xiaolong; Yi Li; Ying Xiong; Yong Zhang; Zhenan Fan;	code
551	SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we establish a connection between Koopman operator approximation and linear Recurrent Neural Networks (RNNs), which have recently demonstrated remarkable success in sequence modeling.	Yitian Zhang; Liheng Ma; Antonios Valkanas; Boris N. Oreshkin; Mark Coates;	code
552	Learning Efficient Robotic Garment Manipulation with Standardization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents APS-Net, a novel approach to garment manipulation that combines unfolding and standardization in a unified framework.	zhou changshi; Feng Luan; hujiarui; Shaoqiang Meng; Zhipeng Wang; Yanchao Dong; Yanmin Zhou; Bin He;	code
553	Diffusion Sampling Correction Via Approximately 10 Parameters Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose PCA-based Adaptive Search (PAS), which optimizes existing solvers for DPMs with minimal additional costs.	Guangyi Wang; Wei Peng; lijiang Li; Wenyu Chen; Yuren Cai; Song-Zhi Su;	code
554	Long-Form Speech Generation with Spoken Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the generative modeling of speech over multiple minutes, a requirement for long-form multimedia generation and audio-native voice assistants.	Se Jin Park; Julian Salazar; Aren Jansen; Keisuke Kinoshita; Yong Man Ro; RJ Skerry-Ryan;	code
555	Automated Hypothesis Validation with Agentic Sequential Falsifications Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose POPPER, an agentic framework for rigorous automated validation of free-form hypotheses.	Kexin Huang; Ying Jin; Ryan Li; Michael Y. Li; Emmanuel Candes; Jure Leskovec;	code
556	Can Large Language Models Understand Intermediate Representations in Compilers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present an explorative empirical study evaluating the capabilities of six state-of-the-art LLMs—GPT-4, GPT-3, DeepSeek, Gemma 2, Llama 3, and Code Llama—in understanding IRs.	Hailong Jiang; Jianfeng Zhu; Yao Wan; Bo Fang; Hongyu Zhang; Ruoming Jin; Qiang Guan;	code
557	IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce IMPACT, a text-to-audio generation framework that achieves high performance in audio quality and fidelity while ensuring fast inference.	Kuan-Po Huang; Shu-wen Yang; HUY PHAN; Bo-Ru Lu; Byeonggeun Kim; Sashank Macha; Qingming Tang; Shalini Ghosh; Hung-yi Lee; Chieh-Chi Kao; Chao Wang;	code
558	QuRe: Query-Relevant Retrieval Through Hard Negative Sampling in Composed Image Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This may result in retrieving irrelevant images, reducing user satisfaction even when the target image is retrieved. To address this issue, we propose Query-Relevant Retrieval through Hard Negative Sampling (QuRe), which optimizes a reward model objective to reduce false negatives.	Jaehyun Kwak; Ramahdani Muhammad Izaaz Inhar; Se-Young Yun; Sung-Ju Lee;	code
559	KoopSTD: Reliable Similarity Analysis Between Dynamical Systems Via Approximating Koopman Spectrum with Timescale Decoupling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose KoopSTD, a dynamical similarity measurement framework that precisely characterizes the underlying dynamics by approximating the Koopman spectrum with explicit timescale decoupling and spectral residual control.	Shimin Zhang; Ziyuan Ye; Yinsong Yan; Zeyang Song; Yujie Wu; Jibin Wu;	code
560	Revisiting Continuity of Image Tokens for Cross-domain Few-shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This questions the role of image tokens’ continuity in ViT’s generalization under large domain gaps. In this paper, we delve into this phenomenon for an interpretation.	Shuai Yi; Yixiong Zou; Yuhua Li; Ruixuan Li;	code
561	Random Registers for Cross-Domain Few-Shot Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although Vision Transformer (ViT) has shown superior capability in many vision tasks, its transferability against huge domain gaps in CDFSL is still under-explored. In this paper, we find an intriguing phenomenon: during the source-domain training, prompt tuning, as a common way to train ViT, could be harmful for the generalization of ViT in target domains, but setting them to random noises (i.e., random registers) could consistently improve target-domain performance.	Shuai Yi; Yixiong Zou; Yuhua Li; Ruixuan Li;	code
562	Diffusion Instruction Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Lavender, a simple supervised fine-tuning (SFT) method that boosts the performance of advanced vision-language models (VLMs) by leveraging state-of-the-art image generation models such as Stable Diffusion.	Chen Jin; Ryutaro Tanno; Amrutha Saseendran; Tom Diethe; Philip Alexander Teare;	code
563	Towards Lifelong Model Editing Via Simulating Ideal Editor Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, this paper proposes a general framework, *Simulating Ideal Editor* (SimIE), which restores the strong performance of parameter-modifying methods from standard model editing in a lifelong context.	Yaming Guo; Siyang Guo; Hengshu Zhu; Ying Sun;	code
564	Modified K-means Algorithm with Local Optimality Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first present conditions under which the K-means algorithm converges to a locally optimal solution. Based on this, we propose simple modifications to the K-means algorithm which ensure local optimality in both the continuous and discrete sense, with the same computational complexity as the original K-means algorithm.	Mingyi Li; Michael R. Metel; Akiko Takeda;	code
565	A Variational Framework for Improving Naturalness in Generative Spoken Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, pitch alone cannot fully represent the range of paralinguistic attributes, and selecting the right features requires careful hand-engineering. To overcome this, we propose an end-to-end variational approach that automatically learns to encode these continuous speech attributes to enhance the semantic tokens.	Li-Wei Chen; Takuya Higuchi; Zakaria Aldeneh; Ahmed Hussen Abdelaziz; Alexander Rudnicky;	code
566	Efficient and Separate Authentication Image Steganography Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: It is elegant to design an authentication mechanism for isolated reception. We explore such mechanism through sufficient experiments, and uncover that additional authentication information will affect the distribution of hidden information and occupy more hiding space of the cover image.	Junchao Zhou; Yao Lu; Jie Wen; Guangming Lu;	code
567	Learning Monotonic Probabilities with A Generative Cost Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This perspective enables us to reformulate the monotonicity challenge into modeling the latent cost variable. To tackle this, we introduce a generative network for the latent cost variable, termed the Generative Cost Model (GCM), which inherently addresses the strict monotonic problem, and propose the Implicit Generative Cost Model (IGCM) to address the implicit monotonic problem.	Yongxiang Tang; Yanhua cheng; Xiaocheng Liu; Jiaochenchen; Yanxiang Zeng; Ning Luo; Pengjia Yuan; Xialong Liu; Peng Jiang;	code
568	Active Reward Modeling: Adaptive Preference Labeling for Large Language Model Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we highlight the insight that an ideal comparison dataset for reward modeling should balance exploration of the representation space and make informative comparisons between pairs with moderate reward differences.	Yunyi Shen; Hao Sun; Jean-Francois Ton;	code
569	RePaViT: Scalable Vision Transformer Acceleration Via Structural Reparameterization on Feedforward Network Layers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel channel idle mechanism that facilitates post-training structural reparameterization for efficient FFN layers during testing.	Xuwei Xu; Yang Li; Yudong Chen; Jiajun Liu; Sen Wang;	code
570	Optimal Information Retention for Time-Series Explanations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a practical framework, we introduce an explanation framework ORTE, learning a binary mask to eliminate redundant information while mining temporal patterns of explanations.	Jinghang Yue; Jing Wang; Lu Zhang; Shuo Zhang; Da Li; Zhaoyang Ma; Youfang Lin;	code
571	BalancEdit: Dynamically Balancing The Generality-Locality Trade-off in Multi-modal Model Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, direct knowledge editing within the models presents a more viable solution.We develop a new model editing dataset named OKEDIT, specifically designed to effectively evaluate this trade-off.	Dongliang Guo; Mengxuan Hu; Zihan Guan; Thomas Hartvigsen; Sheng Li;	code
572	Redundancy Undermines The Trustworthiness of Self-Interpretable GNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents a systematic investigation into the trustworthiness of explanations generated by self-interpretable graph neural networks (GNNs), revealing why models trained with different random seeds yield inconsistent explanations.	Wenxin Tai; Ting Zhong; Goce Trajcevski; Fan Zhou;	code
573	MTL-UE: Learning to Learn Nothing for Multi-Task Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents MTL-UE, the first unified framework for generating unlearnable examples for multi-task data and MTL models.	Yi Yu; Song Xia; SIYUAN YANG; Chenqi Kong; Wenhan Yang; Shijian Lu; Yap-Peng Tan; Alex Kot;	code
574	CoCoA-Mix: Confusion-and-Confidence-Aware Mixture Model for Context Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, frozen encoders often produce misaligned features, leading to confusion between classes and limiting specialization. To overcome this issue, we propose a confusion-aware loss (CoA-loss) that improves specialization by refining the decision boundaries between confusing classes.	Dasol Hong; Wooju Lee; Hyun Myung;	code
575	ITBench: Evaluating AI Agents Across Diverse Real-World IT Automation Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ITBench, a framework that offers a systematic methodology for benchmarking AI agents to address real-world IT automation tasks.	Saurabh Jha; Rohan R. Arora; Yuji Watanabe; Takumi Yanagawa; Yinfang Chen; Jackson Clark; Bhavya Bhavya; Mudit Verma; Harshit Kumar; Hirokuni Kitahara; Noah Zheutlin; Saki Takano; Divya Pathak; Felix George; Xinbo Wu; Bekir O Turkkan; Gerard Vanloo; Michael Nidd; Ting Dai; Oishik Chatterjee; Pranjal Gupta; Suranjana Samanta; Pooja Aggarwal; Rong Lee; Jae-wook Ahn; Debanjana Kar; Amit Paradkar; Yu Deng; Pratibha Moogi; Prateeti Mohapatra; Naoki Abe; Chandrasekhar Narayanaswami; Tianyin Xu; Lav R. Varshney; Ruchi Mahindru; Anca Sailer; Laura Shwartz; Daby Sow; Nicholas C. M. Fuller; Ruchir Puri;	code
576	GrokFormer: Graph Fourier Kolmogorov-Arnold Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Some recent GT models help alleviate this issue, but their flexibility and expressiveness are still limited since the filters they learn are fixed on predefined graph spectrum or spectral order. To tackle this challenge, we propose a Graph Fourier Kolmogorov-Arnold Transformer (GrokFormer), a novel GT model that learns highly expressive spectral filters with adaptive graph spectrum and spectral order through a Fourier series modeling over learnable activation functions.	GuoguoAi; Guansong Pang; Hezhe Qiao; YuanGao; Hui Yan;	code
577	Online Conformal Prediction Via Online Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a family of algorithms for online conformal prediction with coverage guarantees for both adversarial and stochastic data.	Felipe Areces; Christopher Mohri; Tatsunori Hashimoto; John Duchi;	code
578	Propagate and Inject: Revisiting Propagation-Based Feature Imputation for Graphs with Partially Observed Features Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address learning tasks on graphs with missing features, enhancing the applicability of graph neural networks to real-world graph-structured data.	Daeho Um; Sunoh Kim; Jiwoong Park; Jongin Lim; Seong Jin Ahn; Seulki Park;	code
579	PEAKS: Selecting Key Training Examples Incrementally Via Prediction Error Anchored By Kernel Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We find that in IDS, the impact of a new sample on the model state depends fundamentally on both its geometric relationship in the feature space and its prediction error. Leveraging this insight, we propose PEAKS (Prediction Error Anchored by Kernel Similarity), an efficient data selection method tailored for IDS.	Mustafa Burak Gurbuz; Xingyu Zheng; Constantine Dovrolis;	code
580	Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce a flexible and efficient data mixing framework, Chameleon, that employs leverage scores to quantify domain importance within a learned embedding space.	Wanyun Xie; Francesco Tonin; Volkan Cevher;	code
581	Mixed-curvature Decision Trees and Random Forests Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our novel angular reformulation respects manifold geometry while preserving the algorithmic properties that make decision trees effective. In the special cases of single-component manifolds, our method simplifies to its Euclidean or hyperbolic counterparts, or introduces hyperspherical DT algorithms, depending on the curvature.	Philippe Chlenski; Quentin Chu; Raiyan R. Khan; Kaizhu Du; Antonio Khalil Moretti; Itsik Pe’er;	code
582	Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Pivoting Factorization (PIFA), a novel lossless meta low-rank representation that unsupervisedly learns a compact form of any low-rank representation, effectively eliminating redundant information.	Jialin Zhao; Yingtao Zhang; Carlo Vittorio Cannistraci;	code
583	Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces MUFFIN, a fully convolutional NAC framework that leverages psychoacoustically guided multi-band frequency reconstruction.	Dianwen Ng; Kun Zhou; Yi-Wen Chao; Zhiwei Xiong; Bin Ma; EngSiong Chng;	code
584	Playmate: Flexible Control of Portrait Animation Via 3D-Implicit Space Guided Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In the first stage, we introduce a decoupled implicit 3D representation along with a meticulously designed motion-decoupled module to facilitate more accurate attribute disentanglement and generate expressive talking videos directly from audio cues.	Xingpei Ma; Jiaran Cai; Yuansheng Guan; Shenneng Huang; Qiang Zhang; Shunsi Zhang;	code
585	On Exact Bit-level Reversible Transformers Without Changing Architecture Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we present the BDIA-transformer, which is an exact bit-level reversible transformer that uses an unchanged standard architecture for inference.	Guoqiang Zhang; JP Lewis; W. Bastiaan Kleijn;	code
586	Open Materials Generation with Stochastic Interpolants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Open Materials Generation (OMatG), a unifying framework for the generative design and discovery of inorganic crystalline materials.	Philipp Höllmer; Thomas Egg; Maya Martirossyan; Eric Fuemmeler; Zeren Shui; Amit Gupta; Pawan Prakash; Adrian Roitberg; Mingjie Liu; George Karypis; Mark Transtrum; Richard Hennig; Ellad B. Tadmor; Stefano Martiniani;	code
587	Tackling Dimensional Collapse Toward Comprehensive Universal Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we identify that the failure of PDM for extreme UniDA stems from dimensional collapse (DC) in target representations.	Hung-Chieh Fang; Po-Yi Lu; Hsuan-Tien Lin;	code
588	Continual Generalized Category Discovery: Learning and Forgetting from A Bayesian Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on this insight, we propose Variational Bayes C-GCD (VB-CGCD), a novel framework that integrates variational inference with covariance-aware nearest-class-mean classification.We also introduce a new challenging benchmark with only 10% labeled data and extended online phases—VB-CGCD achieves a 67.86% final accuracy, significantly higher than state-of-the-art (38.55%), demonstrating its robust applicability across diverse scenarios.	Hao Dai; Jagmohan Chauhan;	code
589	Navigating Conflicting Views: Harnessing Trust for Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While prior work focuses on learning consistent and informative representations across views, it often assumes perfect alignment and equal importance of all views, an assumption rarely met in real-world scenarios, as some views may express distinct information. To address this, we develop a computational trust-based discounting method that enhances the Evidential Multi-view framework by accounting for the instance-wise reliability of each view through a probability-sensitive trust mechanism.	Jueqing Lu; Wray Buntine; YUANYUAN QI; Joanna Dipnall; Belinda Gabbe; Lan Du;	code
590	Self-supervised Adversarial Purification for Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Defending Graph Neural Networks (GNNs) against adversarial attacks requires balancing accuracy and robustness, a trade-off often mishandled by traditional methods like adversarial training that intertwine these conflicting objectives within a single classifier. To overcome this limitation, we propose a self-supervised adversarial purification framework.	Woohyun Lee; Hogun Park;	code
591	HyperIV: Real-time Implied Volatility Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose HyperIV, a novel approach for real-time implied volatility smoothing that eliminates the need for traditional calibration procedures.	Yongxin Yang; Wenqi Chen; Chao Shu; Timothy Hospedales;	code
592	Pareto-Optimal Fronts for Benchmarking Symbolic Regression Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore absolute Pareto-optimal (APO) solutions instead, which have the optimal tradeoff between the multiple SR objectives, for 34 datasets in the widely-used SR benchmark, SRBench, by performing exhaustive search.	Kei Sen Fong; Mehul Motani;	code
593	Fast Inference with Kronecker-Sparse Matrices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing GPU kernels for KS matrix multiplication suffer from high data movement costs, with up to 50% of time spent on memory-bound tensor permutations. We propose a fused, output-stationary GPU kernel that eliminates these overheads, reducing global memory traffic threefold.	Antoine Gonon; Léon Zheng; Pascal Carrivain; TUNG QUOC LE;	code
594	Reconstructing Cell Lineage Trees from Phenotypic Features with Metric Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce CellTreeQM, a novel deep learning method based on transformer architectures that learns an embedding space with geometric properties optimized for tree-graph inference.By formulating the lineage reconstruction problem as tree-metric learning, we systematically explore weakly supervised training settings at different levels of information and present the Cell Lineage Reconstruction Benchmark to facilitate comprehensive evaluation.	Da Kuang; GuanWen Qiu; Junhyong Kim;	code
595	NegMerge: Sign-Consensual Weight Merging for Machine Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method that utilizes all fine-tuned models trained with varying hyperparameters instead of a single selection.	Hyo Seo Kim; Dongyoon Han; Junsuk Choe;	code
596	AMPO: Active Multi Preference Optimization for Self-play Preference Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Active Multi-Preference Optimization (AMPO), which combines on-policy generation, a multi-preference group-contrastive loss, and active subset selection.We release our datasets [here](https://huggingface.co/Multi-preference-Optimization).	Taneesh Gupta; Rahul Madhavan; Xuchao Zhang; Chetan Bansal; Saravan Rajmohan;	code
597	HYGMA: Hypergraph Coordination Networks with Dynamic Grouping for Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel framework that integrates dynamic spectral clustering with hypergraph neural networks to enable adaptive group formation and efficient information processing in multi-agent systems.	Chiqiang Liu; Dazi Li;	code
598	Pixel-level Certified Explanations Via Randomized Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Post-hoc attribution methods aim to explain deep learning predictions by highlighting influential input pixels.	Alaa Anani; Tobias Lorenz; Mario Fritz; Bernt Schiele;	code
599	ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent work has shown that caching intermediate features to be reused in subsequent inferences is an effective method to reduce latency in diffusion models. We extend this idea to real-time rendering and present ReFrame, which explores different caching policies to optimize trade-offs between quality and performance in rendering workloads.	Lufei Liu; Tor M. Aamodt;	code
600	Linear Mode Connectivity Between Multiple Models Modulo Permutation Symmetries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we conduct a more detailed empirical analysis.	Akira Ito; Masanori Yamada; Atsutoshi Kumagai;	code
601	Test-Time Selective Adaptation for Uni-Modal Distribution Shift in Multi-Modal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this research, we investigate the the under-explored practical scenario uni-modal distribution shift, where the distribution shift influences only one modality, leaving the others unchanged.	MingCai Chen; Baoming Zhang; Zongbo Han; Wenyu Jiang; Yanmeng Wang; Shuai Feng; Yuntao Du.; Bingkun BAO;	code
602	Guardians of Image Quality: Benchmarking Defenses Against Adversarial Attacks on Image Quality Metrics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Modern neural-network-based Image Quality Assessment (IQA) metrics are vulnerable to adversarial attacks, which can be exploited to manipulate search engine rankings, benchmark results, and content quality assessments, raising concerns about the reliability of IQA metrics in critical applications. This paper presents the first comprehensive study of IQA defense mechanisms in response to adversarial attacks on these metrics to pave the way for safer use of IQA metrics.	Aleksandr Gushchin; Khaled Abud; Georgii Bychkov; Ekaterina Shumitskaya; Anna Chistyakova; Sergey Lavrushkin; Bader Rasheed; Kirill Malyshev; Dmitriy S. Vatolin; Anastasia Antsiferova;	code
603	How to Move Your Dragon: Text-to-Motion Synthesis for Large-Vocabulary Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motion synthesis for diverse object categories holds great potential for 3D content creation but remains underexplored due to two key challenges: (1) the lack of comprehensive motion datasets that include a wide range of high-quality motions and annotations, and (2) the absence of methods capable of handling heterogeneous skeletal templates from diverse objects. To address these challenges, we contribute the following: First, we augment the Truebones Zoo dataset—a high-quality animal motion dataset covering over 70 species—by annotating it with detailed text descriptions, making it suitable for text-based motion synthesis.	Wonkwang Lee; Jongwon Jeong; Taehong Moon; Hyeon-Jong Kim; Jaehyeon Kim; Gunhee Kim; Byeong-Uk Lee;	code
604	Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to substantial variations in data quality, the fixed regularization strength often leads to a dilemma: Weak regularization strength fails to address extrapolation errors and value overestimation, while strong regularization strength shifts policy learning toward behavior cloning, impeding potential performance enabled by Bellman updates. To address this issue, we propose the selective state-adaptive regularization method for offline RL.	Qin-Wen Luo; Ming-Kun Xie; Ye-Wen Wang; Sheng-Jun Huang;	code
605	Improving Memory Efficiency for Training KANs Via Meta Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The proposed method provides an alternative technique for training KANs, that allows for greater scalability and extensibility, and narrows the training cost gap with MLPs stated in the original paper of KANs.	Zhangchi Zhao; Jun Shu; Deyu Meng; Zongben Xu;	code
606	Beyond Entropy: Region Confidence Proxy for Wild Test-Time Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel region-integrated method ReCAP* that bypasses the lengthy process.*	Zixuan Hu; Yichun Hu; Xiaotong Li; SHIXIANG TANG; LINGYU DUAN;	code
607	MOGIC: Metadata-infused Oracle Guidance for Improved Extreme Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enhance accuracy while maintaining low latency, we propose MOGIC, a novel approach to metadata-infused oracle guidance for XC.	Suchith Chidananda Prabhu; Bhavyajeet Singh; Anshul Mittal; Siddarth Asokan; Shikhar Mohan; Deepak Saini; Yashoteja Prabhu; Lakshya Kumar; Jian Jiao; Amit S; Niket Tandon; Manish Gupta; Sumeet Agarwal; Manik Varma;	code
608	A Recipe for Causal Graph Regression: Confounding Effects Revisited Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we reflect on the predictive power of confounders in graph-level regression, and generalize classification-specific causal intervention techniques to regression through a lens of contrastive learning.	Yujia Yin; Tianyi Qu; Zihao Wang; Yifan Chen;	code
609	TabFSBench: Tabular Benchmark for Feature Shifts in Open Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, this paper conducts the first comprehensive study on feature shifts in tabular data and introduces the first tabular feature-shift benchmark (TabFSBench).	Zi-Jian Cheng; Ziyi Jia; Zhi Zhou; Yu-Feng Li; Lan-Zhe Guo;	code
610	Text-to-LoRA: Instant Transformer Adaption Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Fine-tuning techniques enable practitioners to adapt foundation models for many new applications but require expensive and lengthy training while being notably sensitive to hyperparameter choices. To overcome these limitations, we introduce Text-to-LoRA (T2L), a model capable of adapting large language models (LLMs) on the fly solely based on a natural language description of the target task.	Rujikorn Charakorn; Edoardo Cetin; Yujin Tang; Robert Tjarko Lange;	code
611	SE(3)-Equivariant Diffusion Policy in Spherical Fourier Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Diffusion Policies are effective at learning closed-loop manipulation policies from human demonstrations but generalize poorly to novel arrangements of objects in 3D space, hurting real-world performance. To address this issue, we propose Spherical Diffusion Policy (SDP), an SE(3) equivariant diffusion policy that adapts trajectories according to 3D transformations of the scene.	Xupeng Zhu; Fan Wang; Robin Walters; Jane Shi;	code
612	Unisoma: A Unified Transformer-based Solver for Multi-Solid Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel explicit modeling paradigm that incorporates factors influencing solid deformation through structured modules.	Shilong Tao; Zhe Feng; Haonan Sun; Zhanxing Zhu; Yunhuai Liu;	code
613	TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current mode-seeking methods identify modes by breaking some dependency connections, but relying heavily on local data characteristics, requiring case-by-case threshold settings or human intervention to be effective for different datasets. To address this issue, we introduce a novel concept called typicality, by exploring the locally defined dependency from a global perspective, to quantify how confident a point would be a mode.	Haowen Ma; Zhiguo Long; Hua Meng;	code
614	Targeted Unlearning with Single Layer Unlearning Gradient Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Single Layer Unlearning Gradient (SLUG) as an efficient method to unlearn targeted information by updating a single critical layer using a one-time gradient computation.	Zikui Cai; Yaoteng Tan; M. Salman Asif;	code
615	Complete-Tree Space Favors Data-Efficient Link Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the challenge of limited link samples, we propose leveraging hierarchical modularity as a prior structure. We introduce complete-tree (CT) space, a discrete metric space with latent complete-tree structures, to formalize hierarchical modularity with an emphasis on its hierarchical permutation symmetry.	Chi Gao; Lukai Li; Yancheng Zhou; Shangqi Guo;	code
616	All-atom Inverse Protein Folding Through Discrete Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Many inverse folding methods struggle to predict sequences for complexes that contain non-protein components, and perform poorly with complexes that adopt multiple structural states. To address these challenges, we present ADFLIP (All-atom Discrete FLow matching Inverse Protein folding), a generative model based on discrete flow-matching for designing protein sequences conditioned on all-atom structural contexts.	Kai Yi; Kiarash Jamali; Sjors HW Scheres;	code
617	Analytical Lyapunov Function Discovery: An RL-based Generative Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these, we propose an end-to-end framework using transformers to construct analytical Lyapunov functions (local), which simplifies formal verification, enhances interpretability, and provides valuable insights for control engineers.	Haohan Zou; Jie Feng; Hao Zhao; Yuanyuan Shi;	code
618	Unveiling AI’s Blind Spots: An Oracle for In-Domain, Out-of-Domain, and Adversarial Errors Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, transformer-based mentor models excel at predicting errors across various mentee architectures. Subsequently, we draw insights from these observations and develop an oracle mentor model, dubbed SuperMentor, that can outperform baseline mentors in predicting errors across different error types from the ImageNet-1K dataset.	Shuangpeng Han; Mengmi Zhang;	code
619	Expressive Score-Based Priors for Distribution Matching with Geometry-Preserving Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While likelihood-based methods are a promising alternative, they often impose unnecessary biases through fixed priors or require explicit density models (e.g., flows) that can be challenging to train. We address this limitation by introducing a novel approach to training likelihood-based DM using expressive score-based prior distributions.	Ziyu Gong; Jim Lim; David I. Inouye;	code
620	An End-to-End Model for Logits-Based Large Language Models Watermarking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel end-to-end logits perturbation method for watermarking LLM-generated text.	KA HIM WONG; Jicheng Zhou; Jiantao Zhou; Yain-Whar Si;	code
621	LipsNet++: Unifying Filter and Controller Into A Policy Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose LipsNet++, a novel policy network with Fourier filter layer and Lipschitz controller layer to separately address both causes.	Xujie Song; Liangfa Chen; Tong Liu; Wenxuan Wang; Yinuo Wang; Shentao Qin; Yinsong Ma; Jingliang Duan; Shengbo Eben Li;	code
622	Enhancing Certified Robustness Via Block Reflector Orthogonal Layers and Logit Annealing Loss Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a novel, efficient Block Reflector Orthogonal (BRO) layer that enhances the capability of orthogonal layers on constructing more expressive Lipschitz neural architectures.	Bo-Han Lai; Pin-Han Huang; Bo-Han Kung; Shang-Tse Chen;	code
623	Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Graph-Assisted Stitching (GAS), a novel framework that formulates subgoal selection as a graph search problem rather than learning an explicit high-level policy.	Seungho Baek; taegeon park; Jongchan Park; Seungjun Oh; Yusung Kim;	code
624	DCBM: Data-Efficient Visual Concept Bottleneck Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Data-efficient CBMs (DCBMs), which reduce the need for large sample sizes during concept generation while preserving interpretability.	Katharina Prasse; Patrick Knab; Sascha Marton; Christian Bartelt; Margret Keuper;	code
625	The Price of Freedom: Exploring Expressivity and Runtime Tradeoffs in Equivariant Tensor Products Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we provide a careful, systematic analysis of a number of tensor product operations.	YuQing Xie; Ameya Daigavane; Mit Kotak; Tess Smidt;	code
626	Return Capping: Sample Efficient CVaR Policy Gradient Optimisation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When optimising for conditional value at risk (CVaR) using policy gradients (PG), current methods rely on discarding a large proportion of trajectories, resulting in poor sample efficiency. We propose a reformulation of the CVaR optimisation problem by capping the total return of trajectories used in training, rather than simply discarding them, and show that this is equivalent to the original problem if the cap is set appropriately.	Harry Mead; Clarissa Costen; Bruno Lacerda; Nick Hawes;	code
627	Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We formalize this machine learning problem and introduce alpha-covariance, a metric for evaluating robustness to such transformations. To tackle this task, we propose a dual-part token embedding strategy: a shared component ensures semantic consistency, while a randomized component maintains token distinguishability.	İlker Işık; Ramazan Gokberk Cinbis; Ebru Aydin Gol;	code
628	CoDy: Counterfactual Explainers for Dynamic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Counterfactual explanation methods provide a promising solution by illustrating how modifications to input graphs can influence model predictions. To address this challenge, we present CoDy—Counterfactual Explainer for Dynamic Graphs—a model-agnostic, instance-level explanation approach that identifies counterfactual subgraphs to interpret TGNN predictions.	Zhan Qu; Daniel Gomm; Michael Färber;	code
629	A Physics-Augmented Deep Learning Framework for Classifying Single Molecule Force Spectroscopy Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we both apply state-of-the-art machine learning models and present a novel deep learning model tailored to SMFS data.	Cailong Hua; Sivaraman Rajaganapathy; Rebecca A Slick; Joseph Vavra; Joseph M. Muretta; James M. Ervasti; Murti Salapaka;	code
630	Ad-Hoc Human-AI Coordination Challenge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce the Ad-Hoc Human-AI Coordination Challenge (AH2AC2) to overcome the constraints of costly and difficult-to-reproduce human evaluations.	Tin Dizdarević; Ravi Hammond; Tobias Gessler; Anisoara Calinescu; Jonathan Cook; Matteo Gallici; Andrei Lupu; Jakob Nicolaus Foerster;	code
631	A Closer Look at Multimodal Representation Collapse Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We further prove that cross-modal knowledge distillation implicitly disentangles such representations by freeing up rank bottlenecks in the student encoder, denoising the fusion-head outputs without negatively impacting the predictive features from either modality. Based on the above findings, we propose an algorithm that prevents modality collapse through explicit basis reallocation, with applications in dealing with missing modalities.	Abhra Chaudhuri; Anjan Dutta; Tu Bui; Serban Georgescu;	code
632	CombiMOTS: Combinatorial Multi-Objective Tree Search for Dual-Target Molecule Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose CombiMOTS, a Pareto Monte Carlo Tree Search (PMCTS) framework that generates dual-target molecules.	Thibaud Southiratn; Bonil Koo; Yijingxiu Lu; Sun Kim;	code
633	BounDr.E: Predicting Drug-likeness Via Biomedical Knowledge Alignment and EM-like One-Class Boundary Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce BounDr.E}: a novel modeling of drug-likeness as a compact space surrounding approved drugs through a dynamic one-class boundary approach.	Dongmin Bang; Inyoung Sung; Yinhua Piao; Sangseon Lee; Sun Kim;	code
634	Diverse Prototypical Ensembles Improve Robustness to Subpopulation Shift Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, we propose using an ensemble of diverse classifiers to adaptively capture risk associated with subpopulations.	Nguyen Nhat Minh To; Paul F R Wilson; Viet Nguyen; Mohamed Harmanani; Michael Cooper; Fahimeh Fooladgar; Purang Abolmaesumi; Parvin Mousavi; Rahul Krishnan;	code
635	What Limits Bidirectional Model’s Generative Capabilities? A Uni-Bi-Directional Mixture-of-Expert Method For Bidirectional Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through systematic Transformer module evaluations, we discover the FFN layer is least affected by such dependence. Leveraging this discovery, we propose UBMoE-LLM, a novel Uni-Bi-directional Mixture-of-Experts LLM, which integrates the original unidirectional FFN with a bidirectionally fine-tuned FFN via unsupervised contrastive learning.	Zuchao Li; Yonghua Hei; Qiwei Li; Lefei Zhang; Ping Wang; hai zhao; Baoyuan Qi; Liu Guoming;	code
636	Test-time Correlation Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we provide a theoretical analysis to investigate the feasibility of Test-time Correlation Alignment (TCA), demonstrating that correlation alignment between high-certainty instances and test instances can enhance test performances with a theoretical guarantee. Based on this, we propose two simple yet effective algorithms: LinearTCA and LinearTCA+.	Linjing You; Jiabao Lu; Xiayuan Huang;	code
637	Tensor Product Neural Networks for Functional ANOVA Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel neural network which guarantees a unique functional ANOVA decomposition and thus is able to estimate each component stably.	Seokhun Park; Insung Kong; yongchan Choi; Chanmoo Park; Yongdai Kim;	code
638	Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet, a critical gap persists, `conceptualization’—the ability to recognize and reason about the same concept despite variations in visual form, a basic ability of human reasoning. To address this challenge, we introduce the Visual Graph Arena (VGA), a dataset featuring six graph-based tasks designed to evaluate and improve AI systems’ capacity for visual abstraction.	Zahra Babaiee; Peyman Kiasari; Daniela Rus; Radu Grosu;	code
639	Attributes Shape The Embedding Space of Face Recognition Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we observe a multiscale geometric structure emerging in the embedding space, influenced by interpretable facial (e.g., hair color) and image attributes (e.g., contrast). We propose a geometric approach to describe the dependence or invariance of FR models to these attributes and introduce a physics-inspired alignment metric.	Pierrick Leroy; Antonio Mastropietro; Marco Nurisso; Francesco Vaccarino;	code
640	ADIOS: Antibody Development Via Opponent Shaping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To demonstrate the value of ADIOS, we build a viral evolution simulator using the Absolut!	Sebastian Rene Towers; Aleksandra Kalisz; Philippe A. Robert; Alicia Higueruelo; Francesca Vianello; Chloe Ming-Han Tsai; Harrison Steel; Jakob Nicolaus Foerster;	code
641	Flow-of-Options: Diversified and Improved LLM Reasoning By Thinking Through Options Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a novel reasoning approach called Flow-of-Options (FoO), designed to address intrinsic biases in Large Language Models (LLMs).	Lakshmi Nair; Ian Trase; J. Mark Kim;	code
642	SCENIR: Visual Semantic Clarity Through Unsupervised Scene Graph Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recognizing the lack of semantic understanding as a key limitation, we propose a novel scene graph-based retrieval framework that emphasizes semantic content over superficial image characteristics.	Nikolaos Chaidos; Angeliki Dimitriou; Maria Lymperaiou; Giorgos Stamou;	code
643	IN2V: Bringing Transductive Node Embeddings to Inductive Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Embedding methods like N2V are limited in their application on new nodes, which restricts them to the transductive setting where the entire graph, including the test nodes, is available during training. We propose inductive node2vec (iN2V), which combines a post-hoc procedure to compute embeddings for nodes unseen during training and modifications to the original N2V training procedure to prepare the embeddings for this post-hoc procedure.	Nicolas Lell; Ansgar Scherp;	code
644	Aggregation Buffer: Revisiting DropEdge with A New Parameter Block Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on this analysis, we propose Aggregation Buffer, a parameter block specifically designed to improve the robustness of GNNs by addressing the limitation of DropEdge.	Dooho Lee; Myeong Kong; Sagad Hamid; Cheonwoo Lee; Jaemin Yoo;	code
645	Enhancing Visual Localization with Cross-Domain Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel cross-domain data generation method to enhance visual localization methods.	Yuanze Wang; Yichao Yan; Shiming Song; Songchang Jin; Yilan Huang; Xingdong Sheng; Dianxi Shi;	code
646	Diversifying Robot Locomotion Behaviors with Extrinsic Behavioral Curiosity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Imitation learning (IL) has shown promise in robot locomotion but is often limited to learning a single expert policy, constraining behavior diversity and robustness in unpredictable real-world scenarios. To address this, we introduce Quality Diversity Inverse Reinforcement Learning (QD-IRL), a novel framework that integrates quality-diversity optimization with IRL methods, enabling agents to learn diverse behaviors from limited demonstrations.	Zhenglin Wan; Xingrui Yu; David Mark Bossens; Yueming Lyu; Qing Guo; Flint Xiaofeng Fan; Yew-Soon Ong; Ivor Tsang;	code
647	NTK-DFL: Enhancing Decentralized Federated Learning in Heterogeneous Settings Via Neural Tangent Kernel Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an approach leveraging the NTK to train client models in the decentralized setting, while introducing a synergy between NTK-based evolution and model averaging.	Gabriel Thompson; Kai Yue; Chau-Wai Wong; Huaiyu Dai;	code
648	Reinforcement Learning for Quantum Control Under Physical Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We devise a physics-constrained Reinforcement Learning (RL) algorithm that restricts the space of possible solutions.	Jan Ole Ernst; Aniket Chatterjee; Tim Franzmeyer; Axel Kuhn;	code
649	Improved Expressivity of Hypergraph Neural Networks Through High-Dimensional Generalized Weisfeiler-Leman Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current algorithms for hypergraphs, like the 1-dimensional generalized Weisfeiler-Lehman test (1-GWL), lag behind advancements in graph isomorphism tests, limiting most hypergraph neural networks to 1-GWL’s expressive power. To address this, we propose the high-dimensional GWL (k-GWL), generalizing k-WL from graphs to hypergraphs.	Detian Zhang; Chengqiang Zhang; Yanghui Rao; Li Qing; Chunjiang Zhu;	code
650	Reasoning Limitations of Multimodal Large Language Models. A Case Study of Bongard Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite some successes on real-world datasets, MLLMs struggle with synthetic BPs. To explore this gap, we introduce Bongard-RWR, a dataset representing synthetic BP concepts using real-world images.	Mikołaj Małkiński; Szymon Pawlonka; Jacek Mańdziuk;	code
651	Global Curvature for Second-order Optimization of Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a theory that predicts the \emph{exact} structure of the global curvature by leveraging the intrinsic symmetries of neural networks, such as invariance under parameter permutations.	Alberto Bernacchia;	code
652	MTSTRec: Multimodal Time-Aligned Shared Token Recommender Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing item ID-based methods and multimodal models often overlook the temporal alignment of modalities like textual descriptions, visual content, and prices in user browsing sequences. To address this limitation, this paper proposes the Multimodal Time-aligned Shared Token Recommender (MTSTRec), a transformer-based framework with a single time-aligned shared token per product for efficient cross-modality fusion.	Ming-Yi Hong; Yen-Jung Hsu; Miao-Chen Chiang; Che Lin;	code
653	Curvature-aware Graph Attention for PDEs on Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a Curvature-aware Graph Attention for PDEs on manifolds by exploring the important intrinsic geometric quantities such as curvature and discrete gradient operator.	Yunfeng Liao; Jiawen Guan; Xiucheng Li;	code
654	Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional robust methods in multi-agent reinforcement learning (MARL) often struggle against coordinated adversarial attacks in cooperative scenarios. To address this limitation, we propose the Wolfpack Adversarial Attack framework, inspired by wolf hunting strategies, which targets an initial agent and its assisting agents to disrupt cooperation.	Sunwoo Lee; Jaebak Hwang; Yonghyeon Jo; Seungyul Han;	code
655	STD-FD: Spatio-Temporal Distribution Fitting Deviation for AIGC Forgery Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Spatio-Temporal Distribution Fitting Deviation (STD-FD) for AIGC forgery detection, which explores the generative process in detail.	Hengrui Lou; Zunlei Feng; Jinsong Geng; Erteng Liu; Jie Lei; Lechao Cheng; Jie Song; Mingli Song; Yijun Bei;	code
656	Fleet of Agents: Coordinated Problem Solving with Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce Fleet of Agents (FoA), a novel and intuitive yet principled framework utilizing LLMs as agents to navigate through dynamic tree searches, employing a genetic-type particle filtering approach.	Lars Henning Klein; Nearchos Potamitis; Roland Aydin; Robert West; Caglar Gulcehre; Akhil Arora;	code
657	Learning from True-False Labels Via Multi-modal Prompt Retrieving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel weakly supervised labeling setting, namely True-False Labels (TFLs) which can achieve high accuracy when generated by VLMs.	Zhongnian Li; Jinghao Xu; Peng Ying; Meng Wei; Xinzheng Xu;	code
658	Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training Via Symmetric Policy Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel off-policy method that eliminates the need for additional environmental interactions by reformulating adversarial learning as a soft-constrained optimization problem.	Kosuke Nakanishi; Akihiro Kubo; Yuji Yasui; Shin Ishii;	code
659	Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While context-based meta-RL methods improve task representation using task latents, they often struggle with out-of-distribution (OOD) tasks. To address this, we propose Task-Aware Virtual Training (TAVT), a novel algorithm that accurately captures task characteristics for both training and OOD scenarios using metric-based representation learning.	Jeongmo Kim; Yisak Park; Minung Kim; Seungyul Han;	code
660	Hypo3D: Exploring Hypothetical Reasoning in 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce Hypothetical 3D Reasoning, namely Hypo3D, a benchmark designed to evaluate models’ ability to reason without access to real-time scene data.	Ye Mao; Weixun Luo; Junpeng Jing; Anlan Qiu; Krystian Mikolajczyk;	code
661	FlexiClip: Locality-Preserving Free-Form Character Animation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Similarly, text-to-video (T2V) and image-to-video (I2V) models struggle to handle clipart due to the mismatch in statistical properties between natural video and clipart styles. This paper introduces FlexiClip, a novel approach designed to overcome these limitations by addressing the intertwined challenges of temporal consistency and geometric integrity.	Anant Khandelwal;	code