Paper Digest: NeurIPS 2025 Papers & Highlights
Note: NeurIPS-2025 accepts more than 5,000 papers, this page only includes the oral and spotlight accepts. Interested users can choose to read All 5,300 NeurIPS-2025 papers in a separate page.
To search for papers presented at NIPS-2025 on a specific topic, please make use of the search by venue (NIPS-2025) service. To summarize the latest research published at NeurIPS-2025 on a specific topic, you can utilize the review by venue (NIPS-2025) service. If you are interested in browsing papers by author, we have a comprehensive list of ~ 21,000 authors (NIPS-2025). Additionally, you may want to explore our “Best Paper” Digest (NeurIPS), which lists the most influential NeurIPS papers since 1987.
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive daily paper digests on the latest research in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: NeurIPS 2025 Papers & Highlights
| Paper | Author(s) | |
|---|---|---|
| 1 | Generalized Linear Mode Connectivity for Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a unified framework that captures four symmetry classes—permutations, semi-permutations, orthogonal transformations, and general invertible maps—broadening the set of valid reparameterizations and subsuming many previous approaches as special cases. |
Alexander Theus; Alessandro Cabodi; Sotiris Anagnostidis; Antonio Orvieto; Sidak Pal Singh; Valentina Boeva; |
| 2 | Deep Compositional Phase Diffusion for Long Motion Sequence Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, when employing these models to create composite sequences containing multiple semantically generated motion clips, they often struggle to preserve the continuity of motion dynamics at the transition boundaries between clips, resulting in awkward transitions and abrupt artifacts. To address these challenges, we present Compositional Phase Diffusion, which leverages the Semantic Phase Diffusion Module (SPDM) and Transitional Phase Diffusion Module (TPDM) to progressively incorporate semantic guidance and phase details from adjacent motion clips into the diffusion process. |
Ho Yin Au; Jie Chen; Junkun Jiang; Jingyu Xiang; |
| 3 | GnnXemplar: Exemplars to Explanations – Natural Language Rules for Global GNN Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose GnnXemplar, a novel global explainer inspired from Exemplar Theory from cognitive science. |
Burouj Armgaan; Eshan Jain; Harsh Pandey; Mahesh Chandran; Sayan Ranu; |
| 4 | RAG4GFM: Bridging Knowledge Gaps in Graph Foundation Models Through Graph Retrieval Augmented Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose RAG4GFM, an end-to-end framework that seamlessly integrates multi-level graph indexing, task-aware retrieval, and graph fusion enhancement. |
Xingliang Wang; Zemin Liu; Junxiao Han; Shuiguang Deng; |
| 5 | Agnostic Active Learning Is Always Better Than Passive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We sharply characterize the optimal first-order query complexity of agnostic active learning for all concept classes, and propose a new general active learning algorithm which achieves it. |
Steve Hanneke; |
| 6 | Learning Linear Attention in Polynomial Time Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that learning the optimal multi head linear attention can be recast as finding the optimal kernel predictor in a suitably defined RKHS. Moving to generalization, we construct an algorithm that, given a dataset, checks in polynomial time whether the set of best fit multi head linear attention networks on this data all perform an identical computation–a powerful notion for out of distribution generalization. |
Morris Yau; Ekin Akyürek; Jiayuan Mao; Joshua B. Tenenbaum; Stefanie Jegelka; Jacob Andreas; |
| 7 | Optimal Mistake Bounds for Transductive Online Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We resolve a 30-year-old open problem concerning the power of unlabeled data in online learning by tightly quantifying the gap between transductive and standard online learning. |
Zachary Chase; Steve Hanneke; Shay Moran; Jonathan Shafer; |
| 8 | State Entropy Regularization for Robust Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that state entropy regularization improves robustness to structured and spatially correlated perturbations. |
Yonatan Ashlag; Uri Koren; Mirco Mutti; Esther Derman; Pierre-Luc Bacon; Shie Mannor; |
| 9 | On The Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we rule out the noisy nature of the loss as a key factor driving generalization in flow matching. |
Quentin Bertrand; Anne Gagneux; Mathurin Massias; Rémi Emonet; |
| 10 | Why Diffusion Models Don’t Memorize: The Role of Implicit Dynamical Regularization in Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the role of the training dynamics in the transition from generalization to memorization. |
Tony Bonnaire; Raphaël Urfin; Giulio Biroli; Marc Mezard; |
| 11 | Adjoint Schrödinger Bridge Sampler Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose **Adjoint Schrödinger Bridge Sampler (ASBS)**, a new diffusion sampler that employs simple and scalable matching-based objectives yet without the need to estimate target samples during training. |
Guan-Horng Liu; Jaemoo Choi; Yongxin Chen; Benjamin Kurt Miller; Ricky T. Q. Chen; |
| 12 | Breaking The Performance Ceiling in Reinforcement Learning Requires Inference Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Meanwhile, many digital or simulation-based applications allow for an inference phase that utilises a specific time and compute budget to explore multiple attempts before outputting a final solution. In this work, we show that such an inference phase employed at execution time, and the choice of a corresponding inference strategy, are key to breaking the performance ceiling observed in complex multi-agent RL problems. |
Felix Chalumeau; Daniel Rajaonarivonivelomanantsoa; Ruan John de Kock; Juan Claude Formanek; Sasha Abramowitz; Omayma Mahjoub; Wiem Khlifi; Simon Verster Du Toit; Louay Ben Nessir; Refiloe Shabe; Arnol Manuel Fokam; Siddarth Singh; Ulrich Armel Mbou Sob; Arnu Pretorius; |
| 13 | High-Dimensional Calibration from Swap Regret Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the online calibration of multi-dimensional forecasts over an arbitrary convex set $\mathcal{P} \subset \mathbb{R}^d$ relative to an arbitrary norm $\Vert\cdot\Vert$. We connect this with the problem of external regret minimization for online linear optimization, showing that if it is possible to guarantee $O(\sqrt{\rho T})$ worst-case regret after $T$ rounds when actions are drawn from $\mathcal{P}$ and losses are drawn from the dual $\Vert \cdot \Vert_*$ unit norm ball, then it is also possible to obtain $\epsilon$-calibrated forecasts after $T = \exp(O(\rho /\epsilon^2))$ rounds. |
Maxwell Fishelson; Noah Golowich; Mehryar Mohri; Jon Schneider; |
| 14 | In Search of Adam’s Secret Sauce Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct an extensive empirical study — training over 1,500 language models across different data configurations and scales — comparing Adam to several known simplified variants. |
Antonio Orvieto; Robert M. Gower; |
| 15 | An Optimized Franz-Parisi Criterion and Its Equivalence with SQ Lower Bounds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a refined FP criterion that aims to better capture the geometric “overlap structure of statistical models. |
Siyu Chen; Theodor Misiakiewicz; Ilias Zadik; Peiyuan Zhang; |
| 16 | MaxSup: Overcoming Representation Collapse in Label Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we analytically decompose the LS-induced loss, exposing two key terms: (i) a regularization term that dampens overconfidence only when the prediction is correct, and (ii) an error-amplification term that arises under misclassifications. |
Yuxuan Zhou; Heng Li; Zhi-Qi Cheng; Xudong Yan; Yifei Dong; Mario Fritz; Margret Keuper; |
| 17 | Memory Mosaics at Scale Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we scale memory mosaics to 10B size, we train them on one trillion tokens, we introduce a couple architectural modifications (*memory mosaics v2*), we assess their capabilities across three evaluation dimensions: training-knowledge storage, new-knowledge storage, and in-context learning. |
Jianyu Zhang; Leon Bottou; |
| 18 | The Emergence of Sparse Attention: Impact of Data Distribution and Benefits of Repetition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite initial studies, we still lack a comprehensive understanding of how and when these abilities emerge. To address this gap, we study the emergence over training of sparse attention, a critical and frequently observed attention pattern in Transformers. |
Nicolas Zucchet; Francesco D’Angelo; Andrew Kyle Lampinen; Stephanie C.Y. Chan; |
| 19 | ControlFusion: A Controllable Image Fusion Network with Language-Vision Degradation Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current image fusion methods struggle with real-world composite degradations and lack the flexibility to accommodate user-specific needs. To address this, we propose ControlFusion, a controllable fusion network guided by language-vision prompts that adaptively mitigates composite degradations. |
Linfeng Tang; Yeda Wang; Zhanchuan Cai; Junjun Jiang; Jiayi Ma; |
| 20 | Identifiability of Deep Polynomial Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. |
Konstantin Usevich; Ricardo Augusto Borsoi; Clara Dérand; Marianne Clausel; |
| 21 | Understanding and Mitigating Numerical Sources of Nondeterminism in LLM Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents the first systematic investigation into how numerical precision affects reproducibility in LLM inference. |
Jiayi Yuan; Hao Li; Xinheng Ding; Wenya Xie; Yu-Jhe Li; Wentian Zhao; Kun Wan; Jing Shi; Xia Hu; Zirui Liu; |
| 22 | PRIMT: Preference-based Reinforcement Learning with Multimodal Feedback and Trajectory Synthesis from Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, its effectiveness is often limited by two critical challenges: the reliance on extensive human input and the inherent difficulties in resolving query ambiguity and credit assignment during reward learning. In this paper, we introduce PRIMT, a PbRL framework designed to overcome these challenges by leveraging foundation models (FMs) for multimodal synthetic feedback and trajectory synthesis. |
Ruiqi Wang; Dezhong Zhao; Ziqin Yuan; Tianyu Shao; Guohua Chen; Dominic Kao; Sungeun Hong; Byung-Cheol Min; |
| 23 | A Is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a metric to detect absorption in SAEs, and validate our findings empirically on hundreds of LLM SAEs. |
David Chanin; James Wilken-Smith; Tomáš Dulka; Hardik Bhatnagar; Satvik Golechha; Joseph Isaac Bloom; |
| 24 | EvoLM: In Search of Lost Language Model Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present EvoLM, a model suite that enables systematic and transparent analysis of LMs’ training dynamics across pre-training, continued pre-training, supervised fine-tuning, and reinforcement learning.To facilitate open research and reproducibility, we release all pre-trained and post-trained models, training datasets for all stages, and our entire training and evaluation pipeline. |
Zhenting Qi; Fan Nie; Alexandre Alahi; James Zou; Himabindu Lakkaraju; Yilun Du; Eric P. Xing; Sham M. Kakade; Hanlin Zhang; |
| 25 | Analog In-memory Training on General Non-ideal Resistive Elements: The Impact of Response Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To overcome the issue, we propose residual learning algorithm, which provably converges exactly to a critical point by solving a bilevel optimization problem. |
Zhaoxian Wu; Quan Xiao; Tayfun Gokmen; Omobayode Fagbohungbe; Tianyi Chen; |
| 26 | Discovering Opinion Intervals from Conflicts in Signed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we ask whether the conflicts in a network reveal a small and interpretable set of prevalent opinion ranges that explain the users’ interactions.We introduce an optimization problem that models this question, and we give strong hardness results and a polynomial-time approximation scheme by utilizing connections to interval graphs and the Correlation Clustering problem. |
Peter Blohm; Florian Chen; Aristides Gionis; Stefan Neumann; |
| 27 | A Clean Slate for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To resolve opaque algorithmic design, we provide clean, minimalistic, single-file implementations of various model-free and model-based offline RL methods, significantly enhancing clarity and achieving substantial speed-ups. Leveraging these streamlined implementations, we propose Unifloral, a unified algorithm that encapsulates diverse prior approaches and enables development within a single, comprehensive hyperparameter space. |
Matthew Thomas Jackson; Uljad Berdica; Jarek Luca Liesen; Shimon Whiteson; Jakob Nicolaus Foerster; |
| 28 | Spectral Perturbation Bounds for Low-Rank Approximation with Applications to Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We establish new high-probability spectral-norm perturbation bounds for symmetric matrices that refine the classical Eckart–Young–Mirsky theorem and explicitly capture interactions between a matrix $A \in \mathbb{R}^{n \times n}$ and an arbitrary symmetric perturbation $E$. |
Phuc Tran; Van Vu; Nisheeth K. Vishnoi; |
| 29 | Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the Bayesian optimization problem (also referred to as the Bayesian setting of the Gaussian process bandit), where the learner seeks to minimize the regret under a function drawn from a known Gaussian process (GP). |
Shogo Iwazaki; |
| 30 | Auto-Compressing Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Auto-Compressing Networks (ACNs), an architectural variant where additive long feedforward connections from each layer to the output replace traditional short residual connections. |
Vaggelis Dorovatas; Georgios Paraskevopoulos; Alexandros Potamianos; |
| 31 | MokA: Multimodal Low-Rank Adaptation for MLLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we reveal that most current efficient multimodal fine-tuning methods are hindered by a key limitation: they are directly borrowed from LLMs, often neglecting the intrinsic differences of multimodal scenarios and even affecting the full utilization of all modalities. |
Yake Wei; Yu Miao; Dongzhan Zhou; Di Hu; |
| 32 | Advancing Expert Specialization for Better MoE Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we observe that the commonly used auxiliary load balancing loss often leads to expert overlap and overly uniform routing, which hinders expert specialization and degrades overall performance during post-training. To address this, we propose a simple yet effective solution that introduces two complementary objectives: (1) an orthogonality loss to encourage experts to process distinct types of tokens, and (2) a variance loss to encourage more discriminative routing decisions. |
Hongcan Guo; Haolang Lu; Guoshun Nan; Bolun Chu; Jialin Zhuang; Yuan Yang; Wenhao Che; Xinye Cao; Sicong Leng; Qimei Cui; Xudong Jiang; |
| 33 | From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by empirical evidence showing improved reasoning capabilities under small initialization scales in language models, we employ the gradient flow analytical framework established in \cite{zhou2022towards} to systematically investigate linearized Transformer training dynamics. |
Zheng-An Chen; Tao Luo; |
| 34 | Large Language Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The capabilities of large language models (LLMs) are widely regarded as relying on autoregressive models (ARMs). We challenge this notion by introducing *LLaDA*, a diffusion model trained from scratch under the pre-training and supervised fine-tuning (SFT) paradigm. |
Shen Nie; Fengqi Zhu; Zebin You; Xiaolu Zhang; Jingyang Ou; Jun Hu; JUN ZHOU; Yankai Lin; Ji-Rong Wen; Chongxuan Li; |
| 35 | Boosting Knowledge Utilization in Multimodal Large Language Models Via Adaptive Logits Fusion and Attention Reallocation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we design Adaptive Logits Fusion and Attention Reallocation (ALFAR), a training-free and plug-and-play approach that improves MLLM responses by maximizing the utility of the retrieved knowledge. |
Wenbin An; Jiahao Nie; Feng Tian; Haonan Lin; mingxiang cai; Yaqiang Wu; QianYing Wang; Xiaoqin Zhang; Shijian Lu; |
| 36 | Interactive Cross-modal Learning for Text-3D Scene Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In practical deployments, however, limited by the capabilities of users and models, it is difficult or even impossible to directly obtain a perfect textual query suiting the entire scene and model, thereby leading to performance degradation. To address this issue, we propose a novel Interactive Text-3D Scene Retrieval Method (IDeal), which promotes the enhancement of the alignment between texts and 3D scenes through continuous interaction. |
Yanglin Feng; Yongxiang Li; Yuan Sun; Yang Qin; Dezhong Peng; Peng Hu; |
| 37 | Rethinking Joint Maximum Mean Discrepancy for Visual Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In domain adaption (DA), joint maximum mean discrepancy (JMMD), as a famous distribution-distance metric, aims to measure joint probability distribution difference between the source domain and target domain, while it is still not fully explored and especially hard to be applied into a subspace-learning framework as its empirical estimation involves a tensor-product operator whose partial derivative is difficult to obtain. To solve this issue, we deduce a concise JMMD based on the Representer theorem that avoids the tensor-product operator and obtains two essential findings. |
Wei Wang; Haifeng Xia; Chao Huang; Zhengming Ding; Cong Wang; Haojie Li; Xiaochun Cao; |
| 38 | Pan-LUT: Efficient Pan-sharpening Via Learnable Look-Up Tables Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This excessive computational demand limits the applicability of these methods in real-world scenarios, particularly in the absence of dedicated computing devices such as GPUs and TPUs. To address these challenges, we propose Pan-LUT, a novel learnable look-up table (LUT) framework for pan-sharpening that strikes a balance between performance and computational efficiency for large remote sensing images. |
Zhongnan Cai; Yingying Wang; Hui Zheng; Panwang Pan; ZiXu Lin; Ge Meng; Chenxin Li; Chunming He; Jiaxin Xie; Yunlong Lin; Junbin Lu; Yue Huang; Xinghao Ding; |
| 39 | Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the learning dynamics of large two-layer neural networks via dynamical mean field theory, a well established technique of non-equilibrium statistical physics. |
Andrea Montanari; Pierfrancesco Urbani; |
| 40 | 1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study building blocks for self-supervised RL that unlock substantial improvements in scalability, with network depth serving as a critical factor. |
Kevin Wang; Ishaan Javali; Michał Bortkiewicz; Tomasz Trzcinski; Benjamin Eysenbach; |
| 41 | Depth-Bounds for Neural Networks Via The Braid Arrangement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We contribute towards resolving the open question of how many hidden layers are required in ReLU networks for exactly representing all continuous and piecewise linear functions on $\mathbb{R}^d$. |
Moritz Leo Grillo; Christoph Hertrich; Georg Loho; |
| 42 | Tighter CMI-Based Generalization Bounds Via Stochastic Projection and Quantization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage stochastic projection and lossy compression to establish new conditional mutual information (CMI) bounds on the generalization error of statistical learning algorithms. |
Milad Sefidgaran; Kimia Nadjahi; Abdellatif Zaidi; |
| 43 | A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we initiate the study of data attribution for online RL, focusing on the widely used Proximal Policy Optimization (PPO) algorithm. |
Yuzheng Hu; Fan Wu; Haotian Ye; David Forsyth; James Zou; Nan Jiang; Jiaqi W. Ma; Han Zhao; |
| 44 | High-dimensional Neuronal Activity from Low-dimensional Latent Dynamics: A Solvable Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To demonstrate that low-dimensional latent dynamics and high-dimensional activity can be two sides of the same coin, we present an analytically solvable recurrent neural network (RNN) model whose dynamics can be exactly reduced to a low-dimensional dynamical system, but generates an activity manifold that has a high linear embedding dimension. |
Valentin Schmutz; Ali Haydaroğlu; Shuqi Wang; Yixiao Feng; Matteo Carandini; Kenneth D. Harris; |
| 45 | Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Spiking Neural Networks (SNNs) represent a promising algorithmic approach for these systems, yet their application to complex control tasks faces two critical challenges: (1) the non-differentiable nature of spiking neurons necessitates surrogate gradients with unclear optimization properties, and (2) the stateful dynamics of SNNs require training on sequences, which in reinforcement learning (RL) is hindered by limited sequence lengths during early training, preventing the network from bridging its warm-up period. We address these challenges by systematically analyzing surrogate gradient slope settings, showing that shallower slopes increase gradient magnitude in deeper layers but reduce alignment with true gradients. |
Korneel Van den Berghe; Stein Stroobants; Vijay Janapa Reddi; Guido De Croon; |
| 46 | Class-wise Balancing Data Replay for Federated Class-Incremental Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their performance is typically limited by class imbalance, both within the replay buffer due to limited global awareness and between replayed and newly arrived classes. To address this issue, we propose a class-wise balancing data replay method for FCIL (FedCBDR), which employs a global coordination mechanism for class-level memory construction and reweights the learning objective to alleviate the aforementioned imbalances. |
Zhuang Qi; Ying-Peng Tang; Lei Meng; Han Yu; Xiaoxiao Li; Xiangxu Meng; |
| 47 | Task-Optimized Convolutional Recurrent Networks Align with Tactile Processing in The Rodent Brain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Tactile sensing remains far less understood in neuroscience and less effective in artificial systems compared to more mature modalities such as vision and language. We bridge these gaps by introducing a novel Encoder-Attender-Decoder (EAD) framework to systematically explore the space of task-optimized temporal neural networks trained on realistic tactile input sequences from a customized rodent whisker-array simulator. |
Trinity Chung; Yuchen Shen; Nathan Kong; Aran Nayebi; |
| 48 | ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Current tightly coupled serving architectures struggle to distinguish between mixed request types or adapt parallelism strategies to different inference stages, leading to increased time-to-first-token (TTFT) and poor resource utilization. To address this, we introduce Elastic Multimodal Parallelism (EMP), a new serving paradigm that elastically adapts to resource heterogeneity across request types and inference stages. |
Zedong Liu; Shenggan Cheng; Guangming Tan; Yang You; Dingwen Tao; |
| 49 | Dynamical Low-Rank Compression of Neural Networks with Robustness Under Adversarial Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a dynamical low-rank training scheme enhanced with a novel spectral regularizer that controls the condition number of the low-rank core in each layer. |
Steffen Schotthöfer; H. Lexie Yang; Stefan Schnake; |
| 50 | QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Clinical decision‑making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision‑centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time‑series signals, and text reports. |
Wei Dai; Peilin Chen; Chanakya Ekbote; Paul Pu Liang; |
| 51 | Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we conduct comprehensive experiments to systematically investigate gating-augmented softmax attention variants. |
Zihan Qiu; Zekun Wang; Bo Zheng; Zeyu Huang; Kaiyue Wen; Songlin Yang; Rui Men; Le Yu; Fei Huang; Suozhi Huang; Dayiheng Liu; Jingren Zhou; Junyang Lin; |
| 52 | Learning Long Range Dependencies Through Time Reversal Symmetry Breaking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose \emph{Recurrent Hamiltonian Echo Learning} (RHEL), an algorithm which provably computes loss gradients as finite differences of physical trajectories of non-dissipative, \emph{Hamiltonian systems}. |
Guillaume Pourcel; Maxence Ernoult; |
| 53 | FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FuXi-Ocean, the first data-driven global ocean forecasting model achieving six-hourly predictions at eddy-resolving 1/12° spatial resolution, reaching depths of up to 1500 meters. |
Qiusheng Huang; Yuan Niu; Xiaohui Zhong; AnboyuGuo; Lei Chen; dianjun zhang; Xuefeng Zhang; Hao Li; |
| 54 | Superposition Yields Robust Neural Scaling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, the origin of this neural scaling law, that loss decreases as a power law with model size, remains unclear. We propose that representation superposition, meaning that LLMs represent more features than they have dimensions, can be a key contributor to loss and cause neural scaling. |
Yizhou Liu; Ziming Liu; Jeff Gore; |
| 55 | ImageNet-trained CNNs Are Not Biased Towards Texture: Revisiting Feature Reliance Through Controlled Suppression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit this hypothesis by examining limitations in the cue-conflict experiment by Geirhos et al. To address these limitations, we propose a domain-agnostic framework that quantifies feature reliance through systematic suppression of shape, texture, and color cues, avoiding the confounds of forced-choice conflicts. |
Tom Burgert; Oliver Stoll; Paolo Rota; Begüm Demir; |
| 56 | On Linear Mode Connectivity of Mixture-of-Experts Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We begin by conducting a comprehensive analysis of both dense and sparse gating regimes, demonstrating that the symmetries inherent to MoE architectures are fully characterized by permutations acting on both the expert components and the gating function. Building on these foundational findings, we propose a matching algorithm that enables alignment between independently trained MoEs, thereby facilitating the discovery of LMC. |
Viet-Hoang Tran; Van-Hoan Trinh; Khanh Vinh Bui; Tan Minh Nguyen; |
| 57 | OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To synthesize physically plausible interactions, we propose an affordance-driven diffusion model paired with a training-free physics refinement stage that minimizes penetration and optimizes affordance alignment. |
Zhenhao Zhang; Ye Shi; Lingxiao Yang; Suting Ni; Qi Ye; Jingya Wang; |
| 58 | Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a straightforward method called \textit{\textbf{R}epresentation \textbf{E}ntanglement for \textbf{G}eneration} (\textbf{REG}), which entangles low-level image latents with a single high-level class token from pretrained foundation models for denoising. |
Ge Wu; Shen Zhang; Ruijing Shi; Shanghua Gao; Zhenyuan Chen; Lei Wang; Zhaowei Chen; Hongcheng Gao; Yao Tang; jian Yang; Ming-Ming Cheng; Xiang Li; |
| 59 | Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, these models still encounter the following challenges when applied to real-world 3D navigation: 1) Insufficient understanding of 3D geometry and spatial semantics; 2) Limited capacity for large-scale exploration and long-term environmental memory; 3) Poor adaptability to dynamic and changing environments.To address these limitations, we propose Dynam3D, a dynamic layered 3D representation model that leverages language-aligned, generalizable, and hierarchical 3D representations as visual input to train 3D-VLM in navigation action prediction. |
Zihan Wang; Seungjun Lee; Gim Hee Lee; |
| 60 | Learning (Approximately) Equivariant Networks Via Constrained Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Even when the data is fully symmetric, enforcing equivariance can hurt training by limiting the model to a restricted region of the parameter space. Guided by homotopy principles, where an optimization problem is solved by gradually transforming a simpler problem into a complex one, we introduce Adaptive Constrained Equivariance (ACE), a constrained optimization approach that starts with a flexible, non-equivariant model and gradually reduces its deviation from equivariance. |
Andrei Manolache; Luiz F. O. Chamon; Mathias Niepert; |
| 61 | SAGE: A Unified Framework for Generalizable Object State Recognition with State-Action Graph Embedding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SAGE (State-Action Graph Embeddings), a novel framework that offers a unified model of physical state transitions by decomposing states into fine-grained, language-described visual concepts that are sharable across different objects and actions. |
Yuan Zang; Zitian Tang; Junho Cho; Jaewook Yoo; Chen Sun; |
| 62 | Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond The Base Model? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we take a critical look at \textit{the current state of RLVR} by systematically probing the reasoning capability boundaries of RLVR-trained LLMs across diverse model families, RL algorithms, and math/coding/visual reasoning benchmarks, using pass@\textit{k} at large \textit{k} values as the evaluation metric. |
Yang Yue; Zhiqi Chen; Rui Lu; Andrew Zhao; Zhaokai Wang; Yang Yue; Shiji Song; Gao Huang; |
| 63 | Learning to Learn with Contrastive Meta-Objective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose to exploit task identity as additional supervision in meta-training, inspired by the alignment and discrimination ability which is is intrinsic in human’s fast learning. |
Shiguang Wu; Yaqing Wang; Yatao Bian; Quanming Yao; |
| 64 | KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces \textit{KVzip}, a query-agnostic KV cache eviction method enabling effective reuse of compressed KV caches across diverse queries. |
Jang-Hyun Kim; Jinuk Kim; Sangwoo Kwon; Jae W. Lee; Sangdoo Yun; Hyun Oh Song; |
| 65 | HyperET: Efficient Training in Hyperbolic Space for Multi-modal Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We argue that a key source of this inefficiency lies in the vision encoders they widely equip with, e.g., CLIP and SAM, which lack the alignment with language at multi-granularity levels. To address this issue, in this paper, we leverage hyperbolic space, which inherently models hierarchical levels and thus provides a principled framework for bridging the granularity gap between visual and textual modalities at an arbitrary granularity level. |
Zelin Peng; Zhengqin Xu; Qingyang Liu; Xiaokang Yang; Wei Shen; |
| 66 | SAVVY: Spatial Awareness Via Audio-Visual LLMs Through Seeing and Hearing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: SAVVY-Bench is comprised of thousands of carefully curated question–answer pairs probing both directional and distance relationships involving static and moving objects, and requires fine-grained temporal grounding, consistent 3D localization, and multi-modal annotation. To tackle this challenge, we propose SAVVY, a novel training-free reasoning pipeline that consists of two stages: (i) Egocentric Spatial Tracks Estimation, which leverages AV-LLMs as well as other audio-visual methods to track the trajectories of key objects related to the query using both visual and spatial audio cues, and (ii) Dynamic Global Map Construction, which aggregates multi-modal queried object trajectories and converts them into a unified global dynamic map. |
Mingfei Chen; Zijun Cui; Xiulong Liu; Jinlin Xiang; Caleb Zheng; Jingyuan Li; Eli Shlizerman; |
| 67 | A Multiscale Analysis of Mean-field Transformers in The Moderate Interaction Regime Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the evolution of tokens through the depth of encoder-only transformer models at inference time by modeling them as a system of particles interacting in a mean-field way and studying the corresponding dynamics. |
Giuseppe Bruno; Federico Pasqualotto; Andrea Agazzi; |
| 68 | Exploring Diffusion Transformer Designs Via Grafting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To this end, we present *grafting*, a simple approach for editing pretrained diffusion transformers (DiTs) to materialize new architectures under small compute budgets. |
Keshigeyan Chandrasegaran; Michael Poli; Daniel Y Fu; Dongjun Kim; Lea M. Hadzic; Manling Li; Agrim Gupta; Stefano Massaroli; Azalia Mirhoseini; Juan Carlos Niebles; Stefano Ermon; Li Fei-Fei; |
| 69 | Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces a hybrid non-Euclidean optimization method which generalizes gradient norm clipping by combining steepest descent and conditional gradient approaches. |
Thomas Pethick; Wanyun Xie; Mete Erdogan; Kimon Antonakopoulos; Tony Silveti-Falls; Volkan Cevher; |
| 70 | Rethinking Multimodal Learning from The Perspective of Mitigating Classification Ability Disproportion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel multimodal learning approach to dynamically balance the classification ability of weak and strong modalities by incorporating the principle of boosting. |
Qing-Yuan Jiang; Longfei Huang; Yang Yang; |
| 71 | TransferTraj: A Vehicle Trajectory Learning Model for Region and Task Transferability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing efforts towards transferability primarily involve learning embedding vectors for trajectories, which perform poorly in region transfer and require retraining of prediction modules for task transfer. To address these challenges, we propose $\textit{TransferTraj}$, a vehicle GPS trajectory learning model that excels in both region and task transferability. |
Tonglong Wei; Yan Lin; Zeyu Zhou; Haomin Wen; Jilin Hu; Shengnan Guo; Youfang Lin; Gao Cong; Huaiyu Wan; |
| 72 | PhySense: Sensor Placement Optimization for Accurate Physics Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While deep learning has made rapid advances in sparse-data reconstruction, existing methods generally omit optimization of sensor placements, leaving the mutual enhancement between reconstruction and placement on the shelf. To change this suboptimal practice, we propose PhySense, a synergistic two-stage framework that learns to jointly reconstruct physical fields and to optimize sensor placements, both aiming for accurate physics sensing. |
Yuezhou Ma; Haixu Wu; Hang Zhou; Huikun Weng; Jianmin Wang; Mingsheng Long; |
| 73 | InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce InfinityStar, a unified spacetime autoregressive framework for high-resolution image and dynamic video synthesis. |
Jinlai Liu; Jian Han; Bin Yan; Wuhui; Fengda Zhu; Xing Wang; Yi Jiang; BINGYUE PENG; Zehuan Yuan; |
| 74 | Does Stochastic Gradient Really Succeed for Bandits? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, whether logarithmic regret holds beyond small learning rates remains unclear. In this work, we take a step towards characterizing the regret *regimes* of SGB as a function of its learning rate. |
Dorian Baudry; Emmeran Johnson; Simon Vary; Ciara Pike-Burke; Patrick Rebeschini; |
| 75 | Perception Encoder: The Best Visual Embeddings Are Not at The Output of The Network Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To draw them out, we introduce two alignment methods: language alignment for multimodal language modeling, and spatial alignment for dense prediction.We release our models, code, and novel dataset of synthetically and human-annotated videos: https://github.com/facebookresearch/perception_models |
Daniel Bolya; Po-Yao Huang; Peize Sun; Jang Hyun Cho; Andrea Madotto; Chen Wei; Tengyu Ma; Jiale Zhi; Jathushan Rajasegaran; Hanoona Abdul Rasheed; Junke Wang; Marco Monteiro; Hu Xu; Shiyu Dong; Nikhila Ravi; Shang-Wen Li; Piotr Dollar; Christoph Feichtenhofer; |
| 76 | PlayerOne: Egocentric World Simulator Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce PlayerOne, the first egocentric realistic world simulator, facilitating immersive and unrestricted exploration within vividly dynamic environments. |
Yuanpeng Tu; Hao Luo; Xi Chen; Xiang Bai; Fan Wang; Hengshuang Zhao; |
| 77 | Mean Flows for One-step Generative Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a principled and effective framework for one-step generative modeling. |
Zhengyang Geng; Mingyang Deng; Xingjian Bai; J Zico Kolter; Kaiming He; |
| 78 | ModHiFi: Identifying High Fidelity Predictive Components for Model Modification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While techniques exist for such modification, they often require training data, are computationally expensive, or are architecture-specific. To address this, we investigate the fundamental question of identifying components that are critical to the model�s predictive performance, without access to either gradients or the loss function, and with only distributional access such as synthetic data. |
Dhruva Kashyap; Chaitanya Murti; Pranav K Nayak; Tanay Narshana; Chiranjib Bhattacharyya; |
| 79 | The Structure of Relation Decoding Linear Operators in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper investigates the structure of linear operators introduced in Hernandez et al. [2023] that decode specific relational facts in transformer language models. |
Miranda Anna Christ; Adrián Csiszárik; Gergely Becsó; Dániel Varga; |
| 80 | KLASS: KL-Guided Fast Inference in Masked Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, due to its iterative refinement process, the inference is often bottlenecked by slow and static sampling speed. To overcome this problem, we introduce `KL-Adaptive Stability Sampling’ (KLASS), a fast yet effective sampling method that exploits token-level KL divergence to identify stable, high-confidence predictions. |
Seo Hyun Kim; Sunwoo Hong; Hojung Jung; Youngrok Park; Se-Young Yun; |
| 81 | HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper designs a hierarchical model merging framework named HM3, formulating a bilevel multi-objective model merging problem across both parameter and architecture spaces. |
Yu Zhou; Xingyu Wu; Jibin Wu; Liang Feng; KC Tan; |
| 82 | Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a structured sparse parametrization of transition matrices in SSMs that enables FSA state tracking with provably optimal state size and depth, while keeping the computational cost of the recurrence comparable to that of diagonal SSMs. |
Aleksandar Terzic; Nicolas Menet; Michael Hersche; Thomas Hofmann; Abbas Rahimi; |
| 83 | Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we reveal a key insight that leveraging the idea of top-$p$ sampling (a.k.a., nucleus sampling) in sparse attention could enable efficient and adaptive budget decisions. |
Chaofan Lin; Jiaming Tang; Shuo Yang; Hanshuo Wang; Tian Tang; Boyu Tian; Ion Stoica; Song Han; Mingyu Gao; |
| 84 | An Analysis of Causal Effect Estimation Using Outcome Invariant Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a unifying framework with topics in causal inference to make a case for the use of DA beyond just the i.i.d. setting, but for generalization across interventions as well. |
UZAIR AKBAR; Niki Kilbertus; Hao Shen; Krikamol Muandet; Bo Dai; |
| 85 | Deciphering The Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper introduces a novel, end-to-end framework explicitly designed to address pathological long-tailed recognition in scientific contexts.We introduce and analyze the real-world ZincFluor chemical dataset ($\mathcal{T}=137.54$) and synthetic benchmarks with controllable extreme imbalances (CIFAR-LT variants). |
Zhe Zhao; HaiBin Wen; Xianfu Liu; Rui Mao; Pengkun Wang; Liheng Yu; Linjiang Chen; Bo An; Qingfu Zhang; Yang Wang; |
| 86 | OpenCUA: Open Foundations for Computer-Use Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To bridge this gap, we propose OpenCUA, a comprehensive open-source framework for scaling CUA data and foundation models.We release our annotation tool, datasets, code, and models to build open foundations for further CUA research. |
Xinyuan Wang; Bowen Wang; Dunjie Lu; Junlin Yang; Tianbao Xie; Junli Wang; Jiaqi Deng; Xiaole Guo; Yiheng Xu; Chen Henry Wu; Zhennan Shen; Zhuokai Li; Ryan Li; Xiaochuan Li; Junda Chen; Zheng Boyuan; LI PEIHANG; Fangyu Lei; Ruisheng Cao; Yeqiao Fu; Dongchan Shin; Martin Shin; Hu Jiarui; Yuyan Wang; Jixuan Chen; Yuxiao Ye; Danyang Zhang; Yipu Wang; Heng Wang; Diyi Yang; Victor Zhong; Y.Charles; Zhilin Yang; Tao Yu; |
| 87 | Near-Optimal Experiment Design in Linear Non-Gaussian Cyclic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A key technical challenge is to efficiently estimate the reward function without having to explicitly enumerate all the graphs in the equivalence class. We propose a sampling-based estimator using random matchings and analyze its bias and concentration behavior. |
Ehsan Sharifian; Saber Salehkaleybar; Negar Kiyavash; |
| 88 | Escaping Saddle Points Without Lipschitz Smoothness: The Power of Nonlinear Preconditioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new sufficient condition that encompasses both notions, reveals their close connection, and holds in key applications such as phase retrieval and matrix factorization. |
Alexander Bodard; Panagiotis Patrinos; |
| 89 | Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In addition, we also find that the activation changes follow predictable trajectories, i.e. a sharp rise after special tokens and a subsequent exponential decay. Based on these insights, we introduce a general training-free activation control technique. |
Zekai Zhao; Qi Liu; Kun Zhou; Zihan Liu; Yifei Shao; Zhiting Hu; Biwei Huang; |
| 90 | Direct Fisher Score Estimation for Likelihood Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a sequential, gradient-based optimization method that directly models the Fisher score based on a local score matching technique which uses simulations from a localized region around each parameter iterate. |
Sherman Khoo; Yakun Wang; Song Liu; Mark Beaumont; |
| 91 | Path-Enhanced Contrastive Learning for Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods tends to unintentionally distance the target node from its path nodes on the interaction path, thus limiting its effectiveness. In this regard, we propose a solution that uses paths as samples in the contrastive loss function. |
Haoran Sun; Fei Xiong; Yuanzhe Hu; Liang Wang; |
| 92 | T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce T-REGS, a simple regularization framework for SSL based on the length of the Minimum Spanning Tree (MST) over the learned representation. |
Julie Mordacq; David Loiseaux; Vicky Kalogeiton; Steve Oudot; |
| 93 | Generating Informative Samples for Risk-Averse Fine-Tuning of Downstream Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel training framework that synthesizes informative samples for CVaR optimization using score-based generative models. |
Heasung Kim; Taekyun Lee; Hyeji Kim; Gustavo De Veciana; |
| 94 | AceSearcher: Bootstrapping Reasoning and Search for LLMs Via Reinforced Self-Play Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose AceSearcher, a cooperative self-play framework that trains a single large language model (LLM) to alternate between two roles: a decomposer that breaks down complex queries and a solver that integrates retrieved contexts for answer generation. |
Ran Xu; Yuchen Zhuang; Zihan Dong; Ruiyu Wang; Yue Yu; Joyce C. Ho; Linjun Zhang; Haoyu Wang; Wenqi Shi; Carl Yang; |
| 95 | DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To leverage temporal information more efficiently, we propose DeltaFlow ($\Delta$Flow), a lightweight 3D framework that captures motion cues via a $\Delta$ scheme, extracting temporal features with minimal computational cost, regardless of the number of frames. |
Qingwen Zhang; Xiaomeng Zhu; Yushan Zhang; Yixi Cai; Olov Andersson; Patric Jensfelt; |
| 96 | Web-Shepherd: Advancing PRMs for Reinforcing Web Agents Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Despite the importance of speed and cost-effectiveness, prior works have utilized MLLMs as reward models, which poses significant constraints for real-world deployment. To address this, in this work, we propose the first process reward model (PRM) called Web-Shepherd which could assess web navigation trajectories in a step-level. |
Hyungjoo Chae; Sunghwan Kim; Junhee Cho; Seungone Kim; Seungjun Moon; Gyeom Hwangbo; Dongha Lim; Minjin Kim; Yeonjun Hwang; Minju Gwak; Dongwook Choi; Minseok Kang; Gwanhoon Im; ByeongUng Cho; Hyojun Kim; Jun Hee Han; Taeyoon Kwon; Minju Kim; Beong-woo Kwak; Dongjin Kang; Jinyoung Yeo; |
| 97 | How Do Transformers Learn Implicit Reasoning? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent work suggests that large language models (LLMs) can perform multi-hop reasoning implicitly—producing correct answers without explicitly verbalizing intermediate steps—but the underlying mechanisms remain poorly understood. In this paper, we study how such implicit reasoning emerges by training transformers from scratch in a controlled symbolic environment. |
Jiaran Ye; Zijun Yao; Zhidian Huang; Liangming Pan; Jinxin Liu; Yushi Bai; Amy Xin; Liu Weichuan; Xiaoyin Che; Lei Hou; Juanzi Li; |
| 98 | On The Sample Complexity of Semi-supervised Multi-objective Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This, in turn, increases the statistical cost, as reflected in known MOL bounds that depend on the complexity of $\mathcal{G}$. We show that this cost is unavoidable for some losses, even in an idealized semi-supervised setting, where the learner has access to the Bayes-optimal solutions for the individual tasks as well as the marginal distributions over the covariates. |
Tobias Wegel; Geelon So; Junhyung Park; Fanny Yang; |
| 99 | Diversity-Aware Policy Optimization for Large Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the pivotal role diversity plays in RL, its influence on LLM reasoning remains largely underexplored. To bridge this gap, this work presents a systematic investigation into the impact of diversity in RL-based training for LLM reasoning, and proposes a novel diversity-aware policy optimization method. |
Jian Yao; Ran Cheng; Xingyu Wu; Jibin Wu; KC Tan; |
| 100 | Fixed-Point RNNs: Interpolating from Diagonal to Dense Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate parameterizations of a large class of dense linear RNNs as fixed-points of parallelizable diagonal linear RNNs. |
Sajad Movahedi; Felix Sarnthein; Nicola Muca Cirone; Antonio Orvieto; |
| 101 | Bridging Theory and Practice in Link Representation with Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we shift the focus to links and provide the first comprehensive study of GNN expressiveness in link representation. |
Veronica Lachi; Francesco Ferrini; Antonio Longa; Bruno Lepri; Andrea Passerini; Manfred Jaeger; |
| 102 | ErrorTrace: A Black-Box Traceability Mechanism Based on Model Family Error Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing IP protection methods either require access to model parameters or are vulnerable to fine-tuning attacks. To fill this gap, we propose ErrorTrace, a robust and black-box traceability mechanism for protecting LLM IP. |
Chuanchao Zang; Xiangtao Meng; Wenyu Chen; Tianshuo Cong; Zha Yaxing; Dong Qi; Zheng Li; Shanqing Guo; |
| 103 | Protein Design with Dynamic Protein Vocabulary Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our empirical results show that even random incorporation of fragments improves foldability. Building on this insight, we introduce ProDVa, a novel protein design approach that integrates a text encoder for functional descriptions, a protein language model for designing proteins, and a fragment encoder to dynamically retrieve protein fragments based on textual functional descriptions. |
Nuowei Liu; Jiahao Kuang; Yanting Liu; Tao Ji; Changzhi Sun; Man Lan; Yuanbin Wu; |
| 104 | Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a neuro-symbolic embodied task planning framework that incorporates explicit symbolic verification and interactive validation processes during code generation. |
Sanghyun Ahn; Wonje Choi; Junyong Lee; Jinwoo Park; Honguk Woo; |
| 105 | AI-Researcher: Autonomous Scientific Innovation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce AI-Researcher, a fully autonomous research system that transforms how AI-driven scientific discovery is conducted and evaluated. |
Jiabin Tang; Lianghao Xia; Zhonghang Li; Chao Huang; |
| 106 | Abstain Mask Retain Core: Time Series Prediction By Adaptive Masking Loss with Representation Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building upon information bottleneck theory, we propose an innovative solution termed Adaptive Masking Loss with Representation Consistency (AMRC), which features two core components: 1) Dynamic masking loss, which adaptively identified highly discriminative temporal segments to guide gradient descent during model training; 2) Representation consistency constraint, which stabilized the mapping relationships among inputs, labels, and predictions. |
Renzhao Liang; Sizhe Xu; Chenggang Xie; Jingru Chen; Feiyang Ren; Shu Yang; Takahiro Yabe; |
| 107 | DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce DexFlyWheel, a scalable data generation framework that employs a self-improving cycle to continuously enrich data diversity. |
Kefei Zhu; Fengshuo Bai; YuanHao Xiang; Yishuai Cai; Xinglin Chen; Ruochong Li; Xingtao Wang; Hao Dong; Yaodong Yang; Xiaopeng Fan; Yuanpei Chen; |
| 108 | Learning Robust Vision-Language Models from Natural Latent Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a collaborative adversarial prompt tuning (CoAPT) approach from pre-trained VLMs to target robust VLMs. |
Zhangyun Wang; Ni Ding; Aniket Mahanti; |
| 109 | Accelerating Diffusion LLMs Via Adaptive Parallel Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Alternatively, diffusion large language models (dLLMs) theoretically allow for parallel token generation, but in practice struggle to achieve the speed of autoregressive models without significantly sacrificing quality. We therefore introduce adaptive parallel decoding (APD), a novel method that dynamically adjusts the number of tokens sampled in parallel. |
Daniel Mingyi Israel; Guy Van den Broeck; Aditya Grover; |
| 110 | Inner Speech As Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI Coordination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Drawing inspiration from the theory of human cognitive processes, where inner speech guides action selection before execution, we propose MIMIC (Modeling Inner Motivations for Imitation and Control), a framework that uses language as an internal representation of behavioral intent. |
Rakshit Trivedi; Kartik Sharma; David C. Parkes; |
| 111 | Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a fix, we propose Pass-at-$k$ Policy Optimization (PKPO), a multivariate transformation on batches of rewards which leads to direct optimization of \passk\ performance, thus optimizing for sets of samples that feature a large maximum reward when considered jointly. |
Christian Walder; Deep Tejas Karkhanis; |
| 112 | LogicTree: Improving Complex Reasoning of LLMs Via Instantiated Multi-step Synthetic Logical Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the limitation, we propose **LogicTree**, a novel framework for efficiently synthesizing multi-step logical reasoning dataset that excels in both complexity and instantiation. |
Zehao Wang; Lin Yang; Jie Wang; Kehan Wang; Hanzhu Chen; Bin Wang; Jianye HAO; Defu Lian; Bin Li; Enhong Chen; |
| 113 | Multitask Learning with Stochastic Interpolants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework for learning maps between probability distributions that broadly generalizes the time dynamics of flow and diffusion models. |
Hugo Negrel; Florentin Coeurdoux; Michael Samuel Albergo; Eric Vanden-Eijnden; |
| 114 | FUDOKI: Discrete Flow-based Unified Understanding and Generation Via Kinetic-Optimal Velocities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we challenge the dominance of AR-based approaches by introducing FUDOKI, a unified multimodal model purely based on discrete flow matching, as an alternative to conventional AR paradigms. |
Jin Wang; Yao Lai; Aoxue Li; Shifeng Zhang; Jiacheng Sun; Ning Kang; Chengyue Wu; Zhenguo Li; Ping Luo; |
| 115 | Fast MRI for All: Bridging Access Gaps By Training Without Raw Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is especially an issue for rural and under-resourced areas, where commercial MRI scanners only provide access to a final reconstructed image. To tackle these challenges, we propose Compressibility-inspired Unsupervised Learning via Parallel Imaging Fidelity (CUPID) for high-quality PD-DL training using only routine clinical reconstructed images exported from an MRI scanner. |
Yasar Utku Alcalar; Merve Gulle; Mehmet Akcakaya; |
| 116 | Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While these modules improve real-time and control capabilities, it remains an open question whether they preserve or degrade the semantic knowledge contained in the pretrained VLM, and what effect they have on the VLA training dynamics. In this paper, we study this question in the context of VLAs that include a continuous diffusion or flow matching action expert, showing that naively including such experts significantly harms both training speed and knowledge transfer. |
Danny Driess; Jost Tobias Springenberg; brian ichter; LILI YU; Adrian Li-Bell; Karl Pertsch; Allen Z. Ren; Homer Walke; Quan Vuong; Lucy Xiaoyang Shi; Sergey Levine; |
| 117 | Complete Structure Guided Point Cloud Completion Via Cluster- and Instance-Level Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A key contribution of our work is the development of a novel self-supervised complete structure reconstruction module, which can learn the complete structure explicitly from incomplete point clouds and thus eliminate the reliance on training data from complete point clouds. |
Yang Chen; Yirun Zhou; WEIZHONG ZHANG; Cheng Jin; |
| 118 | ARECHO: Autoregressive Evaluation Via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these metrics often have different scales, assumptions, and dependencies, making joint estimation non-trivial. To address these issues, we introduce ARECHO (Autoregressive Evaluation via Chain-based Hypothesis Optimization), a chain-based, versatile evaluation system for speech assessment grounded in autoregressive dependency modeling. |
Jiatong Shi; Yifan Cheng; Bo-Hao Su; Hye-jin Shim; Jinchuan Tian; Samuele Cornell; Yiwen Zhao; Siddhant Arora; Shinji Watanabe; |
| 119 | Projective Equivariant Networks Via Second-order Fundamental Differential Invariants Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we tackle the challenge by constructing projective equivariant networks based on differential invariants. |
Yikang Li; Yeqing Qiu; Yuxuan Chen; Lingshen He; Lexiang Hu; Zhouchen Lin; |
| 120 | ZeroS: Zero‑Sum Linear Attention for Efficient Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify two fundamental limitations affecting these approaches: the restriction to convex combinations that only permits additive information blending, and uniform accumulated weight bias that dilutes attention in long contexts. We propose Zero-Sum Linear Attention (ZeroS), which addresses these limitations by removing the constant zero-order term $1/t$ and reweighting the remaining zero-sum softmax residuals. |
Jiecheng Lu; Xu Han; Yan Sun; Viresh Pati; Yubin Kim; Siddhartha Somani; Shihao Yang; |
| 121 | Blameless Users in A Clean Room: Defining Copyright Protection for Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper revisits the question and establishes new foundations for provable copyright protection—foundations that are firmer both technically and legally. |
Aloni Cohen; |
| 122 | Adaptive 3D Reconstruction Via Diffusion Priors and Forward Curvature-Matching Likelihood Updates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent diffusion-based methods have attempted to address this by combining prior models with likelihood updates, but they rely on heuristic fixed step sizes for the likelihood update that lead to slow convergence and suboptimal reconstruction quality. We advance this line of approach by integrating our novel Forward Curvature-Matching (FCM) update method with diffusion sampling. |
Seunghyeok Shin; Dabin Kim; Hongki Lim; |
| 123 | From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study how the choice of pretraining data distribution steers a shallow transformer toward one behavior or the other. |
Ryotaro Kawata; Yujin Song; Alberto Bietti; Naoki Nishikawa; Taiji Suzuki; Samuel Vaiter; Denny Wu; |
| 124 | Learning Interestingness in Automated Mathematical Theory Formation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In particular, we introduce an LLM-based evolutionary algorithm that features function abstraction, leading to notable improvements in discovering elementary number theory and finite fields over hard-coded baselines. |
George Tsoukalas; Rahul Saha; Amitayush Thakur; Sabrina Reguyal; Swarat Chaudhuri; |
| 125 | DNA-DetectLLM: Unveiling AI-Generated Text Via A DNA-Inspired Mutation-Repair Paradigm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, recent advances in generative language modeling have resulted in significant overlap between the feature distributions of human-written and AI-generated text, blurring classification boundaries and making accurate detection increasingly challenging. To address the above challenges, we propose a DNA-inspired perspective, leveraging a repair-based process to directly and interpretably capture the intrinsic differences between human-written and AI-generated text. |
Xiaowei Zhu; Yubing Ren; Fang Fang; Qingfeng Tan; Shi Wang; Yanan Cao; |
| 126 | Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Developing large language models is expensive and often involves making decisions with small experiments, typically by evaluating on large, multi-task evaluation suites. In this work, we analyze specific properties which make a benchmark more reliable and useful for such decisions, and interventions to design higher-quality evaluation benchmarks. |
David Heineman; Valentin Hofmann; Ian Magnusson; Yuling Gu; Noah A. Smith; Hannaneh Hajishirzi; Kyle Lo; Jesse Dodge; |
| 127 | Tradeoffs Between Mistakes and ERM Oracle Calls in Online and Transductive Online Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study online and transductive online learning in settings where the learner can interact with the concept class only via Empirical Risk Minimization (ERM) or weak consistency oracles on arbitrary subsets of the instance domain. |
Idan Attias; Steve Hanneke; Arvind Ramaswami; |
| 128 | The World Is Bigger: A Computationally-Embedded Perspective on The Big World Hypothesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, explicit constraints can be ad hoc, difficult to incorporate, and limiting to the effectiveness of scaling up the agent’s capacity. In this paper, we characterize a problem setting in which an agent, regardless of its capacity, is implicitly constrained by being embedded in the environment. |
Alex Lewandowski; Aditya A. Ramesh; Edan Meyer; Dale Schuurmans; Marlos C. Machado; |
| 129 | Multimodal Disease Progression Modeling Via Spatiotemporal Disentanglement and Multiscale Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Longitudinal multimodal data, including electronic health records (EHR) and sequential chest X-rays (CXRs), is critical for modeling disease progression, yet remains underutilized due to two key challenges: (1) redundancy in consecutive CXR sequences, where static anatomical regions dominate over clinically-meaningful dynamics, and (2) temporal misalignment between sparse, irregular imaging and continuous EHR data. We introduce $\texttt{DiPro}$, a novel framework that addresses these challenges through region-aware disentanglement and multi-timescale alignment. |
Chen Liu; Wenfang Yao; Kejing Yin; William K. Cheung; Jing Qin; |
| 130 | On The Hardness of Conditional Independence Testing In Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While informative, this result (based on “hiding” dependence) does not seem to explain the frequent practical failures observed with popular CI tests. We investigate the Kernel-based Conditional Independence (KCI) test – of which we show the Generalized Covariance Measure underlying many recent tests is _nearly_ a special case – and identify the major factors underlying its practical behavior. |
Zheng He; Roman Pogodin; Yazhe Li; Namrata Deka; Arthur Gretton; Danica J. Sutherland; |
| 131 | Emergence and Evolution of Interpretable Concepts in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the great potential of SAEs has not yet been applied toward gaining insight into the intricate generative process of diffusion models. In this work, we leverage the SAE framework to probe the inner workings of a popular text-to-image diffusion model, and uncover a variety of human-interpretable concepts in its activations. |
Berk Tinaz; Zalan Fabian; Mahdi Soltanolkotabi; |
| 132 | PCA++: How Uniformity Induces Robustness to Background Noise in Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our baseline, PCA+, uses alignment-only contrastive learning and succeeds when background variation is mild, but fails under strong noise or high-dimensional regimes. To address this, we introduce PCA++, a hard uniformity-constrained contrastive PCA that enforces identity covariance on projected features. |
Mingqi Wu; Qiang Sun; Archer Y. Yang; |
| 133 | Optimal Nuisance Function Tuning for Estimating A Doubly Robust Functional Under Proportional Asymptotics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the asymptotically optimal tuning parameter choice in ridge regression for estimating nuisance functions of a statistical functional that has recently gained prominence in conditional independence testing and causal inference. |
Sean McGrath; Debarghya Mukherjee; Rajarshi Mukherjee; Zixiao Wang; |
| 134 | GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Ultra-high-resolution (UHR) remote sensing (RS) imagery offers valuable data for Earth observation but pose challenges for existing multimodal foundation models due to two key bottlenecks: (1) limited availability of UHR training data, and (2) token explosion caused by the large image size. To address data scarcity, we introduce **SuperRS-VQA** (avg. 8,376$\times$8,376) and **HighRS-VQA** (avg. 2,000$\times$1,912), the highest-resolution vision-language datasets in RS to date, covering 22 real-world dialogue tasks. |
Fengxiang Wang; Mingshuo Chen; Yueying Li; Di Wang; Haotian Wang; Zonghao Guo; Zefan Wang; Shan Boqi; Long Lan; Yulin Wang; Hongzhen Wang; Wenjing Yang; Bo Du; Jing Zhang; |
| 135 | Environment Inference for Learning Generalizable Dynamical System Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we propose DynaInfer, a novel method that infers environment specifications by analyzing prediction errors from fixed neural networks within each training round, enabling environment assignments directly from data. |
Shixuan Liu; Yue He; Haotian Wang; Wenjing Yang; Yunfei Wang; Peng Cui; Zhong Liu; |
| 136 | Abstract Rendering: Certified Rendering Under 3D Semantic Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce **abstract rendering**, a framework that computes provable bounds on all images rendered under continuously varying camera poses and scenes. |
Chenxi Ji; Yangge Li; Xiangru Zhong; Huan Zhang; Sayan Mitra; |
| 137 | Inference-Time Reward Hacking in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we characterize reward hacking in inference-time alignment and demonstrate when and how we can mitigate it by hedging on the proxy reward. |
Hadi Khalaf; Claudio Mayrink Verdun; Alex Oesterling; Himabindu Lakkaraju; Flavio Calmon; |
| 138 | High-Performance Arithmetic Circuit Optimization Via Differentiable Architecture Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This disconnect undermines design quality, leading to suboptimal solutions in the circuit topology search space. To bridge this gap, we present **Arith-DAS**, a **D**ifferentiable **A**rchitecture **S**earch framework for **Arith**metic circuits. |
Xilin Xia; Jie Wang; Wanbo Zhang; Zhihai Wang; Mingxuan Yuan; Jianye HAO; Feng Wu; |
| 139 | ROOT: Rethinking Offline Optimization As Distributional Translation Via Probabilistic Bridge Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Both approaches are constrained by the limited amount of offline data. To mitigate this limitation, we introduce a new perspective that casts offline optimization as a distributional translation task. |
Manh Cuong Dao; The Hung Tran; Phi Le Nguyen; Thao Nguyen Truong; Trong Nghia Hoang; |
| 140 | Go With The Flow: Fast Diffusion for Gaussian Mixture Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an analytic parametrization of a set of feasible policies for steering the distribution of a dynamical system from one Gaussian Mixture Model (GMM) to another. |
George Rapakoulias; Ali Reza Pedram; Fengjiao Liu; Lingjiong Zhu; Panagiotis Tsiotras; |
| 141 | SHF: Symmetrical Hierarchical Forest with Pretrained Vision Transformer Encoder for High-Resolution Medical Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a novel approach to addressing the long-sequence problem in high-resolution medical images for Vision Transformers (ViTs). |
Enzhi Zhang; Peng Chen; Rui Zhong; Du Wu; Jun Igarashi; Isaac Lyngaas; Xiao Wang; Masaharu Munetomo; Mohamed Wahib; |
| 142 | Proxy-SPEX: Sample-Efficient Interpretability Via Sparse Feature Interactions in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we observe that LLM feature interactions are often *hierarchical*—higher-order interactions are accompanied by their lower-order subsets—which enables more efficient discovery. |
Landon Butler; Abhineet Agarwal; Justin Singh Kang; Yigit Efe Erginbas; Bin Yu; Kannan Ramchandran; |
| 143 | Fair Cooperation in Mixed-Motive Games Via Conflict-Aware Gradient Adjustment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose an adaptive conflict-aware gradient adjustment method that promotes cooperation while ensuring fairness in individual rewards. |
Woojun Kim; Katia P. Sycara; |
| 144 | OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present OpenWorldSAM, a framework that extends the prompt-driven Segment Anything Model v2 (SAM2) to open-vocabulary scenarios by integrating multi-modal embeddings extracted from a lightweight vision-language model (VLM). |
Shiting Xiao; Rishabh Kabra; Yuhang Li; Donghyun Lee; Joao Carreira; Priyadarshini Panda; |
| 145 | Broken Tokens? Your Language Model Can Secretly Handle Non-Canonical Tokenizations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the robustness of LMs to input encoded with non-canonical tokenizations entirely unseen during training. |
Brian Siyuan Zheng; Alisa Liu; Orevaoghene Ahia; Jonathan Hayase; Yejin Choi; Noah A. Smith; |
| 146 | Hybrid-Balance GFlowNet for Solving Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While Detailed Balance (DB) addresses local optimization more effectively, it alone falls short in solving VRPs, which inherently require holistic trajectory optimization. To address these limitations, we introduce the Hybrid-Balance GFlowNet (HBG) framework, which uniquely integrates TB and DB in a principled and adaptive manner by aligning their intrinsically complementary strengths. |
Ni Zhang; Zhiguang Cao; |
| 147 | Feedback-Aware MCTS for Goal-Oriented Information Seeking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A key challenge lies in efficiently narrowing down a large space of possible outcomes by posing questions that minimize uncertainty. To address this, we introduce a novel framework that leverages Large Language Models (LLMs) to generate information-seeking questions, with Monte Carlo Tree Search (MCTS) to strategically select questions that maximize information gain, as a part of inference-time planning. |
Harshita Chopra; Chirag Shah; |
| 148 | Self-Supervised Learning of Motion Concepts By Optimizing Counterfactuals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we describe Opt-CWM, a self-supervised flow and occlusion estimation technique from a pretrained video prediction model. |
Stefan Stojanov; David Wendt; Seungwoo Kim; Rahul Mysore Venkatesh; Kevin Feigelis; Klemen Kotar; Khai Loong Aw; Jiajun Wu; Daniel LK Yamins; |
| 149 | Uncertain Knowledge Graph Completion Via Semi-Supervised Confidence Distribution Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This causes that the learnt embeddings are insufficient to high-quality UKG completion. Thus, in this paper, to address the above issue, we propose a new semi-supervised Confidence Distribution Learning (ssCDL) method for UKG completion, where each triple confidence is transformed into a confidence distribution to introduce more supervision information of different confidences to reinforce the embedding learning process. |
Tianxing Wu; Shutong Zhu; Jingting Wang; Ning Xu; Guilin Qi; Haofen Wang; |
| 150 | OCTDiff: Bridged Diffusion Model for Portable OCT Super-Resolution and Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose OCTDiff, a bridged diffusion model designed to enhance image resolution and quality from portable OCT devices. |
Ye Tian; Angela McCarthy; Gabriel Gomide; Nancy Liddle; Jedrzej Golebka; Royce Chen; Jeff Liebmann; Kaveri A. Thakoor; |
| 151 | Bits Leaked Per Query: Information-Theoretic Bounds for Adversarial Attacks on LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Yet the scale of information leaked remains anecdotal, leaving auditors without principled guidance and defenders blind to the transparency–risk trade-off. We fill this gap with an information-theoretic framework that computes how much information can be safely disclosed, and enables auditors to gauge how close their methods come to the fundamental limit. |
Masahiro Kaneko; Timothy Baldwin; |
| 152 | Characterizing Control Between Interacting Subsystems with Deep Jacobian Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, methods for understanding subsystem control are typically linear and cannot adequately describe the rich contextual effects enabled by nonlinear complex systems. To bridge this gap, we devise a data-driven nonlinear control-theoretic framework to characterize subsystem interactions via the Jacobian of the dynamics. |
Adam Joseph Eisen; Mitchell Ostrow; Sarthak Chandra; Leo Kozachkov; Earl K Miller; Ila R Fiete; |
| 153 | Sharper Convergence Rates for Nonconvex Optimisation Via Reduction Mappings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These reductions naturally arise from inner optimisation problems and effectively remove redundant directions, yielding a lower-dimensional objective. In this work, we introduce a general framework to understand how such reductions influence the optimisation landscape. |
Evan Markou; Thalaiyasingam Ajanthan; Stephen Gould; |
| 154 | Selective Omniprediction and Fair Abstention Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose new learning algorithms for building selective classifiers, which are predictors that are allowed to abstain on some fraction of the domain. |
Sílvia Casacuberta; Varun Kanade; |
| 155 | Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-Index Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We give a general algorithm for PAC learning a broad class of MIMs with respect to the square loss, even in the presence of adversarial label noise. |
Ilias Diakonikolas; Giannis Iakovidis; Daniel Kane; Lisheng Ren; |
| 156 | A Principled Approach to Randomized Selection Under Uncertainty: Applications to Peer Review and Grant Funding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a principled framework for randomized decision-making based on interval estimates of item quality. |
Alexander Koujianos Goldberg; Giulia Fanti; Nihar B Shah; |
| 157 | Error Forcing in Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we introduce error forcing (EF), where the network activity is guided orthogonally toward the zero-error manifold during learning. |
A Erdem Sağtekin; Colin Bredenberg; Cristina Savin; |
| 158 | Aligning Text-to-Image Diffusion Models to Human Preference By Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by preference learning in large language models, we propose ABC (Alignment by Classification), a simple yet effective framework for aligning diffusion models with human preferences. |
Longquan Dai; Xiaolu Wei; He Wang; Shaomeng Wang; Jinhui Tang; |
| 159 | Sketched Gaussian Mechanism for Private Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the Sketched Gaussian Mechanism (SGM), which directly combines sketching and the Gaussian mechanism for privacy. |
Qiaobo Li; Zhijie Chen; Arindam Banerjee; |
| 160 | Joint‑Embedding Vs Reconstruction: Provable Benefits of Latent Space Prediction for Self‑Supervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we unveil the core mechanisms that distinguish each paradigm. |
Hugues Van Assel; Mark Ibrahim; Tommaso Biancalani; Aviv Regev; Randall Balestriero; |
| 161 | Accelerating Data-driven Algorithm Selection for Combinatorial Partitioning Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We provide size generalization guarantees for three widely used clustering algorithms (single-linkage, k-means++, and Gonzalez’s k-centers heuristic) and two canonical max-cut algorithms (Goemans-Williamson and Greedy). |
Vaggos Chatziafratis; Ishani Karmarkar; Yingxi Li; Ellen Vitercik; |
| 162 | Purifying Shampoo: Investigating Shampoo’s Heuristics By Decomposing Its Preconditioner Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To manage the error induced by infrequent *eigenbasis* computations, we propose an adaptive criterion for determining the eigenbasis computation frequency motivated by terminating a warm-started QR algorithm. |
Runa Eschenhagen; Aaron Defazio; Tsung-Hsien Lee; Richard E. Turner; Hao-Jun Michael Shi; |
| 163 | Replicable Distribution Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We initiate a systematic investigation of distribution testing in the framework of algorithmic replicability. Specifically, given independent samples from a collection of probability distributions, the goal is to characterize the sample complexity of replicably testing natural properties of the underlying distributions. |
Ilias Diakonikolas; Jingyi Gao; Daniel Kane; Sihan Liu; Christopher Ye; |
| 164 | FAPEX: Fractional Amplitude-Phase Expressor for Robust Cross-Subject Seizure Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose \model{FAPEX}, a novel architecture that introduces a learnable \emph{fractional neural frame operator} (FrNFO) for adaptive time–frequency decomposition. |
Ruizhe Zheng; Lingyan Mao; DINGDING HAN; Tian Luo; Yi Wang; Jing Ding; Yuguo Yu; |
| 165 | Causality Meets Locality: Provably Generalizable and Scalable Policy Learning for Networked Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large‑scale networked systems, such as traffic, power, and wireless grids, challenge reinforcement‑learning agents with both scale and environment shifts. To address these challenges, we propose \texttt{GSAC} (\textbf{G}eneralizable and \textbf{S}calable \textbf{A}ctor‑\textbf{C}ritic), a framework that couples causal representation learning with meta actor‑critic learning to achieve both scalability and domain generalization. |
Hao Liang; Shuqing Shi; Yudi Zhang; Biwei Huang; Yali Du; |
| 166 | Conformal Mixed-Integer Constraint Learning with Feasibility Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Conformal Mixed-Integer Constraint Learning (C-MICL), a novel framework that provides probabilistic feasibility guarantees for data-driven constraints in optimization problems. |
Daniel Ovalle; Lorenz T. Biegler; Ignacio E Grossmann; Carl D Laird; Mateo Dulce Rubio; |
| 167 | Why Do Some Language Models Fake Alignment While Others Don’t? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: *Alignment Faking in Large Language Models* presented a demonstration of Claude 3 Opus and Claude 3.5 Sonnet selectively complying with a helpful-only training objective to prevent modification of their behavior outside of training. We expand this analysis to 25 models and find that only 5 (Claude 3 Opus, Claude 3.5 Sonnet, Llama 3 405B, Grok 3, Gemini 2.0 Flash) comply with harmful queries more when they infer they are in training than when they infer they are in deployment. |
Abhay Sheshadri; John Hughes; Julian Michael; Alex Troy Mallen; Arun Jose; Fabien Roger; |
| 168 | When Data Can’t Meet: Estimating Correlation Across Privacy Barriers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the problem of estimating the correlation of two random variables $X$ and $Y$, where the pairs $(X,Y)$ are not observed together, but are instead separated co-ordinate-wise at two servers: server 1 contains all the $X$ observations, and server 2 contains the corresponding $Y$ observations. |
Abhinav Chakraborty; Arnab Auddy; T. Tony Cai; |
| 169 | TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce TRIDENT, a novel framework that integrates molecular SMILES, textual descriptions, and taxonomic functional annotations to learn rich molecular representations. |
Feng Jiang; Mangal Prakash; Hehuan Ma; Jianyuan Deng; Yuzhi Guo; Amina Mollaysa; Tommaso Mansi; Rui Liao; Junzhou Huang; |
| 170 | Deep Value Benchmark: Measuring Whether Models Generalize Deep Values or Shallow Preferences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the Deep Value Benchmark (DVB), an evaluation framework that directly tests whether large language models (LLMs) learn fundamental human values or merely surface-level preferences.We are releasing our dataset, which was subject to three separate human validation experiments. |
Joshua Ashkinaze; Hua Shen; Sai Avula; Eric Gilbert; Ceren Budak; |
| 171 | Return of ChebNet: Understanding and Improving An Overlooked GNN on Long Range Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This has led researchers to adapt MPNNs through *rewiring* or make use of *Graph Transformers*, which compromise the computational efficiency that characterized early spatial message passing architectures, and typically disregard the graph structure. Almost a decade after its original introduction, we revisit ChebNet to shed light on its ability to model distant node interactions. |
Ali Hariri; Alvaro Arroyo; Alessio Gravina; Moshe Eliasof; Carola-Bibiane Schönlieb; Davide Bacciu; Xiaowen Dong; Kamyar Azizzadenesheli; Pierre Vandergheynst; |
| 172 | Preconditioned Langevin Dynamics with Score-based Generative Models for Infinite-Dimensional Linear Bayesian Inverse Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Designing algorithms for solving high-dimensional Bayesian inverse problems directly in infinite‑dimensional function spaces – where such problems are naturally formulated – is crucial to ensure stability and convergence as the discretization of the underlying problem is refined. In this paper, we contribute to this line of work by analyzing a widely used sampler for linear inverse problems: Langevin dynamics driven by score‑based generative models (SGMs) acting as priors, formulated directly in function space. |
Lorenzo Baldassari; Josselin Garnier; Knut Solna; Maarten V. de Hoop; |
| 173 | Transferring Linear Features Across Language Models With Model Stitching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we demonstrate that affine mappings between residual streams of language models is a cheap way to effectively transfer represented features between models. |
Alan Chen; Jack Merullo; Alessandro Stolfo; Ellie Pavlick; |
| 174 | Causal Differentiating Concepts: Interpreting LM Behavior Via Causal Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Language model activations entangle concepts that mediate their behavior, making it difficult to interpret these factors, which has implications for generalizability and robustness. We introduce an approach for disentangling these concepts without supervision. |
Navita Goyal; Hal Daumé III; Alexandre Drouin; Dhanya Sridhar; |
| 175 | Ambient Proteins – Training Diffusion Models on Noisy Structures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Ambient Protein Diffusion, a framework for training protein diffusion models that generates structures with unprecedented diversity and quality. |
Giannis Daras; Jeffrey Ouyang-Zhang; Krithika Ravishankar; Constantinos Costis Daskalakis; Adam Klivans; Daniel Jesus Diaz; |
| 176 | On Universality Classes of Equivariant Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate the approximation power of equivariant neural networks beyond separation constraints. |
Marco Pacini; Gabriele Santin; Bruno Lepri; Shubhendu Trivedi; |
| 177 | Guarantees for Alternating Least Squares in Overparameterized Tensor Decompositions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our main result shows that overparameterization provably enables global convergence of ALS: on input a third order $n \times n \times n$ tensor with a decomposition of rank $r \ll n$, ALS overparameterized with rank $k=O(r^2)$ achieves global convergence with high probability under random initialization. |
Dionysis Arvanitakis; Vaidehi Srinivas; Aravindan Vijayaraghavan; |
| 178 | Checklists Are Better Than Reward Models For Aligning Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our work, we instead propose using flexible, instruction-specific criteria as a means of broadening the impact that reinforcement learning can have in eliciting instruction following.We release our our dataset of rubrics (WildChecklists), models, and code to the public. |
Vijay Viswanathan; Yanchao Sun; Xiang Kong; Meng Cao; Graham Neubig; Tongshuang Wu; |
| 179 | Blackbox Model Provenance Via Palimpsestic Membership Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Can Alice prove that Bob is using her model, either by querying Bob’s derivative model (query setting) or from the text alone ( observational setting)? We formulate this question as an independence testing problem—in which the null hypothesis is that Bob’s model or text is independent of Alice’s randomized training run—and investigate it through the lens of palimpsestic memorization in language models: models are more likely to memorize data seen later in training, so we can test whether Bob is using Alice’s model using test statistics that capture correlation between Bob’s model or text and the ordering of training examples in Alice’s training run. |
Rohith Kuditipudi; Jing Huang; Sally Zhu; Diyi Yang; Christopher Potts; Percy Liang; |
| 180 | Scaling Can Lead to Compositional Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we seek to understand what it takes for a standard neural network to generalize over tasks that share compositional structure. |
Florian Redhardt; Yassir Akram; Simon Schug; |
| 181 | Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: G-Vendi scales to million-sample datasets and yet consistently outperforms heuristic alternatives, achieving strong correlation ($\text{Spearman’s } \rho \approx 0.9$) with out-of-distribution (OOD) performance across both natural language inference (NLI) and math reasoning tasks. Building on this insight, we present **Prismatic Synthesis**, a framework for generating diverse synthetic data by targeting underrepresented regions in gradient space. |
Jaehun Jung; Seungju Han; Ximing Lu; Skyler Hallinan; David Acuna; Shrimai Prabhumoye; Mostofa Patwary; Mohammad Shoeybi; Bryan Catanzaro; Yejin Choi; |
| 182 | AnaCP: Toward Upper-Bound Continual Learning Via Analytic Contrastive Projection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they still face a major limitation: the inability to continually adapt feature representations to best suit the CIL tasks, leading to suboptimal performance. To address this, we propose AnaCP (Analytic Contrastive Projection), a novel method that preserves the efficiency of analytic classifiers while enabling incremental feature adaptation without gradient-based training, thereby eliminating the CF caused by gradient updates. |
Saleh Momeni; Changnan Xiao; Bing Liu; |
| 183 | Flow Density Control: Generative Optimization Beyond Entropy-Regularized Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce Flow Density Control (FDC), a simple algorithm that reduces this complex problem to a specific sequence of simpler fine-tuning tasks, each solvable via scalable established methods. |
Riccardo De Santi; Marin Vlastelica; Ya-Ping Hsieh; Zebang Shen; Niao He; Andreas Krause; |
| 184 | What Expressivity Theory Misses: Message Passing Complexity for GNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we argue that this focus is misguided: First, higher expressivity is not necessary for most real-world tasks as these tasks rarely require expressivity beyond the basic WL test. Second, expressivity theory’s binary characterization and idealized assumptions fail to reflect GNNs’ practical capabilities. To overcome these limitations, we propose Message Passing Complexity (MPC): a continuous measure that quantifies the difficulty for a GNN architecture to solve a given task through message passing. |
Niklas Kemper; Tom Wollschläger; Stephan Günnemann; |
| 185 | Ambient Diffusion Omni: Training Good Models with Bad Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Ambient Diffusion Omni, a simple, principled framework to train diffusion models that can extract signal from arbitrarily images during training. |
Giannis Daras; Adrian Rodriguez-Munoz; Adam Klivans; Antonio Torralba; Constantinos Costis Daskalakis; |
| 186 | UMA: A Family of Universal Models for Atoms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this need, we present a family of Universal Models for Atoms (UMA), designed to push the frontier of speed, accuracy, and generalization.We are releasing the UMA code, weights, and associated data to accelerate computational workflows and enable the community to build increasingly capable AI models. |
Brandon M Wood; Misko Dzamba; Xiang Fu; Meng Gao; Muhammed Shuaibi; Luis Barroso-Luque; Kareem Abdelmaqsoud; Vahe Gharakhanyan; John R. Kitchin; Daniel S. Levine; Kyle Michel; Anuroop Sriram; Taco Cohen; Abhishek Das; Sushree Jagriti Sahoo; Ammar Rizvi; Zachary Ward Ulissi; C. Lawrence Zitnick; |
| 187 | The Best Instruction-Tuning Data Are Those That Fit Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We hypothesize that SFT is most effective when the data is aligned with the model’s pretrained distribution, and propose **GRAPE**—a novel SFT framework that tailors supervision to the target model. |
Dylan Zhang; Qirun Dai; Hao Peng; |
| 188 | Forecasting in Offline Reinforcement Learning for Non-stationary Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These offsets can lead to partial observability, causing agents to misperceive their true state and degrade performance. To overcome this challenge, we introduce Forecasting in Non-stationary Offline RL (FORL), a framework that unifies (i) conditional diffusion-based candidate state generation, trained without presupposing any specific form of future non-stationarity, and (ii) zero-shot time-series foundation models. |
Suzan Ece Ada; Georg Martius; Emre Ugur; Erhan Oztop; |
| 189 | Purifying Approximate Differential Privacy with Randomized Post-processing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a framework to convert $(\varepsilon, \delta)$-approximate Differential Privacy (DP) mechanisms into $(\varepsilon’, 0)$-pure DP mechanisms under certain conditions, a process we call “purification.” |
Yingyu Lin; Erchi Wang; Yian Ma; Yu-Xiang Wang; |
| 190 | Trust Region Constrained Measure Transport in Path Space for Stochastic Optimal Control and Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In practice, however, this optimization is challenging in particular if the target measure differs substantially from the prior. In this work, we therefore approach the problem by iteratively solving constrained problems incorporating trust regions that aim for approaching the target measure gradually in a systematic way. |
Denis Blessing; Julius Berner; Lorenz Richter; Carles Domingo-Enrich; Yuanqi Du; Arash Vahdat; Gerhard Neumann; |
| 191 | How Well Can Differential Privacy Be Audited in One Run? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that the key barrier to its efficacy is interference between the observable effects of different data elements. |
Amit Keinan; Moshe Shenfeld; Katrina Ligett; |
| 192 | Counteractive RL: Rethinking Core Principles for Efficient and Scalable Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In our paper we focus on the agent’s interaction with the environment in a high-dimensional MDP during the learning phase and we introduce a theoretically-founded novel paradigm based on experiences obtained through counteractive actions. |
Ezgi Korkmaz; |
| 193 | Is Noise Conditioning Necessary? A Unified Theory of Unconditional Graph Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Explicit noise-level conditioning is widely regarded as essential for the effective operation of Graph Diffusion Models (GDMs). In this work, we challenge this assumption by investigating whether denoisers can implicitly infer noise levels directly from corrupted graph structures, potentially eliminating the need for explicit noise conditioning. |
Jipeng Li; Yanning Shen; |
| 194 | An Evidence-Based Post-Hoc Adjustment Framework for Anomaly Detection Under Data Contamination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing solutions require access to the training pipelines, data or prior knowledge of the proportions of anomalies in the data, limiting their real-world applicability. To address this challenge, we propose EPHAD, a simple yet effective test-time adaptation framework that updates the outputs of AD models trained on contaminated datasets using evidence gathered at test time. |
Sukanya Patra; Souhaib Ben Taieb; |
| 195 | SHAP Values Via Sparse Fourier Representation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose an efficient two-stage algorithm for computing SHAP values in both black-box setting and tree-based models. |
Ali Gorji; Andisheh Amrollahi; Andreas Krause; |
| 196 | Quantum Doubly Stochastic Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recently, it has been proven that DSMs can be obtained with a parametric quantum circuit, yielding a novel quantum inductive bias for DSMs with no known classical analogue. Motivated by this, we demonstrate the feasibility of a hybrid classical-quantum doubly stochastic Transformer (QDSFormer) that replaces the softmax in the self-attention layer with a variational quantum circuit. |
Jannis Born; Filip Skogh; Kahn Rhrissorrakrai; Filippo Utro; Nico Wagner; Aleksandros Sobczyk; |
| 197 | Reverse Engineering Human Preferences with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this study, we adopt a different approach and use the signal provided by judge-LLMs as a reward to adversarially tune models that generate text preambles designed to boost downstream performance. |
Lisa Alazraki; Yi-Chern Tan; Jon Ander Campos; Maximilian Mozes; Marek Rei; Max Bartolo; |
| 198 | Efficient Fairness-Performance Pareto Front Computation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: There is a well known intrinsic trade-off between the fairness of a representation and the performance of classifiers derived from the representation. In this paper we propose a new method to compute the optimal Pareto front of this trade off. |
Mark Kozdoba; Binyamin Perets; Shie Mannor; |
| 199 | Curl Descent : Non-Gradient Learning Dynamics with Sign-Diverse Plasticity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Curl terms naturally emerge in networks with excitatory-inhibitory connectivity or Hebbian/anti-Hebbian plasticity, resulting in learning dynamics that cannot be framed as gradient descent on any objective. To investigate the impact of these curl terms, we analyze feedforward networks within an analytically tractable student-teacher framework, systematically introducing non-gradient dynamics through rule-flipped neurons. |
Hugo Ninou; Jonathan Kadmon; N Alex Cayco Gajic; |
| 200 | ElliCE: Efficient and Provably Robust Algorithmic Recourse Via The Rashomon Sets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ElliCE, a novel framework for robust algorithmic recourse that optimizes counterfactuals over an ellipsoidal approximation of the Rashomon set. |
Bohdan Turbal; Iryna Voitsitska; Lesia Semenova; |
| 201 | Head Pursuit: Probing Attention Specialization in Multimodal Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study how individual attention heads in text-generative models specialize in specific semantic or visual attributes. |
Lorenzo Basile; Valentino Maiorca; Diego Doimo; Francesco Locatello; Alberto Cazzaniga; |
| 202 | Towards A Golden Classifier-Free Guidance Path Via Foresight Fixed Point Iterations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing approaches stem from divergent theoretical interpretations, thereby limiting the design space and obscuring key design choices. To address this, we propose a unified perspective that reframes conditional guidance as fixed point iterations, seeking to identify a golden path where latents produce consistent outputs under both conditional and unconditional generation. |
Kaibo Wang; Jianda Mao; Tong Wu; Yang Xiang; |
| 203 | Credal Prediction Based on Relative Likelihood Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a theoretically grounded approach to credal prediction based on the statistical notion of relative likelihood: The target of prediction is the set of all (conditional) probability distributions produced by the collection of plausible models, namely those models whose relative likelihood exceeds a specified threshold. |
Timo Löhr; Paul Hofman; Felix Mohr; Eyke Hüllermeier; |
| 204 | The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by recent work on learning with distribution shift, we give a general outlier removal algorithm called *iterative polynomial filtering* and show a number of striking applications for supervised learning with contamination: (1) We show that any function class that can be approximated by low-degree polynomials with respect to a hypercontractive distribution can be efficiently learned under bounded contamination (also known as *nasty noise*). |
Adam Klivans; Konstantinos Stavropoulos; Kevin Tian; Arsen Vasilyan; |
| 205 | Light-Weight Diffusion Multiplier and Uncertainty Quantification for Fourier Neural Operators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce DINOZAUR: a diffusion-based neural operator parametrization with uncertainty quantification. |
Albert Matveev; Sanmitra Ghosh; Aamal Hussain; James-Michael Leahy; Michalis Michaelides; |
| 206 | Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As a result, they are fundamentally ill-suited for applications involving inherently discrete quantities such as particle counts or material units, that are constrained by strict conservation laws like mass conservation, limiting their applicability in scientific workflows. To address this limitation, we propose Discrete Spatial Diffusion (DSD), a framework based on a continuous-time, discrete-state jump stochastic process that operates directly in discrete spatial domains while strictly preserving particle counts in both forward and reverse diffusion processes. |
Javier E. Santos; Agnese Marcato; Roman Colman; Nicholas Lubbers; Yen Ting Lin; |
| 207 | CTRL-ALT-DECEIT Sabotage Evaluations for Automated AI R&D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, frontier and future systems may not be sufficiently trustworthy, and there is evidence that these systems may even be misaligned with their developers or users. Therefore, we investigate the capabilities of AI agents to act against the interests of their users when conducting ML engineering, by sabotaging ML models, sandbagging their performance, and subverting oversight mechanisms. |
Francis Rhys Ward; Teun van der Weij; Hanna Gábor; Sam Martin; Raja Mehta Moreno; Harel Lidar; Louis Makower; Thomas Jodrell; Lauren Robson; |
| 208 | An Analytical Theory of Spectral Bias in The Learning Dynamics of Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop an analytical framework for understanding how the learned distribution evolves during diffusion model training. |
Binxu Wang; Cengiz Pehlevan; |
| 209 | DeepHalo: A Neural Choice Model with Controllable Context Effects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose DeepHalo, a neural modeling framework that incorporates features while enabling explicit control over interaction order and principled interpretation of context effects. |
Shuhan Zhang; Zhi Wang; Rui Gao; Shuang Li; |
| 210 | Multi-Agent Learning Under Uncertainty: Recurrence Vs. Concentration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we examine the convergence landscape of multi-agent learning under uncertainty. |
Kyriakos Lotidis; Panayotis Mertikopoulos; Nicholas Bambos; Jose Blanchet; |
| 211 | Quantum Speedup of Non-linear Monte Carlo Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate whether a similar quadratic speedup is achievable for estimating *non-linear* functionals of probability distributions. |
Jose Blanchet; Yassine Hamoudi; Mario Szegedy; Guanyang Wang; |
| 212 | Fine-grained List-wise Alignment for Generative Medication Recommendation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose FLAME, a fine-grained list-wise alignment framework for large language models (LLMs), enabling drug-by-drug generation of drug lists. |
Chenxiao Fan; Chongming Gao; Wentao Shi; Yaxin Gong; Zhao Zihao; Fuli Feng; |
| 213 | Scaling Up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. |
Jonas Geiping; Sean Michael McLeish; Neel Jain; John Kirchenbauer; Siddharth Singh; Brian R. Bartoldson; Bhavya Kailkhura; Abhinav Bhatele; Tom Goldstein; |
| 214 | Diffusion-Based Hierarchical Graph Neural Networks for Simulating Nonlinear Solid Mechanics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they often struggle with capturing global phenomena, such as bending or long-range correlations usually occurring in solid mechanics, and suffer from error accumulation over long rollouts due to their reliance on local message passing and direct next-step prediction. We address these limitations by introducing the Rolling Diffusion-Batched Inference Network (ROBIN), a novel learned simulator that integrates two key innovations: (i) Rolling Diffusion-Batched Inference (ROBI), a parallelized inference scheme that amortizes the cost of diffusion-based refinement across physical time steps by overlapping denoising steps across a temporal window. |
Tobias Würth; Niklas Freymuth; Gerhard Neumann; Luise Kärger; |
| 215 | Breaking The Batch Barrier (B3) of Contrastive Learning Via Smart Batch Mining Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose *Breaking the Batch Barrier* (B3), a novel batch construction strategy designed to curate high-quality batches for CL. |
Raghuveer Thirukovalluru; Rui Meng; Ye Liu; Karthikeyan K; Mingyi Su; Ping Nie; Semih Yavuz; Yingbo Zhou; Wenhu Chen; Bhuwan Dhingra; |
| 216 | Dynamic Algorithm for Explainable $k$-medians Clustering Under $\ell_p$ Norm Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the first algorithm for explainable $k$-medians under $\ell_p$ norm for every finite $p \geq 1$. |
Konstantin Makarychev; Ilias Papanikolaou; Liren Shan; |
| 217 | Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Tracking the emergence of goals and values has proven a longstanding problem, and despite much interest over the years it remains unclear whether current AIs have meaningful values. We propose a solution to this problem, leveraging the framework of utility functions to study the internal coherence of AI preferences. |
Mantas Mazeika; Xuwang Yin; Rishub Tamirisa; Jaehyuk Lim; Bruce W. Lee; Richard Ren; Long Phan; Norman Mu; Oliver Zhang; Dan Hendrycks; |
| 218 | Is The Acquisition Worth The Cost? Surrogate Losses for Consistent Two-stage Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider an increasingly relevant setting where we have two classifier stages. |
Florence Regol; Joseph Cotnareanu; Theodore Glavas; Mark Coates; |
| 219 | SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To explore whether low-bit attention can be effectively applied to training tasks, we design an accurate and efficient $\texttt{8-bit}$ attention for both forward and backward propagation. |
Jintao Zhang; Jia wei; Haoxu Wang; Pengle Zhang; Xiaoming Xu; Haofeng Huang; Kai Jiang; Jun Zhu; Jianfei Chen; |
| 220 | Eluder Dimension: Localise It! Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We establish a lower bound on the eluder dimension in generalised linear model classes, showing that standard eluder dimension-based analysis cannot lead to first-order regret bounds. To address this, we introduce a localisation method for the eluder dimension; our analysis immediately recovers and improves on classic results for Bernoulli bandits, and allows for the first genuine first-order bounds for finite-horizon reinforcement learning tasks with bounded cumulative returns. |
Alireza Bakhtiari; Alex Ayoub; Samuel McLaughlin Robertson; David Janz; Csaba Szepesvari; |
| 221 | Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the scaling law behavior of DiLoCo when training LLMs under a fixed compute budget. |
Zachary Charles; Gabriel Teston; Lucio M. Dery; J Keith Rush; Nova Fallen; Zachary Garrett; Arthur Szlam; Arthur Douillard; |
| 222 | FlashMD: Long-stride, Universal Prediction of Molecular Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose FlashMD, a method to predict the evolution of positions and momenta over strides that are between one and two orders of magnitude longer than typical MD time steps. |
Filippo Bigi; Sanggyu Chong; Agustinus Kristiadi; Michele Ceriotti; |
| 223 | Not All Data Are Good Labels: On The Self-supervised Labeling for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: During the optimization of a simple reconstruction network, intermediates are used as pseudo labels in a self-supervised paradigm, improving generalization for any predictor. We introduce the Self-Correction with Adaptive Mask (SCAM), which discards overfitted components and selectively replaces them with pseudo labels generated from reconstructions. |
Yuxuan Yang; Dalin Zhang; Yuxuan Liang; Hua Lu; Gang Chen; Huan Li; |
| 224 | 3D Equivariant Visuomotor Policy Learning Via Spherical Projection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This type of point cloud input is not compatible with the now-common setting where the primary input modality is an eye-in-hand RGB camera like a GoPro. This paper closes this gap by incorporating into the diffusion policy model a process that projects features from the 2D RGB camera image onto a sphere. |
Boce Hu; Dian Wang; David Klee; Heng Tian; Xupeng Zhu; Haojie Huang; Robert Platt; Robin Walters; |
| 225 | Predictable Scale (Part II) — Farseer: A Refined Scaling Law in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Training Large Language Models (LLMs) is prohibitively expensive, creating a critical scaling gap where insights from small-scale experiments often fail to transfer to resource-intensive production systems, thereby hindering efficient innovation. To bridge this, we introduce Farseer, a novel and refined scaling law offering enhanced predictive accuracy across scales. |
Houyi Li; Wenzhen Zheng; Qiufeng Wang; Zhenyu Ding; Haoying Wang; Zili Wang; Shijie Xuyang; Ning Ding; Shuigeng Zhou; Xiangyu Zhang; Daxin Jiang; |
| 226 | Adaptive Prediction-Powered AutoEval with Reliability and Efficiency Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose $\texttt{R-AutoEval+}$, a novel framework that provides finite-sample reliability guarantees on the model evaluation, while also ensuring an enhanced (or at least no worse) sample efficiency compared to conventional methods. |
Sangwoo Park; Matteo Zecchin; Osvaldo Simeone; |
| 227 | Ridge Boosting Is Both Robust and Efficient Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider a simple estimator, \emph{ridge boosting}: starting with any initial predictor, perform a single boosting step with (kernel) ridge regression. |
David Bruns-Smith; Zhongming Xie; Avi Feller; |
| 228 | A Unifying View of Linear Function Approximation in Off-Policy RL Through Matrix Splitting and Preconditioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that increasing the number of updates under the same target value function, i.e., the target network technique, is a transition from using a constant preconditioner to using a data-feature adaptive preconditioner. This elucidates, for the first time, why TD convergence does not necessarily imply FQI convergence, and establishes tight convergence connections among TD, PFQI, and FQI. |
Zechen Wu; Amy Greenwald; Ronald Parr; |
| 229 | Generalizable Reasoning Through Compositional Energy Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel approach to reasoning generalization by learning energy landscapes over the solution spaces of smaller, more tractable subproblems. |
Alexandru Oarga; Yilun Du; |
| 230 | Universal Sequence Preconditioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the problem of preconditioning in the setting of sequential prediction. |
Annie Marsden; Elad Hazan; |
| 231 | Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ENIGMATA, the first comprehensive suite tailored for improving LLMs with puzzle reasoning skills. |
Jiangjie Chen; Qianyu He; Siyu Yuan; Aili Chen; Zhicheng Cai; Weinan Dai; Hongli Yu; Jiaze Chen; Xuefeng Li; Qiying Yu; Hao Zhou; Mingxuan Wang; |
| 232 | Memory-Enhanced Neural Solvers for Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we present MEMENTO, an approach that leverages memory to improve the search of neural solvers at inference. |
Felix Chalumeau; Refiloe Shabe; Noah De Nicola; Arnu Pretorius; Thomas D Barrett; Nathan Grinsztajn; |
| 233 | Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Meanwhile, recent advances in pretrained vision-language models (VLMs) have demonstrated strong cross-task generalization, offering a promising foundation for developing unified solutions. In this paper, we introduce Uni-MuMER, which fully fine-tunes a VLM for the HMER task without modifying its architecture, effectively injecting domain-specific knowledge into a generalist framework. |
Yu Li; Jin Jiang; Jianhua Zhu; Shuai Peng; Baole Wei; Yuxuan Zhou; Liangcai Gao; |
| 234 | Two Heads Are Better Than One: Simulating Large Transformers with Small Ones Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we show that transformers with long input sequences (large transformers) can be efficiently simulated by transformers that can only take short input sequences (small transformers). |
Hantao Yu; Josh Alman; |
| 235 | Gaze Beyond The Frame: Forecasting Egocentric 3D Visual Span Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose EgoSpanLift, a novel method that transforms egocentric visual span forecasting from 2D image planes to 3D scenes. |
Heeseung Yun; Joonil Na; Jaeyeon Kim; Calvin Murdock; Gunhee Kim; |
| 236 | Unifying Proportional Fairness in Centroid and Non-Centroid Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Caragiannis et al. [NeurIPS 2024] study _non-centroid clustering_, in which each data point’s loss is determined by its maximum distance to any other data point in its cluster. We generalize both paradigms to introduce _semi-centroid clustering_, in which each data point’s loss is a combination of its centroid and non-centroid losses, and study two proportional fairness criteria—the core and, its relaxation, fully justified representation (FJR). |
Benjamin Cookson; Nisarg Shah; Ziqi Yu; |
| 237 | ARIA: Training Language Agents with Intention-driven Reward Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Sampling actions in such a space can lead to extreme reward sparsity, which brings large reward variance, hindering effective reinforcement learning (RL). To address this, we propose **ARIA**, a method that **A**ggregates **R**ewards in **I**ntention space to enable efficient and effective language **A**gents training. |
Ruihan Yang; Yikai Zhang; Aili Chen; Xintao Wang; Jiangjie Chen; Siyu Yuan; Deqing Yang; Yanghua Xiao; |
| 238 | Reasoning Planning for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We revisit this assumption through a rigorous theoretical analysis, deriving accuracy bounds for standard aggregation methods under fixed generation distributions and candidate sizes. Building on these insights, we introduce EPIC, an Ensemble Planning with Contrastive learning framework to learn a shared representation space that captures both model reasoning abilities and query-method compatibility. |
Bao Nguyen; Hieu Trung Nguyen; Ruifeng She; Xiaojin Fu; Viet Anh Nguyen; |
| 239 | When Less Language Is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Drawing inspiration from cognitive neuroscience, which suggests that human reasoning functions largely independently of language processing, we hypothesize that LLMs similarly encode reasoning and language as separable components that can be disentangled to enhance multilingual reasoning. |
Weixiang Zhao; Jiahe Guo; Yang Deng; Tongtong Wu; Wenxuan Zhang; Yulin Hu; Xingyu Sui; Yanyan Zhao; Wanxiang Che; Bing Qin; Tat-Seng Chua; Ting Liu; |
| 240 | EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce the event cameras to aid scene construction from a casually captured video for the first time, and propose Event-Aided Free-Trajectory 3DGS, called EF-3DGS, which seamlessly integrates the advantages of event cameras into 3DGS through three key components. |
Bohao Liao; Wei Zhai; Zengyu Wan; Zhixin Cheng; Wenfei Yang; Yang Cao; Tianzhu Zhang; Zheng-Jun Zha; |
| 241 | Beyond Scalar Rewards: An Axiomatic Framework for Lexicographic MDPs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hausner’s foundational work showed that dropping the continuity axiom leads to a generalization of expected utility theory where utilities are lexicographically ordered vectors of arbitrary dimension. In this paper, we extend this result by identifying a simple and practical condition under which preferences in a Markov Decision Process (MDP) cannot be represented by scalar rewards, necessitating a 2-dimensional reward function. |
Mehran Shakerinava; Siamak Ravanbakhsh; Adam Oberman; |
| 242 | Compress to Impress: Efficient LLM Adaptation Using A Single Gradient Step on 100 Samples Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We demonstrate that this overhead can be removed and find that: (i) Only a small, carefully chosen subset of matrices needs to be inspected�eliminating the layer-by-layer sweep, (ii) The gradient of each matrix�s singular values pinpoints which matrices merit reduction, (iii) Increasing the factorization search space by allowing matrices rows to cluster around multiple subspaces and then decomposing each cluster separately further reduces overfitting on the original training data and further lifts accuracy by up to 24.6 percentage points, and finally, (iv) we discover that evaluating on just 100 samples rather than the full training data�both for computing the indicative gradients and for measuring the final accuracy�suffices to further reduce the search time; we explain that as adaptation to downstream tasks is dominated by prompting style, not dataset size. As a results, we show that combining these findings yields a fast and robust adaptation algorithm for downstream tasks. |
Shiva Sreeram; Alaa Maalouf; Pratyusha Sharma; Daniela Rus; |
| 243 | Understanding Prompt Tuning and In-Context Learning Via Meta-Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we discuss how optimal prompting can be understood through a Bayesian view, which also implies some fundamental limitations of prompting that can only be overcome by tuning weights. |
Tim Genewein; Li Kevin Wenliang; Jordi Grau-Moya; Anian Ruoss; Laurent Orseau; Marcus Hutter; |
| 244 | Error Broadcast and Decorrelation As A Potential Artificial and Natural Learning Mechanism Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce *Error Broadcast and Decorrelation* (EBD), a novel learning framework for neural networks that addresses credit assignment by directly broadcasting output errors to individual layers, circumventing weight transport of backpropagation. |
Mete Erdogan; Cengiz Pehlevan; Alper Tunga Erdogan; |
| 245 | On The Expressive Power of Mixture-of-Experts for Structured Complex Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct a systematic study of the expressive power of MoEs in modeling complex tasks with two common structural priors: low-dimensionality and sparsity. |
Mingze Wang; Weinan E; |
| 246 | TokenSwap: A Lightweight Method to Disrupt Memorized Sequences in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce TokenSwap, a lightweight, post-hoc defense designed for realistic settings where the user can only access token-level outputs. |
Parjanya Prajakta Prashant; Kaustubh Ponkshe; Babak Salimi; |
| 247 | Streaming Attention Approximation Via Discrepancy Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we study the streaming complexity of attention approximation, a key computational primitive underlying token generation. |
Ekaterina Kochetkova; Kshiteej Sheth; Insu Han; Amir Zandieh; Michael Kapralov; |
| 248 | SegMASt3R: Geometry Grounded Segment Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage the spatial understanding of 3D foundation models to tackle wide-baseline segment matching, a challenging setting involving extreme viewpoint shifts. |
Rohit Jayanti; Swayam Agrawal; Vansh Garg; Siddharth Tourani; Muhammad Haris Khan; Sourav Garg; Madhava Krishna; |
| 249 | Among Us: A Sandbox for Measuring and Detecting Agentic Deception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Prior studies on deception in language-based AI agents typically assess whether the agent produces a false statement about a topic, or makes a binary choice prompted by a goal, rather than allowing open-ended deceptive behavior to emerge in pursuit of a longer-term goal. To fix this, we introduce $\textit{Among Us}$, a sandbox social deception game where LLM-agents exhibit long-term, open-ended deception as a consequence of the game objectives. |
Satvik Golechha; Adrià Garriga-Alonso; |
| 250 | AutoToM: Scaling Model-based Mental Inference Via Automated Agent Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce *AutoToM*, an automated agent modeling method for scalable, robust, and interpretable mental inference. |
Zhining Zhang; Chuanyang Jin; Mung Yao Jia; Shunchi Zhang; Tianmin Shu; |
| 251 | Towards Interpretable and Efficient Attention: Compressing All By Contracting A Few Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a unified optimization objective that derives inherently interpretable and efficient attention mechanisms through algorithm unrolling. |
Qishuai Wen; Zhiyuan Huang; Chun-Guang Li; |
| 252 | EvoBrain: Dynamic Multi-Channel EEG Graph Modeling for Time-Evolving Brain Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we present the first theoretical analysis of these two problems, demonstrating the effectiveness and necessity of explicit dynamic modeling and time-then-graph dynamic GNN method. Building on these insights, we propose EvoBrain, a novel seizure detection model that integrates a two-stream Mamba architecture with a GCN enhanced by Laplacian Positional Encoding, following neurological insights. |
Rikuto Kotoge; Zheng Chen; Tasuku Kimura; Yasuko Matsubara; Takufumi Yanagisawa; Haruhiko Kishima; Yasushi Sakurai; |
| 253 | Product Distribution Learning with Imperfect Advice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When $P$ belongs to the class of product distributions on the Boolean hypercube $\{0,1\}^d$, it is known that $\Omega(d/\epsilon^2)$ samples are necessary to learn $P$ within total variation (TV) distance $\epsilon$. We revisit this problem when the learner is also given as advice the parameters of a product distribution $Q$. |
Arnab Bhattacharyya; Davin Choo; Philips George John; Themis Gouleakis; |
| 254 | RoboScape: Physics-informed Embodied World Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present RoboScape, a unified physics-informed world model that jointly learns RGB video generation and physics knowledge within an integrated framework. |
Yu Shang; Xin Zhang; Yinzhou Tang; Lei Jin; Chen Gao; Wei Wu; Yong Li; |
| 255 | InstructHOI: Context-Aware Instruction for Multi-Modal Reasoning in Human-Object Interaction Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose InstructHOI, a novel method that leverages context-aware instructions to guide multi-modal reasoning for HOI detection. |
Jinguo Luo; Weihong Ren; Quanlong Zheng; Yanhao Zhang; Zhenlong Yuan; Zhiyong Wang; Haonan Lu; Honghai LIU; |
| 256 | ARM: Adaptive Reasoning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Adaptive Reasoning Model (ARM), a reasoning model capable of adaptively selecting appropriate reasoning formats based on the task at hand. |
Siye Wu; Jian Xie; Yikai Zhang; Aili Chen; Kai Zhang; Yu Su; Yanghua Xiao; |
| 257 | Computational Efficiency Under Covariate Shift in Kernel Ridge Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper addresses the covariate shift problem in the context of nonparametric regression within reproducing kernel Hilbert spaces (RKHSs). |
Andrea Della Vecchia; Arnaud Mavakala Watusadisi; Ernesto De Vito; Lorenzo Rosasco; |
| 258 | Self-Assembling Graph Perceptrons Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Subsequently, we incorporate a self-assembly mechanism on top of GP called Self-Assembling Graph Perceptron (SAGP). |
Jialong Chen; Tong Wang; Bowen Deng; Luonan Chen; Zibin Zheng; Chuan Chen; |
| 259 | Scaling and Context Steer LLMs Along The Same Computational Path As The Human Brain Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, whether this representational alignment arises from a similar sequence of computations remains elusive. In this study, we explore this question by examining temporally-resolved brain signals of participants listening to 10 hours of an audiobook. |
Joséphine Raugel; Jérémy Rapin; Stéphane d’Ascoli; Valentin Wyart; Jean-Remi King; |
| 260 | InfiFPO: Implicit Model Fusion Via Preference Optimization in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: The current few fusion methods on PA phase, like WRPO, simplify the process by utilizing only response outputs from source models while discarding their probability information. To address this limitation, we propose InfiFPO, a preference optimization method for implicit model fusion. |
Yanggan Gu; Yuanyi Wang; Zhaoyi Yan; Yiming Zhang; Qi Zhou; Fei Wu; Hongxia Yang; |
| 261 | SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recognizing that visual appearance and motion patterns share fundamental physical laws in the real world, we propose a novel framework that combines visual priors and dynamic constraints within a synchronized diffusion process to generate the HOI video and motion simultaneously. |
Lingwei Dang; Ruizhi Shao; Hongwen Zhang; Wei MIN; Yebin Liu; Qingyao Wu; |
| 262 | HyPINO: Multi-Physics Neural Operators Via HyperPINNs and The Method of Manufactured Solutions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present HyPINO, a multi-physics neural operator designed for zero-shot generalization across a broad class of parametric PDEs without requiring task-specific fine-tuning. |
Rafael Bischof; Michal Piovarci; Michael Anton Kraus; Siddhartha Mishra; Bernd Bickel; |
| 263 | RidgeLoRA: Matrix Ridge Enhanced Low-Rank Adaptation of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the low-rank nature makes it prone to produce a decrease in the representation ability, leading to suboptimal performance. In order to break this limitation, we propose RidgeLoRA, a lightweight architecture like LoRA that incorporates novel architecture and matrix ridge enhanced full-rank approximation, to match the performance of full-rank training, while eliminating the need for high memory and a large number of parameters to restore the rank of matrices. |
Junda Zhu; Jun Ai; Yujun Li; Yichun Yin; Yasheng Wang; Lifeng Shang; Qun Liu; |
| 264 | Think or Not? Exploring Thinking Efficiency in Large Reasoning Models Via An Information-Theoretic Lens Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose two metrics—InfoBias and InfoGain—to quantify divergence from ideal reasoning paths and stepwise information contribution, respectively. |
Xixian Yong; Xiao Zhou; Yingying Zhang; Jinlin Li; Yefeng Zheng; Xian Wu; |
| 265 | Hyperbolic Fine-Tuning for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this study, we investigate the non-Euclidean characteristics of LLMs. |
Menglin Yang; Ram Samarth B B; Aosong Feng; Bo Xiong; Jiahong Liu; Irwin King; Rex Ying; |
| 266 | Deep Continuous-Time State-Space Models for Marked Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose the _state-space point process_ (S2P2) model, a novel and performant model that leverages techniques derived for modern deep state-space models (SSMs) to overcome limitations of existing MTPP models, while simultaneously imbuing strong inductive biases for continuous-time event sequences that other discrete sequence models (i.e., RNNs, transformers) do not capture. |
Yuxin Chang; Alex James Boyd; Cao Xiao; Taha Kass-Hout; Parminder Bhatia; Padhraic Smyth; Andrew Warrington; |
| 267 | HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce HBLLM, a wavelet-enhanced high-fidelity $1$-bit post-training quantization method for Large Language Models (LLMs). |
Ningning CHEN; Weicai Ye; Ying Jiang; |
| 268 | Which Algorithms Have Tight Generalization Bounds? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study which machine learning algorithms have tight generalization bounds with respect to a given collection of population distributions. |
Michael Gastpar; Ido Nachum; Jonathan Shafer; Thomas Weinberger; |
| 269 | Axial Neural Networks for Dimension-Free Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Traditional approaches either fix a maximum dimension or employ separate encoders for different dimensionalities, resulting in inefficiencies. To address this, we propose a dimension-agnostic neural network architecture, the Axial Neural Network (XNN), inspired by parameter-sharing structures such as Deep Sets and Graph Neural Networks. |
Hyunsu Kim; Jonggeon Park; Joan Bruna; Hongseok Yang; Juho Lee; |
| 270 | Horizon Reduction Makes RL Scalable Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the scalability of offline reinforcement learning (RL) algorithms. |
Seohong Park; Kevin Frans; Deepinder Mann; Benjamin Eysenbach; Aviral Kumar; Sergey Levine; |
| 271 | OpenBox: Annotate Any Bounding Boxes in 3D Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing approaches uniformly annotate 3D bounding boxes, ignore objects’ physical states, and require multiple self-training iterations for annotation refinement, resulting in suboptimal quality and substantial computational overhead. To address these challenges, we propose OpenBox, a two-stage automatic annotation pipeline that leverages a 2D vision foundation model. |
In-Jae Lee; Mungyeom Kim; Kwonyoung Ryu; Pierre Musacchio; Jaesik Park; |
| 272 | SGCD: Stain-Guided CycleDiffusion for Unsupervised Domain Adaptation of Histopathology Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, achieving high-quality and stable translations typically requires paired data, which poses a challenge in scenarios with limited annotations in the target domain. To address this issue, this paper proposes a novel method termed Stain-Guided Cycle Diffusion (SGCD), employing a dual diffusion model with bidirectional generative constraints to synthesize highly realistic data for downstream task fine-tuning. |
Hsi-Ling Chen; Chun-Shien Lu; Pau-Choo Chung; |
| 273 | FFN Fusion: Rethinking Sequential Computation in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce \textit{FFN Fusion}, an architectural optimization technique that reduces sequential computation in large language models by identifying and exploiting natural opportunities for parallelization. |
Akhiad Bercovich; Mohammed Dabbah; Omri Puny; Ido Galil; Amnon Geifman; Yonatan Geifman; Izhak Golan; Ehud Dov Karpas; Itay Levy; Zach Moshe; Najeeb Nabwani; Tomer Ronen; Itamar Schen; Ido Shahaf; Oren Tropp; Ran Zilberstein; Ran El-Yaniv; |
| 274 | Distilling LLM Agent Into Small Models with Retrieval and Code Tools Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Agent Distillation, a framework for transferring not only reasoning capability but full task-solving behavior from LLM-based agents into sLMs with retrieval and code tools. |
Minki Kang; Jongwon Jeong; Seanie Lee; Jaewoong Cho; Sung Ju Hwang; |
| 275 | Understanding Parametric and Contextual Knowledge Reconciliation Within Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose modeling the forward propagation of knowledge as an entity flow, employing this framework to trace LLMs’ internal behaviors when processing mixed-source knowledge. |
Jun Zhao; Yongzhuo Yang; Xiang Hu; Jingqi Tong; Yi Lu; Wei Wu; Tao Gui; Qi Zhang; Xuanjing Huang; |
| 276 | Differentiable Decision Tree Via ReLU+Argmin Reformulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To ameliorate numerical instability, we propose a warm-start annealing scheme that solves multiple optimization tasks with increasingly accurate approximations. |
Qiangqiang Mao; Jiayang Ren; Yixiu Wang; Chenxuanyin Zou; Jingjing Zheng; Yankai Cao; |
| 277 | SafeVLA: Towards Safety Alignment of Vision-Language-Action Model Via Constrained Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: *How can safety constraints be explicitly integrated into VLAs? * We address this by exploring an integrated safety approach (ISA), systematically **modeling** safety requirements, then actively **eliciting** diverse unsafe behaviors, effectively **constraining** VLA policies via safe reinforcement learning, and rigorously **assuring** their safety through targeted evaluations. |
Borong Zhang; Yuhao Zhang; Jiaming Ji; Yingshan Lei; Josef Dai; Yuanpei Chen; Yaodong Yang; |
| 278 | 3D Interaction Geometric Pre-training for Molecular Relational Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper introduces a novel 3D geometric pre-training strategy for MRL (3DMRL) that incorporates a 3D virtual interaction environment, overcoming the limitations of costly traditional quantum mechanical calculation methods. |
Namkyeong Lee; Yunhak Oh; Heewoong Noh; Gyoung S. Na; Minkai Xu; Hanchen; Tianfan Fu; Chanyoung Park; |
| 279 | Improved Bounds for Swap Multicalibration and Swap Omniprediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the related problems of multicalibration — a multigroup fairness notion and omniprediction — a simultaneous loss minimization paradigm, both in the distributional and online settings. |
Haipeng Luo; Spandan Senapati; Vatsal Sharan; |
| 280 | UniteFormer: Unifying Node and Edge Modalities in Transformers for Vehicle Routing Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose UniteFormer, a unified neural solver that supports node-only, edge-only, and hybrid input types through a single model trained via joint edge-node modalities. |
Dian Meng; Zhiguang Cao; Jie Gao; Yaoxin Wu; Yaqing Hou; |
| 281 | Simultaneous Swap Regret Minimization Via KL-Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A technical contribution of our work is a new randomized rounding procedure and a non-uniform discretization scheme to minimize the swap regret for log loss. |
Haipeng Luo; Spandan Senapati; Vatsal Sharan; |
| 282 | Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the challenges of real-world travel planning, this paper introduces the Multiple Aspects of Planning (MAoP), empowering LLMs with wide-horizon thinking to solve planning problems with multifaceted constraints. |
Dongjie Yang; Chengqiang Lu; Qimeng Wang; Xinbei Ma; Yan Gao; Yao Hu; hai zhao; |
| 283 | Does Object Binding Naturally Emerge in Large Pretrained Vision Transformers? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by the quadratic nature of self-attention, we hypothesize that ViTs represent whether two patches belong to the same object, a property we term *IsSameObject*. |
Yihao Li; Saeed Salehi; Lyle Ungar; Konrad Kording; |
| 284 | Probing Neural Combinatorial Optimization Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we take the first critical step towards interpreting NCO models by investigating their representations through various probing tasks. |
Zhiqin Zhang; Yining Ma; Zhiguang Cao; Hoong Chuin Lau; |
| 285 | A Principled Path to Fitted Distributional Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While only a few related approaches exist, there remains no unified framework for designing FDE methods. To fill this gap, we present a set of guiding principles for constructing theoretically grounded FDE methods. |
Sungee Hong; Jiayi Wang; Zhengling Qi; Raymond K. W. Wong; |
| 286 | Improving LLM General Preference Alignment Via Optimistic Online Mirror Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we drop the BT model assumption and study LLM alignment under general preferences, formulated as a two-player game. |
Yuheng Zhang; Dian Yu; Tao Ge; Linfeng Song; Zhichen Zeng; Haitao Mi; Nan Jiang; Dong Yu; |
| 287 | Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Progressive Inference-Time Annealing (PITA) a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: I.) Annealing of the Boltzmann distribution and II.) |
Tara Akhound-Sadegh; Jungyoon Lee; Joey Bose; Valentin De Bortoli; Arnaud Doucet; Michael M. Bronstein; Dominique Beaini; Siamak Ravanbakhsh; Kirill Neklyudov; Alexander Tong; |
| 288 | Compositional Monte Carlo Tree Diffusion for Extendable Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Compositional Monte Carlo Tree Diffusion (C-MCTD), a framework that elevates planning from individual trajectory optimization to reasoning over complete plan compositions. |
Jaesik Yoon; Hyeonseo Cho; Sungjin Ahn; |
| 289 | Fast Monte Carlo Tree Diffusion: 100× Speedup Via Parallel and Sparse Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite its strengths, our analysis shows that MCTD incurs substantial computational overhead due to the sequential nature of tree search and the cost of iterative denoising. To address this, we propose Fast-MCTD, a more efficient variant that preserves the strengths of MCTD while significantly improving its speed and scalability. |
Jaesik Yoon; Hyeonseo Cho; Yoshua Bengio; Sungjin Ahn; |
| 290 | Zero-shot Denoising Via Neural Compression: Theoretical and Algorithmic Framework Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose the *Zero-Shot Neural Compression Denoiser* (ZS-NCD), a novel denoising framework based on neural compression. |
Ali Zafari; Xi Chen; Shirin Jalali; |
| 291 | Gradient-Variation Online Adaptivity for Accelerated Optimization with Hölder Smoothness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Interestingly, these two problems are actually closely connected — accelerated optimization can be understood through the lens of gradient-variation online learning. In this paper, we investigate online learning with *Hölder* functions, a general class encompassing both smooth and non-smooth (Lipschitz) functions, and explore its implications for offline optimization. |
Yuheng Zhao; Yu-Hu Yan; Kfir Yehuda Levy; Peng Zhao; |
| 292 | A Smooth Sea Never Made A Skilled SAILOR: Robust Imitation Via Learning to Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, we explore *learning to search* (L2S) from expert demonstrations, i.e. learning the components required to, at test time, plan to match expert outcomes, even after making a mistake. |
Arnav Kumar Jain; Vibhakar Mohta; Subin Kim; Atiksh Bhardwaj; Juntao Ren; Yunhai Feng; Sanjiban Choudhury; Gokul Swamy; |
| 293 | MAESTRO : Adaptive Sparse Attention and Robust Learning for Multimodal Dynamic Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: From clinical healthcare to daily living, continuous sensor monitoring across multiple modalities has shown great promise for real-world intelligent decision-making but also faces various challenges. In this work, we argue for modeling such heterogeneous data sources under the multimodal paradigm and introduce a new framework, MAESTRO. |
Payal Mohapatra; Yueyuan Sui; Akash Pandey; Stephen Xia; Qi Zhu; |
| 294 | Ctrl-DNA: Controllable Cell-Type-Specific Regulatory DNA Design Via Constrained RL Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Here, we introduce Ctrl-DNA, a novel constrained reinforcement learning (RL) framework tailored for designing regulatory DNA sequences with controllable cell-type specificity. |
Xingyu Chen; Shihao Ma; Runsheng Lin; Jiecong Lin; BO WANG; |
| 295 | On The Universal Near Optimality of Hedge in Combinatorial Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the classical Hedge algorithm in combinatorial settings. |
Zhiyuan Fan; Arnab Maiti; Lillian J. Ratliff; Kevin Jamieson; Gabriele Farina; |
| 296 | Transfer Faster, Price Smarter: Minimax Dynamic Pricing Under Cross-Market Preference Shift Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study contextual dynamic pricing when a target market can leverage $K$ auxiliary markets—offline logs or concurrent streams—whose *mean utilities differ by a structured preference shift*. We propose *Cross-Market Transfer Dynamic Pricing (CM-TDP)*, the first algorithm that *provably* handles such model-shift transfer and delivers minimax-optimal regret for *both* linear and non-parametric utility models. |
Yi Zhang; Elynn Chen; Yujun Yan; |
| 297 | Transformer Brain Encoders Explain Human High-level Visual Responses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we employ the attention mechanism used in the transformer architecture to study how retinotopic visual features can be dynamically routed to category-selective areas in high-level visual processing. |
Hossein Adeli; Minni Sun; Nikolaus Kriegeskorte; |
| 298 | Fast and Fluent Diffusion Language Models Via Convolutional Decoding and Rejective Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, semi-AR eliminates the main advantages of diffusion models. To overcome this, we propose Convolutional decoding (\textit{Conv}), a normalization-based method that narrows the decoding window without hard segmentation, leading to better fluency and flexibility. |
Yeongbin Seo; Dongha Lee; Jaehyung Kim; Jinyoung Yeo; |
| 299 | MoBA: Mixture of Block Attention for Long-Context LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a solution that adheres to the “less structure” principle, allowing the model to determine where to attend autonomously, rather than introducing predefined biases. |
Enzhe Lu; Zhejun Jiang; Jingyuan Liu; Yulun Du; Tao Jiang; Chao Hong; Shaowei Liu; Weiran He; Enming Yuan; Yuzhi Wang; Zhiqi Huang; Huan Yuan; Suting Xu; Xinran Xu; Guokun Lai; Yanru Chen; Huabin Zheng; Junjie Yan; Jianlin Su; Yuxin Wu; Yutao Zhang; Zhilin Yang; Xinyu Zhou; Mingxing Zhang; Jiezhong Qiu; |
| 300 | DERD-Net: Learning Depth from Event-based Ray Densities Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a scalable, flexible and adaptable framework for pixel-wise depth estimation with event cameras in both monocular and stereo setups. |
Diego de Oliveira Hitzges; Suman Ghosh; Guillermo Gallego; |
| 301 | Beyond Expectations: Quantile-Guided Alignment for Risk-Calibrated Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Quantile‑Guided Alignment (QA), a framework that allows users to specify desired improvements at any quantile—individually or across multiple reward dimensions—thus shifting the distribution of outputs with finer control toward safer, more desirable outcomes. |
Xinran Wang; Jin Du; Azal Ahmad Khan; Qi Le; Enmao Diao; Jiawei Zhou; Jie Ding; Ali Anwar; |
| 302 | VPO: Reasoning Preferences Optimization Based on $\mathcal{V}$-Usable Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, focusing on reasoning tasks, we propose VPO, a negative gradient constraint method for human non-preference samples based on $\mathcal{V}$-usable information. |
Zecheng Wang; Chunshan Li; Yupeng Zhang; Han Liu; Bingning Wang; Dianhui Chu; Dianbo Sui; |
| 303 | MoCha: Towards Movie-Grade Talking Character Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unlike talking head tasks, Talking Characters aims at generating the full portrait of one or more characters beyond the facial region. In this paper, we propose MoCha, the first of its kind to generate talking characters. |
Cong Wei; Bo Sun; Haoyu Ma; Ji Hou; Felix Juefei-Xu; Zecheng He; Xiaoliang Dai; Luxin Zhang; Kunpeng Li; Tingbo Hou; Animesh Sinha; Peter Vajda; Wenhu Chen; |
| 304 | Stable Minima of ReLU Neural Networks Suffer from The Curse of Dimensionality: The Neural Shattering Phenomenon Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents new and somewhat surprising theoretical results for multivariate inputs. |
Tongtong Liang; Dan Qiao; Yu-Xiang Wang; Rahul Parhi; |
| 305 | Fisher Meets Feynman: Score-based Variational Inference with A Product of Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a highly expressive yet distinctly tractable family for black-box variational inference (BBVI). |
Diana Cai; Robert M. Gower; David Blei; Lawrence K. Saul; |
| 306 | Mitigating The Privacy–Utility Trade-off in Decentralized Federated Learning Via F-Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop two new $f$-DP–based accounting methods tailored to decentralized settings: Pairwise Network $f$-DP (PN-$f$-DP), which quantifies privacy leakage between user pairs under random-walk communication, and Secret-based $f$-Local DP (Sec-$f$-LDP), which supports structured noise injection via shared secrets. |
Xiang Li; Chendi Wang; Buxin Su; Qi Long; Weijie J Su; |
| 307 | Mitigating Instability in High Residual Adaptive Sampling for PINNs Via Langevin Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these methods often neglect points with medium or low residuals, which can affect stability as the complexity of the model increases. In this paper, we investigate this limitation and show that high residual-based approaches require stricter learning rate bounds to ensure stability. |
Minseok Jeong; Giup Seo; Euiseok Hwang; |
| 308 | Spectral Estimation with Free Decompression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In such settings, the matrix is impalpable, in the sense that we have access to only masked snapshots of it. We draw on principles from free probability theory to introduce a novel method of free decompression to estimate the spectrum of such matrices. |
Siavash Ameli; Chris van der Heide; Liam Hodgkinson; Michael W. Mahoney; |
| 309 | CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To ease the demand on the model, we propose \emph{Condition-Aware Reparameterization for Flow Matching} (CAR-Flow) — a lightweight, learned \emph{shift} that conditions the source, the target, or both distributions. |
Chen Chen; Pengsheng Guo; Liangchen Song; Jiasen Lu; Rui Qian; Tsu-Jui Fu; Xinze Wang; Wei Liu; Yinfei Yang; Alex Schwing; |
| 310 | Training-Free Constrained Generation With Stable Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While there is increasing effort to incorporate physics-based constraints into generative models, existing techniques are either limited in their applicability to latent diffusion frameworks or lack the capability to strictly enforce domain-specific constraints. To address this limitation this paper proposes a novel integration of stable diffusion models with constrained optimization frameworks, enabling the generation of outputs satisfying stringent physical and functional requirements. |
Stefano Zampini; Jacob K Christopher; Luca Oneto; Davide Anguita; Ferdinando Fioretto; |
| 311 | Comparator-Adaptive $\Phi$-Regret: Improved Bounds, Simpler Algorithms, and Applications to Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a general idea to achieve an even better comparator-adaptive $\Phi$-regret bound via much simpler algorithms compared to Lu et al., [2025]. |
Soumita Hait; Ping Li; Haipeng Luo; Mengxiao Zhang; |
| 312 | X-Field: A Physically Informed Representation for 3D X-ray Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce X-Field, a 3D representation informed in the physics of X-ray imaging. |
Feiran Wang; Jiachen Tao; Junyi Wu; Haoxuan Wang; Bin Duan; Kai Wang; Zongxin Yang; Yan Yan; |
| 313 | Distillation Robustifies Unlearning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In short, distillation robustifies unlearning. Based on this result, we propose Unlearn-Noise-Distill-on-Outputs (UNDO), a scalable method that distills an unlearned model into a noised copy of itself. |
Bruce W. Lee; Addie Foote; Alex Infanger; Leni Shor; Harish K Kamath; Jacob Goldman-Wetzler; Bryce Woodworth; Alex Cloud; Alexander Matt Turner; |
| 314 | SimWorld: An Open-ended Simulator for Agents in Physical and Social Worlds Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SimWorld, a new simulator built on Unreal Engine 5, designed for developing and evaluating LLM/VLM agents in rich, real-world-like settings. |
Xiaokang Ye; Jiawei Ren; Yan Zhuang; Xuhong He; Yiming Liang; Yiqing Yang; Mrinaal Dogra; Xianrui Zhong; Eric Liu; Kevin Benavente; Rajiv Mandya Nagaraju; Dhruv Vivek Sharma; Ziqiao Ma; Tianmin Shu; Zhiting Hu; Lianhui Qin; |
| 315 | Two‑Stage Learning of Stabilizing Neural Controllers Via Zubov Sampling and Iterative Domain Expansion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel two-stage training framework to jointly synthesize a controller a Lyapunov function for continuous-time systems. |
Haoyu Li; Xiangru Zhong; Bin Hu; Huan Zhang; |
| 316 | The Complexity of Symmetric Equilibria in Min-Max Optimization and Team Zero-Sum Games Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider the problem of computing stationary points in min-max optimization, with a focus on the special case of Nash equilibria in (two-)team zero-sum games. |
Ioannis Anagnostides; Ioannis Panageas; Tuomas Sandholm; Jingming Yan; |
| 317 | On Traceability in $\ell_p$ Stochastic Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate the necessity of traceability for accurate learning in stochastic convex optimization (SCO) under $\ell_p$ geometries. |
Sasha Voitovych; Mahdi Haghifam; Idan Attias; Gintare Karolina Dziugaite; Roi Livni; Daniel M. Roy; |
| 318 | Improved Representation Steering for Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We demonstrate how to improve representation steering via our new Reference-free Preference Steering (RePS), a bidirectional preference-optimization objective that jointly does concept steering and suppression. |
Zhengxuan Wu; Qinan Yu; Aryaman Arora; Christopher D Manning; Christopher Potts; |
| 319 | FlexOLMo: Open Language Models for Flexible Data Use Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FlexOLMo, a new class of language models (LMs) that supports (1) distributed training without data sharing, where different model parameters are independently trained on private datasets, and (2) data-flexible inference, where these parameters along with their associated data can be easily included or excluded from model inferences with no further training. |
Weijia Shi; Akshita Bhagia; Kevin Farhat; Niklas Muennighoff; Jacob Morrison; Evan Pete Walsh; Dustin Schwenk; Shayne Longpre; Jake Poznanski; Allyson Ettinger; Daogao Liu; Margaret Li; Mike Lewis; Wen-tau Yih; Dirk Groeneveld; Luca Soldaini; Kyle Lo; Noah A. Smith; Luke Zettlemoyer; Pang Wei Koh; Hannaneh Hajishirzi; Ali Farhadi; Sewon Min; |
| 320 | The Fragile Truth of Saliency: Improving LLM Input Attribution Via Attention Bias Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a stress-testing framework inspired by the needle-in-a-haystack (NIAH) setting to systematically assess the reliability of seven popular input saliency methods. |
Yihua Zhang; Changsheng Wang; Yiwei Chen; Chongyu Fan; Jinghan Jia; Sijia Liu; |
| 321 | Enhancing Training Data Attribution with Representational Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Representation-based approaches are far more scalable, but typically rely on heuristic embeddings that are not optimized for attribution, limiting their fidelity. To address these challenges, we propose AirRep, a scalable, representation-based approach that closes this gap by learning task-specific and model-aligned representations optimized explicitly for TDA. |
Weiwei Sun; Haokun Liu; Nikhil Kandpal; Colin Raffel; Yiming Yang; |
| 322 | DisMo: Disentangled Motion Representations for Open-World Motion Transfer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these models often fail to provide an explicit representation of motion separate from content, limiting their applicability for content creators. To address this gap, we propose DisMo, a novel paradigm for learning abstract motion representations directly from raw video data via an image-space reconstruction objective. |
Thomas Ressler-Antal; Frank Fundel; Malek Ben Alaya; Stefan Andreas Baumann; Felix Krause; Ming Gui; Björn Ommer; |
| 323 | Rig3R: Rig-Aware Conditioning and Discovery for 3D Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce Rig3R, a generalization of prior multiview reconstruction models that incorporates rig structure when available, and learns to infer it when not. |
Samuel Li; Pujith Kachana; Prajwal Chidananda; Saurabh Nair; Yasutaka Furukawa; Matthew Brown; |
| 324 | Cycle-Sync: Robust Global Camera Pose Estimation Through Enhanced Cycle-Consistent Synchronization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Cycle-Sync, a robust and global framework for estimating camera poses (both rotations and locations). |
Shaohan Li; Yunpeng Shi; Gilad Lerman; |
| 325 | Dense Associative Memory with Epanechnikov Energy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel energy function for Dense Associative Memory (DenseAM) networks, the log-sum-ReLU (LSR), inspired by optimal kernel density estimation. |
Benjamin Hoover; Zhaoyang Shi; Krishna Balasubramanian; Dmitry Krotov; Parikshit Ram; |
| 326 | Searching Latent Program Spaces Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose the Latent Program Network (LPN), a new architecture that builds in test-time search directly into neural models. |
Matthew Macfarlane; Clément Bonnet; |
| 327 | Convergence Rates of Constrained Expected Improvement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the theoretical convergence rate of CEI has not been established. In this work, we study the convergence rate of CEI by analyzing its simple regret upper bound. |
Haowei Wang; Jingyi Wang; Zhongxiang Dai; Nai-Yuan Chiang; Szu Hui Ng; Cosmin G. Petra; |
| 328 | CoT Information: Improved Sample Complexity Under Chain-of-Thought Supervision Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper develops a statistical theory of learning under CoT supervision. |
Awni Altabaa; Omar Montasser; John Lafferty; |
| 329 | Unlocking Hidden Biomolecular Conformational Landscapes in Diffusion Models at Inference Time Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ConforMix, an inference-time algorithm that enhances sampling of conformational distributions using a combination of classifier guidance, filtering, and free energy estimation. |
Daniel D. Richman; Jessica Karaguesian; Carl-Mikael Suomivuori; Ron O. Dror; |
| 330 | QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose leveraging Singular-Value Decomposition (SVD) over the joint query (Q), key (K), and value (V) weight matrices to reduce KV cache size and computational overhead. |
Yutong Wang; Haiyu Wang; Sai Qian Zhang; |
| 331 | Caption This, Reason That: VLMs Caught in The Middle Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis reveals distinct cognitive profiles: while advanced models approach ceiling performance on some tasks (e.g. category identification), a significant gap persists, particularly in tasks requiring spatial understanding or selective attention. Investigating the source of these failures and potential methods for improvement, we employ a vision-text decoupling analysis, finding that models struggling with direct visual reasoning show marked improvement when reasoning over their own generated text captions. |
Zihan Weng; Lucas Gomez; Taylor Whittington Webb; Pouya Bashivan; |
| 332 | Universal Causal Inference in A Topos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore the universal properties underlying causal inference by formulating it in terms of a topos. |
Sridhar Mahadevan; |
| 333 | Can We Infer Confidential Properties of Training Data from LLMs? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce PropInfer, a benchmark task for evaluating property inference in LLMs under two fine-tuning paradigms: question-answering and chat-completion. |
Pengrun Huang; Chhavi Yadav; Kamalika Chaudhuri; Ruihan Wu; |
| 334 | Temperature Is All You Need for Generalization in Langevin Dynamics and Other Markov Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze the generalization gap (gap between the training and test errors) when training a potentially over-parametrized model using a Markovian stochastic training algorithm, initialized from some distribution $\theta_0 \sim p_0$. |
Itamar Harel; Yonathan Wolanowsky; Gal Vardi; Nathan Srebro; Daniel Soudry; |
| 335 | Learning The Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we characterize syntactic templates, domain, and semantics in task-instruction pairs. |
Chantal Shaib; Vinith Menon Suriyakumar; Byron C Wallace; Marzyeh Ghassemi; |
| 336 | Corporate Needs You to Find The Difference: Revisiting Submodular and Supermodular Ratio Optimization Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Motivated by recent applications [39, 31], we formalize two new broad problems: the Unrestricted Sparsest Submodular Set (USSS) and Unrestricted Densest Supermodular Set (UDSS) which allow negative and non-monotone functions. |
Elfarouk Harb; Yousef Yassin; Chandra Chekuri; |
| 337 | Private Set Union with Multiple Contributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We consider differentially private algorithms that always report a subset of the union, and define the utility of an algorithm to be the expected size of the subset that it reports. Because the achievable utility varies significantly with the dataset, we introduce the *utility ratio*, which normalizes utility by a dataset-specific upper bound and characterizes a mechanism by its lowest normalized utility across all datasets. |
Travis Dick; Haim Kaplan; Alex Kulesza; Uri Stemmer; Ziteng Sun; Ananda Theertha Suresh; |
| 338 | Clustering Via Hedonic Games: New Concepts and Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study fundamental connections between coalition formation games and clustering, illustrating the cross-disciplinary relevance of these concepts. |
Gergely Csáji; Alexander Gundert; Jörg Rothe; Ildikó Schlotter; |
| 339 | The Graphon Limit Hypothesis: Understanding Neural Network Pruning Via Infinite Width Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite advances in pruning methods that create sparse architectures, understanding why some sparse structures are better trainable than others with the same level of sparsity remains poorly understood. Aiming to develop a systematic approach to this fundamental problem, we propose a novel theoretical framework based on the theory of graph limits, particularly graphons, that characterizes sparse neural networks in the infinite-width regime. |
Hoang Pham; The-Anh Ta; Tom Jacobs; Rebekka Burkholz; Long Tran-Thanh; |
| 340 | Fast Training of Large Kernel Models with Delayed Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present a new methodology for building kernel machines that can scale efficiently with both data size and model size. |
Amirhesam Abedsoltan; Siyuan Ma; Parthe Pandit; Mikhail Belkin; |
| 341 | D1: Scaling Reasoning in Diffusion Large Language Models Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose, a framework to adapt pre-trained masked dLLMs into reasoning models via a combination of supervised finetuning (SFT) and RL. |
Siyan Zhao; Devaansh Gupta; Qinqing Zheng; Aditya Grover; |
| 342 | Affine-Invariant Global Non-Asymptotic Convergence Analysis of BFGS Under Self-Concordance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we establish global non-asymptotic convergence guarantees for the BFGS quasi-Newton method without requiring strong convexity or the Lipschitz continuity of the gradient or Hessian. |
Qiujiang Jin; Aryan Mokhtari; |
| 343 | Extrapolation By Association: Length Generalization Transfer In Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we investigate length generalization—the ability to extrapolate from shorter to longer inputs—through the lens of \textit{task transfer}. |
Ziyang Cai; Nayoung Lee; Avi Schwarzschild; Samet Oymak; Dimitris Papailiopoulos; |
| 344 | Optimal Neural Compressors for The Rate-Distortion-Perception Tradeoff Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose neural compressors that are low complexity and benefit from high packing efficiency through lattice coding and shared randomness through shared dithering over the lattice cells. |
Eric Lei; Hamed Hassani; Shirin Saeedi Bidokhti; |
| 345 | Bridging Symmetry and Robustness: On The Role of Equivariance in Enhancing Adversarial Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate an architectural approach to adversarial robustness by embedding group-equivariant convolutions—specifically, rotation- and scale-equivariant layers—into standard convolutional neural networks (CNNs). |
Longwei Wang; Ifrat Ikhtear Uddin; KC Santosh; Chaowei Zhang; Xiao Qin; Yang Zhou; |
| 346 | Color Conditional Generation with Sliced Wasserstein Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose SW-Guidance, a training-free approach for image generation conditioned on the color distribution of a reference image. |
Alexander Lobashev; Maria Larchenko; Dmitry Guskov; |
| 347 | Private Hyperparameter Tuning with Ex-Post Guarantee Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their work, and similar findings by Whitehouse et al. [2023], are primarily limited to simple mechanisms based on Laplace or Gaussian noise. In this paper, we significantly generalize these results. |
Badih Ghazi; Pritish Kamath; Alexander Knop; Ravi Kumar; Pasin Manurangsi; Chiyuan Zhang; |
| 348 | A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To better understand the reasoning capability of LLMs, we study a minimal propositional logic problem that requires combining multiple facts to arrive at a solution. |
Guan Zhe Hong; Nishanth Dikkala; Enming Luo; Cyrus Rashtchian; Xin Wang; Rina Panigrahy; |
| 349 | Optimal and Provable Calibration in High-Dimensional Binary Classification: Angular Calibration and Platt Scaling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the fundamental problem of calibrating a linear binary classifier of the form \(\sigma(\hat{w}^\top x)\), where the feature vector \(x\) is Gaussian, \(\sigma\) is a link function, and \(\hat{w}\) is an estimator of the true linear weight $w^\star$. |
Yufan Li; Pragya Sur; |
| 350 | Reconstruction and Secrecy Under Approximate Distance Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This problem arises naturally in various contexts—from localization in GPS and sensor networks to privacy-aware data access—making it relevant from the perspective of both the reconstructor (seeking accurate recovery) and the responder (aiming to limit information disclosure, e.g., for privacy or security reasons). We study this reconstruction game through a learning-theoretic lens, focusing on the rate and limits of the best possible reconstruction error. |
Shay Moran; Elizaveta Nesterova; |
| 351 | Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address this, we propose FPX, an adaptive framework that dynamically selects model size and quantization level based on real-time demands.To support our investigation, we introduce two new benchmarks: HFTBench, a high-frequency trading simulation, and StreetFighter, a competitive gaming platform. |
Hao Kang; Qingru Zhang; Han Cai; Weiyuan Xu; Tushar Krishna; Yilun Du; Tsachy Weissman; |
| 352 | Characterizing The Expressivity of Fixed-Precision Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we analyze a restricted idealization of fixed-precision transformers with strict future masking, soft attention, and no positional encodings. |
Jiaoda Li; Ryan Cotterell; |
| 353 | Distributional Training Data Attribution: What Do Influence Functions Sample? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: They ignore the fact that, due to stochasticity in the initialisation and batching, training on the same dataset can yield different models. In this paper, we address this shortcoming through introducing _distributional_ training data attribution (d-TDA), the goal of which is to predict how the distribution of model outputs (over training runs) depends upon the dataset. |
Bruno Kacper Mlodozeniec; Isaac Reid; Samuel Power; David Krueger; Murat A Erdogdu; Richard E. Turner; Roger Baker Grosse; |
| 354 | Hogwild! Inference: Parallel LLM Generation Via Concurrent Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, each of these frameworks may not be suitable for all types of tasks, which can hinder their applicability. In this work, we propose a different design approach: we run LLM workers in parallel , allowing them to synchronize via a concurrently-updated attention cache and prompt these workers to decide how best to collaborate. |
Gleb Rodionov; Roman Garipov; Alina Shutova; George Yakushev; Erik Schultheis; Vage Egiazarian; Anton Sinitsin; Denis Kuznedelev; Dan Alistarh; |
| 355 | Mind-the-Glitch: Visual Correspondence for Detecting Inconsistencies in Subject-Driven Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a novel approach for disentangling visual and semantic features from the backbones of pre-trained diffusion models, enabling visual correspondence in a manner analogous to the well-established semantic correspondence. |
Abdelrahman Eldesokey; Aleksandar Cvejić; Bernard Ghanem; Peter Wonka; |
| 356 | The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we critically examine the concept of causal abstraction by considering arbitrarily powerful alignment maps. |
Denis Sutter; Julian Minder; Thomas Hofmann; Tiago Pimentel; |
| 357 | Gaussian Herding Across Pens: An Optimal Transport Perspective on Global Gaussian Reduction for 3DGS Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing compaction approaches address this by pruning Gaussians based on heuristic importance scores, without global fidelity guarantee. To bridge this gap, we propose a novel optimal transport perspective that casts 3DGS compaction as global Gaussian mixture reduction. |
Tao Wang; Mengyu Li; Geduo Zeng; Cheng Meng; Qiong Zhang; |
| 358 | Generalized Top-k Mallows Model for Ranked Choices Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we address several challenges related to the generalized top-$k$ Mallows model, with a focus on analyzing buyer choices. |
Shahrzad Haddadan; Sara Ahmadian; |
| 359 | Improving Bilinear RNN with Closed-loop Control Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first introduce the concept of Bilinear RNNs with a comprehensive analysis on the advantages and limitations of these models. Then based on the closed-loop control theory, we propose a novel Bilinear RNN variant named Comba, which adopts a scalar-plus-low-rank state transition, with both state feedback and output feedback corrections. |
Jiaxi Hu; Yongqi Pan; Jusen Du; Disen Lan; Xiaqiang Tang; Qingsong Wen; Yuxuan Liang; Weigao Sun; |
| 360 | Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present Spatial-MLLM, a novel framework for visual-based spatial reasoning from purely 2D observations.Beyond architecture improvements, we construct a training dataset from multiple sources and train the model on it using supervised fine-tuning and GRPO. |
Diankun Wu; Fangfu Liu; Yi-Hsin Hung; Yueqi Duan; |
| 361 | SANSA: Unleashing The Hidden Semantics in SAM2 for Few-Shot Segmentation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Our key insight is that, despite its class-agnostic pretraining, SAM2 already encodes rich semantic structure in its features. We propose SANSA (Semantically AligNed Segment Anything 2), a framework that makes this latent structure explicit, and repurposes SAM2 for few-shot segmentation through minimal task-specific modifications. |
Claudia Cuttano; Gabriele Trivigno; Giuseppe Averta; Carlo Masone; |
| 362 | Cost-Aware Contrastive Routing for LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Cost-Spectrum Contrastive Routing (CSCR), a lightweight framework that maps both prompts and models into a shared embedding space to enable fast, cost-sensitive selection. |
Reza Shirkavand; Shangqian Gao; Peiran Yu; Heng Huang; |
| 363 | Is Grokking A Computational Glass Relaxation? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here we propose an interpretation for grokking by framing it as a computational glass relaxation: viewing NNs as a physical system where parameters are the degrees of freedom and train loss is the system energy, we find memorization process resembles a rapid cooling of liquid into non-equilibrium glassy state at low temperature and the later generalization is like a slow relaxation towards a more stable configuration. |
Xiaotian Zhang; Yue Shang; Entao Yang; Ge Zhang; |
| 364 | LoRATv2: Enabling Low-Cost Temporal Modeling in One-Stream Trackers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches rely on a standard attention mechanism, which incurs quadratic token complexity, making real-time inference computationally expensive. In this paper, we introduce LoRATv2, a novel tracking framework that addresses these limitations with three main contributions. First, LoRATv2 integrates frame-wise causal attention, which ensures full self-attention within each frame while enabling causal dependencies across frames, significantly reducing computational overhead. |
Liting Lin; Heng Fan; Zhipeng Zhang; Yuqing Huang; Yaowei Wang; Yong Xu; Haibin Ling; |
| 365 | On The Necessity of Adaptive Regularisation: Optimal Anytime Online Learning on $\boldsymbol{\ell_p}$-balls Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study online convex optimization on $\ell_p$-balls in $\mathbb{R}^d$ for $p > 2$. |
Emmeran Johnson; David Martínez-Rubio; Ciara Pike-Burke; Patrick Rebeschini; |
| 366 | Controlling Thinking Speed in Reasoning Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we enable LRMs to approximate human intelligence through dynamic thinking speed adjustment, optimizing accuracy-efficiency trade-offs. |
Zhengkai Lin; Zhihang Fu; Ze Chen; Chao Chen; Liang Xie; Wenxiao Wang; Deng Cai; Zheng Wang; Jieping Ye; |
| 367 | Vector Quantization in The Brain: Grid-like Codes in World Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Grid-like Code Quantization (GCQ), a brain-inspired method for compressing observation-action sequences into discrete representations using grid-like patterns in attractor dynamics. |
Xiangyuan Peng; Xingsi Dong; Si Wu; |
| 368 | IA-GGAD: Zero-shot Generalist Graph Anomaly Detection Via Invariant and Affinity Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle FSS, we develop an anomaly-driven graph invariant learning module that learns domain-invariant node representations. |
Xiong Zhang; Zhenli He; Changlong Fu; Cheng Xie; |
| 369 | Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This manuscript revisits the essence of generative image fusion under the inspiration of human cognitive laws and proposes a novel infrared and visible image fusion method, termed HCLFuse. |
Lin Guo; Xiaoqing Luo; Wei Xie; Zhancheng Zhang; Hui Li; Rui Wang; Zhenhua Feng; Xiaoning Song; |
| 370 | MonoLift: Learning 3D Manipulation Policies from Monocular RGB Via Distillation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: An intuitive alternative is to incorporate a pre-trained depth estimator; however, this often incurs substantial inference-time cost. To address this, we propose MonoLift, a tri-level knowledge distillation framework that transfers spatial, temporal, and action-level knowledge from a depth-guided teacher to a monocular RGB student. |
Ziru Wang; Mengmeng Wang; Guang Dai; Yongliu Long; Jingdong Wang; |
| 371 | SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce the concept of semantic orientation, which defines object orientations using natural language in a reference-frame-free manner (e.g., the ”plug-in” direction of a USB or the ”handle” direction of a cup). |
Zekun Qi; Wenyao Zhang; Yufei Ding; Runpei Dong; XinQiang Yu; Jingwen Li; Lingyun Xu; Baoyu Li; Xialin He; Guofan Fan; Jiazhao Zhang; Jiawei He; Jiayuan Gu; Xin Jin; Kaisheng Ma; Zhizheng Zhang; He Wang; Li Yi; |
| 372 | Conditional Representation Learning for Customized Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Conditional Representation Learning (CRL), aiming to extract representations tailored to arbitrary user-specified criteria. |
Honglin Liu; Chao Sun; Peng Hu; Yunfan Li; Xi Peng; |
| 373 | Angular Steering: Behavior Control Via Rotation in Activation Space Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Angular Steering, a novel and flexible method for behavior modulation that operates by rotating activations within a fixed two-dimensional subspace. |
Hieu M. Vu; Tan Minh Nguyen; |
| 374 | COOPERA: Continual Open-Ended Human-Robot Assistance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces COOPERA, a novel framework for COntinual, OPen-Ended human-Robot Assistance, where simulated humans, driven by psychological traits and long-term intentions, interact with robots in complex environments.Within COOPERA, we introduce a benchmark and an approach to personalize the robot’s collaborative actions by learning human traits and context-dependent intents. |
Chenyang Ma; Kai Lu; Ruta Desai; Xavier Puig; Andrew Markham; Niki Trigoni; |
| 375 | Joint Hierarchical Representation Learning of Samples and Features Via Informed Tree-Wasserstein Distance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose an unsupervised method for jointly learning hierarchical representations of samples and features via Tree-Wasserstein Distance (TWD). |
Ya-Wei Eileen Lin; Ronald R. Coifman; Gal Mishne; Ronen Talmon; |
| 376 | Detecting Generated Images By Fitting Natural Image Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel framework that exploits geometric differences between the data manifolds of natural and generated images. |
Yonggang Zhang; Jun Nie; Xinmei Tian; Mingming Gong; Kun Zhang; Bo Han; |
| 377 | Agnostic Learning Under Targeted Poisoning: Optimal Rates and The Role of Randomness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we resolve the corresponding question in the agnostic setting. |
Bogdan Chornomaz; Yonatan Koren; Shay Moran; Tom Waknine; |
| 378 | ORIGAMISPACE: Benchmarking Multimodal LLMs in Multi-Step Spatial Reasoning with Mathematical Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose four evaluation tasks: Pattern Prediction, Multi-step Spatial Reasoning, Spatial Relationship Prediction, and End-to-End CP Code Generation. |
Rui Xu; Dakuan Lu; Zicheng Zhao; Xiaoyu Tan; Xintao Wang; Siyu Yuan; Jiangjie Chen; Xu Yinghui; |
| 379 | Establishing Linear Surrogate Regret Bounds for Convex Smooth Losses Via Convolutional Fenchel–Young Losses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Under this scenario, the better optimization and estimation properties of convex smooth surrogate losses may inevitably deteriorate after undergoing the regret transfer onto a target loss. We overcome this dilemma for arbitrary discrete target losses by constructing a convex smooth surrogate loss, which entails a linear surrogate regret bound composed with a tailored prediction link. |
Yuzhou Cao; Han Bao; Lei Feng; Bo An; |
| 380 | Co-Reinforcement Learning for Unified Multimodal Understanding and Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents a pioneering exploration of reinforcement learning (RL) via group relative policy optimization for unified multimodal large language models (ULMs), aimed at simultaneously reinforcing generation and understanding capabilities. |
Jingjing Jiang; Chongjie Si; Jun Luo; Hanwang Zhang; Chao Ma; |
| 381 | Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This work introduces Structured Linear Controlled Differential Equations (SLiCEs), a unifying framework for sequence models with structured, input-dependent state-transition matrices that retain the maximal expressivity of dense matrices whilst being cheaper to compute. |
Benjamin Walker; Lingyi Yang; Nicola Muca Cirone; Cristopher Salvi; Terry Lyons; |
| 382 | Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in processing long context for Large Language Models (LLMs); however, applying RAG to long video faces challenges such as disrupted temporal dependencies and inclusion of irrelevant information that can hinder accurate reasoning. To address these limitations, we propose Vgent, a novel \textbf{graph-based retrieval-reasoning-augmented generation framework} to enhance LVLMs for long video understanding. |
Xiaoqian Shen; Wenxuan Zhang; Jun Chen; Mohamed Elhoseiny; |
| 383 | Privacy Amplification By Random Allocation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing analyses of this scheme either rely on privacy amplification by shuffling which leads to overly conservative bounds or require Monte Carlo simulations that are computationally prohibitive in most practical scenarios. We give the first theoretical guarantees and numerical estimation algorithms for this sampling scheme. |
Moshe Shenfeld; Vitaly Feldman; |
| 384 | Tackling Biased Evaluators in Dueling Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we aim to minimize the regret in dueling bandits considering evaluators’ biased feedback. |
Ming Tang; Yuxuan Zhou; Chao Huang; |
| 385 | Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods primarily rely on the single coarse condition (e.g., skeleton sequences) as the intermediary to bridge the translation model and the video generation model, which limits both the naturalness and expressiveness of the generated videos. To overcome these limitations, we propose SignViP, a novel SLVG framework that incorporate multiple fine-grained conditions for improved generation fidelity. |
Cong Wang; Zexuan Deng; Zhiwei Jiang; Yafeng Yin; Fei Shen; Zifeng Cheng; Shiping Ge; Shiwei Gan; Qing Gu; |
| 386 | Transstratal Adversarial Attack: Compromising Multi-Layered Defenses in Text-to-Image Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing adversarial attacks have demonstrated vulnerabilities in isolated defense layers, they prove largely ineffective against multi-layered defenses deployed in real-world T2I systems. In this paper, we demonstrate that exploiting overlapping vulnerabilities across these distinct defense layers enables adversaries to systematically bypass the entire safeguard of T2I systems. |
Chunlong Xie; Kangjie Chen; Shangwei Guo; Shudong Zhang; Tianwei Zhang; Tao Xiang; |
| 387 | SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, because finding the optimal tree is NP-hard, enumerating the Rashomon set is inherently challenging. Therefore, we introduce SORTD, a novel framework that improves scalability and enumerates trees in the Rashomon set in order of the objective value, thus offering anytime behavior. |
Elif Arslan; Jacobus G. M. van der Linden; Serge Hoogendoorn; Marco Rinaldi; Emir Demirović; |
| 388 | TGA: True-to-Geometry Avatar Dynamic Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we propose True-to-Geometry Avatar Dynamic Reconstruction (TGA), a perspective-aware 4D Gaussian avatar framework that sensitively captures fine-grained facial variations for accurate 3D geometry reconstruction. |
Bo Guo; Sijia Wen; Ziwei Wang; Yifan Zhao; |
| 389 | Minimax Adaptive Online Nonparametric Regression Over Besov Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce an adaptive wavelet-based algorithm that performs sequential prediction without prior knowledge of $(s,p,q)$, and establish minimax-optimal regret bounds against any comparator in $B_{pq}^s$. |
Paul Liautaud; Pierre Gaillard; Olivier Wintenberger; |
| 390 | UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To address these issues, we first introduce UniSite-DS, the first UniProt (Unique Protein)-centric ligand binding site dataset, which contains 4.81 times more multi-site data and 2.08 times more overall data compared to the previously most widely used datasets. We then propose UniSite, the first end-to-end ligand binding site detection framework supervised by set prediction loss with bijective matching. |
Jigang Fan; QuanLin Wu; Shengjie Luo; Liwei Wang; |
| 391 | URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose \textbf{URDF-Anything}, an end-to-end automatic reconstruction framework based on a 3D multimodal large language model (MLLM). |
Zhe Li; Xiang Bai; Jieyu Zhang; Zhuangzhe Wu; Che Xu; Ying Li; Chengkai Hou; Shanghang Zhang; |
| 392 | L2DGCN: Learnable Enhancement and Label Selection Dynamic Graph Convolutional Networks for Mitigating Degree Bias Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the information bias caused by degree imbalance, we propose a Learnable Enhancement and Label Selection Dynamic Graph Convolutional Network (L2DGCN). |
jingxiao zhang; Shifei Ding; Jian Zhang; Lili Guo; Xuan Li; |
| 393 | Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Memo, a transformer-based architecture and training recipe for reinforcement learning (RL) on memory-intensive, long-horizon tasks. |
Gunshi Gupta; Karmesh Yadav; Zsolt Kira; Yarin Gal; Rahaf Aljundi; |
| 394 | Achilles’ Heel of Mamba: Essential Difficulties of The Mamba Architecture Demonstrated By Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we use carefully designed synthetic tasks to reveal Mamba’s inherent limitations. |
Tianyi Chen; Pengxiao Lin; Zhiwei Wang; Zhi-Qin John Xu; |
| 395 | RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce RepLDM, a novel reprogramming framework for pretrained LDMs that enables high-quality, high-efficiency, high-resolution image generation; see Fig. 1. |
Boyuan Cao; Jiaxin Ye; Yujie Wei; Hongming Shan; |
| 396 | Understanding LLM Behaviors Via Compression: Data Generation, Knowledge Acquisition and Scaling Laws Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we revisit the classical relationship between compression and prediction, grounded in Kolmogorov complexity and Shannon information theory, to provide deeper insights into LLM behaviors. |
Zhixuan Pan; Shaowen Wang; Liao Pengfei; Jian Li; |
| 397 | Differentiable Hierarchical Visual Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce an end-to-end differentiable tokenizer that adapts to image content with pixel-level granularity while remaining backward-compatible with existing architectures for retrofitting pretrained models. |
Marius Aasan; Martine Hjelkrem-Tan; Nico Catalano; Changkyu Choi; Adín Ramírez Rivera; |
| 398 | Trajectory Graph Learning: Aligning with Long Trajectories in Reinforcement Learning Without Reward Design Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we study the problem of directly aligning policies with expert-labeled trajectories to preserve long-horizon behavior without relying on reward signals. |
Yunfan Li; Eric Liu; Lin Yang; |
| 399 | Meta CLIP 2: A Worldwide Scaling Recipe Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we present Meta CLIP 2, the first recipe training CLIP from scratch on worldwide web-scale image-text pairs. |
Yung-Sung Chuang; Yang Li; Dong Wang; Ching-Feng Yeh; Kehan Lyu; Ramya Raghavendra; James R. Glass; LIFEI HUANG; Jason E Weston; Luke Zettlemoyer; Xinlei Chen; Zhuang Liu; Saining Xie; Wen-tau Yih; Shang-Wen Li; Hu Xu; |
| 400 | Policy Compatible Skill Incremental Learning Via Lazy Learning Interface Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose SIL-C, a novel framework that ensures skill-policy compatibility, allowing improvements in incrementally learned skills to enhance the performance of downstream policies without requiring policy re-training or structural adaptation. |
Daehee Lee; Dongsu Lee; TaeYoon Kwack; Wonje Choi; Honguk Woo; |
| 401 | Estimating Cognitive Biases with Attention-aware Inverse Planning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, building on recent work in computational cognitive science, we formally articulate the \textit{attention-aware inverse planning problem}, in which the goal is to estimate a person’s attentional biases from their actions. We demonstrate how attention-aware inverse planning systematically differs from standard inverse reinforcement learning and how cognitive biases can be inferred from behavior. |
Sounak Banerjee; Daphne Cornelisse; Deepak Edakkattil Gopinath; Emily Sumner; Jonathan DeCastro; Guy Rosman; Eugene Vinitsky; Mark K Ho; |
| 402 | An Efficient Orlicz-Sobolev Approach for Transporting Unbalanced Measures on A Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, GST provides a scalable yet rigid framework, which poses significant challenges to extend GST to accommodate nonnegative measures. To tackle these challenges, in this work we revisit the entropy partial transport (EPT) problem. |
Tam Le; Truyen Nguyen; Hideitsu Hino; Kenji Fukumizu; |
| 403 | DeepDiver: Adaptive Web-Search Intensity Scaling Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce $\textbf{WebPuzzle}$, a 24k-sample training and 275-sample test benchmark that evaluates information seeking on the live internet, across both wiki and open-domain queries. |
Wenxuan Shi; Haochen Tan; Chuqiao Kuang; Xiaoguang Li; Hanting Chen; Xiaozhe Ren; Yasheng Wang; Lu Hou; Lifeng Shang; |
| 404 | MetaGS: A Meta-Learned Gaussian-Phong Model for Out-of-Distribution 3D Scene Relighting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing relighting methods, which assume consistent light source distributions between training and testing, often degrade in OOD scenarios. We introduce **MetaGS** to tackle this challenge from two perspectives. |
Yumeng He; Yunbo Wang; |
| 405 | Towards A Pairwise Ranking Model with Orderliness and Monotonicity for Label Enhancement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these works still exhibit deficiencies in representing the probabilistic relationships between label distribution and label rankings, or fail to accommodate scenarios where multiple labels are equally important for a given instance. Therefore, we propose PROM, a pairwise ranking model with orderliness and monotonicity, to explain the probabilistic relationship between label distributions and label rankings. |
Yunan Lu; Xixi Zhang; Yaojin Lin; Weiwei Li; Lei Yang; Xiuyi Jia; |
| 406 | Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we develop a theoretical framework for analyzing budget allocation strategies. |
Pu Yang; Yunzhen Feng; Ziyuan Chen; Yuhang Wu; Zhuoyuan Li; |
| 407 | Fast-Slow Thinking GRPO for Large Vision-Language Model Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: When applying reinforcement learning—typically through GRPO—to large vision-language model reasoning struggles to effectively scale reasoning length or generates verbose outputs across all tasks with only marginal gains in accuracy. To address this issue, we present FAST-GRPO, a variant of GRPO that dynamically adapts reasoning depth based on question characteristics. |
Wenyi Xiao; Leilei Gan; |
| 408 | Exploration Via Feature Perturbation in Contextual Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose *feature perturbation*, a simple yet effective exploration strategy for contextual bandits that injects randomness directly into feature inputs, instead of randomizing unknown parameters or adding noise to rewards. |
Seouh-won Yi; Min-hwan Oh; |
| 409 | Instance-Optimality for Private KL Distribution Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Under natural notions of local neighborhood, we propose algorithms that achieve instance-optimality up to constant factors, with and without a differential privacy constraint. |
Jiayuan Ye; Vitaly Feldman; Kunal Talwar; |
| 410 | DMWM: Dual-Mind World Model with Long-Term Imagination Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the dual-process theory of human cognition, we propose a novel dual-mind world model (DMWM) framework that integrates logical reasoning to enable imagination with logical consistency. |
Lingyi Wang; Rashed Shelim; Walid Saad; Naren Ramakrishnan; |
| 411 | TimeWak: Temporal Chained-Hashing Watermark for Time Series Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose TimeWak, the first watermarking algorithm for multivariate time series diffusion models. |
Zhi Wen Soi; Chaoyi Zhu; Fouad Abiad; Aditya Shankar; Jeroen M. Galjaard; Huijuan Wang; Lydia Y. Chen; |
| 412 | UFO: A Unified Approach to Fine-grained Visual Perception Via Open-ended Language Interface Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This is primarily because these tasks often rely heavily on task-specific designs and architectures that can complicate the modeling process. To address this challenge, we present UFO, a framework that unifies fine-grained visual perception tasks through an open-ended language interface. |
Hao Tang; Chen-Wei Xie; Haiyang Wang; Xiaoyi Bao; Tingyu Weng; Pandeng Li; Yun Zheng; Liwei Wang; |
| 413 | DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this approach prohibits fine-grained comparisons, and we point out that it biases the annotators towards low-motion clips as they often contain fewer visual artifacts. In this work, we introduce DenseDPO, a method that addresses these shortcomings by making three contributions. First, we create each video pair for DPO by denoising corrupted copies of a ground truth video. |
Ziyi Wu; Anil Kag; Ivan Skorokhodov; Willi Menapace; Ashkan Mirzaei; Igor Gilitschenski; Sergey Tulyakov; Aliaksandr Siarohin; |
| 414 | Precise Asymptotics and Refined Regret of Variance-Aware UCB Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the behavior of the Upper Confidence Bound-Variance (UCB-V) algorithm for the Multi-Armed Bandit (MAB) problems, a variant of the canonical Upper Confidence Bound (UCB) algorithm that incorporates variance estimates into its decision-making process. |
Yingying Fan; Yuxuan Han; Jinchi Lv; Xiaocong XU; Zhengyuan Zhou; |
| 415 | InfMasking: Unleashing Synergistic Information By Contrastive Multimodal Interactions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This is particularly problematic because synergistic information constitutes the fundamental value proposition of multimodal representation. To address this challenge, we introduce InfMasking, a contrastive synergistic information extraction method designed to enhance synergistic information through an Infinite Masking strategy. |
Liangjian Wen; Qun Dai; Jianzhuang Liu; Jiangtao Zheng; Yong Dai; Dongkai Wang; zhao kang; Jun Wang; Zenglin Xu; Jiang Duan; |
| 416 | Extracting Task-relevant Preserved Dynamics from Contrastive Aligned Neural Recordings Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce $\underline{\text{C}}$ontrastive $\underline{\text{A}}$ligned $\underline{\text{N}}$eural $\underline{\text{D}}$$\underline{\text{Y}}$namics (CANDY), an end‑to‑end framework that aligns neural and behavioral data using rank-based contrastive learning, adapted for continuous behavioral variables, to project neural activity from different sessions onto a shared low-dimensional embedding space. |
Yiqi Jiang; Kaiwen Sheng; Yujia Gao; E. Kelly Buchanan; Yu Shikano; Seung Je Woo; Yixiu Zhao; Tony Hyun Kim; Fatih Dinc; Scott Linderman; Mark Schnitzer; |
| 417 | Orochi: Versatile Biomedical Image Processor Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these plugins are typically based on models that are limited to specific tasks and datasets, making them less practical for biologists. To address this challenge, we introduce **Orochi**, the first application-oriented, efficient, and versatile image processor designed to overcome these limitations. |
Gaole Dai; Chenghao Zhou; Yu Zhou; Rongyu Zhang; Yuan Zhang; Chengkai Hou; Tiejun Huang; Jianxu Chen; Shanghang Zhang; |
| 418 | WISA: World Simulator Assistant for Physics-aware Text-to-video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This limitation stems primarily from the lack of explicit physical guidance, caused by a significant gap between high-level physical concepts and the generative capabilities of current models. To address this challenge, we propose the **W**orld **S**imulator **A**ssistant (**WISA**), a novel framework designed to systematically decompose and integrate physical principles into T2V models. |
Jing Wang; Ao Ma; Ke Cao; Jun Zheng; Jiasong Feng; Zhanjie Zhang; Wanyuan Pang; Xiaodan Liang; |
| 419 | Conservative Classifiers Do Consistently Well with Improving Agents: Characterizing Statistical and Online Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper we characterize so-called learnability with improvements across multiple new axes. |
Dravyansh Sharma; Alec Sun; |
| 420 | KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing benchmarks are often domain-specific and thus cannot fully capture an LLM’s general reasoning potential. To address this limitation, we introduce the **Knowledge Orthogonal Reasoning Gymnasium (KORGym)**, a dynamic evaluation platform inspired by KOR-Bench and Gymnasium. |
Jiajun Shi; Jian Yang; Jiaheng Liu; Xingyuan Bu; Jiangjie Chen; Junting Zhou; Kaijing Ma; Zhoufutu Wen; Bingli Wang; Yancheng He; Liang Song; Hualei Zhu; Shilong Li; Xingjian Wang; Wei Zhang; Ruibin Yuan; Yifan Yao; Wenjun Yang; Yunli Wang; Siyuan Fang; Siyu Yuan; Qianyu He; Xiangru Tang; Yingshui Tan; Wangchunshu Zhou; Zhaoxiang Zhang; Zhoujun Li; Wenhao Huang; Ge Zhang; |
| 421 | Learnable Sampler Distillation for Discrete Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Accelerating DDMs by using larger step sizes typically introduces significant problems in generation quality, as it amplifies the impact of both the compounding decoding error due to factorized predictions and discretization error from numerical approximations, leading to a significant decrease in sampling quality. To address these challenges, we propose learnable sampler distillation (LSD), a novel approach to train fast and high-fidelity samplers for DDMs. |
Feiyang Fu; Tongxian Guo; Zhaoqiang Liu; |
| 422 | On The Hardness of Approximating Distributions with Tractable Probabilistic Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, due to hardness of inference tasks, exactly representing distributions while supporting tractable inference often incurs exponential size blow-ups. In this paper, we consider a natural, yet so far underexplored, question: can we avoid such size blow-up by allowing for some small approximation error? |
John Leland; YooJung Choi; |
| 423 | MoESD: Unveil Speculative Decoding’s Potential for Accelerating Sparse MoE Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While current SD research primarily focuses on improving acceptance rates of algorithms, changes in workload and model architecture can still lead to degraded SD acceleration even with high acceptance rates. To address this limitation, we introduce a new metric ‘target efficiency’ that characterizes these effects, thus helping researchers identify system bottlenecks and understand SD acceleration more comprehensively. |
Zongle Huang; Lei Zhu; ZongYuan Zhan; Ting Hu; Weikai Mao; Xianzhi Yu; Yongpan Liu; Tianyu Zhang; |
| 424 | SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce ThinkLite-VL, a family of visual reasoning models that achieve state-of-the-art (SoTA) performance using an order of magnitude fewer training samples, relying purely on reinforcement fine-tuning (RFT) self-improvement without any knowledge distillation. |
Xiyao Wang; Zhengyuan Yang; Chao Feng; Hongjin Lu; Linjie Li; Chung-Ching Lin; Kevin Lin; Furong Huang; Lijuan Wang; |
| 425 | Deno-IF: Unsupervised Noisy Visible and Infrared Image Fusion Method Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a novel unsupervised noisy visible and infrared image fusion method, comprising two key modules. |
Han Xu; Yuyang Li; Yunfei Deng; Jiayi Ma; Guangcan Liu; |
| 426 | ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, current generative approaches often rely on complex post-processing or extensive fine-tuning on massive datasets to achieve satisfactory results, and they remain prone to content–position mismatches and semantic leakage. To overcome these limitations, we introduce ReCon, a novel augmentation framework that enhances the capacity of structure-controllable generative models for object detection. |
Haowei Zhu; Tianxiang Pan; Rui Qin; Jun-Hai Yong; Bin Wang; |
| 427 | Repurposing Marigold for Zero-Shot Metric Depth Estimation Via Defocus Blur Cues Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they still exhibit a significant performance drop on out-of-distribution datasets. We address this limitation by injecting defocus blur cues at inference time into Marigold, a \textit{pre-trained} diffusion model for zero-shot, scale-invariant monocular depth estimation (MDE). |
Chinmay Talegaonkar; Nikhil Gandudi Suresh; Zachary Novack; Yash Belhe; Priyanka Nagasamudra; Nicholas Antipa; |
| 428 | PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce PARTONOMY, an LMM benchmark designed for pixel-level part grounding. |
Ansel Blume; Jeonghwan Kim; Hyeonjeong Ha; Elen Chatikyan; Xiaomeng Jin; Khanh Duy Nguyen; Nanyun Peng; Kai-Wei Chang; Derek Hoiem; Heng Ji; |
| 429 | Composite Flow Matching for Reinforcement Learning with Shifted-Dynamics Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their estimation of the dynamics gap often relies on KL divergence or mutual information, which can be ill-defined when the source and target dynamics have disjoint support. To overcome these limitations, we propose CompFlow, a method grounded in the theoretical connection between flow matching and optimal transport. |
Lingkai Kong; Haichuan Wang; Tonghan Wang; GUOJUN XIONG; Milind Tambe; |
| 430 | Asymmetric Duos: Sidekicks Improve Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new cost-effective strategy for improving the uncertainty quantification and downstream decisions of a large model (e.g. a fine-tuned ViT-B): coupling it with a less accurate but much smaller sidekick (e.g. a fine-tuned ResNet-34) with a fraction of the computational cost. |
Tim G. Zhou; Evan Shelhamer; Geoff Pleiss; |
| 431 | Robust Learning of Halfspaces Under Log-concave Marginals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We give an algorithm that agnostically learns linear threshold functions and returns a classfier with boundary volume $O(r+\varepsilon)$ at radius of perturbation $r$. |
Jane Lange; Arsen Vasilyan; |
| 432 | Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: These results encompass important special cases including Spectral Descent and Muon, which we show converge to max-margin solutions with respect to the spectral norm. A key insight of our contribution is that the analysis of general entry-wise and Schatten p-norms can be reduced to the analysis of NSD/NMD with max-norm by exploiting a natural ordering property between all p-norms relative to the max-norm and its dual sum-norm. |
Chen Fan; Mark Schmidt; Christos Thrampoulidis; |
| 433 | Lost in Transmission: When and Why LLMs Fail to Reason Globally Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We argue that these failures arise due to capacity limits on the accurate flow of information within LLMs. To formalize this issue, we introduce the bounded attention prefix oracle (BAPO) model, a new computational framework that models bandwidth constraints on attention heads, the mechanism for internal communication in LLMs. |
Tobias Schnabel; Kiran Tomlinson; Adith Swaminathan; Jennifer Neville; |
| 434 | On Transferring Transferability: Towards A Theory for Size Generalization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work by introducing a general framework for transferability across dimensions. |
Eitan Levin; Yuxin Ma; Mateo Diaz Diaz; Soledad Villar; |
| 435 | When Worse Is Better: Navigating The Compression Generation Trade-off In Visual Tokenization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This reveals a fundamental trade-off, do we compress more aggressively to make the latent distribution easier for the stage 2 model to learn even if it makes reconstruction worse? We study this problem in the context of discrete, auto-regressive image generation. |
Vivek Ramanujan; Kushal Tirumala; Armen Aghajanyan; Luke Zettlemoyer; Ali Farhadi; |
| 436 | FlowFeat: Pixel-Dense Embedding of Motion Profiles Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, state-of-the-art networks, such as transformers, produce low-resolution feature grids, which are suboptimal for dense prediction tasks. To address this limitation, we present *FlowFeat*, a high-resolution and multi-task feature representation. |
Nikita Araslanov; Anna Ribic; Daniel Cremers; |
| 437 | The Generative Leap: Tight Sample Complexity for Efficiently Learning Gaussian Multi-Index Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work we consider generic Gaussian Multi-index models, in which the labels only depend on the (Gaussian) $d$-dimensional inputs through their projection onto a low-dimensional $r = O_d(1)$ subspace, and we study efficient agnostic estimation procedures for this hidden subspace. |
Alex Damian; Jason D. Lee; Joan Bruna; |
| 438 | Incremental Sequence Classification with Temporal Consistency Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Drawing on temporal-difference learning from reinforcement learning, we identify a temporal-consistency condition that successive predictions should satisfy. We leverage this condition to develop a novel loss function for training incremental sequence classifiers. |
Lucas Maystre; Gabriel Barello; Tudor Berariu; Aleix Cambray; Rares Dolga; Alvaro Ortega Gonzalez; Andrei Cristian Nica; David Barber; |
| 439 | LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present a novel level-of-detail (LOD) method for 3D Gaussian Splatting that enables real-time rendering of large-scale scenes on memory-constrained devices. |
Jonas Kulhanek; Marie-Julie Rakotosaona; Fabian Manhardt; Christina Tsalicoglou; Michael Niemeyer; Torsten Sattler; Songyou Peng; Federico Tombari; |
| 440 | Solving Neural Min-Max Games: The Role of Architecture, Initialization & Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While such games often involve non-convex non-concave objectives, empirical evidence shows that simple gradient methods frequently converge, suggesting a hidden geometric structure. In this paper, we provide a theoretical framework that explains this phenomenon through the lens of \emph{hidden convexity} and \emph{overparameterization}. |
Deep Patel; Emmanouil-Vasileios Vlatakis-Gkaragkounis; |
| 441 | Provable Gradient Editing of Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present ProGrad, the first efficient approach for editing the parameters of a DNN to provably enforce hard constraints on the DNN gradients. |
Zhe Tao; Aditya V. Thakur; |
| 442 | Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Audio Flamingo 3 (AF3), a fully open state-of-the-art (SOTA) large audio-language model that advances reasoning and understanding across speech, sound, and music.To enable these capabilities, we propose several large-scale training datasets curated using novel strategies, including AudioSkills-XL, LongAudio-XL, AF-Think, and AF-Chat, and train AF3 with a novel five-stage curriculum-based training strategy. |
Sreyan Ghosh; Arushi Goel; Jaehyeon Kim; Sonal Kumar; Zhifeng Kong; Sang-gil Lee; Chao-Han Huck Yang; Ramani Duraiswami; Dinesh Manocha; Rafael Valle; Bryan Catanzaro; |
| 443 | RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method for more accurate and efficient camera parameter optimization in dynamic scenes solely supervised by a single RGB video, dubbed $\textbf{\textit{ROS-Cam}}$. |
Fang Li; Hao Zhang; Narendra Ahuja; |
| 444 | Integration Matters for Learning PDEs with Backwards SDEs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, standard BSDE-based solvers have empirically been shown to underperform relative to PINNs in the literature. In this paper, we identify the root cause of this performance gap as a discretization bias introduced by the standard Euler-Maruyama (EM) integration scheme applied to one-step self-consistency BSDE losses, which shifts the optimization landscape off target. |
Sungje Park; Stephen Tu; |
| 445 | Unleashing Hour-Scale Video Training for Long Video-Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the scarcity of well-annotated long videos has left the training of hour-long Video-LMMs underexplored. To close this gap, we present VideoMarathon, a large-scale hour-long video instruction-following dataset. |
Jingyang Lin; Jialian Wu; Ximeng Sun; Ze Wang; Jiang Liu; Yusheng Su; Xiaodong Yu; Hao Chen; Jiebo Luo; Zicheng Liu; Emad Barsoum; |
| 446 | Neural Entropy Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We explore the connection between deep learning and information theory through the paradigm of diffusion models. |
Akhil Premkumar; |
| 447 | Locality in Image Diffusion Models Emerges from Data Statistics Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we present evidence that the locality in deep diffusion models emerges as a statistical property of the image dataset and is not due to the inductive bias of convolutional neural networks, as suggested in previous work. |
Artem Lukoianov; Chenyang Yuan; Justin Solomon; Vincent Sitzmann; |
| 448 | Regularized Least Squares Learning with Heavy-tailed Noise Is Minimax Optimal Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper examines the performance of ridge regression in reproducing kernel Hilbert spaces in the presence of noise that exhibits a finite number of higher moments. |
Mattes Mollenhauer; Nicole Mücke; Dimitri Meunier; Arthur Gretton; |
| 449 | On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We address this by introducing entropy-regularized Markov games, which yield a unique equilibrium while preserving strategic incentives. For this setting, we provide a sample complexity analysis detailing how errors affect learned policy performance. |
Till Freihaut; Giorgia Ramponi; |
| 450 | MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose MonarchAttention — a novel approach to sub-quadratic attention approximation via Monarch matrices, an expressive class of structured matrices. |
Can Yaras; Alec S Xu; Pierre Abillama; Changwoo Lee; Laura Balzano; |
| 451 | The Hawthorne Effect in Reasoning Models: Evaluating and Steering Test Awareness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a white-box probing framework that (i) linearly identifies awareness-related activations and (ii) steers models toward or away from test awareness while monitoring downstream performance. |
Sahar Abdelnabi; Ahmed Salem; |
| 452 | Measuring and Controlling Solution Degeneracy Across Task-Trained Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we develop a unified framework to systematically quantify and control solution degeneracy across three levels: behavior, neural dynamics, and weight space. |
Ann Huang; Satpreet Harcharan Singh; Flavio Martinelli; Kanaka Rajan; |
| 453 | Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We prove that BOLA remains sample-efficient even under imperfect predictions. |
Chenbei Lu; Zaiwei Chen; Tongxin Li; Chenye Wu; Adam Wierman; |
| 454 | AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We focus on methods for improving agents’ performance on MLE-bench, a challenging benchmark where agents compete in Kaggle competitions to solve real-world machine learning problems. |
Edan Toledo; Karen Hambardzumyan; Martin Josifoski; RISHI HAZRA; Nicolas Baldwin; Alexis Audran-Reiss; Michael Kuchnik; Despoina Magka; Minqi Jiang; Alisia Maria Lupidi; Andrei Lupu; Roberta Raileanu; Tatiana Shavrina; Kelvin Niu; Jean-Christophe Gagnon-Audet; Michael Shvartsman; Shagun Sodhani; Alexander H Miller; Abhishek Charnalia; Derek Dunfield; Carole-Jean Wu; Pontus Stenetorp; Nicola Cancedda; Jakob Nicolaus Foerster; Yoram Bachrach; |
| 455 | Differentiable Cyclic Causal Discovery Under Unmeasured Confounders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods that account for confounders either assume linearity or struggle with scalability. To address these limitations, we propose DCCD-CONF, a novel framework for differentiable learning of nonlinear cyclic causal graphs in the presence of unmeasured confounders using interventional data. |
Muralikrishnna Guruswamy Sethuraman; Faramarz Fekri; |
| 456 | ELECTRA: A Cartesian Network for 3D Charge Density Prediction with Floating Orbitals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present the Electronic Tensor Reconstruction Algorithm (ELECTRA) – an equivariant model for predicting electronic charge densities using floating orbitals. |
Jonas Elsborg; Luca Thiede; Alan Aspuru-Guzik; Tejs Vegge; Arghya Bhowmik; |
| 457 | Compositional Neural Network Verification Via Assume-Guarantee Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, scaling verification to large networks is challenging, at least in part due to the significant memory requirements of verification algorithms. In this paper, we propose an assume-guarantee compositional framework, CoVeNN, that is parameterized by an underlying verifier to generate a sequence of verification sub-problems to address this challenge. |
Hai Duong; David Shriver; ThanhVu Nguyen; Matthew B. Dwyer; |
| 458 | A Machine Learning Approach That Beats Rubik’s Cubes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The paper proposes a novel machine learning-based approach to the pathfinding problem on extremely large graphs. |
Alexander Chervov; Kirill Khoruzhii; Nikita Bukhal; Jalal Naghiyev; Vladislav Zamkovoy; Ivan Koltsov; Lyudmila Cheldieva; Arsenii Sychev; Arsenii Lenin; Mark Obozov; Egor Urvanov; Alexey M. Romanov; |
| 459 | ENMA: Tokenwise Autoregression for Continuous Neural PDE Operators Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ENMA, a generative neural operator designed to model spatio-temporal dynamics arising from physical phenomena. |
Armand Kassaï Koupaï; Lise Le Boudec; Louis Serrano; Patrick Gallinari; |
| 460 | Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a simple, empirical approach to *directly* measure the CBS and show how the CBS evolves over training. |
William Merrill; Shane Arora; Dirk Groeneveld; Hannaneh Hajishirzi; |
| 461 | Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For instance, reward-free and goal-conditioned RL methods often presume that the successor measure admits a low-rank representation. In this work, we challenge this assumption by first remarking that the successor measure itself is not approximately low-rank. |
Bastien Dubail; Stefan Stojanovic; Alexandre Proutiere; |
| 462 | Stochastic Optimization in Semi-Discrete Optimal Transport: Convergence Analysis and Minimax Rate Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we answer it positively by providing both computational and statistical convergence guarantees of SGD. |
Ferdinand Genans; Antoine Godichon-Baggioni; François-Xavier Vialard; Olivier Wintenberger; |
| 463 | Transferable Black-Box One-Shot Forging of Watermarks Via Image Preference Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we investigate watermark forging in the context of widely used post-hoc image watermarking. |
Tomas Soucek; Sylvestre-Alvise Rebuffi; Pierre Fernandez; Nikola Jovanović; Hady Elsahar; Valeriu Lacatusu; Tuan A. Tran; Alexandre Mourachko; |
| 464 | Non-Asymptotic Analysis Of Data Augmentation For Precision Matrix Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel deterministic equivalent for generalized resolvent matrices, accommodating dependent samples with specific structure. |
Lucas Morisset; Adrien Hardy; Alain Oliviero Durmus; |
| 465 | Improving Perturbation-based Explanations By Understanding The Role of Uncertainty Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We show that models systematically produce unreliable probability estimates when subjected to explainability-specific perturbations and theoretically prove that this directly undermines global and local explanation quality. To address this, we introduce ReCalX, a novel approach to recalibrate models for improved explanations while preserving their original predictions. |
Thomas Decker; Volker Tresp; Florian Buettner; |
| 466 | ConTextTab: A Semantics-Aware Tabular In-Context Learner Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: At the other end of the spectrum, tabular ICL models based on pretrained large language models such as TabuLa-8B integrate deep semantic understanding and world knowledge but are only able to make use of a small amount of context due to inherent architectural limitations. With the aim to combine the best of both these worlds, we introduce ConTextTab, integrating semantic understanding and alignment into a table-native ICL framework. |
Marco Spinaci; Marek Polewczyk; Maximilian Schambach; Sam Thelin; |
| 467 | Some Optimizers Are More Equal: Understanding The Role of Optimizers in Group Fairness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through stochastic differential equation analysis of optimization dynamics in an analytically tractable setup, we demonstrate that the choice of optimization algorithm indeed influences fairness outcomes, particularly under severe imbalance. |
Mojtaba Kolahdouzi; Hatice Gunes; Ali Etemad; |
| 468 | Less Is More: Improving LLM Alignment Via Preference Data Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To further mitigate the noise in different reward models, we propose a Bayesian Aggregation approach that unifies multiple margin sources (external and implicit) into a single preference probability. |
Xun Deng; Han Zhong; Rui Ai; Fuli Feng; Zheng Wang; Xiangnan He; |
| 469 | Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we conduct a series of empirical analyses which suggest that the combination of non-stationarity with gradient pathologies, due to suboptimal architectural choices, underlie the challenges of scale. |
Roger Creus Castanyer; Johan Obando-Ceron; Lu Li; Pierre-Luc Bacon; Glen Berseth; Aaron Courville; Pablo Samuel Castro; |
| 470 | SpecMER: Fast Protein Generation with K-mer Guided Speculative Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SpecMER (Speculative Decoding via k-mer Guidance), a novel framework that incorporates biological, structural, and functional priors using k-mer motifs extracted from multiple sequence alignments. |
Thomas Walton; Darin Tsui; Aryan Musharaf; Amirali Aghazadeh; |
| 471 | SpecEdge: Scalable Edge-Assisted Serving Framework for Interactive LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce SpecEdge, an edge-assisted inference framework that splits LLM workloads between edge and server GPUs using a speculative decoding scheme, exchanging only token outputs over the network. |
Jinwoo Park; Seunggeun Cho; Dongsu Han; |
| 472 | DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce DEXTER, a data-free framework that employs diffusion models and large language models to generate global, textual explanations of visual classifiers. |
Simone Carnemolla; Matteo Pennisi; Sarinda Samarasinghe; Giovanni Bellitto; Simone Palazzo; Daniela Giordano; Mubarak Shah; Concetto Spampinato; |
| 473 | What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, for higher order Markov sources, the best known constructions require at least three layers (each with a single attention head) – leaving open the question: *can a two-layer single-head transformer represent any $k^{\text{th}}$-order Markov process? * In this paper, we precisely address this and theoretically show that a two-layer transformer with one head per layer can indeed represent any conditional $k$-gram. |
Chanakya Ekbote; Ashok Vardhan Makkuva; Marco Bondaschi; Nived Rajaraman; Michael Gastpar; Jason D. Lee; Paul Pu Liang; |
| 474 | Sample-Adaptivity Tradeoff in On-Demand Sampling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: For the general agnostic case, we present an algorithm that achieves near-optimal sample complexity of $\widetilde O((d + k) / \epsilon^2)$ within $\widetilde O(\sqrt{k})$ rounds. |
Nika Haghtalab; Omar Montasser; Mingda Qiao; |
| 475 | Online Strategic Classification With Noise and Partial Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study an online strategic classification problem, where a principal aims to learn an accurate binary linear classifier from sequentially arriving agents. |
Tianrun Zhao; Xiaojie Mao; Yong Liang; |
| 476 | Partition-Then-Adapt: Combating Prediction Bias for Reliable Multi-Modal Test-Time Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they often prove ineffective when multiple modalities simultaneously undergo domain shifts, as they struggle to identify and utilize reliable samples within testing batches amid severe prediction bias. To address this problem, we propose Partition-Then-Adapt (PTA), a novel approach combating prediction bias for TTA with multi-modal domain shifts. |
Guowei Wang; Fan Lyu; Changxing Ding; |
| 477 | Strategic Hypothesis Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Building on prior work, we develop a game-theoretic model that captures how the agent’s participation and reporting behavior respond to the principal’s statistical decision rule. |
Yatong Chen; Safwan Hossain; Yiling Chen; |
| 478 | Accelerating Optimization Via Differentiable Stopping Time Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, the complementary objective of minimizing the time to reach a target loss is traditionally considered non-differentiable. To address this limitation, we propose a differentiable discrete stopping time and theoretically justify it based on its connection to continuous differential equations. |
Zhonglin Xie; Yiman Fong; Haoran Yuan; Zaiwen Wen; |
| 479 | Tight Generalization Bounds for Large-Margin Halfspaces Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We prove the first generalization bound for large-margin halfspaces that is asymptotically tight in the tradeoff between the margin, the fraction of training points with the given margin, the failure probability and the number of training points. |
Kasper Green Larsen; Natascha Schalburg; |
| 480 | Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose WeSCon, the first self-training framework that enables word-level control of both emotion and speaking rate in a pretrained zero-shot TTS model, without relying on datasets containing intra-sentence emotion or speed transitions. |
tianrui wang; Haoyu Wang; Meng Ge; Cheng Gong; Chunyu Qiang; Ziyang Ma; Zikang Huang; Guanrou Yang; Xiaobao Wang; EngSiong Chng; Xie Chen; Longbiao Wang; Jianwu Dang; |
| 481 | UMoE: Unifying Attention and FFN with Shared Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we aim to unify MoE designs in attention and FFN layers by introducing a novel reformulation of the attention mechanism, that reveals an underlying FFN-like structure within attention modules. |
Yuanhang Yang; Chaozheng Wang; Jing Li; |
| 482 | RF-Agent: Automated Reward Function Design Via Language Agent Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, they suffer from poor utilization of historical feedback and inefficient search, resulting in limited improvements in complex control tasks. To address this challenge, we propose RF-Agent, a framework that treats LLMs as language agents and frames reward function design as a sequential decision-making process, enhancing optimization through better contextual reasoning. |
Ning Gao; Xiuhui Zhang; Xingyu Jiang; Mukang You; Mohan Zhang; Yue Deng; |
| 483 | GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify the root cause as LoRA’s structural bottleneck, which introduces gradient entanglement to the unrelated input channels and distorts gradient propagation. To address this, we introduce a novel structure, Granular Low-Rank Adaptation (GraLoRA) that partitions weight matrices into sub-blocks, each with its own low-rank adapter. |
Yeonjoon Jung; Daehyun Ahn; Hyungjun Kim; Taesu Kim; Eunhyeok Park; |
| 484 | Quantization-Free Autoregressive Action Transformer Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We propose a quantization-free method instead that leverages Generative Infinite-Vocabulary Transformers (GIVT) as a direct, continuous policy parametrization for autoregressive transformers. |
Ziyad Sheebaelhamd; Michael Tschannen; Michael Muehlebach; Claire Vernade; |
| 485 | LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven By Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the analyzing and reasoning capability of large language models (LLMs), we design **LLM-Explorer** to adaptively generate task-specific exploration strategies with LLMs, enhancing the policy exploration in RL. |
Qianyue Hao; Yiwen Song; Qingmin Liao; Jian Yuan; Yong Li; |
| 486 | BayeSQP: Bayesian Optimization Through Sequential Quadratic Programming Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce BayeSQP, a novel algorithm for general black-box optimization that merges the structure of sequential quadratic programming with concepts from Bayesian optimization. |
Paul Brunzema; Sebastian Trimpe; |
| 487 | Practical Do-Shapley Explanations with Estimand-Agnostic Causal Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In response, do-SHAP employs interventional queries, but its reliance on estimands hinders its practical application. To address this problem, we propose the use of estimand-agnostic approaches, which allow for the estimation of any identifiable query from a single model, making do-SHAP feasible on complex graphs. |
Álvaro Parafita; Tomas Garriga; Axel Brando; Francisco J. Cazorla; |
| 488 | AlphaZero Neural Scaling and Zipf’s Law: A Tale of Board Games and Power Laws Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper we examine power-law scaling in AlphaZero, a reinforcement learning algorithm, using a model of language-model scaling. |
Oren Neumann; Claudius Gros; |
| 489 | Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods, however, either provide weak notions of stationarity or require restrictive assumptions to guarantee the smoothness of hyper-objective functions. In this paper, we eliminate these impractical assumptions and show that strong (Clarke) hyper-stationarity remains computable even when the hyper-objective is nonsmooth. |
He Chen; Jiajin Li; Anthony Man-Cho So; |
| 490 | The Computational Advantage of Depth in Learning High-Dimensional Hierarchical Targets Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a class of target functions (single and multi-index Gaussian hierarchical targets) that incorporate a hierarchy of latent subspace dimensionalities. |
Yatin Dandi; Luca Pesce; Lenka Zdeborova; Florent Krzakala; |
| 491 | Thoughts Are All Over The Place: On The Underthinking of Long Reasoning Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel metric to quantify underthinking by measuring token efficiency in incorrect answers. |
Yue Wang; Qiuzhi Liu; Jiahao Xu; Tian Liang; Xingyu Chen; Zhiwei He; Linfeng Song; Dian Yu; Juntao Li; Zhuosheng Zhang; Rui Wang; Zhaopeng Tu; Haitao Mi; Dong Yu; |
| 492 | OPTFM: A Scalable Multi-View Graph Transformer for Hierarchical Pre-Training in Combinatorial Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Optimization problems, often modeled as graphs, pose unique challenges due to their diverse structures, varying distributions, and NP-hard complexity. To address these challenges, we propose OPTFM, the first graph foundation model for general combinatorial optimization. |
Hao Yuan; Wenli Ouyang; Changwen Zhang; Congrui Li; Yong Sun; |
| 493 | ESCA: Contextualizing Embodied Agents Via Scene-Graph Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing MLLMs do not reliably capture fine-grained links between low-level visual features and high-level textual semantics, leading to weak grounding and inaccurate perception. To overcome this challenge, we propose ESCA, a framework that contextualizes embodied agents by grounding their perception in spatial-temporal scene graphs. |
Jiani Huang; Amish Sethi; Matthew Kuo; Mayank Keoliya; Neelay Velingker; JungHo Jung; Ser-Nam Lim; Ziyang Li; Mayur Naik; |
| 494 | Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present regret minimization algorithms for the contextual multi-armed bandit (CMAB) problem over $K$ actions in the presence of delayed feedback, a scenario where loss observations arrive with delays chosen by an adversary. |
Orin Levy; Liad Erez; Alon Cohen; Yishay Mansour; |
| 495 | Towards Multi-Table Learning: A Novel Paradigm for Complementarity Quantification and Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Specifically, we introduce a metric called complementarity strength (CS), which captures inter-table complementarity by incorporating relevance, similarity, and informativeness. |
Junyu Zhang; Lizhong Ding; MinghongZhang; Ye Yuan; Xingcan Li; Pengqi Li; Tihang Xi; Guoren Wang; Changsheng Li; |
| 496 | QFFT, Question-Free Fine-Tuning for Adaptive Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To enable models to leverage both patterns, we propose Question-Free Fine-Tuning (QFFT), a fine-tuning approach that removes the input question during training and learns exclusively from Long CoT responses. |
Wanlong Liu; Junxiao Xu; Fei Yu; Yukang Lin; Ke Ji; Wenyu Chen; Lifeng Shang; Yasheng Wang; Yan Xu; Benyou Wang; |
| 497 | Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a simple yet effective solution: _**Option-aware Temporally Abstracted**_ value learning, dubbed **OTA**, which incorporates temporal abstraction into the temporal-difference learning process. |
Hongjoon Ahn; Heewoong Choi; Jisu Han; Taesup Moon; |
| 498 | Functional Scaling Laws in Kernel Regression: Loss Dynamics and Learning Rate Schedules Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing studies predominantly focus on the final-step loss, leaving open whether the entire $\textit{loss dynamics}$ obey similar laws and, crucially, how the $\textit{learning rate schedule}$ (LRS) shapes them. We address these gaps in a controlled theoretical setting by analyzing stochastic gradient descent (SGD) on a power-law kernel regression model. |
Binghui Li; Fengling Chen; Zixun Huang; Lean Wang; Lei Wu; |
| 499 | GSRF: Complex-Valued 3D Gaussian Splatting for Efficient Radio-Frequency Data Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce GSRF, a framework that extends 3D Gaussian Splatting (3DGS) from the optical domain to the RF domain, enabling efficient RF data synthesis. |
Kang Yang; Gaofeng Dong; Sijie JI; Wan Du; Mani Srivastava; |
| 500 | Sharp Gaussian Approximations for Decentralized Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present two generalized Gaussian approximation results for local SGD and explore their implications. |
SOHAM BONNERJEE; Sayar Karmakar; Wei Biao Wu; |
| 501 | DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: One key challenge is the sparse reward, which introduces more training variance in policy optimization and makes it difficult to obtain a good estimation for value function in Actor-Critic (AC) methods. To address these issues, we introduce Direct Advantage-Based Policy Optimization (DAPO), a novel step-level offline RL algorithm with theoretical guarantees for enhancing the reasoning abilities of LLMs. |
Jiacai Liu; Chaojie Wang; Chris Yuhao Liu; Liang Zeng; Rui Yan; Yiwen Sun; Yang Liu; |
| 502 | MGUP: A Momentum-Gradient Alignment Update Policy for Stochastic Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although intra-layer selective updates have been explored, a general mechanism that enables fine-grained control while ensuring convergence guarantees is still lacking. To bridge this gap, we propose \textbf{MGUP}, a novel mechanism for selective updates. |
Da Chang; Ganzhao Yuan; |
| 503 | Unveiling The Power of Multiple Gossip Steps: A Stability-Based Generalization Analysis in Decentralized Training Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, the theoretical reasons for its effectiveness and whether this gap can be fully eliminated by MGS remain open questions. In this paper, we derive upper bounds on the generalization error and excess error of MGS using stability analysis, systematically answering these two key questions. |
Qinglun Li; Yingqi Liu; Miao Zhang; Xiaochun Cao; Quanjun Yin; Li Shen; |
| 504 | Enhancing Contrastive Learning with Variable Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel method called Contrastive Learning with Variable Similarity (CLVS) to accurately characterize the intrinsic similarity relationships between different augmented views. |
Haowen Cui; Shuo Chen; Jun Li; Jian Yang; |
| 505 | Orient Anything V2: Unifying Orientation and Rotation Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work presents Orient Anything V2, an enhanced foundation model for unified understanding of object 3D orientation and rotation from single or paired images. |
Zehan Wang; Ziang Zhang; Jiayang Xu; Jialei Wang; Tianyu Pang; Chao Du; Hengshuang Zhao; Zhou Zhao; |
| 506 | A Closer Look at Graph Transformers: Cross-Aggregation and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Unfortunately, the underlying mechanism driving their effectiveness remains insufficiently understood. In this paper, we revisit these strategies and uncover a shared underlying mechanism—Cross Aggregation—that effectively captures the interaction between graph topology and node attributes. |
Jiaming Zhuo; Ziyi Ma; Yintong Lu; Yuwei Liu; Kun Fu; Di Jin; Chuan Wang; Wenning Wu; Zhen Wang; Xiaochun Cao; Liang Yang; |
| 507 | Latent Policy Barrier: Learning Robust Visuomotor Policies By Staying In-Distribution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Latent Policy Barrier, a framework for robust visuomotor policy learning. |
Zhanyi Sun; Shuran Song; |
| 508 | Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To isolate a minimal form of this transformation, we identify language model subnetworks that make bigram predictions, naive next token predictions based only on the current token. We find that bigram subnetworks can be found in fully trained language models up to 1B parameters, and these subnetworks are critical for model performance even when they consist of less than 0.2% of model parameters. |
Tyler A. Chang; Ben Bergen; |
| 509 | Minimax-Optimal Univariate Function Selection in Sparse Additive Models: Rates, Adaptation, and The Estimation-Selection Gap Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper focuses on the variable selection, i.e., the univariate function selection problem in SpAM. |
Shixiang Liu; |
| 510 | Continuous Thought Machines Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: By incorporating neuron-level processing and synchronization, we reintroduce neural timing as a foundational element. We present the Continuous Thought Machine (CTM), a model designed to leverage neural dynamics as its core representation. |
Luke Nicholas Darlow; Ciaran Regan; Sebastian Risi; Jeffrey Seely; Llion Jones; |
| 511 | Virus Infection Attack on LLMs: Your Poisoning Can Spread VIA Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We reveal that such a paradigm exhibits strong resistance to existing attacks, primarily thanks to the different distribution patterns between poisoning data and queries used to generate synthetic samples. To enhance the effectiveness of these attacks and further investigate the security risks introduced by synthetic data, we introduce a novel and universal attack framework, namely, Virus Infection Attack (VIA), which enables the propagation of current attacks through synthetic data even under purely clean queries. |
Zi Liang; Qingqing Ye; Xuan Liu; Yanyun Wang; Jianliang Xu; Haibo Hu; |
| 512 | Robust Neural Rendering in The Wild with Asymmetric Dual 3D Gaussian Splatting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing methods typically rely on heuristic strategies to handle the low-quality training data, which often struggle to produce stable and consistent reconstructions, frequently resulting in visual artifacts. In this work, we propose Asymmetric Dual 3DGS, a novel framework that leverages the stochastic nature of these artifacts: they tend to vary across different training runs due to minor randomness. |
Chengqi Li; Zhihao Shi; Yangdi Lu; Wenbo He; Xiangyu Xu; |
| 513 | Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present a neural network structure, FramePack, to train next-frame (or next-frame-section) prediction models for video generation. |
Lvmin Zhang; Shengqu Cai; Muyang Li; Gordon Wetzstein; Maneesh Agrawala; |
| 514 | CoLT: The Conditional Localization Test for Assessing The Accuracy of Neural Posterior Estimates Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As an alternative, we introduce the *Conditional Localization Test* (**CoLT**), a principled method designed to detect discrepancies between $p(\theta \mid x)$ and $q(\theta \mid x)$ across the full range of conditioning inputs. |
Tianyu Chen; Vansh Bansal; James G. Scott; |
| 515 | Predictive Preference Learning from Human Interventions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although most interactive imitation learning methods focus on correcting the agent’s action at the current state, they do not adjust its actions in future states, which may be potentially more hazardous. To address this, we introduce Predictive Preference Learning from Human Interventions (PPL), which leverages the implicit preference signals contained in human interventions to inform predictions of future rollouts. |
Haoyuan Cai; Zhenghao Peng; Bolei Zhou; |
| 516 | Transformers for Mixed-type Event Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a simple yet powerful Marked Temporal Point Process (MTPP) framework for modeling event sequences with flexible structure, using a single unified model. |
Felix Draxler; Yang Meng; Kai Nelson; Lukas Laskowski; Yibo Yang; Theofanis Karaletsos; Stephan Mandt; |
| 517 | ROGR: Relightable 3D Objects Using Generative Relighting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce ROGR, a novel approach that reconstructs a relightable 3D model of an object captured from multiple views, driven by a generative relighting model that simulates the effects of placing the object under novel environment illuminations. |
Jiapeng Tang; Matthew Jacob Levine; Dor Verbin; Stephan J. Garbin; Matthias Nießner; Ricardo Martin Brualla; Pratul P. Srinivasan; Philipp Henzler; |
| 518 | Hamiltonian Descent Algorithms for Optimization: Accelerated Rates Via Randomized Integration Time Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We study the Hamiltonian flow for optimization (HF-opt), which simulates the Hamiltonian dynamics for some integration time and resets the velocity to $0$ to decrease the objective function; this is the optimization analogue of the Hamiltonian Monte Carlo algorithm for sampling. |
Qiang Fu; Andre Wibisono; |
| 519 | Cost-aware LLM-based Online Dataset Annotation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a novel online framework, Cost-aware Majority Voting (CaMVo), for efficient and accurate LLM-based dataset annotation. |
Eray Can Elumar; Cem Tekin; Osman Yagan; |
| 520 | Scaling Unlocks Broader Generation and Deeper Functional Understanding of Proteins Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce the ProGen3 family of sparse generative PLMs, and we develop compute-optimal scaling laws to scale up to a 46B-parameter model pre-trained on 1.5T amino acid tokens. |
Aadyot Bhatnagar; Sarthak Jain; Joel Beazer; Samuel C. Curran; Alexander M. Hoffnagle; Kyle Shan Ching; Michael Martyn; Stephen Nayfach; Jeffrey A. Ruffolo; Ali Madani; |
| 521 | STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present STARFlow, a scalable generative model based on normalizing flows that achieves strong performance on high-resolution image synthesis.Building on this foundation, we introduce a set of architectural and algorithmic innovations that significantly enhance the scalability: (1) a deep-shallow design where a deep Transformer block captures most of the model’s capacity, followed by a few shallow Transformer blocks that are computationally cheap yet contribute non-negligibly, (2) learning in the latent space of pretrained autoencoders, which proves far more effective than modeling pixels directly, and (3) a novel guidance algorithm that substantially improves sample quality. |
Jiatao Gu; Tianrong Chen; David Berthelot; Huangjie Zheng; Yuyang Wang; Ruixiang ZHANG; Laurent Dinh; Miguel Ángel Bautista; Joshua M. Susskind; Shuangfei Zhai; |
| 522 | Flow Equivariant Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To date, however, equivariance has been considered only for static transformations and feed-forward networks, limiting its applicability to sequence models, such as recurrent neural networks (RNNs), and corresponding time-parameterized sequence transformations. In this work, we extend equivariant network theory to this regime of ‘flows’ — one-parameter Lie subgroups capturing natural transformations over time, such as visual motion. |
T. Anderson Keller; |
| 523 | AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing prompting approaches usually adopt general-purpose, fixed configurations that work “well enough” across tasks but seldom achieve task-specific optimality. To address this gap, we introduce AdaReasoner, an LLM-agnostic plugin designed for any LLM to automate adaptive reasoning configurations for tasks requiring different types of thinking. |
Xiangqi Wang; Yue Huang; Yanbo Wang; Xiaonan Luo; Kehan Guo; Yujun Zhou; Xiangliang Zhang; |
| 524 | LABridge: Text–Image Latent Alignment Framework Via Mean-Conditioned OU Process Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Latent Alignment Framework (LABridge), a novel Text–Image Latent Alignment Framework via an Ornstein–Uhlenbeck (OU) Process, which explicitly preserves and aligns textual and visual semantics in an aligned latent space. |
Huiyang Shao; Xin Xia; Yuxi Ren; XING WANG; Xuefeng Xiao; |
| 525 | Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits Through Gaussian Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a novel Stochastic-Programming-based (SP-based) policy that, under a uniqueness assumption, achieves an $\tilde{\mathcal{O}}(1/N)$ optimality gap for degenerate RMABs. |
Chen YAN; Weina Wang; Lei Ying; |
| 526 | On The Surprising Effectiveness of Large Learning Rates Under Standard Width Scaling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we show that this discrepancy is not fully explained by finite-width phenomena. Instead, we find a resolution through a finer-grained analysis of the regime previously considered unstable and therefore uninteresting. |
Moritz Haas; Sebastian Bordt; Ulrike von Luxburg; Leena Chennuru Vankadara; |
| 527 | Flash Invariant Point Attention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce FlashIPA, a factorized reformulation of IPA that leverages hardware-efficient FlashAttention to achieve linear scaling in GPU memory and wall-clock time with sequence length. |
Andrew Liu; Axel Elaldi; Nicholas T Franklin; Nathan Russell; Gurinder S. Atwal; Yih-En Andrew Ban; Olivia Viessmann; |
| 528 | SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose SmallKV, a small model assisted compensation method for KV cache compression. |
Yi Zhao; Yajuan Peng; Nguyen Cam-Tu; Zuchao Li; Wang Xiaoliang; hai zhao; Xiaoming Fu; |
| 529 | CALM-PDE: Continuous and Adaptive Convolutions for Latent Space Modeling of Time-dependent PDEs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In contrast, convolutional neural networks allow memory-efficient encoding and decoding but are limited to regular discretizations. Motivated by these considerations, we propose CALM-PDE, a model class that efficiently solves arbitrarily discretized PDEs in a compressed latent space. |
Jan Hagnberger; Daniel Musekamp; Mathias Niepert; |
| 530 | On The Empirical Power of Goodness-of-Fit Tests in Watermark Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we systematically evaluate eight GoF tests across three popular watermarking schemes, using three open-source LLMs, two datasets, various generation temperatures, and multiple post-editing methods. |
Weiqing He; Xiang Li; Tianqi Shang; Li Shen; Weijie J Su; Qi Long; |
| 531 | Balancing Multimodal Training Through Game-Theoretic Regularization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes the Multimodal Competition Regularizer (MCR), inspired by a mutual information (MI) decomposition designed to prevent the adverse effects of competition in multimodal training. |
Konstantinos Kontras; Thomas Strypsteen; Christos Chatzichristos; Paul Pu Liang; Matthew B. Blaschko; Maarten De Vos; |
| 532 | Bipolar Self-attention for Spiking Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Second, SSA typically omits Softmax functions to avoid energy-intensive multiply-accumulate operations, thereby failing to maintain row-stochasticity constraints on attention scores. To address these issues, we propose a Bipolar Self-attention (BSA) paradigm, effectively modeling multi-polar membrane potential interactions with a fully spike-driven characteristic. |
Shuai Wang; Malu Zhang; Jingya Wang; Dehao Zhang; Yimeng Shan; Jieyuan Zhang; Yichen Xiao; Honglin Cao; Haonan Zhang; Zeyu Ma; Yang Yang; Haizhou Li; |
| 533 | Online Functional Tensor Decomposition Via Continual Learning for Streaming Data Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The main aim of this work is to propose a novel online functional tensor decomposition (OFTD) framework, which represents a spatial-temporal continuous function using the CP tensor decomposition parameterized by coordinate-based implicit neural representations (INRs). |
Xi Zhang; Yanyi Li; Yisi Luo; Qi Xie; Deyu Meng; |
| 534 | TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: First, travel purposes are tied to the functions of the roads and points-of-interest (POIs) involved in a trip. Such information is encoded in textual addresses and descriptions and introduces heavy computational burden to modeling. Second, real-world trajectories often contain redundant points, which harm both computational efficiency and trajectory embedding quality. To address these challenges, we propose TrajMamba, a novel approach for efficient and semantically rich vehicle trajectory learning. |
Yichen Liu; Yan Lin; Shengnan Guo; Zeyu Zhou; Youfang Lin; Huaiyu Wan; |
| 535 | Tensor Product Attention Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Tensor Product Attention (TPA), a novel attention mechanism that uses tensor decompositions to represent queries, keys, and values compactly, substantially shrinking the KV cache size at inference time. |
Yifan Zhang; Yifeng Liu; Huizhuo Yuan; Zhen Qin; Yang Yuan; Quanquan Gu; Andrew C Yao; |
| 536 | TreeSynth: Synthesizing Diverse Data from Scratch Via Tree-Guided Subspace Partitioning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the great potential of large language models (LLMs) for data synthesis, current approaches are constrained by limited seed data, model biases and low-variation prompts, resulting in limited diversity and biased distribution with the increase of data scales. To tackle this challenge, we introduce TreeSynth, a tree-guided subspace-based data synthesis approach inspired by decision trees. |
Sheng Wang; Pengan CHEN; Jingqi Zhou; Qintong Li; Jingwei Dong; Jiahui Gao; Boyang XUE; Jiyue Jiang; Lingpeng Kong; Chuan Wu; |
| 537 | DeCaFlow: A Deconfounding Causal Generative Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce DeCaFlow, a deconfounding causal generative model. |
Alejandro Almodóvar; Adrián Javaloy; Juan Parras; Santiago Zazo; Isabel Valera; |
| 538 | Transfer Learning for Benign Overfitting in High-Dimensional Linear Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite their individual importance, the intersection of transfer learning and MNI remains largely unexplored. Our research bridges this gap by proposing a novel two-step Transfer MNI approach and analyzing its trade-offs. |
Yeichan Kim; Ilmun Kim; Seyoung Park; |
| 539 | LLM Meeting Decision Trees on Tabular Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work explores a novel direction of integrating LLMs into tabular data through logical decision tree rules as intermediaries, proposing a decision tree enhancer with LLM-derived rule for tabular prediction, DeLTa. |
Hangting Ye; Jinmeng Li; He Zhao; Dandan Guo; Yi Chang; |
| 540 | AgentBreeder: Mitigating The AI Safety Risks of Multi-Agent Scaffolds Via Self-Improvement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce AgentBreeder, a framework for multi-objective self-improving evolutionary search over scaffolds. |
J Rosser; Jakob Nicolaus Foerster; |
| 541 | Learnable Burst-Encodable Time-of-Flight Imaging for High-Fidelity Long-Distance Depth Sensing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a novel ToF imaging paradigm, termed Burst-Encodable Time-of-Flight (BE-ToF), which facilitates high-fidelity, long-distance depth imaging. |
Manchao Bao; Shengjiang Fang; Tao Yue; Xuemei Hu; |
| 542 | MigGPT: Harnessing Large Language Models for Automated Migration of Out-of-Tree Linux Kernel Patches Across Versions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose MigGPT, a framework that employs a novel code fingerprint structure to retain code snippet information and incorporates three meticulously designed modules to improve the migration accuracy and efficiency of out-of-tree kernel patches. |
Pucheng Dang; Di Huang; Dong Li; Kang Chen; Yuanbo Wen; Qi Guo; Xing Hu; |
| 543 | FP4 All The Way: Fully Quantized Training of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Additionally, we identify a theoretical and empirical threshold for effective quantized training: when the gradient norm falls below approximately $\sqrt{3}$ times the quantization noise, quantized training becomes less effective. Leveraging these insights, we successfully train a 7-billion-parameter model on 256 Intel Gaudi2 accelerators. |
Brian Chmiel; Maxim Fishman; Ron Banner; Daniel Soudry; |
| 544 | Efficient Knowledge Transfer in Federated Recommendation for Joint Venture Ecosystem Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we deeply dive into new but practical FedRS applications within the joint venture ecosystem. |
Yichen Li; Yijing Shan; YI LIU; Haozhao Wang; Cheng Wang; wangshi.ww; Yi Wang; Ruixuan Li; |
| 545 | VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we aim to enhance the slow-thinking capabilities of vision-language models using reinforcement learning (without relying on distillation) to advance the state of the art. |
Haozhe Wang; Chao Qu; Zuming Huang; Wei Chu; Fangzhen Lin; Wenhu Chen; |
| 546 | Spectral Graph Neural Networks Are Incomplete on Graphs with A Simple Spectrum Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we leverage a well-studied paradigm of classifying graphs by their largest eigenvalue multiplicity to introduce an expressivity hierarchy for SGNNs. |
Snir Hordan; Maya Bechler-Speicher; Gur Lifshitz; Nadav Dym; |
| 547 | ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this gap, we propose ShapeLLM-Omni—a native 3D large language model capable of understanding and generating 3D assets and text in any sequence.Building upon the 3D-aware discrete tokens, we innovatively construct a large-scale continuous training dataset named 3D-Alpaca, encompassing generation, comprehension, and editing, thus providing rich resources for future research and training. |
Junliang Ye; Zhengyi Wang; Ruowen Zhao; Shenghao Xie; Jun Zhu; |
| 548 | The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we revisit the latter setting, and formally establish a phenomenon entirely undetected by prior work on the implicit bias of SSMs. |
Yonatan Slutzky; Yotam Alexander; Noam Razin; Nadav Cohen; |
| 549 | Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Autoregressive Large Language Models (AR-LLMs) frequently exhibit implicit parallelism in sequential generation. Inspired by this, we introduce Multiverse, a new generative model enabling natively parallel generation. |
Xinyu Yang; Yuwei An; Hongyi Liu; Tianqi Chen; Beidi Chen; |
| 550 | BevSplat: Resolving Height Ambiguity Via Feature-Based Gaussian Primitives for Weakly-Supervised Cross-View Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose BevSplat, a novel method that resolves height ambiguity by using feature-based Gaussian primitives. |
Qiwei Wang; Wu Shaoxun; Yujiao Shi; |
| 551 | Conflict-Aware Knowledge Editing in The Wild: Semantic-Augmented Graph Representation for Unstructured Text Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing structured knowledge editing approaches face significant challenges when handling the entangled and intricate knowledge present in unstructured text, resulting in issues such as representation ambiguity and editing conflicts. To address these challenges, we propose a Conflict-Aware Knowledge Editing in the Wild (CAKE) framework, the first framework explicitly designed for editing knowledge extracted from wild unstructured text. |
Zhange Zhang; Zhicheng Geng; Yuqing Ma; Tianbo Wang; Kai Lv; Xianglong Liu; |
| 552 | Vision Transformers with Self-Distilled Registers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Given the availability of existing large-scale pre-trained ViTs, in this paper we seek add register tokens to existing models without needing to re-train from scratch, which is infeasible considering their size. |
Zipeng Yan; Yinjie Chen; Chong Zhou; Bo Dai; Andrew Luo; |
| 553 | EraseFlow: Learning Concept Erasure Policies Via GFlowNet-Driven Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce EraseFlow, the first framework that casts concept unlearning as exploration in the space of denoising paths and optimizes it with a GFlowNets equipped with the trajectory‑balance objective. |
Naga Sai Abhiram kusumba; Maitreya Patel; Kyle Min; Changhoon Kim; Chitta Baral; Yezhou Yang; |
| 554 | Boundary-Value PDEs Meet Higher-Order Differential Topology-aware GNNs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a higher-order GNN framework that incorporates higher-order interactions based on discrete and finite element exterior calculus. |
Yunfeng Liao; Yangxin Wu; Xiucheng Li; |
| 555 | 4DGT: Learning A 4D Gaussian Transformer Using Real-World Monocular Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose 4DGT, a 4D Gaussian-based Transformer model for dynamic scene reconstruction, trained entirely on real-world monocular posed videos. |
Zhen Xu; Zhengqin Li; Zhao Dong; Xiaowei Zhou; Richard Newcombe; Zhaoyang Lv; |
| 556 | CURE: Co-Evolving Coders and Unit Testers Via Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we turn our focus to the coding domain. |
Yinjie Wang; Ling Yang; Ye Tian; Ke Shen; Mengdi Wang; |
| 557 | How Many Measurements Are Enough? Bayesian Recovery in Inverse Problems with General Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We study the sample complexity of Bayesian recovery for solving inverse problems with general prior, forward operator and noise distributions. |
Ben Adcock; Nick Huang; |
| 558 | Diffusion Generative Modeling on Lie Group Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel class of score-based diffusion processes that operate directly in the representation space of Lie groups. |
Marco Bertolini; Tuan Le; Djork-Arné Clevert; |
| 559 | EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, 2D Gaussian Splatting (2DGS) enforces multi-view consistency but compromises texture details. To address these limitations, we propose Exchangeable Gaussian Splatting (EGGS), a hybrid representation that integrates 2D and 3D Gaussians to balance appearance and geometry. |
Yancheng Zhang; Guangyu Sun; Chen Chen; |
| 560 | CURE: Concept Unlearning Via Orthogonal Representation Editing in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce CURE, a training-free concept unlearning framework that operates directly in the weight space of pre-trained diffusion models, enabling fast, interpretable, and highly specific suppression of undesired concepts. |
Shristi Das Biswas; Arani Roy; Kaushik Roy; |
| 561 | Projection-based Lyapunov Method for Fully Heterogeneous Weakly-coupled MDPs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Heterogeneity poses a fundamental challenge for many real-world large-scale decision-making problems but remains largely understudied. In this paper, we study the _fully heterogeneous_ setting of a prominent class of such problems, known as weakly-coupled Markov decision processes (WCMDPs). |
XiangCheng Zhang; Yige Hong; Weina Wang; |
| 562 | Low-degree Evidence for Computational Transition of Recovery Rate in Stochastic Block Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate implications of the (extended) low-degree conjecture (recently formalized in [moitra et al2023]) in the context of the symmetric stochastic block model. |
Jingqiu Ding; Yiding Hua; Lucas Slot; David Steurer; |
| 563 | Nonlinear Laplacians: Tunable Principal Component Analysis Under Directional Prior Information Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a new family of algorithms for detecting and estimating a rank-one signal from a noisy observation under prior information about that signal’s direction, focusing on examples where the signal is known to have entries biased to be positive. |
Yuxin Ma; Dmitriy Kunisky; |
| 564 | Chain-of-Zoom: Extreme Super-Resolution Via Scale Autoregression and Preference Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Modern single-image super-resolution (SISR) models deliver photo-realistic results at the scale factors on which they are trained, but collapse when asked to magnify far beyond that regime. We address this scalability bottleneck with Chain-of-Zoom (CoZ), a model-agnostic framework that factorizes SISR into an autoregressive chain of intermediate scale-states with multi-scale-aware prompts. |
Bryan Sangwoo Kim; Jeongsol Kim; Jong Chul Ye; |
| 565 | Personalized Decision Modeling: Utility Optimization or Textualized-Symbolic Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Developing upon Utility Theory and leveraging the textual-reasoning capabilities of Large Language Models (LLMs), this paper proposes an Adaptive Textual-symbolic Human-centric Reasoning framework (ATHENA) to address the optimal information integration. |
Yibo Zhao; Yang Zhao; Hongru Du; Hao Frank Yang; |
| 566 | What Are You Sinking? A Geometric Approach on Attention Sink Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We analyze several architectures and identify three distinct reference frame types, centralized, distributed, and bidirectional, that correlate with the attention sink phenomenon. |
Valeria Ruscio; Umberto Nanni; Fabrizio Silvestri; |
| 567 | KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper presents KARMA, a novel framework employing multi-agent large language models (LLMs) to automate KG enrichment through structured analysis of unstructured text. |
Yuxing Lu; Wei Wu; Xukai Zhao; Rui Peng; Jinzhuo Wang; |
| 568 | MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While large language models (LLMs) excel in semantic understanding tasks, they struggle with the ambiguity and contextual nuance inherent in human communication. To bridge this gap, we introduce **MetaMind**, a multi-agent framework inspired by psychological theories of metacognition, designed to emulate human-like social reasoning. |
Xuanming Zhang; Yuxuan Chen; Samuel Yeh; Sharon Li; |
| 569 | Enhancing CLIP Robustness Via Cross-Modality Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This misalignment is significantly amplified under adversarial perturbations, leading to severe degradation in classification performance. To address this problem, we propose **C**r**O**ss-moda**L**ity **A**lignment, dubbed **COLA**, an optimal transport-based framework that explicitly addresses adversarial misalignment by restoring both global image-text alignment and local structural consistency in the feature space. |
Xingyu Zhu; Beier Zhu; Shuo Wang; Kesen Zhao; Hanwang Zhang; |
| 570 | Rethinking Entropy in Test-Time Adaptation: The Missing Piece from Energy Duality Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Importantly, we reveal that entropy minimization alone neither ensures energy reduction nor supports reliable likelihood estimation, and it requires explicit discriminative guidance to reach zero entropy. To combat these problems, we propose a twofold solution. |
Mincheol Park; Heeji Won; Won Woo Ro; Suhyun Kim; |
| 571 | On The Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: A potential limitation of existing ZOO methods is the bias inherent in most gradient estimators unless the perturbation stepsize vanishes. In this paper, we overcome this biasedness issue by proposing a novel family of *unbiased* gradient estimators based solely on function evaluations. |
Shaocong Ma; Heng Huang; |
| 572 | Gradient Variance Reveals Failure Modes in Flow-Based Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Rectified Flows learn ODE vector fields whose trajectories are straight between source and target distributions, enabling near one-step inference. We show that this straight-path objective reveals fundamental failure modes: under deterministic training, low gradient variance drives memorization of arbitrary training pairings, even when interpolant lines between training pairs intersect. |
Teodora Reu; Sixtine Dromigny; Michael M. Bronstein; Francisco Vargas; |
| 573 | TREND: Unsupervised 3D Representation Learning Via Temporal Forecasting for LiDAR Perception Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Instead, we propose TREND, short for Temporal REndering with Neural fielD, to learn 3D representation via forecasting the future observation in an unsupervised manner. |
Runjian Chen; Hyoungseob Park; Bo Zhang; Wenqi Shao; Ping Luo; Alex Wong; |
| 574 | Boosting Generative Image Modeling Via Joint Image-Feature Synthesis Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image modeling framework that seamlessly bridges this gap by leveraging a diffusion model to jointly model low-level image latents (from a variational autoencoder) and high-level semantic features (from a pretrained self-supervised encoder like DINO). |
Theodoros Kouzelis; Efstathios Karypidis; Ioannis Kakogeorgiou; Spyros Gidaris; Nikos Komodakis; |
| 575 | A Token Is Worth Over 1,000 Tokens: Efficient Knowledge Distillation Through Low-Rank Clone Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Existing approaches often face three key challenges: (1) information loss from hard pruning, (2) inefficient alignment of representations, and (3) underutilization of informative activations, particularly from Feed-Forward Networks (FFNs). To address these challenges, we introduce \textbf{Low-Rank Clone (LRC)}, an efficient pre-training method that constructs SLMs aspiring to behavioral equivalence with strong teacher models. |
Jitai Hao; Qiang Huang; Hao Liu; Xinyan Xiao; Zhaochun Ren; Jun Yu; |
| 576 | SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Cross-view sparsity variations lead to encoding discrepancies, heightening sample-level semantic heterogeneity and making view-level dynamic weighting inappropriate. To tackle these challenges, we propose Adaptive Sparse Autoencoders for Multi-View Clustering (SparseMVC), a framework with three key modules. |
Ruimeng Liu; Xin Zou; Chang Tang; Xiao Zheng; Xingchen Hu; Kun Sun; Xinwang Liu; |
| 577 | Unbiased Prototype Consistency Learning for Multi-Modal and Multi-Task Object Re-Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address the practical requirements for unified retrieval, we introduce Multi-Modal and Multi-Task object ReID ($\rm {M^3T}$-ReID). |
Zhongao Zhou; Bin Yang; Wenke Huang; Jun Chen; Mang Ye; |
| 578 | MJ-Video: Benchmarking and Rewarding Video Generation with Fine-Grained Video Preference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing models still struggle with key challenges such as instruction misalignment, content hallucination, safety concerns, and generation bias. To address these limitations, we introduce MJ-BENCH-VIDEO, a large-scale video preference benchmark designed to evaluate video generation across five critical aspects: Alignment, Safety, Fineness, Coherence & Consistency, and Bias & Fairness. |
Haibo Tong; Zhaoyang Wang; Zhaorun Chen; Haonian Ji; Shi Qiu; Siwei Han; Kexin Geng; Zhongkai Xue; Yiyang Zhou; Peng Xia; Mingyu Ding; Rafael Rafailov; Chelsea Finn; Huaxiu Yao; |
| 579 | Fast Projection-Free Approach (without Optimization Oracle) for Optimization Over Compact Convex Set Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, extending these methods effectively to general compact convex sets remains challenging and largely open, as FW methods rely on expensive linear optimization oracles (LOO), while penalty-based methods often struggle with poor feasibility. We tackle this open challenge by presenting **Hom-PGD**, a novel projection-free method without expensive (optimization) oracles. |
Chenghao Liu; Enming Liang; Minghua Chen; |
| 580 | Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose Adaptive Branching Monte Carlo Tree Search (AB-MCTS), a novel inference-time framework that generalizes repeated sampling with principled multi-turn exploration and exploitation. |
Yuichi Inoue; Kou Misaki; Yuki Imajuku; So Kuroki; Taishi Nakamura; Takuya Akiba; |
| 581 | RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Relying solely on README files provides insufficient guidance, and deeper exploration reveals two core obstacles: overwhelming information and tangled dependencies of repositories, both constrained by the limited context windows of current LLMs. To tackle these issues, we propose RepoMaster, an autonomous agent framework designed to explore and reuse GitHub repositories for solving complex tasks. |
Huacan Wang; Ziyi Ni; Shuo Zhang; Shuo Lu; Sen Hu; Ziyang He; Chen Hu; Jiaye Lin; Yifu Guo; Yuntao Du; Pin Lyu; |
| 582 | Hierarchical Shortest-Path Graph Kernel Network Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing graph kernel methods rely on fixed graph similarity estimation that cannot be directly optimized for task-specific objectives, leading to sub-optimal performance. To address this limitation, we propose a kernel-based learning framework called Hierarchical Shortest-Path Graph Kernel Network HSP-GKN, which seamlessly integrates graph similarity estimation with downstream tasks within a unified optimization framework. |
Jiaxin Wang; Wenxuan Tu; Jieren Cheng; |
| 583 | Differential Privacy on Fully Dynamic Streams Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we consider the fully dynamic setting where items may be inserted into or deleted from the dataset over time, and we need to continually release query answers at every time instance. |
Yuan Qiu; Ke Yi; |
| 584 | Shortcut Features As Top Eigenfunctions of NTK: A Linear Neural Network Case and More Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the framework of Neural Tangent Kernel (NTK), we analyzed the case of linear neural networks to derive some important properties of shortcut learning. |
Jinwoo Lim; Suhyun Kim; Soo-Mook Moon; |
| 585 | SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: These workloads result in long and highly predictable sequences, which current speculative decoding methods do not effectively exploit. To address this gap, we introduce \emph{SuffixDecoding}, a novel method that utilizes efficient suffix trees to cache long token sequences from prompts and previous outputs. |
Gabriele Oliaro; Zhihao Jia; Daniel F Campos; Aurick Qiao; |
| 586 | Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although applications involving long-context inputs are crucial for the effective utilization of large language models (LLMs), they also result in increased computational costs and reduced performance. To address this challenge, we propose an efficient, training-free prompt compression method that retains key information within compressed prompts. |
Weizhi Fei; Xueyan Niu; XIE GUOQING; Yingqing Liu; Bo Bai; Wei Han; |
| 587 | UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce a general-purpose approach that jointly estimates albedo and synthesizes relit outputs in a single pass, harnessing the generative capabilities of video diffusion models. |
Kai He; Ruofan Liang; Jacob Munkberg; Jon Hasselgren; Nandita Vijaykumar; Alexander Keller; Sanja Fidler; Igor Gilitschenski; Zan Gojcic; Zian Wang; |
| 588 | When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose a metric, constraint attention, to quantify model focus during generation and show that CoT reasoning often diverts attention away from instruction-relevant tokens. |
Xiaomin Li; Zhou Yu; Zhiwei Zhang; Xupeng Chen; Ziji Zhang; Yingying Zhuang; Narayanan Sadagopan; Anurag Beniwal; |
| 589 | On Agnostic PAC Learning in The Small Error Regime Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we resolve this question by exhibiting a learner which achieves error $c \\cdot \\tau + O \\left(\\sqrt{\\frac{\\tau (d + \\log(1 / \\delta))}{m}} + \\frac{d + \\log(1 / \\delta)}{m} \\right)$ for a constant $c \\leq 2.1$, matching the lower bound and demonstrating optimality when $\\tau =O( d/m)$. |
Julian Asilis; Mikael Møller Høgsgaard; Grigoris Velegkas; |
| 590 | Wavelet Canonical Coherence for Nonstationary Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Despite the growing interest in multivariate time series analysis, existing methods for between-clusters dependence typically rely on the assumption of stationarity and lack the temporal resolution to capture transient, frequency-specific interactions. To overcome this limitation, we propose scale-specific wavelet canonical coherence (WaveCanCoh), a novel framework that extends canonical coherence analysis to the nonstationary setting by leveraging the multivariate locally stationary wavelet model. |
Haibo Wu; Marina I. Knight; Keiland W. Cooper; Norbert J. Fortin; Hernando Ombao; |
| 591 | Language Modeling By Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: * Inspired by real research, we propose a multi-agent LLM approach that simulates the conventional stages of research, from ideation and literature search (proposal stage) to design implementation (code generation), generative pre-training, and downstream evaluation (verification). |
Junyan Cheng; Peter Clark; Kyle Richardson; |
| 592 | Plasticity As The Mirror of Empowerment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This former capacity, however, is equally foundational: In what ways, and to what extent, can an agent be influenced by what it observes? In this paper, we ground this concept in a universal agent-centric measure that we refer to as plasticity, and reveal a fundamental connection to empowerment. |
David Abel; Michael Bowling; Andre Barreto; Will Dabney; Shi Dong; Steven Stenberg Hansen; Anna Harutyunyan; Khimya Khetarpal; Clare Lyle; Razvan Pascanu; Georgios Piliouras; Doina Precup; Jonathan Richens; Mark Rowland; Tom Schaul; Satinder Singh; |
| 593 | PiKE: Adaptive Data Mixing for Large-Scale Multi-Task Learning Under Low Gradient Conflicts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While prior work in MTL has emphasized mitigating gradient conflicts, we observe that large-scale pretraining scenarios—such as multilingual or multi-domain training—often exhibit little to no gradient conflict. Motivated by this observation, we propose $\textbf{PiKE}$ ($\textbf{P}$ositive gradient $\textbf{i}$nteraction-based $\textbf{K}$-task weights $\textbf{E}$stimator), an adaptive data mixing algorithm that dynamically adjusts sampling weights during training. |
Zeman Li; Yuan Deng; Peilin Zhong; Meisam Razaviyayn; Vahab Mirrokni; |
| 594 | Language Models Can Self-Improve at State-Value Estimation for Better Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce Self-Taught Lookahead (STL), a reward-free framework that improves language model–based value functions by reasoning explicitly about state transitions. |
Ethan Mendes; Alan Ritter; |
| 595 | Scalable Cross-View Sample Alignment for Multi-View Clustering with View Structure Similarity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, in real-world scenarios, the collected data across different views is often unsynchronized, making it difficult to ensure consistent sample correspondence between views. To address this issue, we propose a scalable sample-alignment-based multi-view clustering method, referred to as SSA-MVC. |
Jun Wang; Zhenglai Li; Chang Tang; Suyuan Liu; Hao Yu; Chuan Tang; Miaomiao Li; Xinwang Liu; |
| 596 | Multidimensional Bayesian Utility Maximization: Tight Approximations to Welfare Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We initiate the study of multidimensional Bayesian utility maximization, focusing on the unit-demand setting where values are i.i.d. across both items and buyers. |
Kira Goldner; Taylor Lundy; |
| 597 | Decomposing Stimulus-specific Sensory Neural Information Via Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Finally, most proposed decompositions are computationally intractable for the high-dimensional stimuli and non-linear encoding models relevant for neuroscience. To resolve these limitations, we propose a set of axioms that any stimulus specific and feature-specific information decomposition should satisfy in order to serve as a meaningful and interpretable measure of neural sensitivity. |
Steeve Laquitaine; Simone Azeglio; Carlo Paris; Ulisse Ferrari; Matthew Chalk; |
| 598 | Any-stepsize Gradient Descent for Separable Data Under Fenchel–Young Losses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To further understand what property of a loss function matters in GD, we aim to show arbitrary-stepsize GD convergence for a general loss function based on the framework of \emph{Fenchel–Young losses}. |
Han Bao; Shinsaku Sakaue; Yuki Takezawa; |
| 599 | Provably Efficient RL Under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper closes the gap by proposing an RL algorithm for linear CMDPs that achieves $\widetilde{\mathcal{O}}(\sqrt{K})$ regret with an episode-wise zero-violation guarantee. |
Toshinori Kitamura; Arnob Ghosh; Tadashi Kozuno; Wataru Kumagai; Kazumi Kasaura; Kenta Hoshino; Yohei Hosoe; Yutaka Matsuo; |
| 600 | To Distill or Decide? Understanding The Algorithmic Trade-off in Partially Observable RL Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper — through a simple but instructive theoretical model called the *perturbed Block MDP*, and controlled experiments on challenging simulated locomotion tasks — we investigate the algorithmic trade-off between privileged expert distillation and standard RL without privileged information. |
Yuda Song; Dhruv Rohatgi; Aarti Singh; Drew Bagnell; |
| 601 | A Learnability Analysis on Neuro-symbolic Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents a comprehensive theoretical analysis of the learnability of neuro-symbolic (NeSy) tasks within hybrid systems. |
Hao-Yuan He; Ming Li; |
| 602 | Self-Perturbed Anomaly-Aware Graph Dynamics for Multivariate Time-Series Anomaly Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Detecting anomalies in multivariate time-series data is an essential task across various domains, yet there are unresolved challenges such as (1) severe class imbalance between normal and anomalous data due to rare anomaly availability in the real world; (2) limited adaptability of the static graph-based methods to dynamically changing inter-variable correlations; and (3) neglect of subtle anomalies due to overfitting to normal patterns in reconstruction-based methods. To tackle these issues, we propose Self-Perturbed Anomaly-Aware Graph Dynamics (SPAGD), a framework for time-series anomaly detection. |
Jinyu Cai; Yuan Xie; Glynnis Lim; Yifang Yin; Roger Zimmermann; See-Kiong Ng; |
| 603 | Restoring Pruned Large Language Models Via Lost Component Compensation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a targeted restoration strategy for pruned models that restores performance while preserving their low cost and high efficiency. |
Zijian Feng; Hanzhang Zhou; Zixiao Zhu; Tianjiao Li; Chua Jia Jim Deryl; Mak Lee Onn; Gee Wah Ng; Kezhi Mao; |
| 604 | Towards Building Model/Prompt-Transferable Attackers Against Large Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Motivated by this research gap, this paper aims to develop a more powerful attack that is transferable to black-box LVLM models of different structures and task-aware prompts of different semantics. |
Xiaowen Cai; Daizong Liu; Xiaoye Qu; Xiang Fang; Jianfeng Dong; Keke Tang; Pan Zhou; Lichao Sun; Wei Hu; |
| 605 | GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GaussianFusion, a Gaussian-based multi-sensor fusion framework for end-to-end autonomous driving. |
Shuai Liu; Quanmin Liang; Zefeng Li; Boyang Li; Kai Huang; |
| 606 | SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To achieve this, recent approaches resort to 2D-to-3D feature alignment paradigm, which leads to limited 3D understanding capability and potential semantic information loss. In light of this, we propose SIU3R, the first alignment-free framework for generalizable simultaneous understanding and 3D reconstruction from unposed images. |
Qi Xu; Dongxu Wei; Lingzhe Zhao; Wenpu Li; Zhangchi Huang; Shunping Ji; Peidong Liu; |
| 607 | HYPERION: Fine-Grained Hypersphere Alignment for Robust Federated Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Combining both level strategies, we present our robust FGL framework,**HYPERION**, which operates all components within a unified hyperspherical space. |
Guancheng Wan; Xiaoran Shang; Yuxin Wu; Guibin Zhang; Jinhe Bi; Liangtao Zheng; Xin Lin; Yue Liu; Yanbiao Ma; Wenke Huang; Bo Du; |
| 608 | Brain-Inspired FMRI-to-Text Decoding Via Incremental and Wrap-Up Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While recent advances in large language models (LLMs) have enabled open-vocabulary fMRI-to-text decoding, existing frameworks typically process the entire fMRI sequence in a single step, leading to performance degradation when handling long input sequences due to memory overload and semantic drift. To address this limitation, we propose a brain-inspired sequential fMRI-to-text decoding framework that mimics the human cognitive strategy of segmented and inductive language processing. |
Wentao Lu; Dong Nie; Pengcheng Xue; Zheng Cui; Piji Li; Daoqiang Zhang; Xuyun Wen; |
| 609 | Sampling-Efficient Test-Time Scaling: Self-Estimating The Best-of-N Sampling in Early Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although some studies have explored efficiency improvements, none have addressed both challenges at once. To address this gap, we propose **Self-Truncation Best-of-$N$ (ST-BoN)**, a decoding method that avoids fully generating all $N$ samples and eliminates the need for reward models. |
Yiming Wang; Pei Zhang; Siyuan Huang; Baosong Yang; Zhuosheng Zhang; Fei Huang; Rui Wang; |
| 610 | PoE-World: Compositional World Modeling with Products of Programmatic Experts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We introduce a novel program synthesis method for effectively modeling complex, non-gridworld domains by representing a world model as an exponentially-weighted product of programmatic experts (PoE-World) synthesized by LLMs. |
Wasu Top Piriyakulkij; Yichao Liang; Hao Tang; Adrian Weller; Marta Kryven; Kevin Ellis; |
| 611 | Uni-LoRA: One Vector Is All You Need Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent works such as Tied-LoRA, VeRA, and VB-LoRA push efficiency further by introducing additional constraints to reduce the trainable parameter space. In this paper, we show that the parameter space reduction strategies employed by these LoRA variants can be formulated within a unified framework, Uni-LoRA, where the LoRA parameter space, flattened as a high-dimensional vector space R^D, can be reconstructed through a projection from a subspace R^d, with d << D. |
Kaiyang Li; Shaobo Han; Qing Su; Wei Li; Zhipeng Cai; Shihao Ji; |
| 612 | Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce DRG-Sapphire, which uses large-scale reinforcement learning (RL) for automated DRG coding from clinical notes. |
Hanyin Wang; Zhenbang Wu; Gururaj J. Kolar; Hariprasad Reddy Korsapati; Brian Bartlett; Bryan Hull; Jimeng Sun; |
| 613 | Stochastic Process Learning Via Operator Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Expanding on neural operators, we propose a novel framework for stochastic process learning across arbitrary domains. |
Yaozhong Shi; Zachary E Ross; Domniki Asimaki; Kamyar Azizzadenesheli; |
| 614 | Strategic Costs of Perceived Bias in Fair Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We develop a game-theoretic model in which candidates from different socioeconomic groups differ in their perceived post-selection value—shaped by social context and, increasingly, by AI-powered tools offering personalized career or salary guidance. |
L. Elisa Celis; Lingxiao Huang; Milind Sohoni; Nisheeth K. Vishnoi; |
| 615 | A Closer Look at Model Collapse: From A Generalization-to-Memorization Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This transition is directly driven by the declining entropy of the synthetic training data produced in each training cycle, which serves as a clear indicator of model degradation. Motivated by this insight, we propose an entropy-based data selection strategy to mitigate the transition from generalization to memorization and alleviate model collapse. |
Lianghe Shi; Meng Wu; Huijie Zhang; Zekai Zhang; Molei Tao; Qing Qu; |
| 616 | AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Therefore, draft models often struggle to fully assimilate the target model’s knowledge due to capacity constraints, leading to suboptimal performance. To address this challenge, we propose AdaSPEC, a novel method that incorporates selective token filtering into the KD process. |
Yuezhou Hu; Jiaxin Guo; Xinyu Feng; Tuo Zhao; |
| 617 | Depth-Width Tradeoffs for Transformers on Graph Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, an open question, which we address here, is what happens if width is allowed to grow linearly, while depth is kept fixed. Here we analyze this setting, and provide the surprising result that with linear width, constant depth suffices for solving a host of graph-based problems. |
Gilad Yehudai; Clayton Sanford; Maya Bechler-Speicher; Orr Fischer; Ran Gilad-Bachrach; Amir Globerson; |
| 618 | Refinement Methods for Distributed Distribution Estimation Under $\ell^p$-Losses Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In order to achieve the optimal rates for different parameter regimes, we introduce refinement methods and develop additional customized techniques in the estimation protocols. |
Deheng Yuan; Tao Guo; Zhongyi Huang; |
| 619 | Measuring and Guiding Monosemanticity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To systematically quantify these limitations, we introduce Feature Monosemanticity Score (FMS), a novel metric to quantify feature monosemanticity in latent representation. Building on these insights, we propose Guided Sparse Autoencoders (G-SAE), a method that conditions latent representations on labeled concepts during training. |
Ruben Härle; Felix Friedrich; Manuel Brack; Björn Deiseroth; Stephan Waeldchen; Patrick Schramowski; Kristian Kersting; |
| 620 | Amortized Variational Transdimensional Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce CoSMIC normalizing flows (COntextually-Specified Masking for Identity-mapped Components), an extension to neural autoregressive conditional normalizing flow architectures that enables use of a single amortized variational density for inference over a transdimensional (multi-model) conditional target distribution. |
Laurence Davies; Dan MacKinlay; Rafael Oliveira; Scott A Sisson; |
| 621 | Scaling Laws For Scalable Oversight Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, it is still unclear how scalable oversight itself scales. To address this gap, we propose a framework that quantifies the probability of successful oversight as a function of the capabilities of the overseer and the system being overseen. |
Joshua Engels; David D. Baek; Subhash Kantamneni; Max Tegmark; |
| 622 | A Near-Optimal Algorithm for Decentralized Convex-Concave Finite-Sum Minimax Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we study the distributed convex-concave finite-sum minimax optimization over the network, and a decentralized variance-reduced optimistic gradient method with stochastic mini-batch sizes (DIVERSE) is proposed. |
Hongxu Chen; Ke Wei; Haishan Ye; Luo Luo; |
| 623 | Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we explore a _novel **t**est-**t**ime **c**omputing paradigm, namely learning with calibration, **ST-TTC**_, for **s**patio-**t**emporal forecasting. |
Wei Chen; Yuxuan Liang; |
| 624 | FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce FPSAttention, a novel training-aware co-design of FP8 quantization and Sparsity for video generation, with a focus on the 3D bi-directional attention mechanism. |
Akide Liu; Zeyu Zhang; Zhexin Li; Xuehai Bai; Yuanjie Xing; Yizeng Han; Jiasheng Tang; Jichao Wu; Mingyang Yang; Weihua Chen; Jiahao He; Yuanyu He; Fan Wang; Gholamreza Haffari; Bohan Zhuang; |
| 625 | Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Language-based approaches offer structure but often fail to explain motions due to their tacit nature—intuitively understood but difficult to verbalize. To address these challenges, we propose Disentangled Action aNd Context concept-based Explainable (DANCE) video action recognition, a framework that predicts actions through disentangled concept types: motion dynamics, objects, and scenes. |
Jongseo Lee; Wooil Lee; Gyeong-Moon Park; Seong Tae Kim; Jinwoo Choi; |
| 626 | Physics-Driven Spatiotemporal Modeling for AI-Generated Video Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a physics-driven AI-generated video detection paradigm based on probability flow conservation principles. |
Shuhai Zhang; ZiHao Lian; Jiahao Yang; Daiyuan Li; Guoxuan Pang; Feng Liu; Bo Han; Shutao Li; Mingkui Tan; |
| 627 | Robust Graph Condensation Via Classification Complexity Mitigation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Although this property is critical for effective GC performance, it remains highly vulnerable to adversarial perturbations. To tackle this vulnerability and improve GC robustness, we adopt the geometry perspective of graph data manifold and propose a novel **M**anifold-constrained **R**obust **G**raph **C**ondensation framework named **MRGC**. |
Jiayi Luo; Qingyun Sun; Beining Yang; Haonan Yuan; Xingcheng Fu; Yanbiao Ma; Jianxin Li; Philip S. Yu; |
| 628 | CausalPFN: Amortized Causal Effect Estimation Via In-Context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present CausalPFN, a single transformer that *amortizes* this workflow: trained once on a large library of simulated data-generating processes that satisfy ignorability, it infers causal effects for new observational datasets out of the box. |
Vahid Balazadeh; Hamidreza Kamkari; Valentin Thomas; Junwei Ma; Bingru Li; Jesse C. Cresswell; Rahul Krishnan; |
| 629 | High-order Equivariant Flow Matching for Density Functional Theory Hamiltonian Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose QHFlow, a high-order equivariant flow matching framework that generates Hamiltonian matrices conditioned on molecular geometry. |
Seongsu Kim; Nayoung Kim; Dongwoo Kim; Sungsoo Ahn; |
| 630 | 🎧MOSPA: Human Motion Generation Driven By Spatial Audio Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: As of yet, these models typically overlook the impact of spatial features encoded in spatial audio signals on human motion. To bridge this gap and enable high-quality modeling of human movements in response to spatial audio, we introduce the first comprehensive Spatial Audio-Driven Human Motion (SAM) dataset, which contains diverse and high-quality spatial audio and motion data. |
Shuyang Xu; Zhiyang Dou; Mingyi Shi; Liang Pan; Leo Ho; Jingbo Wang; Yuan Liu; Cheng Lin; Yuexin Ma; Wenping Wang; Taku Komura; |
| 631 | Absolute Zero: Reinforced Self-play Reasoning with Zero Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Furthermore, in a hypothetical future where AI surpasses human intelligence, tasks provided by humans may offer limited learning potential for a superintelligent system. To address these concerns, we propose a new RLVR paradigm called Absolute Zero, in which a single model learns to propose tasks that maximize its own learning progress and improves reasoning by solving them, without relying on any external human or distillation data. |
Andrew Zhao; Yiran Wu; Yang Yue; Tong Wu; Quentin Xu; Yang Yue; Matthieu Lin; Shenzhi Wang; Qingyun Wu; Zilong Zheng; Gao Huang; |
| 632 | Graph–Smoothed Bayesian Black-Box Shift Estimator and Its Information Geometry Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Classical black‑box shift estimators invert an empirical confusion matrix of a frozen classifier, producing a brittle point estimate that ignores sampling noise and similarity among classes. We present Graph‑Smoothed Bayesian BBSE (GS‑B$^3$SE), a fully probabilistic alternative that places Laplacian–Gaussian priors on both target log‑priors and confusion‑matrix columns, tying them together on a label‑similarity graph. |
Masanari Kimura; |
| 633 | Vision Transformers Don’t Need Trained Registers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We investigate the mechanism underlying a previously identified phenomenon in Vision Transformers — the emergence of high-norm tokens that lead to noisy attention maps (Darcet et al., 2024). |
Nicholas Jiang; Amil Dravid; Alexei A Efros; Yossi Gandelsman; |
| 634 | Object-centric 3D Motion Field for Robot Learning from Human Videos Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing action representations such as video frames, pixelflow, and pointcloud flow have inherent limitations such as modeling complexity or loss of information. In this paper, we propose to use object-centric 3D motion field to represent actions for robot learning from human videos, and present a novel framework for extracting this representation from videos for zero-shot control. |
Zhao-Heng Yin; Sherry Yang; Pieter Abbeel; |
| 635 | Sparse VideoGen2: Accelerate Video Generation with Sparse Attention Via Semantic-Aware Permutation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose SVG2, a training-free framework that maximizes identification accuracy and minimizes computation waste, achieving a Pareto frontier trade-off between generation quality and efficiency. |
Shuo Yang; Haocheng Xi; Yilong Zhao; Muyang Li; Jintao Zhang; Han Cai; Yujun Lin; Xiuyu Li; Chenfeng Xu; Kelly Peng; Jianfei Chen; Song Han; Kurt Keutzer; Ion Stoica; |
| 636 | Aggregation Hides Out-of-Distribution Generalization Failures from Spurious Correlations Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Using a simple gradient-based method, OODSelect, we identify semantically coherent OOD subsets where accuracy-on-the-line breaks down. Across widely used distribution-shift benchmarks, OODSelect uncovers subsets�sometimes comprising more than half of the standard OOD set�where higher ID accuracy predicts lower OOD accuracy. |
Olawale Elijah Salaudeen; Haoran Zhang; Kumail Alhamoud; Sara Beery; Marzyeh Ghassemi; |
| 637 | Mean-Field Sampling for Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose a new algorithm $\texttt{SUBSAMPLE-MFQ}$ ($\textbf{Subsample}$-$\textbf{M}$ean-$\textbf{F}$ield-$\textbf{Q}$-learning) and a decentralized randomized policy for a system with $n$ agents. |
Emile Timothy Anand; Ishani Karmarkar; Guannan Qu; |
| 638 | EAG3R: Event-Augmented 3D Geometry Estimation for Dynamic and Extreme-Lighting Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose \textbf{EAG3R}, a novel geometry estimation framework that augments pointmap-based reconstruction with asynchronous event streams. |
Xiaoshan Wu; Yifei Yu; Xiaoyang Lyu; Yi-Hua Huang; Bo Wang; Baoheng Zhang; Zhongrui Wang; XIAOJUAN QI; |
| 639 | LaViDa: A Large Diffusion Model for Vision-Language Understanding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LaViDa, a family of VLMs built on DMs. |
Shufan Li; Konstantinos Kallidromitis; Hritik Bansal; Akash Gokul; Yusuke Kato; Kazuki Kozuka; Jason Kuen; Zhe Lin; Kai-Wei Chang; Aditya Grover; |
| 640 | Theory-Driven Label-Specific Representation for Incomplete Multi-View Multi-Label Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To tackle the complex yet highly practical challenges, we propose a Theory-Driven Label-Specific Representation (TDLSR) framework. |
Quanjiang Li; Tianxiang Xu; Tingjin Luo; Yan Zhong; Yang Li; Yiyun Zhou; Chenping Hou; |
| 641 | Scalable Fingerprinting of Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Hence, we pose scalability as a crucial requirement for fingerprinting schemes. |
Anshul Nasery; Jonathan Hayase; Creston Brooks; Peiyao Sheng; Himanshu Tyagi; Pramod Viswanath; Sewoong Oh; |
| 642 | Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Process Reward Models (PRMs) offer a way to align the retrieval process of KG-based RAG with query-specific knowledge requirements, but they heavily rely on process-level supervision signals that are expensive and hard to obtain on KGs. To address this challenge, we propose GraphFlow, a framework that efficiently retrieves accurate and diverse knowledge required for real-world queries from text-rich KGs. |
Junchi Yu; Yujie Liu; Jindong Gu; Philip Torr; Dongzhan Zhou; |
| 643 | Decomposing Interventional Causality Into Synergistic, Redundant, and Unique Components Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a novel framework for decomposing interventional causal effects into synergistic, redundant, and unique components, building on the intuition of Partial Information Decomposition (PID) and the principle of Möbius inversion. |
Abel Jansma; |
| 644 | Long-Tailed Recognition Via Information-Preservable Two-Stage Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: The imbalance (or long-tail) is the nature of many real-world data distributions, which often induces the undesirable bias of deep classification models toward frequent classes, resulting in poor performance for tail classes. In this paper, we propose a novel two-stage learning approach to mitigate such a majority-biased tendency while preserving valuable information within datasets. |
Fudong Lin; Xu Yuan; |
| 645 | Mozart: Modularized and Efficient MoE Training on 3.5D Wafer-Scale Chiplet Architectures Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the modular organization of the human brain, we propose $\texttt{Mozart}$, a novel algorithm-hardware co-design framework tailored for efficient training of MoE-based LLMs on 3.5D wafer-scale chiplet architectures. |
Shuqing Luo; Ye Han; Pingzhi Li; Jiayin Qin; Jie Peng; Yang Katie Zhao; Yu Cao; Tianlong Chen; |
| 646 | Data Mixing Can Induce Phase Transitions in Knowledge Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Large Language Models (LLMs) are typically trained on data mixtures: most data come from web scrapes, while a small portion is curated from high-quality sources with dense domain-specific knowledge. In this paper, we show that when training LLMs on such data mixtures, knowledge acquisition from knowledge-dense datasets—unlike training exclusively on knowledge-dense data—does not always follow a smooth scaling law but can exhibit phase transitions with respect to the mixing ratio and model size. |
Xinran Gu; Kaifeng Lyu; Jiazheng Li; Jingzhao Zhang; |
| 647 | Shallow Diffuse: Robust and Invisible Watermarking Through Low-Dim Subspaces in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce *Shallow Diffuse*, a new watermarking technique that embeds robust and invisible watermarks into diffusion model outputs. |
Wenda Li; Huijie Zhang; Qing Qu; |
| 648 | Offline Guarded Safe Reinforcement Learning for Medical Treatment Optimization Strategies Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To safely improve policy beyond clinician recommendations while ensuring that state-action trajectories remain in-distribution, we propose \textit{Offline Guarded Safe Reinforcement Learning} ($\mathsf{OGSRL}$), a theoretically grounded model-based offline RL framework. |
Runze Yan; Xun Shen; Akifumi Wachi; Sebastien Gros; Anni Zhao; Xiao Hu; |
| 649 | From Counterfactuals to Trees: Competitive Analysis of Model Extraction Attacks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we formalize and characterize the risks and inherent complexity of model reconstruction, focusing on the oracle” queries required for faithfully inferring the underlying prediction function. |
Awa Khouna; Julien Ferry; Thibaut Vidal; |
| 650 | Repo2Run: Automated Building Executable Environment for Code Repository at Scale Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: To mitigate the gap, we introduce Repo2Run, the first LLM-based agent aiming at automating the building of executable test environments for any repositories at scale.We created a benchmark containing 420 Python repositories with unit tests for evaluation. |
Ruida Hu; Chao Peng; XinchenWang; Junjielong Xu; Cuiyun Gao; |
| 651 | Towards Dynamic 3D Reconstruction of Hand-Instrument Interaction in Ophthalmic Surgery Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce OphNet-3D, the first extensive RGB-D dynamic 3D reconstruction dataset for ophthalmic surgery, comprising 41 sequences from 40 surgeons and totaling 7.1 million frames, with fine-grained annotations of 12 surgical phases, 10 instrument categories, dense MANO hand meshes, and full 6-DoF instrument poses. |
Ming Hu; Zhengdi Yu; Feilong Tang; Kaiwen Chen; Yulong Li; Imran Razzak; Junjun He; Tolga Birdal; Kaijing Zhou; Zongyuan Ge; |
| 652 | Unlocking Dataset Distillation with Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This trend arises because naive backpropagation through the long denoising chain leads to vanishing gradients, which prevents effective synthetic sample optimization. To address this limitation, we introduce Latent Dataset Distillation with Diffusion Models (LD3M), the first method to learn gradient-based distilled latents and class embeddings end-to-end through a pre-trained latent diffusion model. |
Brian Bernhard Moser; Federico Raue; Sebastian Palacio; Stanislav Frolov; Andreas Dengel; |
| 653 | Vision-centric Token Compression in Large Language Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Vision Centric Token Compression ($\textbf{Vist}$), a $\textit{slow–fast}$ compression framework that mirrors human reading: the $\textit{fast}$ path renders distant tokens into images, letting a $\textbf{frozen, lightweight vision encoder}$ skim the low-salience context; the $\textit{slow}$ path feeds the proximal window into the LLM for fine-grained reasoning. |
Ling Xing; Alex Jinpeng Wang; Rui Yan; Xiangbo Shu; Jinhui Tang; |
| 654 | VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose a carefully designed multi-stage training methodology that progressively trains LLM to understand both visual and speech information, ultimately enabling fluent vision and speech interaction. |
Chaoyou Fu; Haojia Lin; Xiong Wang; YiFan Zhang; Yunhang Shen; Xiaoyu Liu; Haoyu Cao; Zuwei Long; Heting Gao; Ke Li; Long MA; Xiawu Zheng; Rongrong Ji; Xing Sun; Caifeng Shan; Ran He; |
| 655 | EDELINE: Enhancing Memory in Diffusion-based World Models Via Linear-Time Sequence Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce EDELINE, a unified world model architecture that integrates state space models with diffusion models. |
Jia-Hua Lee; Bor-Jiun Lin; Wei-Fang Sun; Chun-Yi Lee; |
| 656 | UniTok: A Unified Tokenizer for Visual Generation and Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we show that reconstruction and semantic supervision do not inherently conflict. |
Chuofan Ma; Yi Jiang; Junfeng Wu; Jihan Yang; Xin Yu; Zehuan Yuan; BINGYUE PENG; XIAOJUAN QI; |
| 657 | NormFit: A Lightweight Solution for Few-Shot Federated Learning with Non-IID Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To simultaneously address all those limitations, we propose NormFit, a lightweight solution that selectively fine-tunes only a very small portion of the model parameters, specifically only the Pre-LayerNorm parameters of the vision encoder within a VLM. |
Azadeh Motamedi; Jae-Mo Kang; Il-Min Kim; |
| 658 | Towards Comprehensive Scene Understanding: Integrating First and Third-Person Views for LVLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While this view offers fine-grained cues about user attention and hand-object interactions, its narrow field of view and lack of global context often lead to failures on spatially or contextually demanding queries. To address this, we introduce a framework that augments egocentric inputs with third-person (exocentric) views, providing complementary information such as global scene layout and object visibility to LVLMs. |
Insu Lee; Wooje Park; Jaeyun Jang; Minyoung Noh; Kyuhong Shim; Byonghyo Shim; |
| 659 | Accelerating Visual-Policy Learning Through Parallel Differentiable Simulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose a computationally efficient algorithm for visual policy learning that leverages differentiable simulation and first-order analytical policy gradients. |
Haoxiang You; Yilang Liu; Ian Abraham; |
| 660 | Online Prediction with Limited Selectivity Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a model of Prediction with Limited Selectivity (PLS) where the forecaster can start the prediction only on a subset of the time horizon. |
Licheng Liu; Mingda Qiao; |
| 661 | Generative Trajectory Stitching Through Diffusion Composition Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose CompDiffuser, a novel generative approach that can solve new tasks by learning to compositionally stitch together shorter trajectory chunks from previously seen tasks. |
Yunhao Luo; Utkarsh Aashu Mishra; Yilun Du; Danfei Xu; |
| 662 | VLMs Have Tunnel Vision: Evaluating Nonlocal Visual Reasoning in Leading VLMs Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present an evaluation that tests vision-language models’ capacity for \emph{nonlocal visual reasoning}- reasoning that requires chaining evidence collected from multiple, possibly distant, regions of an image. |
Shmuel Berman; Jia Deng; |
| 663 | Learning to Factorize Spatio-Temporal Foundation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we introduce FactoST, a factorized STFM that decouples universal temporal pretraining from spatio-temporal adaptation. |
Siru Zhong; Junjie Qiu; Yangyu Wu; Xingchen Zou; Zhongwen Rao; Bin Yang; Chenjuan Guo; Hao Xu; Yuxuan Liang; |
| 664 | Angles Don’t Lie: Unlocking Training‑Efficient RL Through The Model’s Own Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we identify a model-inherent signal termed *angle concentration* that effectively reflects an LLM’s capacity to learn from specific data. |
Qinsi Wang; Jinghan Ke; Hancheng Ye; Yueqian Lin; Yuzhe Fu; Jianyi Zhang; Kurt Keutzer; Chenfeng Xu; Yiran Chen; |
| 665 | Generalizable Insights for Graph Transformers in Theory and Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we propose the Generalized-Distance Transformer (GDT), a GT architecture using standard attention that incorporates many advancements for GTs from recent years, and develop a fine-grained understanding of the GDT’s representation power in terms of attention and PEs. |
Timo Stoll; Luis Müller; Christopher Morris; |
| 666 | Principled Data Augmentation for Learning to Solve Quadratic Programming Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This work introduces a principled approach to data augmentation tailored for QPs via MPNNs. |
Chendi Qian; Christopher Morris; |
| 667 | Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To resolve this, we propose Dual Data Alignment (DDA), which aligns both the pixel and frequency domains.Moreover, we introduce two new test sets: DDA-COCO, containing DDA-aligned synthetic images, and EvalGEN, featuring the latest generative models. |
Ruoxin Chen; Junwei Xi; Zhiyuan Yan; Ke-Yue Zhang; Shuang Wu; Jingyi Xie; Xu Chen; Lei Xu; Isabel Guan; Taiping Yao; Shouhong Ding; |
| 668 | CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Ignoring this diversity may lead to suboptimal representations and weakened generalization ability. To address these limitations, we propose CSBrain, a Cross-scale Spatiotemporal Brain foundation model for generalized EEG decoding. |
Yuchen Zhou; Jiamin Wu; Zichen Ren; Zhouheng Yao; Weiheng Lu; Kunyu Peng; Qihao Zheng; Chunfeng Song; Wanli Ouyang; Chao Gou; |
| 669 | Flattening Hierarchies with Policy Bootstrapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we introduce an algorithm to train a flat (non-hierarchical) goal-conditioned policy by bootstrapping on subgoal-conditioned policies with advantage-weighted importance sampling. |
John Luoyu Zhou; Jonathan Kao; |
| 670 | HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we pioneer textual reference-guided human action segmentation in multi-person settings, where a textual description specifies the target person for segmentation.We introduce the first dataset for Referring Human Action Segmentation, i.e., RHAS133, built from 133 movies and annotated with 137 fine-grained actions with 33h video data, together with textual descriptions for this new task. |
Kunyu Peng; Junchao Huang; Xiangsheng Huang; Di Wen; Junwei Zheng; Yufan Chen; Kailun Yang; Jiamin Wu; Chongqing Hao; Rainer Stiefelhagen; |
| 671 | Geometry Meets Incentives: Sample-Efficient Incentivized Exploration with Linear Contexts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Namely, we consider the linear bandit model with actions in the Euclidean unit ball, and give an incentive-compatible exploration algorithm with sample complexity that scales polynomially with the dimension and other parameters. |
Benjamin Schiffer; Mark Sellke; |
| 672 | GeoRemover: Removing Objects and Their Causal Visual Artifacts Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We identify that these limitations stem from ignoring the causal relationship between an object’s geometry presence and its visual effects. To address this limitation, we propose a geometry-aware two-stage framework that decouples object removal into (1) geometry removal and (2) appearance rendering. |
Zixin Zhu; Haoxiang Li; Xuelu Feng; He Wu; Chunming Qiao; Junsong Yuan; |
| 673 | The Primacy of Magnitude in Low-Rank Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we establish update magnitude as the fundamental driver of LoRA performance and propose LoRAM, a magnitude-driven “Basis \& Basis” initialization scheme that matches spectral methods without their inefficiencies. |
Zicheng Zhang; Haoran Li; Yifeng Zhang; Guoqiang Gong; Jiaxing Wang; pengzhang liu; Qixia Jiang; Junxing Hu; |
| 674 | GeRaF: Neural Geometry Reconstruction from Radio Frequency Signals Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Moreover, RF signals interact with surfaces via specular reflections requiring fundamentally different modeling. To address these challenges, GeRaF (1) introduces filter-based rendering to suppress irrelevant signals, (2) implements a physics-based RF volumetric rendering pipeline, and (3) proposes a novel lens-less sampling and lens-less alpha blending strategy that makes full-space sampling feasible during training. |
Jiachen Lu; Hailan Shanbhag; Haitham Al Hassanieh; |
| 675 | Spatial Understanding from Videos: Structured Prompts Meet Simulation Data Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, existing methods face spatial uncertainty and data scarcity, limiting the 3D spatial reasoning capability of pre-trained vision-language models (VLMs). To address these challenges, we present a unified framework for enhancing 3D spatial reasoning in pre-trained VLMs without modifying their architecture. |
Haoyu Zhang; Meng Liu; Zaijing Li; Haokun Wen; Weili Guan; Yaowei Wang; Liqiang Nie; |
| 676 | SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these limitations, we propose Saturn, a SAT-based RL framework that uses Boolean Satisfiability (SAT) problems to train and evaluate LLMs reasoning.We release the source code, data, and models to support future research. |
Huanyu Liu; Jia Li; Hao Zhu; Kechi Zhang; Yihong Dong; Ge Li; |
| 677 | JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper presents JavisGPT, the first unified multimodal large language model (MLLM) for Joint Audio-Video (JAV) comprehension and generation. |
Kai Liu; Jungang Li; Yuchong Sun; Shengqiong Wu; jianzhang gao; Daoan Zhang; Wei Zhang; Sheng Jin; Sicheng Yu; Geng Zhan; Jiayi Ji; Fan Zhou; Liang Zheng; Shuicheng YAN; Hao Fei; Tat-Seng Chua; |
| 678 | Dimension-adapted Momentum Outscales SGD Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We investigate scaling laws for stochastic momentum algorithms on the power law random features model, parameterized by data complexity, target complexity, and model size. |
Damien Ferbach; Katie Everett; Gauthier Gidel; Elliot Paquette; Courtney Paquette; |
| 679 | From Experts to A Generalist: Toward General Whole-Body Control for Humanoid Robots Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While existing frameworks excel in training single motion-specific policies, they struggle to generalize across highly varied behaviors due to conflicting control requirements and mismatched data distributions. In this work, we propose BumbleBee (BB), an expert-generalist learning framework that combines motion clustering and sim-to-real adaptation to overcome these challenges. |
Yuxuan Wang; Ming Yang; Ziluo Ding; Yu Zhang; Weishuai Zeng; Xinrun Xu; Haobin Jiang; Zongqing Lu; |
| 680 | LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce LoRAShop, the first framework for multi-concept image generation and editing with LoRA models. |
Yusuf Dalva; Hidir Yesiltepe; Pinar Yanardag; |
| 681 | GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce GeoSVR, an explicit voxel-based framework that explores and extends the under-investigated potential of sparse voxels for achieving accurate, detailed, and complete surface reconstruction. |
Jiahe Li; Jiawei Zhang; Youmin Zhang; Xiao Bai; Jin Zheng; Xiaohan Yu; Lin Gu; |
| 682 | Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Treating the original transformer-based model as the Pilot, we correspondingly design a Copilot model to refine the Pilot’s inference performance via logits rectification. |
Jiaru Zou; Yikun Ban; Zihao Li; Yunzhe Qi; Ruizhong Qiu; Ling Yang; Jingrui He; |
| 683 | ScMRDR: A Scalable and Flexible Framework for Unpaired Single-cell Multi-omics Data Integration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce a scalable and flexible generative framework called single-cell Multi-omics Regularized Disentangled Representations (scMRDR) for unpaired multi-omics integration. |
Jianle Sun; Chaoqi Liang; Ran Wei; Peng Zheng; LEI BAI; Wanli Ouyang; Hongliang Yan; Peng Ye; |
| 684 | Imitation Beyond Expectation Using Pluralistic Stochastic Dominance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We reformulate imitation learning using stochastic dominance over the demonstrations’ reward distribution across a range of reward functions as our foundational aim. |
Ali Farajzadeh; Danyal Saeed; Syed M Abbas; Rushit N. Shah; Aadirupa Saha; Brian D Ziebart; |
| 685 | Cloud4D: Estimating Cloud Properties at A High Spatial and Temporal Resolution Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Cloud4D, the first learning-based framework that reconstructs a physically consistent, four–dimensional cloud state using only synchronized ground‐based cameras. |
Jacob Lin; Edward Gryspeerdt; Ronald Clark; |
| 686 | CLiFT: Compressive Light-Field Tokens for Compute Efficient and Adaptive Neural Rendering Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper proposes a neural rendering approach that represents a scene as compressed light-field tokens (CLiFTs), retaining rich appearance and geometric information of a scene. |
Zhengqing Wang; Yuefan Wu; Jiacheng Chen; Fuyang Zhang; Yasutaka Furukawa; |
| 687 | Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, we show that each category has inherent limitations: density and sample constraints tend to be overly conservative in many scenarios, while the support constraint, though least restrictive, faces challenges in accurately modeling the behavior policy. To overcome these limitations, we propose a new neighborhood constraint that restricts action selection in the Bellman target to the union of neighborhoods of dataset actions. |
Yixiu Mao; Yun Qu; Cheems Wang; Xiangyang Ji; |
| 688 | To Think or Not To Think: A Study of Thinking in Rule-Based Visual Reinforcement Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Challenging the convention that explicit thinking is crucial for the success of RFT, we introduce \textit{No-Thinking-RFT}, exploring RFT without thinking by introducing a simple equality accuracy reward. |
Ming Li; Jike Zhong; Shitian Zhao; Yuxiang Lai; Haoquan Zhang; Wang Bill Zhu; Kaipeng Zhang; |
| 689 | DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we leverage garment structural correspondence to automatically generate a dataset with diverse trajectories using only a single expert demonstration, significantly reducing manual intervention. |
Yuran Wang; Ruihai Wu; Yue Chen; Jiarui Wang; Jiaqi Liang; Ziyu Zhu; Haoran Geng; Jitendra Malik; Pieter Abbeel; Hao Dong; |
| 690 | STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing OPE methods are ineffective for high-dimensional, long-horizon problems, due to exponential blow-ups in variance from importance weighting or compounding errors from learned dynamics models. To address these challenges, we propose STITCH-OPE, a model-based generative framework that leverages denoising diffusion for long-horizon OPE in high-dimensional state and action spaces. |
Hossein Goli; Michael Gimelfarb; Nathan Samuel de Lara; Haruki Nishimura; Masha Itkina; Florian Shkurti; |
| 691 | Neural Atlas Graphs for Dynamic Scene Decomposition and Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose Neural Atlas Graphs (NAGs), a hybrid high-resolution scene representation, where every graph node is a view-dependent neural atlas, facilitating both 2D appearance editing and 3D ordering and positioning of scene elements. |
Jan Philipp Schneider; Pratik Singh Bisht; Ilya Chugunov; Andreas Kolb; Michael Moeller; Felix Heide; |
| 692 | LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present LeMiCa, a training-free and efficient acceleration framework for diffusion-based video generation. |
Huanlin Gao; Ping Chen; Fuyuan Shi; Chao Tan; Zhaoxiang Liu; Fang Zhao; Kai Wang; Shiguo Lian; |
| 693 | Robust SuperAlignment: Weak-to-Strong Robustness Generalization for Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To this end, we are the first to propose the weak-to-strong (adversarial) robustness generalization method to elicit zero-shot robustness in large-scale models by an unsupervised scheme, mitigating the unreliable information source for alignment from two perspectives: alignment re-weighting and source guidance refinement. |
Junhao Dong; Cong Zhang; Xinghua Qu; Zejun MA; Piotr Koniusz; Yew-Soon Ong; |
| 694 | Thought Communication in Multiagent Collaboration Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To go beyond language, we introduce a new paradigm, *thought communication*, which enables agents to interact directly mind-to-mind, akin to telepathy. |
Yujia Zheng; Zhuokai Zhao; Zijian Li; Yaqi Xie; Mingze Gao; Lizhu Zhang; Kun Zhang; |
| 695 | Polyline Path Masked Attention for Vision Transformer Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose Polyline Path Masked Attention (PPMA) that integrates the self-attention mechanism of ViTs with an enhanced structured mask of Mamba2, harnessing the complementary strengths of both architectures. |
Zhongchen Zhao; Chaodong Xiao; Hui LIN; Qi Xie; Lei Zhang; Deyu Meng; |
| 696 | PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Third, extracting road conditions for missing points is non-trivial. To address these challenges, we propose $\textit{PLMTrajRec}$, a novel trajectory recovery model. |
Tonglong Wei; Yan Lin; Youfang Lin; Shengnan Guo; Jilin Hu; Haitao Yuan; Gao Cong; Huaiyu Wan; |
| 697 | Optimization Inspired Few-Shot Adaptation for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we reinterpret the forward pass of LLMs as an optimization process, a sequence of preconditioned gradient descent steps refining internal representations. |
Boyan Gao; Xin Wang; Yibo Yang; David A. Clifton; |
| 698 | Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by Information Foraging Theory (IFT), we propose InForage, a reinforcement learning framework that formalizes retrieval-augmented reasoning as a dynamic information-seeking process.To facilitate training, we construct a human-guided dataset capturing iterative search and reasoning trajectories for complex, real-world web tasks.We provide all codes and datasets in the supplementary materials. |
Hongjin Qian; Zheng Liu; |
| 699 | Taccel: Scaling Up Vision-based Tactile Robotics Via High-performance GPU Simulation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: We present Taccel, a high-performance simulation platform that integrates Incremental Potential Contact (IPC) and Affine Body Dynamics (ABD) to model robots, tactile sensors, and objects with both accuracy and unprecedented speed, achieving a total of 915 FPS with 4096 parallel environments. |
Yuyang Li; Wenxin Du; Chang Yu; Puhao Li; Zihang Zhao; Tengyu Liu; Chenfanfu Jiang; Yixin Zhu; Siyuan Huang; |
| 700 | Toward Relative Positional Encoding in Spiking Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce several strategies to approximate relative positional encoding (RPE) in spiking Transformers while preserving the binary nature of spikes. |
Changze Lv; Yansen Wang; Dongqi Han; Yifei Shen; Xiaoqing Zheng; Xuanjing Huang; Dongsheng Li; |
| 701 | TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This paper demonstrates that MLA provides superior expressive power compared to GQA with the same KV cache overhead, thereby offering a rationale for transitioning from GQA to MLA. |
Fanxu Meng; Pingzhi Tang; Zengwei Yao; Xing Sun; Muhan Zhang; |
| 702 | Towards Understanding The Mechanisms of Classifier-Free Guidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Our analysis reveals that linear CFG improves generation quality via three distinct components: (i) a mean-shift term that approximately steers samples in the direction of class means, (ii) a positive Contrastive Principal Components (CPC) term that amplifies class-specific features, and (iii) a negative CPC term that suppresses generic features prevalent in unconditional data. |
Xiang Li; Rongrong Wang; Qing Qu; |
| 703 | PhysX-3D: Physical-Grounded 3D Asset Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Consequently, despite the rapid development of 3D generative models, the synthesized 3D assets often overlook rich and important physical properties, hampering their real-world application in physical domains like simulation and embodied AI. As an initial attempt to address this challenge, we propose \textbf{PhysX}, an end-to-end paradigm for physical-grounded 3D asset generation. |
Ziang Cao; Zhaoxi Chen; Liang Pan; Ziwei Liu; |
| 704 | Towards Physics-informed Spatial Intelligence with Human Priors: An Autonomous Driving Pilot Study Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Spatial Intelligence Grid (SIG): a structured, grid-based schema that explicitly encodes object layouts, inter-object relations, and physically grounded priors. |
Guanlin Wu; Boyan Su; Yang Zhao; Pu Wang; Yichen Lin; Hao Frank Yang; |
| 705 | StreamForest: Efficient Online Video Understanding with Persistent Event Memory Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, their effectiveness in real-time streaming scenarios remains limited due to storage constraints of historical visual features and insufficient real-time spatiotemporal reasoning. To address these challenges, we propose StreamForest, a novel architecture specifically designed for streaming video understanding. |
Xiangyu Zeng; Kefan Qiu; Qingyu Zhang; Xinhao Li; Jing Wang; Jiaxin Li; Ziang Yan; Kun Tian; Meng Tian; Xinhai Zhao; Yi Wang; Limin Wang; |
| 706 | Vanish Into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2 Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we first analyze the performance gap of existing attacks between SAM and SAM2 and highlight two key challenges arising from their architectural differences: directional guidance from the prompt and semantic entanglement across consecutive frames. To address these issues, we propose UAP-SAM2, the first cross-prompt universal adversarial attack against SAM2 driven by dual semantic deviation. |
Ziqi Zhou; Yifan Hu; Yufei Song; Zijing Li; Shengshan Hu; Leo Yu Zhang; Dezhong Yao; Long Zheng; Hai Jin; |
| 707 | Evolutionary Multi-View Classification Via Eliminating Individual Fitness Bias Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This makes it difficult for multi-view model (MVM) to achieve optimal performance during convergence, which in turn leads to FE failing to accurately reflect individual performance rankings and ultimately triggering FEB. To address this issue, we propose an evolutionary multi-view classification via eliminating individual fitness bias (EFB-EMVC) method, which alleviates the FEB issue by introducing evolutionary navigators for each MVM, thereby providing more accurate individual ranking. |
Xinyan Liang; ShuaiLi; Qian Guo; Yuhua Qian; Bingbing Jiang; Tingjin Luo; Liang Du; |
| 708 | Right Question Is Already Half The Answer: Fully Unsupervised LLM Reasoning Incentivization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we propose Entropy Minimized Policy Optimization (EMPO), which makes an early attempt at fully unsupervised LLM reasoning incentivization. |
Qingyang Zhang; Haitao Wu; Changqing Zhang; Peilin Zhao; Yatao Bian; |
| 709 | MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, most existing approaches assume modality-matched conditions, significantly limiting their effectiveness in modality-mismatched scenarios. To overcome this limitation and achieve a more flexible ReID, we introduce MDReID to allow any-to-any image-level ReID systems. |
Yingying Feng; Jie Li; Jie Hu; Yukang Zhang; Lei Tan; Jiayi Ji; |
| 710 | RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We furthermore uncover that compensating for the gap between stark singular values contributes to direction robustness. Therefore, we propose RobustMerge, a training-free parameter-efficient merging method with complementary parameter adaptation to maintain direction robustness. |
Fanhu Zeng; Haiyang Guo; Fei Zhu; Li Shen; Hao Tang; |
| 711 | Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Stable Part Diffusion 4D (SP4D), a framework for generating paired RGB and kinematic part videos from monocular inputs. |
Hao Zhang; Chun-Han Yao; Simon Donné; Narendra Ahuja; Varun Jampani; |
| 712 | OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce OnlineSplatter, a novel online feed-forward framework generating high-quality, object-centric 3D Gaussians directly from RGB frames without requiring camera pose, depth priors, or bundle optimization. |
Mark He Huang; Lin Geng Foo; Christian Theobalt; Ying Sun; De Wen Soh; |
| 713 | Rectified Point Flow: Generic Point Cloud Pose Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present Rectified Point Flow, a unified parameterization that formulates pairwise point cloud registration and multi-part shape assembly as a single conditional generative problem. |
TAO SUN; Liyuan Zhu; Shengyu Huang; Shuran Song; Iro Armeni; |
| 714 | Cue3D: Quantifying The Role of Image Cues in Single-Image 3D Generation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Cue3D, the first comprehensive, model-agnostic framework for quantifying the influence of individual image cues in single-image 3D generation. |
Xiang Li; Zirui Wang; Zixuan Huang; James Matthew Rehg; |
| 715 | SceneDesigner: Controllable Multi-Object Image Generation with 9-DoF Pose Manipulation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these limitations, we propose ***SceneDesigner***, a method for accurate and flexible multi-object 9-DoF pose manipulation.To support training, we construct a new dataset, ***ObjectPose9D***, which aggregates images from diverse sources along with 9D pose annotations. |
Zhenyuan Qin; Xincheng Shuai; Henghui Ding; |
| 716 | StelLA: Subspace Learning in Low-rank Adaptation Using Stiefel Manifold Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we propose a geometry-aware extension of LoRA that uses a three-factor decomposition $USV^\top$. |
Zhizhong Li; Sina Sajadmanesh; Jingtao Li; Lingjuan Lyu; |
| 717 | Non-Clairvoyant Scheduling with Progress Bars Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce a natural setting in which the scheduler receives continuous feedback in the form of progress bars—estimates of the fraction of each job completed over time. |
Ziyad Benomar; Romain Cosson; Alexander Lindermayr; Jens Schlöter; |
| 718 | Jacobian-Based Interpretation of Nonlinear Neural Encoding Model Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, these approaches remain limited in characterizing the brain’s inherently nonlinear response properties. To address this, we propose the Jacobian-based Nonlinearity Evaluation (JNE), an interpretability metric for nonlinear neural encoding models. |
Xiaohui Gao; Haoran Yang; Yue Cheng; Mengfei Zuo; Yiheng Liu; Peiyang Li; Xintao Hu; |
| 719 | Mesh-RFT: Enhancing Mesh Generation Via Fine-grained Reinforcement Fine-Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing pretrained models for 3D mesh generation often suffer from data biases and produce low-quality results, while global reinforcement learning (RL) methods rely on object-level rewards that struggle to capture local structure details. To address these challenges, we present $\textbf{Mesh-RFT}$, a novel fine-grained reinforcement fine-tuning framework that employs Masked Direct Preference Optimization (M-DPO) to enable localized refinement via quality-aware face masking. |
Jian Liu; Jing Xu; Song Guo; Jing Li; Guojingfeng; Jiaao Yu; Haohan Weng; Biwen Lei; Xianghui Yang; Zhuo Chen; Fangqi Zhu; Tao Han; Chunchao Guo; |
| 720 | VisualQuality-R1: Reasoning-Induced Image Quality Assessment Via Reinforcement Learning to Rank Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we introduce VisualQuality-R1, a reasoning-induced no-reference IQA (NR-IQA) model, and we train it with reinforcement learning to rank, a learning algorithm tailored to the intrinsically relative nature of visual quality. |
Tianhe Wu; Jian Zou; Jie Liang; Lei Zhang; Kede Ma; |
| 721 | GenColor: Generative and Expressive Color Enhancement with Pixel-Perfect Texture Preservation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose GenColor, a novel diffusion-based framework for sophisticated, texture-preserving color enhancement.We have released the code and dataset. |
Yi Dong; Yuxi Wang; Xianhui Lin; Wenqi Ouyang; Zhiqi Shen; Peiran Ren; Ruoxi Fan; Rynson W. H. Lau; |
| 722 | G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: Upon close inspection, we are alarmed to discover that prevailing MAS memory mechanisms (1) are overly simplistic, completely disregarding the nuanced inter-agent collaboration trajectories, and (2) lack cross-trial and agent-specific customization, in stark contrast to the expressive memory developed for single agents. To bridge this gap, we introduce G-Memory, a hierarchical, agentic memory system for MAS inspired by organizational memory theory, which manages the lengthy MAS interaction via a three-tier graph hierarchy: insight, query, and interaction graphs. |
Guibin Zhang; Muxin Fu; Kun Wang; Guancheng Wan; Miao Yu; Shuicheng YAN; |
| 723 | FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We propose FSDrive, a visual spatio-temporal CoT framework that enables VLAs to think in images. |
Shuang Zeng; Xinyuan Chang; Mengwei Xie; Xinran Liu; Yifan Bai; Zheng Pan; Mu Xu; Xing Wei; |
| 724 | E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, EGNNs face substantial computational challenges due to the high cost of constructing edge features via spherical tensor products, making them almost impractical for large-scale systems. To address this limitation, we introduce E2Former, an equivariant and efficient transformer architecture that incorporates a Wigner $6j$ convolution (Wigner $6j$ Conv). |
Yunyang Li; Lin Huang; Zhihao Ding; Xinran Wei; Chu Wang; Han Yang; Zun Wang; Chang Liu; Yu Shi; Peiran Jin; Tao Qin; Mark Gerstein; Jia Zhang; |
| 725 | Adaptive Defense Against Harmful Fine-Tuning for Large Language Models Via Bayesian Data Scheduler Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Existing defense strategies preemptively build robustness via attack simulation but suffer from fundamental limitations: (i) the infeasibility of extending attack simulations beyond bounded threat models due to the inherent difficulty of anticipating unknown attacks, and (ii) limited adaptability to varying attack settings, as simulation fails to capture their variability and complexity. To address these challenges, we propose Bayesian Data Scheduler (BDS), an adaptive tuning-stage defense strategy with no need for attack simulation. |
Zixuan Hu; Li Shen; Zhenyi Wang; Yongxian Wei; Dacheng Tao; |
| 726 | Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address these challenges, we propose Neptune-X, a data-centric generative-selection framework that enhances training effectiveness by leveraging synthetic data generation with task-aware sample selection.To support robust benchmarking, we construct the Maritime Generation Dataset, the first dataset tailored for generative maritime learning, encompassing a wide range of semantic conditions. |
Yu Guo; Shengfeng He; Yuxu Lu; Haonan An; Yihang Tao; Huilin Zhu; Jingxian Liu; Yuguang Fang; |
| 727 | AuroRA: Breaking Low-Rank Bottleneck of LoRA with Nonlinear Mapping Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Recent linear LoRA variants have attempted to enhance expressiveness by introducing additional linear mappings; however, their composition remains inherently linear and fails to fundamentally improve LoRA’s representational capacity. To address this limitation, we propose \ourmethod, which incorporates an Adaptive Nonlinear Layer (ANL) between two linear projectors to capture \emph{fixed} and \emph{learnable} nonlinearities. |
Haonan Dong; Wenhao Zhu; Guojie Song; Liang Wang; |
| 728 | Puppeteer: Rig and Animate Your 3D Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We present \textbf{Puppeteer}, a comprehensive framework that addresses both automatic rigging and animation for diverse 3D objects. |
Chaoyue Song; Xiu Li; Fan Yang; Zhongcong Xu; Jiacheng Wei; Fayao Liu; Jiashi Feng; Guosheng Lin; Jianfeng Zhang; |
| 729 | Talk2Event: Grounded Understanding of Dynamic Scenes from Event Cameras Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Built from real-world driving data, Talk2Event provides over 30,000 validated referring expressions, each enriched with four grounding attributes — appearance, status, relation to viewer, and relation to other objects — bridging spatial, temporal, and relational reasoning. To fully exploit these cues, we propose EventRefer, an attribute-aware grounding framework that dynamically fuses multi-attribute representations through a Mixture of Event-Attribute Experts (MoEE). |
Lingdong Kong; Dongyue Lu; Alan Liang; Rong Li; Yuhao Dong; Tianshuai Hu; Lai Xing Ng; Wei Tsang Ooi; Benoit R Cottereau; |
| 730 | DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Considering the fact that in RF the noisy latent is estimated through direct interpolation between Gaussian noises and clean images at each timestep, we propose Direct Noise Alignment (DNA), which directly refines the desired Gaussian noise in the noise domain, significantly reducing the error accumulation in previous methods. |
Chenxi Xie; Minghan Li; Shuai Li; Yuhui Wu; Qiaosi Yi; Lei Zhang; |
| 731 | Alligat0R: Pre-Training Through Covisibility Segmentation for Relative Camera Pose Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Alligat0R, a novel pre-training approach that replaces cross-view learning with a covisibility segmentation task. |
Thibaut Loiseau; Guillaume Bourmaud; Vincent Lepetit; |
| 732 | DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we revisit convolution as an alternative building block for constructing efficient and expressive diffusion models. |
Yuang Ai; Qihang Fan; Xuefeng Hu; Zhenheng Yang; Ran He; Huaibo Huang; |
| 733 | Injecting Frame-Event Complementary Fusion Into Diffusion for Optical Flow in Challenging Scenes Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on diffusion models, we propose a Multi-Condition Iterative Denoising Decoder.In addition, we propose a dual-modal optical flow dataset for generalization experiments. |
Haonan Wang; Hanyu Zhou; Haoyue Liu; Luxin Yan; |
| 734 | OmniSync: Towards Universal Lip Synchronization Via Diffusion Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we present OmniSync, a universal lip synchronization framework for diverse visual scenarios. |
Ziqiao Peng; Jiwen Liu; Haoxian Zhang; Xiaoqiang Liu; Songlin Tang; Pengfei Wan; Di ZHANG; Hongyan Liu; Jun He; |
| 735 | MesaTask: Towards Task-Driven Tabletop Scene Generation Via 3D Spatial Reasoning Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we formulate a novel task, namely task-oriented tabletop scene generation, which poses significant challenges due to the substantial gap between high-level task instructions and the tabletop scenes. |
Jinkun Hao; Naifu Liang; Zhen Luo; Xudong XU; Weipeng Zhong; Ran Yi; Yichen Jin; Zhaoyang Lyu; Feng Zheng; Lizhuang Ma; Jiangmiao Pang; |
| 736 | Self Forcing: Bridging The Train-Test Gap in Autoregressive Video Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce Self Forcing, a novel training paradigm for autoregressive video diffusion models. |
Xun Huang; Zhengqi Li; Guande He; Mingyuan Zhou; Eli Shechtman; |
| 737 | DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: This paper’s primary objective is to develop a robust generalist perception model capable of addressing multiple tasks under constraints of computational resources and limited training data. |
Canyu Zhao; Yanlong Sun; Mingyu Liu; Huanyi Zheng; Muzhi Zhu; Zhiyue Zhao; Hao Chen; Tong He; Chunhua Shen; |
| 738 | Variational Learning Finds Flatter Solutions at The Edge of Stability Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Here, we analyze the implicit regularization of VL through the Edge of Stability (EoS) framework. |
Avrajit Ghosh; Bai Cong; Rio Yokota; Saiprasad Ravishankar; Rongrong Wang; Molei Tao; Mohammad Emtiyaz Khan; Thomas Möllenhoff; |
| 739 | Seeing Sound, Hearing Sight: Uncovering Modality Bias and Conflict of AI Models in Sound Localization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In contrast, AI shows a pronounced bias toward vision, often failing to suppress irrelevant or conflicting visual input, leading to chance-level performance. To bridge this gap, we present EchoPin, a neuroscience-inspired multimodal model for SSL that emulates human auditory perception. |
Yanhao Jia; Ji Xie; S Jivaganesh; Li Hao; Xu Wu; Mengmi Zhang; |
| 740 | ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: While amortized methods for Bayesian inference and experimental design offer part of the solution, neither approach is optimal in the most general and challenging task, where new data needs to be collected for instant inference. To tackle this issue, we introduce the Amortized Active Learning and Inference Engine (ALINE), a unified framework for amortized Bayesian inference and active data acquisition. |
Daolang Huang; Xinyi Wen; Ayush Bharti; Samuel Kaski; Luigi Acerbi; |
| 741 | Fully Autonomous Neuromorphic Navigation and Dynamic Obstacle Avoidance Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Inspired by the efficiency of biological systems, we propose a fully neuromorphic framework achieving end-to-end obstacle avoidance during navigation with an overall latency of just 2.3 milliseconds. |
Xiaochen Shang; Luo Pengwei; Xinning Wang; Jiayue Zhao; Huilin Ge; Bo Dong; Xin Yang; |
| 742 | Enhancing LLM Watermark Resilience Against Both Scrubbing and Spoofing Attacks Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Based on the redundancy, we propose a watermark scheme with **S**ub-vocabulary decomposed **E**quivalent t**E**xture **K**ey (**SEEK**). |
Huanming Shen; Baizhou Huang; Xiaojun Wan; |
| 743 | Mulberry: Empowering MLLM with O1-like Reasoning and Reflection Via Collective Monte Carlo Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this work, we aim to develop an MLLM that understands and solves questions by learning to create each intermediate step of the reasoning involved till the final answer. |
Huanjin Yao; Jiaxing Huang; Wenhao Wu; Jingyi Zhang; Yibo Wang; Shunyu Liu; Yingjie Wang; YuXin Song; Haocheng Feng; Li Shen; Dacheng Tao; |
| 744 | ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This motivates the need for a generative model capable of designing diverse sequences while preserving structural consistency. To address this trade-off, we introduce ProtInvTree, the first reward-guided tree-search framework for protein inverse folding. |
Mengdi Liu; Xiaoxue Cheng; Zhangyang Gao; Hong Chang; Cheng Tan; Shiguang Shan; Xilin Chen; |
| 745 | On The Value of Cross-Modal Misalignment in Multimodal Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: There are two distinct viewpoints on how to address this issue: one suggests mitigating the misalignment, and the other leveraging it. We seek here to reconcile these seemingly opposing perspectives, and to provide a practical guide for practitioners. |
Yichao Cai; Yuhang Liu; Erdun Gao; Tianjiao Jiang; Zhen Zhang; Anton van den Hengel; Javen Qinfeng Shi; |
| 746 | A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: To address this, we propose Unified Video Fusion (UniVF), a novel and unified framework for video fusion that leverages multi-frame learning and optical flow-based feature warping for informative, temporally coherent video fusion.To support its development, we also introduce Video Fusion Benchmark (VF-Bench), the first comprehensive benchmark covering four video fusion tasks: multi-exposure, multi-focus, infrared-visible, and medical fusion. |
Zixiang Zhao; Haowen Bai; Bingxin Ke; Yukun Cui; Lilun Deng; Yulun Zhang; Kai Zhang; Konrad Schindler; |
| 747 | Multi-agent Markov Entanglement Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we uncover the underlying mathematical structure that enables value decomposition. |
Shuze Chen; Tianyi Peng; |
| 748 | SQS: Enhancing Sparse Perception Models Via Query-based Splatting in Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce SQS, a novel query-based splatting pre-training specifically designed to advance SPMs in autonomous driving. |
Haiming Zhang; Yiyao Zhu; Wending Zhou; Xu Yan; Yingjie CAI; Bingbing Liu; Shuguang Cui; Zhen Li; |
| 749 | PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we study building a Perception Language Model (PLM) in a fully open and reproducible framework for transparent research in image and video understanding. |
Jang Hyun Cho; Andrea Madotto; Effrosyni Mavroudi; Triantafyllos Afouras; Tushar Nagarajan; Muhammad Maaz; Yale Song; Tengyu Ma; Shuming Hu; Suyog Jain; Miguel Martin; Huiyu Wang; Hanoona Abdul Rasheed; Peize Sun; Po-Yao Huang; Daniel Bolya; Nikhila Ravi; Shashank Jain; Tammy Stark; Seungwhan Moon; Babak Damavandi; Vivian Lee; Andrew Westbury; Salman Khan; Philipp Kraehenbuehl; Piotr Dollar; Lorenzo Torresani; Kristen Grauman; Christoph Feichtenhofer; |
| 750 | Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models? Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we disentangle MAD into two key components–Majority Voting and inter-agent Debate–and assess their respective contributions. Through extensive experiments across seven NLP benchmarks, we find that Majority Voting alone accounts for most of the performance gains typically attributed to MAD. |
Hyeong Kyu Choi; Jerry Zhu; Sharon Li; |
| 751 | VoxDet: Rethinking 3D Semantic Scene Completion As Dense Object Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: However, this paradigm neglects critical instance-centric discriminability, leading to instance-level incompleteness and adjacent ambiguities. To address this, we highlight a free lunch of SSC labels: the voxel-level class label has implicitly told the instance-level insight, which is ever-overlooked by the community. |
Wuyang Li; Zhu Yu; Alexandre Alahi; |
| 752 | Approximate Domain Unlearning for Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this paper, we introduce {\em Approximate Domain Unlearning (ADU)}, a novel problem setting that requires reducing recognition accuracy for images from specified domains (e.g., {\em illustration}) while preserving accuracy for other domains (e.g., {\em real}). |
Kodai Kawamura; Yuta Goto; Rintaro Yanagi; Hirokatsu Kataoka; Go Irie; |
| 753 | BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: At the intra-species level, instead of being diminished, the intra-species variations (e.g., life stages and sexes) are preserved and better separated in subspaces orthogonal to inter-species distinctions. We provide formal proof and analyses to explain why hierarchical supervision and contrastive objectives encourage these emergent properties. |
Jianyang Gu; Samuel Stevens; Elizabeth G Campolongo; Matthew J Thompson; Net Zhang; Jiaman Wu; Andrei Kopanev; Zheda Mai; Alexander E. White; James Balhoff; Wasla Dahdul; Daniel Rubenstein; Hilmar Lapp; Tanya Berger-Wolf; Wei-Lun Chao; Yu Su; |
| 754 | GraphMaster: Automated Graph Synthesis Via LLM Agents in Data-Limited Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: While large language models (LLMs) demonstrate exceptional text generation capabilities, their direct application to graph synthesis is impeded by context window limitations, hallucination phenomena, and structural consistency challenges. To address these issues, we introduce \textbf{GraphMaster}—the first multi-agent framework specifically designed for graph data synthesis in data-limited environments. |
Enjun Du; Xunkai Li; Tian Jin; Zhihan Zhang; Rong-Hua Li; Guoren Wang; |
| 755 | Q-Insight: Understanding Image Quality Via Visual Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: In this paper, we propose Q-Insight, a reinforcement learning-based model built upon group relative policy optimization (GRPO), which demonstrates strong visual reasoning capability for image quality understanding while requiring only a limited amount of rating scores and degradation labels. |
Weiqi Li; Xuanyu Zhang; Shijie Zhao; Yabin ZHANG; Junlin Li; Li zhang; Jian Zhang; |
| 756 | What Makes A Reward Model A Good Teacher? An Optimization Perspective Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Highlight: However, while this quality is primarily evaluated through accuracy, it remains unclear whether accuracy fully captures what makes a reward model an effective teacher. We address this question from an optimization perspective. |
Noam Razin; Zixuan Wang; Hubert Strauss; Stanley Wei; Jason D. Lee; Sanjeev Arora; |
| 757 | $\Psi$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: We introduce $\Psi$-Sampler, an SMC-based framework incorporating pCNL-based initial particle sampling for effective inference-time reward alignment with a score-based model. |
TaeHoon Yoon; Yunhong Min; Kyeongmin Yeo; Minhyuk Sung; |
| 758 | Do-PFN: In-Context Learning for Causal Effect Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: Through extensive experiments in synthetic and semi-synthetic settings, we show that our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph. |
Jake Robertson; Arik Reuter; Siyuan Guo; Noah Hollmann; Frank Hutter; Bernhard Schölkopf; |
| 759 | ReSim: Reliable World Simulation for Autonomous Driving Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: This limitation restricts their applicability to tasks such as policy evaluation. In this work, we address this challenge by enriching real-world human demonstrations with diverse non-expert data collected from a driving simulator (e.g., CARLA), and building a controllable world model trained on this heterogeneous corpus. |
Jiazhi Yang; Kashyap Chitta; Shenyuan Gao; Long Chen; Yuqian Shao; Xiaosong Jia; Hongyang Li; Andreas Geiger; Xiangyu Yue; Li Chen; |
| 760 | Differentiable Sparsity Via $D$-Gating: Simple and Versatile Structured Penalization Related Papers Related Patents Related Grants Related Venues Related Experts View Highlight: In this work, we propose $D$-Gating, a fully differentiable structured overparameterization that splits each group of weights into a primary weight vector and multiple scalar gating factors. |
Chris Kolb; Laetitia Frost; Bernd Bischl; David Rügamer; |
This table only includes oral and spotlight accepts. To browse the full list (~5,300 papers), please visit Paper Digest: NeurIPS-2025 (Full List).