Paper Digest: AISTATS 2025 Papers & Highlights
To search for papers presented at AISTATS-2025 on a specific topic, please make use of the search by venue (AISTATS-2025) service. To summarize the latest research published at AISTATS 2025 on a specific topic, you can utilize the review by venue (AISTATS-2025) service. If you are interested in browsing papers by author, we have a comprehensive list of all authors (AISTATS-2025).
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive daily updates on the latest research, discussions & news in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: AISTATS 2025 Papers & Highlights
| Paper | Author(s) | |
|---|---|---|
| 1 | Training LLMs with MXFP4 Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present the first near-lossless training recipe that uses MXFP4 GEMMs, which are $2\times$ faster than FP8 on supported hardware. |
Albert Tseng; Tao Yu; Youngsuk Park; |
| 2 | Beyond Size-Based Metrics: Measuring Task-Specific Complexity in Symbolic Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a new complexity measure designed to quantify the difficulty of conducting single-feature global perturbation analysis (SGPA)—a type of analysis commonly applied in fields like physics and risk scoring to understand the global impact of perturbing individual input features. |
Krzysztof Kacprzyk; Mihaela van der Schaar; |
| 3 | Visualizing Token Importance for Black-box Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the problem of auditing \emph{black-box} large language models (LLMs) to ensure they behave reliably when deployed in production settings, particularly in high-stakes domains such as legal, medical, and regulatory compliance. |
Paulius Rauba; Qiyao Wei; Mihaela van der Schaar; |
| 4 | Towards Regulatory-Confirmed Adaptive Clinical Trials: Machine Learning Opportunities and Solutions Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Are there discrepancies in the treatment effectiveness across diverse and under-served populations? We introduce two new objectives for future clinical trials that integrate regulatory constraints and treatment policy value for both the entire population and under-served populations, thus answering some of the questions above in advance. |
Omer Noy Klein; Alihan H�y�k; Ron Shamir; Uri Shalit; Mihaela van der Schaar; |
| 5 | Active Feature Acquisition for Personalised Treatment Assignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing active feature acquisition (AFA) methods, developed for supervised learning, fail to address the unique challenges of CATE, such as confounding, overlap, and the structural similarities of potential outcomes under different treatments. To tackle these challenges, we propose specialised feature acquisition metrics and estimation strategies tailored to the CATE setting. |
Julianna Piskorz; Nicol�s Astorga; Jeroen Berrevoets; Mihaela van der Schaar; |
| 6 | How Well Can Transformers Emulate In-Context Newton�s Method? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study whether Transformers can perform higher order optimization methods, beyond the case of linear regression. |
Angeliki Giannou; Liu Yang; Tianhao Wang; Dimitris Papailiopoulos; Jason D. Lee; |
| 7 | Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we highlight the following pitfall of prefilling: for batches containing high-varying prompt lengths, significant computation is wasted by the standard practice of padding sequences to the maximum length. |
Siyan Zhao; Daniel Mingyi Israel; Guy Van den Broeck; Aditya Grover; |
| 8 | $f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces $f$-divergence Preference Optimization ($f$-PO), a novel framework that generalizes and extends existing approaches. |
Jiaqi Han; Mingjian Jiang; Yuxuan Song; Stefano Ermon; Minkai Xu; |
| 9 | On The Sample Complexity of Next-Token Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we provide an analysis of empirical risk minimization for sequential inputs generated by order-$k$ Markov chains. |
Oguz Kaan Y�ksel; Nicolas Flammarion; |
| 10 | Generalization Lower Bounds for GD and SGD in Smooth Stochastic Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: More specifically, we focus on how training steps $T$ and step-size $\eta$ might affect generalization in smooth stochastic convex optimization (SCO) problems. |
Peiyuan Zhang; Jiaye Teng; Jingzhao Zhang; |
| 11 | Evaluating Prediction-based Interventions with Human Decision Makers In Mind Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we formalize and investigate various models of human decision-making in the presence of a predictive model aid. |
Inioluwa Deborah Raji; Lydia T. Liu; |
| 12 | SteinDreamer: Variance Reduction for Text-to-3D Score Distillation Via Stein Identity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we reveal that the gradient estimation in score distillation is inherent to high variance. |
Peihao Wang; Zhiwen Fan; Dejia Xu; Dilin Wang; Sreyas Mohan; Forrest Iandola; Rakesh Ranjan; Yilei Li; Qiang Liu; Zhangyang Wang; Vikas Chandra; |
| 13 | Looped ReLU MLPs May Be All You Need As Practical Programmable Computers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we provide an affirmative answer that a looped 23-layer $\mathsf{ReLU}$-$\mathsf{MLP}$ is capable of performing the basic necessary operations, more efficiently and effectively functioning as a programmable computer than a looped Transformer. |
Yingyu Liang; Zhizhou Sha; Zhenmei Shi; Zhao Song; Yufa Zhou; |
| 14 | Every Call Is Precious: Global Optimization of Black-Box Functions with Unknown Lipschitz Constants Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce Every Call is Precious (ECP), a novel global optimization algorithm that minimizes unpromising evaluations by strategically focusing on potentially optimal regions. |
Fares Fourati; Salma Kharrat; Vaneet Aggarwal; Mohamed-Slim Alouini; |
| 15 | Decision from Suboptimal Classifiers: Excess Risk Pre- and Post-Calibration Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we quantify the excess risk (a.k.a. regret) incurred using approximate posterior probabilities in batch binary decision-making. |
Alexandre Perez-Lebel; Gael Varoquaux; Sanmi Koyejo; Matthieu Doutreligne; Marine Le Morvan; |
| 16 | On The Power of Multitask Representation Learning with Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite its growing use, our understanding of the underlying mechanisms remains limited. In this paper, we provide a theoretical analysis elucidating why multi-task representation learning outperforms its single-task counterpart in scenarios involving over-parameterized two-layer convolutional neural networks trained by gradient descent. |
Qiaobo Li; Zixiang Chen; Yihe Deng; Yiwen Kou; Yuan Cao; Quanquan Gu; |
| 17 | Composition and Control with Distilled Energy Diffusion Models and Sequential Monte Carlo Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Diffusion models may be formulated as a time-indexed sequence of energy-based models, where the score corresponds to the negative gradient of an energy function. |
James Thornton; Louis B�thune; Ruixiang ZHANG; Arwen Bradley; Preetum Nakkiran; Shuangfei Zhai; |
| 18 | Choice Is What Matters After Attention Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by the concept of loss aversion from prospect theory in behavioral economics, and the endowment effect as highlighted by Richard H. Thaler, the 2017 Nobel Memorial Prize in Economic Sciences — particularly the principle that "the negative utility of an equivalent loss is approximately twice the positive utility of a comparable gain" — we have developed a new decoding strategy called Loss Sampling. |
Chenhan Fu; Guoming Wang; Juncheng Li; Rongxing Lu; Siliang Tang; |
| 19 | Balls-and-Bins Sampling for DP-SGD Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we show that the Balls-and-Bins sampling achieves the "best-of-both" samplers, namely, the implementation of Balls-and-Bins sampling is similar to that of Shuffling and models trained using DP-SGD with Balls-and-Bins sampling achieve utility comparable to those trained using DP-SGD with Shuffling at the same noise multiplier, and yet, Balls-and-Bins sampling enjoys similar-or-better privacy amplification as compared to Poisson subsampling in practical regimes. |
Lynn Chua; Badih Ghazi; Charlie Harrison; Pritish Kamath; Ravi Kumar; Ethan Jacob Leeman; Pasin Manurangsi; Amer Sinha; Chiyuan Zhang; |
| 20 | Cross-modality Matching and Prediction of Perturbation Responses with Labeled Gromov-Wasserstein Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we extend two Gromov-Wasserstein optimal transport methods to incorporate the perturbation label for cross-modality alignment. |
Jayoung Ryu; Charlotte Bunne; Luca Pinello; Aviv Regev; Romain Lopez; |
| 21 | A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we take a step towards bridging the gap between theory and practice by analyzing an action-conditional self-predictive objective (BYOL-AC) using the ODE framework. |
Khimya Khetarpal; Zhaohan Daniel Guo; Bernardo Avila Pires; Yunhao Tang; Clare Lyle; Mark Rowland; Nicolas Heess; Diana L Borsa; Arthur Guez; Will Dabney; |
| 22 | A Random Matrix Theory Perspective on The Spectrum of Learned Features and Asymptotic Generalization Capabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we provide a random matrix analysis of how fully-connected two-layer neural networks adapt to the target function after a single, but aggressive, gradient descent step. |
Yatin Dandi; Luca Pesce; Hugo Cui; Florent Krzakala; Yue Lu; Bruno Loureiro; |
| 23 | SemlaFlow � Efficient 3D Molecular Generation with Latent Attention and Equivariant Flow Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Current approaches, however, often suffer from very slow sampling times or generate molecules with poor chemical validity. Addressing these limitations, we propose Semla, a scalable E(3)-equivariant message passing architecture. |
Ross Irwin; Alessandro Tibo; Jon Paul Janet; Simon Olsson; |
| 24 | DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a more realistic graph data generation model using Structural Causal Models (SCMs), allowing us to redefine distribution shifts by pinpointing their origins within the generation process. |
Xiaoxue Han; Huzefa Rangwala; Yue Ning; |
| 25 | M$^2$AD: Multi-Sensor Multi-System Anomaly Detection Through Global Scoring and Calibrated Thresholding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, most existing anomaly detection methods are designed for either univariate or single-system multivariate data, making them insufficient for these complex scenarios. To address this, we introduce M$^2$AD, a framework for unsupervised anomaly detection in multivariate time series data from multiple systems. |
Sarah Alnegheimish; Zelin He; Matthew Reimherr; Akash Chandrayan; Abhinav Pradhan; Luca D�Angelo; |
| 26 | Towards Fair Graph Learning Without Demographic Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a novel method for fair graph learning without demographic information. |
Zichong Wang; Nhat Hoang; Xingyu Zhang; Kevin Bello; Xiangliang Zhang; Sundararaja Sitharama Iyengar; Wenbin Zhang; |
| 27 | Fundamental Computational Limits of Weak Learnability in High-dimensional Multi-index Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Multi-index models – functions which only depend on the covariates through a non-linear transformation of their projection on a subspace – are a useful benchmark for investigating feature learning with neural networks. |
Emanuele Troiani; Yatin Dandi; Leonardo Defilippis; Lenka Zdeborova; Bruno Loureiro; Florent Krzakala; |
| 28 | Prior-Fitted Networks Scale to Larger Datasets When Treated As Weak Learners Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To address these challenges, we investigate the fitting assumption for PFNs and input samples. Building on this understanding, we propose \emph{BoostPFN} designed to enhance the performance of these networks, especially for large-scale datasets. |
Yuxin Wang; Botian Jiang; Yiran Guo; Quan Gan; David Wipf; Xuanjing Huang; Xipeng Qiu; |
| 29 | Superiority of Multi-Head Attention: A Theoretical Study in Shallow Transformers in In-Context Linear Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a theoretical analysis of the performance of transformer with softmax attention in in-context learning with linear regression tasks. |
Yingqian Cui; Jie Ren; Pengfei He; Hui Liu; Jiliang Tang; Yue Xing; |
| 30 | A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we theoretically show that, compared to Stepwise ICL, the transformer gains better error correction ability and more accurate predictions if the reasoning from earlier steps (Coherent CoT) is integrated. |
Yingqian Cui; Pengfei He; Xianfeng Tang; Qi He; Chen Luo; Jiliang Tang; Yue Xing; |
| 31 | Fundamental Limits of Perfect Concept Erasure Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, there seems to be an inherent tradeoff between erasure and retaining utility, making it unclear how to achieve perfect concept erasure while maintaining high utility. In this paper, we offer a fresh perspective toward solving this problem by quantifying the fundamental limits of concept erasure through an information-theoretic lens. |
Somnath Basu Roy Chowdhury; Kumar Avinava Dubey; Ahmad Beirami; Rahul Kidambi; Nicholas Monath; Amr Ahmed; Snigdha Chaturvedi; |
| 32 | RetroDiff: Retrosynthesis As Multi-stage Distribution Interpolation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Interestingly, this generation process mirrors the reverse of the widely adapted semi-template retrosynthesis workflow, i.e., from reaction center identification to synthon completion. Based on these designs, we introduce Retrosynthesis Diffusion (RetroDiff), a novel diffusion-based method for the retrosynthesis task. |
Yiming Wang; Yuxuan Song; Yiqun Wang; Minkai Xu; Rui Wang; Hao Zhou; Wei-Ying Ma; |
| 33 | All or None: Identifiable Linear Properties of Next-Token Predictors in Language Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We analyze identifiability as a possible explanation for the ubiquity of linear properties across language models, such as the vector difference between the representations of “easy” and “easiest” being parallel to that between “lucky” and “luckiest”. |
Emanuele Marconato; Sebastien Lachapelle; Sebastian Weichwald; Luigi Gresele; |
| 34 | Exposing Privacy Gaps: Membership Inference Attack on Preference Data for LLM Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate the vulnerability of LLMs aligned using two widely used methods – DPO and PPO – to membership inference attacks (MIAs). |
Qizhang Feng; Siva Rajesh Kasa; SANTHOSH KUMAR KASA; Hyokun Yun; Choon Hui Teo; Sravan Babu Bodapati; |
| 35 | Max-Rank: Efficient Multiple Testing for Conformal Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Some conformal applications give rise to simultaneous testing, and positive dependencies among tests typically exist. We introduce max-rank, a novel correction that exploits these dependencies whilst efficiently controlling the family-wise error rate. |
Alexander Timans; Christoph-Nikolas Straehle; Kaspar Sakmann; Christian A. Naesseth; Eric Nalisnick; |
| 36 | Restructuring Tractable Probabilistic Circuits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing multiplication algorithms require that the circuits respect the same structure, i.e. variable scopes decomposes according to the same vtree. In this work, we propose and study the task of restructuring structured(-decomposable) PCs, that is, transforming a structured PC such that it conforms to a target vtree. |
Honghua Zhang; Benjie Wang; Marcelo Arenas; Guy Van den Broeck; |
| 37 | Invariant Link Selector for Spatial-Temporal Out-of-Distribution Problem Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: With the Information Bottleneck (IB) method, we propose an error-bounded Invariant Link Selector that can distinguish invariant components and variant components during the training process to make the deep learning model generalizable for different testing scenarios. |
Katherine Tieu; Dongqi Fu; Jun Wu; Jingrui He; |
| 38 | Primal-Dual Spectral Representation for Off-policy Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the major bottleneck of applying DICE estimators lies in the difficulty of solving the saddle-point optimization involved, especially with neural network implementations. In this paper, we tackle this challenge by establishing a \emph{linear representation} of value function and stationary distribution correction ratio, \emph{i.e.}, primal and dual variables in the DICE framework, using the spectral decomposition of the transition operator. |
Yang Hu; Tianyi Chen; Na Li; Kai Wang; Bo Dai; |
| 39 | What and How Does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we conduct a comprehensive study to understand ICL from a statistical perspective. |
Yufeng Zhang; Fengzhuo Zhang; Zhuoran Yang; Zhaoran Wang; |
| 40 | Fourier Circuits in Neural Networks and Transformers: A Case Study of Modular Arithmetic with Multiple Inputs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We direct our focus to the complex algebraic learning task of modular addition involving $k$ inputs. Our research presents a thorough analytical characterization of the features learned by stylized one-hidden layer neural networks and one-layer Transformers in addressing this task. |
Chenyang Li; Yingyu Liang; Zhenmei Shi; Zhao Song; Tianyi Zhou; |
| 41 | Optimal Time Complexity Algorithms for Computing General Random Walk Graph Kernels on Sparse Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the first linear time complexity randomized algorithms for unbiased approximation of the celebrated family of general random walk kernels (RWKs) for sparse graphs. |
Krzysztof Marcin Choromanski; Isaac Reid; Arijit Sehanobish; Kumar Avinava Dubey; |
| 42 | Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Traditional methods struggle with geometry-aware data generation, interpolation along meaningful trajectories, and transporting populations via feasible paths. To address these issues, we introduce Geometry-Aware Generative Autoencoder (GAGA), a novel framework that combines extensible manifold learning with generative modeling. |
Xingzhi Sun; Danqi Liao; Kincaid MacDonald; Yanlei Zhang; Guillaume Huguet; Guy Wolf; Ian Adelstein; Tim G. J. Rudner; Smita Krishnaswamy; |
| 43 | When Can We Solve The Weighted Low Rank Approximation Problem in Truly Subquadratic Time? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show that there is a certain regime, even if $A$ and $W$ are dense, we can still hope to solve the weighted low-rank approximation problem in almost linear $n^{1+o(1)}$ time. |
Chenyang Li; Yingyu Liang; Zhenmei Shi; Zhao Song; |
| 44 | Fine-Tuning with Uncertainty-Aware Priors Makes Vision and Language Foundation Models More Reliable Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we improve uncertainty quantification in fine-tuned models by constructing a data-driven uncertainty-aware fine-tuning prior that assigns high probability density to parameters that induce predictive functions with high uncertainty on input points that are meaningfully different from the data. |
Tim G. J. Rudner; Xiang Pan; Yucen Lily Li; Ravid Shwartz-Ziv; Andrew Gordon Wilson; |
| 45 | Implicit Diffusion: Efficient Optimization Through Stochastic Sampling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a general framework and a new algorithm for first-order optimization of parameterized stochastic diffusions, performing jointly, in a single loop, optimization and sampling steps. |
Pierre Marion; Anna Korba; Peter Bartlett; Mathieu Blondel; Valentin De Bortoli; Arnaud Doucet; Felipe Llinares-L�pez; Courtney Paquette; Quentin Berthet; |
| 46 | Offline Multi-task Transfer RL with Representational Penalization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an algorithm to compute pointwise uncertainty measures for the learnt representation in low-rank MDPs, and establish a data-dependent upper bound for the suboptimality of the learnt policy for the target task. |
Avinandan Bose; Simon Shaolei Du; Maryam Fazel; |
| 47 | Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on the connection, we establish a novel framework, \textbf{WIN} rate \textbf{D}ominance (WIND), with a series of efficient algorithms for regularized win rate dominance optimization that approximates iterative BOND in the parameter space. |
Tong Yang; Jincheng Mei; Hanjun Dai; Zixin Wen; Shicong Cen; Dale Schuurmans; Yuejie Chi; Bo Dai; |
| 48 | Bypassing The Exponential Dependency: Looped Transformers Efficiently Learn In-context By Multi-step Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study linear looped Transformers in-context learning on linear vector generation tasks. |
Bo Chen; Xiaoyu Li; Yingyu Liang; Zhenmei Shi; Zhao Song; |
| 49 | Rate of Model Collapse in Recursive Training Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we ask how fast model collapse occurs for some well-studied distribution families under maximum likelihood (ML or near ML) estimation during recursive training. |
Ananda Theertha Suresh; Andrew Thangaraj; Aditya Nanda Kishore Khandavally; |
| 50 | ChronosX: Adapting Pretrained Time Series Models with Exogenous Variables Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a new method to incorporate covariates into pretrained time series forecasting models.In order to evaluate our approach, we introduce a benchmark composed of 32 different synthetic datasets with varying dynamics to evaluate the effectivity of forecasting models with covariates. |
Sebastian Pineda Arango; Pedro Mercado; Shubham Kapoor; Abdul Fatir Ansari; Lorenzo Stella; Huibin Shen; Hugo Henri Joseph Senetaire; Ali Caner Turkmen; Oleksandr Shchur; Danielle C. Maddix; Michael Bohlke-Schneider; Bernie Wang; Syama Sundar Rangapuram; |
| 51 | Gated Recurrent Neural Networks with Weighted Time-Delay Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a novel approach to modeling long-term dependencies in sequential data by introducing a gated recurrent unit (GRU) with a weighted time-delay feedback mechanism. |
N. Benjamin Erichson; Soon Hoe Lim; Michael W. Mahoney; |
| 52 | A High Dimensional Statistical Model for Adversarial Training: Geometry and Trade-Offs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a tractable mathematical model where the interplay between the data and adversarial attacker geometries can be studied, while capturing the core phenomenology observed in the adversarial robustness literature. |
Kasimir Tanner; Matteo Vilucchio; Bruno Loureiro; Florent Krzakala; |
| 53 | Quantifying The Optimization and Generalization Advantages of Graph Neural Networks Over Multilayer Perceptrons Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although existing works have demonstrated the benefits of graph convolution through Laplacian smoothing, expressivity or separability, there remains a lack of quantitative analysis comparing GNNs and MLPs from an optimization and generalization perspective. This study aims to address this gap by examining the role of graph convolution through feature learning theory. |
Wei Huang; Yuan Cao; Haonan Wang; Xin Cao; Taiji Suzuki; |
| 54 | Multi-level Advantage Credit Assignment for Cooperative Multi-Agent Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we formalize the credit assignment level as the number of agents cooperating to obtain a reward, and address scenarios with multiple coexisting levels. |
Xutong Zhao; Yaqi Xie; |
| 55 | Spectral Representation for Causal Estimation with Hidden Confounders Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We study the problem of causal effect estimation in the presence of unobserved confounders, focusing on two settings: instrumental variable (IV) regression with additional observed confounders, and proxy causal learning. |
Haotian Sun; Antoine Moulin; Tongzheng Ren; Arthur Gretton; Bo Dai; |
| 56 | Your Finetuned Large Language Model Is Already A Powerful Out-of-distribution Detector Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We revisit the likelihood ratio between a pretrained large language model (LLM) and its finetuned variant as a criterion for out-of-distribution (OOD) detection. |
Andi Zhang; Tim Z. Xiao; Weiyang Liu; Robert Bamler; Damon Wischik; |
| 57 | Improving Stochastic Cubic Newton with Momentum Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose using a special version of momentum to stabilize the stochastic gradient and Hessian estimates in Newton’s method. |
El Mahdi Chayti; Nikita Doikov; Martin Jaggi; |
| 58 | Level Set Teleportation: An Optimization Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To evaluate teleportation in practice, we develop a projected-gradient method requiring only Hessian-vector products. |
Aaron Mishkin; Alberto Bietti; Robert M. Gower; |
| 59 | Locally Private Estimation with Public Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We also explore scenarios where users have the flexibility to select features for protection manually. In such cases, we propose an estimator and a data-driven parameter tuning strategy, leading to analogous theoretical and empirical results. |
Yuheng Ma; Ke Jia; Hanfang Yang; |
| 60 | Relating Piecewise Linear Kolmogorov Arnold Networks to ReLU Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the connection between Kolmogorov Arnold Networks (KANs) with piecewise linear (univariate real) functions and ReLU networks. |
Nandi Schoots; Mattia Jacopo Villani; Niels uit de Bos; |
| 61 | Unifying Feature-Based Explanations with Functional ANOVA and Cooperative Game Theory Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory. |
Fabian Fumagalli; Maximilian Muschalik; Eyke H�llermeier; Barbara Hammer; Julia Herbinger; |
| 62 | Nonparametric Factor Analysis and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a general framework for identifying latent variables in the nonparametric noisy settings. |
Yujia Zheng; Yang Liu; Jiaxiong Yao; Yingyao Hu; Kun Zhang; |
| 63 | Density Ratio Estimation Via Sampling Along Generalized Geodesics on Statistical Manifolds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that these methods can be regarded as iterating on the Riemannian manifold along a particular curve between the two probability distributions. Making use of the geometry of the manifold, we propose to consider incremental density ratio estimation along generalized geodesics on this manifold. |
Masanari Kimura; Howard Bondell; |
| 64 | Global Optimization of Gaussian Process Acquisition Functions Using A Piecewise-Linear Kernel Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We analyze the theoretical regret bounds of the proposed approximation, and empirically demonstrate the framework on synthetic functions, constrained benchmarks, and a hyperparameter tuning task. |
Yilin Xie; Shiqiang Zhang; Joel Paulson; Calvin Tsay; |
| 65 | A Causal Framework for Evaluating Deferring Systems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We evaluate our approach on synthetic and real datasets for seven deferring systems from the literature. |
Filippo Palomba; Andrea Pugnana; Jose Manuel Alvarez; Salvatore Ruggieri; |
| 66 | LMEraser: Large Model Unlearning Via Adaptive Prompt Tuning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the growing demand for privacy protection in machine learning, we propose an efficient and exact machine unlearning method for Large Models, called LMEraser. |
Jie Xu; Zihan Wu; Cong Wang; Xiaohua Jia; |
| 67 | Leveraging Frozen Batch Normalization for Co-Training in Source-Free Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose co-training the source model with frozen Batch Normalization layers as part of the domain adaptation process. |
Xianwen Deng; Yijun Wang; Zhi Xue; |
| 68 | A Graphical Global Optimization Framework for Parameter Estimation of Statistical Models with Nonconvex Regularization Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we develop a novel graph-based method to globally solve optimization problems that contain a generalization of norm-bounding constraints. |
Danial Davarnia; Mohammadreza Kiaghadi; |
| 69 | Partial Information Decomposition for Data Interpretability and Feature Selection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce Partial Information Decomposition of Features (PIDF), a new paradigm for simultaneous data interpretability and feature selection. |
Charles Westphal; Stephen Hailes; Mirco Musolesi; |
| 70 | Optimal Stochastic Trace Estimation in Generative Modeling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate the need for frequent and costly QR decompositions, we propose practical schemes that balance frequency and accuracy, backed by theoretical guarantees. |
Xinyang Liu; Hengrong Du; Wei Deng; Ruqi Zhang; |
| 71 | Bayesian Off-Policy Evaluation and Learning for Large Action Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this framework, we propose sDM, a generic Bayesian approach for OPE and OPL, grounded in both algorithmic and theoretical foundations. |
Imad Aouali; Victor-Emmanuel Brunel; David Rohde; Anna Korba; |
| 72 | Survival Models: Proper Scoring Rule and Stochastic Optimization with Competing Risks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: When dealing with right-censored data, where some outcomes are missing due to a limited observation period, survival analysis —known as \emph{time-to-event analysis}— focuses on predicting the time until an event of interest occurs. |
Julie Alberge; Vincent Maladiere; Olivier Grisel; Judith Ab�cassis; Gael Varoquaux; |
| 73 | Diffusion Models As Constrained Samplers for Optimization with Unknown Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Depending on the differentiability of the objective function, we propose two different sampling methods. |
Lingkai Kong; Yuanqi Du; Wenhao Mu; Kirill Neklyudov; Valentin De Bortoli; Dongxia Wu; Haorui Wang; Aaron M Ferber; Yian Ma; Carla P Gomes; Chao Zhang; |
| 74 | Quantifying Knowledge Distillation Using Partial Information Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel multi-level optimization to incorporate redundant information as a regularizer, leading to our framework of Redundant Information Distillation (RID). |
Pasan Dissanayake; Faisal Hamman; Barproda Halder; Ilia Sucholutsky; Qiuyi Zhang; Sanghamitra Dutta; |
| 75 | Protein Fitness Landscape: Spectral Graph Theory Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present a novel theoretical framework for analyzing and modeling protein fitness landscapes using spectral graph theory. |
Hao Zhu; Daniel M. Steinberg; Piotr Koniusz; |
| 76 | Variational Schr�dinger Momentum Diffusion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To obtain a trade-off between transport properties and scalability, we introduce variational Schrödinger momentum diffusion (VSMD), which employs linearized forward score functions (variational scores) to eliminate the dependence on simulated forward trajectories. |
Kevin Rojas; Yixin Tan; Molei Tao; Yuriy Nevmyvaka; Wei Deng; |
| 77 | Policy Teaching Via Data Poisoning in Learning from Human Preferences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: More specifically, we consider the problem of teaching/enforcing a target policy $\pi^\dagger$ by synthesizing preference data. |
Andi Nika; Jonathan N�ther; Debmalya Mandal; Parameswaran Kamalaruban; Adish Singla; Goran Radanovic; |
| 78 | Fair Resource Allocation in Weakly Coupled Markov Decision Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For more general settings, we introduce a count-proportion-based deep reinforcement learning approach. |
Xiaohui Tu; Yossiri Adulyasak; Nima Akbarzadeh; Erick Delage; |
| 79 | FLIPHAT: Joint Differential Privacy for High Dimensional Linear Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To further address the problem, we design a computationally efficient bandit algorithm, \textbf{F}orgetfu\textbf{L} \textbf{I}terative \textbf{P}rivate \textbf{HA}rd \textbf{T}hresholding (FLIPHAT). |
Saptarshi Roy; Sunrit Chakraborty; Debabrota Basu; |
| 80 | Hyperboloid GPLVM for Discovering Continuous Hierarchies Via Nonparametric Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes hyperboloid Gaussian process latent variable models (hGP-LVMs) to embed high-dimensional hierarchical data while preserving the implicit continuity via nonparametric estimation. |
Koshi Watanabe; Keisuke Maeda; Takahiro Ogawa; Miki Haseyama; |
| 81 | Effective Bayesian Causal Inference Via Structural Marginalisation and Autoregressive Orders Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: While principled, this marginalisation over entire causal models, i.e., both causal structures (graphs) and mechanisms, poses a tremendous computational challenge. In this work, we address this challenge by decomposing structure marginalisation into the marginalisation over (i) causal orders and (ii) directed acyclic graphs (DAGs) given an order. |
Christian Toth; Christian Knoll; Franz Pernkopf; Robert Peharz; |
| 82 | Optimising Clinical Federated Learning Through Mode Connectivity-based Model Aggregation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This issue is particularly pronounced in non-IID settings, common in clinical contexts, where variations in data distribution, class imbalance, and training sample sizes result in client heterogeneity. To address this issue, we propose a mode connectivity-based FL framework that ensures the global model resides within the overlapping low-loss regions of all clients in the parameter space. |
Anshul Thakur; Soheila Molaei; Patrick Schwab; Danielle Belgrave; Kim Branson; David A. Clifton; |
| 83 | Information Transfer Across Clinical Tasks Via Adaptive Parameter Optimisation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents Adaptive Parameter Optimisation (APO), a novel framework for optimising shared models across multiple clinical tasks, addressing the challenges of balancing strict parameter sharing—often leading to task conflicts—and soft parameter sharing, which may limit effective cross-task information exchange. |
Anshul Thakur; Elena Gal; Soheila Molaei; Xiao Gu; Patrick Schwab; Danielle Belgrave; Kim Branson; David A. Clifton; |
| 84 | Towards A Mathematical Theory for Consistency Training in Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consistency models, which were proposed to mitigate the high computational overhead during the sampling phase of diffusion models, facilitate single-step sampling while attaining state-of-the-art empirical performance. When integrated into the training phase, consistency models attempt to train a sequence of consistency functions capable of mapping any point at any time step of the diffusion process to its starting point. |
Gen Li; Zhihan Huang; Yuting Wei; |
| 85 | Regularity in Canonicalized Models: A Theoretical Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Focusing on the importance of end-to-end regularity rather than the projection mapping itself, this paper explores the continuity and regularity of canonicalized models from a theoretical perspective. |
Behrooz Tahmasebi; Stefanie Jegelka; |
| 86 | Sampling in High-Dimensions Using Stochastic Interpolants and Forward-Backward Stochastic Differential Equations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a class of diffusion-based algorithms to draw samples from high-dimensional probability distributions given their unnormalized densities. |
Anand Jerry George; Nicolas Macris; |
| 87 | On The Identifiability of Causal Abstractions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we instead assume interventions on arbitrary subsets of latent variables, which is more realistic. |
Xiusi Li; S�kou-Oumar Kaba; Siamak Ravanbakhsh; |
| 88 | Learning in Herding Mean Field Games: Single-Loop Algorithm with Finite-Time Convergence Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a new class of solvable MFGs, named the "fully herding class", which expands the known solvable class of MFGs and for the first time includes problems with multiple equilibria. We propose a direct policy optimization method, Accelerated Single-loop Actor Critic Algorithm for Mean Field Games (ASAC-MFG), that provably finds a global equilibrium for MFGs within this class, under suitable access to a single trajectory of Markovian samples. |
Sihan Zeng; Sujay Bhatt; Alec Koppel; Sumitra Ganesh; |
| 89 | Poisoning Bayesian Inference Via Data Deletion and Replication Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Research in adversarial machine learning (AML) has shown that statistical models are vulnerable to maliciously altered data. |
Matthieu Carreau; Roi Naveiro; William N. Caballero; |
| 90 | Recursive Learning of Asymptotic Variational Objectives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To enable online VI in SSMs when the observations are received in real time, we propose maximising an IWAE-type variational lower bound on the asymptotic contrast function, rather than the standard IWAE ELBO, using stochastic approximation. |
Alessandro Mastrototaro; Mathias M�ller; Jimmy Olsson; |
| 91 | Performative Reinforcement Learning with Linear Markov Decision Process Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we generalize the results to \emph{linear Markov decision processes} which is the primary theoretical model of large-scale MDPs. |
Debmalya Mandal; Goran Radanovic; |
| 92 | Corruption Robust Offline Reinforcement Learning with Human Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We aim to design algorithms that identify a near-optimal policy from the corrupted data, with provable guarantees. |
Debmalya Mandal; Andi Nika; Parameswaran Kamalaruban; Adish Singla; Goran Radanovic; |
| 93 | Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present two Policy Gradient-based algorithms with general parametrization in the context of infinite-horizon average reward Markov Decision Process (MDP). |
Swetha Ganesh; Washim Uddin Mondal; Vaneet Aggarwal; |
| 94 | Two-Timescale Linear Stochastic Approximation: Constant Stepsizes Go A Long Way Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we investigate {\it constant} stpesize schemes through the lens of Markov processes, proving that the iterates of both timescales converge to a unique joint stationary distribution in Wasserstein metric. |
Jeongyeol Kwon; Luke Dotson; Yudong Chen; Qiaomin Xie; |
| 95 | HR-Bandit: Human-AI Collaborated Linear Recourse Bandit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Human doctors frequently recommend actionable recourses that allow patients to modify their conditions to access more effective treatments. Inspired by such healthcare scenarios, we propose the Recourse Linear UCB (\textsf{RLinUCB}) algorithm, which optimizes both action selection and feature modifications by balancing exploration and exploitation. |
Junyu Cao; Ruijiang Gao; Esmaeil Keyvanshokooh; |
| 96 | Steering No-Regret Agents in MFGs Under Model Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our work presents an effective framework for steering agents behaviors in large-population systems under uncertainty. |
Leo Widmer; Jiawei Huang; Niao He; |
| 97 | Feasible Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce Feasible Learning (FL), a sample-centric learning paradigm where models are trained by solving a feasibility problem that bounds the loss for each training sample. |
Juan Ramirez; Ignacio Hounie; Juan Elenter; Jose Gallego-Posada; Meraj Hashemizadeh; Alejandro Ribeiro; Simon Lacoste-Julien; |
| 98 | Selecting The Number of Communities for Weighted Degree-Corrected Stochastic Block Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate how to select the number of communities for weighted networks without a full likelihood modeling. |
Yucheng Liu; Xiaodong Li; |
| 99 | Randomized Iterative Solver As Iterative Refinement: A Simple Fix Towards Backward Stability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a new perspective that interprets Iterative Sketching and Sketching-and-Precondition as forms of Iterative Refinement. |
Ruihan Xu; Yiping Lu; |
| 100 | Near-Optimal Sample Complexity for Iterated CVaR Reinforcement Learning with A Generative Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the sample complexity problem of risk-sensitive Reinforcement Learning (RL) with a generative model, where we aim to maximize the Conditional Value at Risk (CVaR) with risk tolerance level $\tau$ at each step, named Iterated CVaR. |
Zilong Deng; Simon Khan; Shaofeng Zou; |
| 101 | Conditional Simulation Via Entropic Optimal Transport: Toward Non-parametric Estimation of Conditional Brenier Maps Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we propose a non-parametric estimator for conditional Brenier maps based on the computational scalability of \emph{entropic} optimal transport. |
Ricardo Baptista; Aram-Alexandre Pooladian; Michael Brennan; Youssef Marzouk; Jonathan Niles-Weed; |
| 102 | Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We conduct the convergence analysis of parameter estimation in the contaminated mixture of experts. |
Fanqi Yan; Huy Nguyen; Le Quang Dung; Pedram Akbarian; Nhat Ho; |
| 103 | Cost-Aware Optimal Pairwise Pure Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing works mostly focus on specific pure exploration tasks, without a holistic view of the general pure exploration problem. This work fills this gap by introducing a versatile framework to study pure exploration, with a focus on identifying the pairwise relationships between targeted arm pairs. |
Di Wu; Chengshuai Shi; Ruida Zhou; Cong Shen; |
| 104 | Conditional Diffusions for Amortized Neural Posterior Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we demonstrate the effectiveness of conditional diffusions coupled with high-capacity summary networks for amortized NPE. |
Tianyu Chen; Vansh Bansal; James G. Scott; |
| 105 | Causal Discovery-Driven Change Point Detection in Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The conditional relative Pearson divergence quantifies the distribution difference between consecutive segments in the time series, while the causal discovery method allows a focus on the causal mechanism, facilitating access to independent and identically distributed (IID) samples. |
Shanyun Gao; Raghavendra Addanki; Tong Yu; Ryan A. Rossi; Murat Kocaoglu; |
| 106 | Sequential Kernelized Stein Discrepancy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a sequential version of the kernelized Stein discrepancy goodness-of-fit test, which allows for conducting goodness-of-fit tests for unnormalized densities that are continuously monitored and adaptively stopped. |
Diego Martinez-Taboada; Aaditya Ramdas; |
| 107 | Time-series Attribution Maps with Regularized Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Here, we propose a method to generate attribution maps with identifiability guarantees by developing a regularized contrastive learning algorithm trained on time-series data plus a new attribution method called Inverted Neuron Gradient (collectively named xCEBRA). |
Steffen Schneider; Rodrigo Gonz�lez Laiz; Anastasiia Filippova; Markus Frey; Mackenzie W Mathis; |
| 108 | Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work focuses on the gradient flow dynamics of a neural network model that uses correlation loss to approximate a multi-index function on high-dimensional standard Gaussian data. |
Berfin Simsek; Amire Bendjeddou; Daniel Hsu; |
| 109 | The Cost of Local and Global Fairness in Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes a framework that investigates the minimum accuracy lost for enforcing a specified level of global and local fairness in multi-class FL settings. |
Yuying Duan; Gelei Xu; Yiyu Shi; Michael Lemmon; |
| 110 | The Polynomial Iteration Complexity for Variance Exploding Diffusion Models: Elucidating SDE and ODE Samplers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, only a few works analyze the iteration complexity of VE-based models, and most focus on SDE-based implementation with strong assumptions. In this work, we prove the first polynomial iteration complexity under the realistic bounded support assumption for these two implementations. |
Ruofeng Yang; Bo Jiang; Shuai Li; |
| 111 | Fairness Risks for Group-Conditionally Missing Demographics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The key challenge we will address is the group dependency of the unavailability, e.g., people of some age range may be more reluctant to reveal their age. |
Kaiqi Jiang; Wenzhe Fan; Mao Li; Xinhua Zhang; |
| 112 | Causal Representation Learning from General Environments Under Nonparametric Mixing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, we formalize a set of desiderata for causal representation learning that applies to a broader class of environments, referred to as general environments. |
Ignavier Ng; Shaoan Xie; Xinshuai Dong; Peter Spirtes; Kun Zhang; |
| 113 | Sampling from The Random Linear Model Via Stochastic Localization Up to The AMP Threshold Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on sampling from the posterior in the linear inverse problem, with an i.i.d. random design matrix, and show that the threshold for sampling coincide with that of posterior mean estimation. |
Han Cui; Zhiyuan Yu; Jingbo Liu; |
| 114 | Differentially Private Graph Data Release: Inefficiencies & Unfairness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper investigates the effects of DP on bias and fairness when releasing network edge weights. |
Ferdinando Fioretto; Diptangshu Sen; Juba Ziani; |
| 115 | Adversarial Vulnerabilities in Large Language Models for Time Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce a targeted adversarial attack framework for LLM-based time series forecasting. |
Fuqiang Liu; Sicong Jiang; Luis Miranda-Moreno; Seongjin Choi; Lijun Sun; |
| 116 | A Multi-Task Learning Approach to Linear Multivariate Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose to view multivariate forecasting as a multi-task learning problem, facilitating the analysis of forecasting by considering the angle between task gradients and their balance. |
Liran Nochumsohn; Hedi Zisling; Omri Azencot; |
| 117 | Batch, Match, and Patch: Low-rank Approximations for Score-based Variational Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Black-box variational inference (BBVI) scales poorly to high-dimensional problems when it is used to estimate a multivariate Gaussian approximation with a full covariance matrix. In this paper, we extend the \emph{batch-and-match} (BaM) framework for score-based BBVI to problems where it is prohibitively expensive to store such covariance matrices, let alone to estimate them. |
Chirag Modi; Diana Cai; Lawrence K. Saul; |
| 118 | Sampling From Multiscale Densities With Delayed Rejection Generalized Hamiltonian Monte Carlo Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, HMC still struggles to sample from hierarchical models that induce densities with multiscale geometry: a large step size is needed to efficiently explore low curvature regions while a small step size is needed to accurately explore high curvature regions. We introduce the delayed rejection generalized HMC (DR-G-HMC) sampler that overcomes this challenge by employing dynamic step size selection, inspired by differential equation solvers. |
Gilad Turok; Chirag Modi; Bob Carpenter; |
| 119 | On The Computational Tractability of The (Many) Shapley Values Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these studies primarily focused on a specific variant called Conditional SHAP, though many other variants exist and address different limitations. In this work, we analyze the complexity of computing a much broader range of such variants, including Conditional, Interventional, and Baseline SHAP, while exploring both local and global computations. |
Reda Marzouk; Shahaf Bassan; Guy Katz; De la Higuera; |
| 120 | Understanding Inverse Reinforcement Learning Under Overparameterization: Non-Asymptotic Analysis and Global Optimality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Meanwhile, conventional IRL algorithms usually adopt a nested structure, leading to computational inefficiency, especially in high-dimensional settings. To address this problem, we propose the first two-timescale single-loop IRL algorithm under neural network parameterized reward and provide a non-asymptotic convergence analysis under overparameterization. |
Ruijia Zhang; Siliang Zeng; Chenliang Li; Alfredo Garcia; Mingyi Hong; |
| 121 | Functional Stochastic Gradient MCMC for Bayesian Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce novel functional MCMC schemes, including stochastic gradient versions, based on newly designed diffusion dynamics that can incorporate more informative functional priors. |
Mengjing Wu; Junyu Xuan; Jie Lu; |
| 122 | Multi-Agent Credit Assignment with Pretrained Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Moreover, the diversity-promoting nature of existing ASG methods can lead to the "over-representation" of subgoals, generating numerous spurious subgoals of limited relevance to the actual task reward and thus decreasing the sample efficiency of the algorithm. To address this problem and inspired by the disentangled representation learning, we propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA), that prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. |
Wenhao Li; Dan Qiao; Baoxiang Wang; Xiangfeng Wang; Wei Yin; Hao Shen; Bo Jin; Hongyuan Zha; |
| 123 | Personalizing Low-Rank Bayesian Neural Networks Via Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Furthermore, different clients may exhibit heterogeneous uncertainty levels owing to varying local dataset sizes and distributions. To address these challenges, we propose LR-BPFL, a novel BPFL method that learns a global deterministic model along with personalized low-rank Bayesian corrections. |
Boning Zhang; Dongzhu Liu; Osvaldo Simeone; Guanchu Wang; Dimitrios Pezaros; Guangxu Zhu; |
| 124 | ROTI-GCV: Generalized Cross-Validation for Right-ROTationally Invariant Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a first step towards modeling structured sample dependence and heavy tails, we use right-rotationally invariant covariate distributions — a crucial concept from compressed sensing. In the proportional asymptotics regime where the number of features and samples grow comparably, which is known to better reflect the empirical behavior in moderately sized datasets, we introduce a new framework, ROTI-GCV, for reliably performing cross-validation under these challenging conditions. |
Kevin Luo; Yufan Li; Pragya Sur; |
| 125 | Distributional Counterfactual Explanations With Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes distributional counterfactual explanation (DCE), shifting focus to the distributional properties of observed and counterfactual data, thus providing broader insights. |
Lei You; Lele Cao; Mattias Nilsson; Bo Zhao; Lei Lei; |
| 126 | Causal Temporal Regime Structure Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: CASTOR optimizes the data log-likelihood using an expectation-maximization algorithm, alternating between assigning regime indices (expectation step) and inferring causal relationships in each regime (maximization step). |
Abdellah Rahmani; Pascal Frossard; |
| 127 | Credal Two-Sample Tests of Epistemic Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce credal two-sample testing, a new hypothesis testing framework for comparing credal sets—convex sets of probability measures where each element captures aleatoric uncertainty and the set itself represents epistemic uncertainty that arises from the modeller’s partial ignorance. |
Siu Lun Chau; Antonin Schrab; Arthur Gretton; Dino Sejdinovic; Krikamol Muandet; |
| 128 | On Preference-based Stochastic Linear Contextual Bandits with Knapsacks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose budget-aware optimistic and randomized exploration algorithms that achieve a regret of ${O}((\kappa+\frac{T\nu^*}{B})\sqrt{T}\log T),$ for any total budget $B=\Omega(\sqrt{T}). |
Xin Liu; |
| 129 | Theory of Agreement-on-the-Line in Linear Models and Gaussian Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we discover that agreement-on-the-line emerges even in linear classifiers over Gaussian class conditional distributions. |
Christina Baek; Aditi Raghunathan; J Zico Kolter; |
| 130 | On The Power of Adaptive Weighted Aggregation in Heterogeneous Federated Learning and Beyond Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: These results reveal an inconsistency between FL theory and practice that is not fully explained. In this paper, we show that common heterogeneity measures contribute to this inconsistency based on rigorous convergence analysis. |
Dun Zeng; Zenglin Xu; SHIYU LIU; Yu Pan; Qifan Wang; Xiaoying Tang; |
| 131 | Stochastic Rounding for LLM Training: Theory and Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we leverage stochastic rounding (SR) to address numerical errors of training with low-precision representation. |
Kaan Ozkara; Tao Yu; Youngsuk Park; |
| 132 | Data Reconstruction Attacks and Defenses: A Systematic Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose to view the problem as an inverse problem, enabling us to theoretically and systematically evaluate the data reconstruction attack. |
Sheng Liu; Zihan Wang; Yuxiao Chen; Qi Lei; |
| 133 | On Tradeoffs in Learning-Augmented Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, in some settings, this comes at the expense of smoothness. In this paper, we explore the tradeoffs between all the mentioned criteria and show how they can be balanced. |
Ziyad Benomar; Vianney Perchet; |
| 134 | Performative Prediction on Games and Mechanism Design Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This effect is ubiquitous in scenarios ranging from pandemic predictions to election polls, but existing work has ignored interdependencies among predicted agents. As a first step in this direction, we study a collective risk dilemma where agents dynamically decide whether to trust predictions based on past accuracy. |
Ant�nio G�is; Mehrnaz Mofakhami; Fernando P. Santos; Gauthier Gidel; Simon Lacoste-Julien; |
| 135 | InfoNCE: Identifying The Gap Between Theory and Practice Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Hence, a more realistic assumption is that all latent factors change with a continuum of variability across all factors. We introduce AnInfoNCE, a generalization of InfoNCE that can provably uncover the latent factors in this anisotropic setting, broadly generalizing previous identifiability results in CL. |
Evgenia Rusak; Patrik Reizinger; Attila Juhos; Oliver Bringmann; Roland S. Zimmermann; Wieland Brendel; |
| 136 | HACSurv: A Hierarchical Copula-Based Approach for Survival Analysis with Dependent Competing Risks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce HACSurv, a survival analysis method that learns Hierarchical Archimedean Copulas structures and cause-specific survival functions from data with competing risks. |
Xin Liu; Weijia Zhang; Min-Ling Zhang; |
| 137 | On The Difficulty of Constructing A Robust and Publicly-Detectable Watermark Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, no existing scheme combines robustness, unforgeability, and public-detectability. In this work, we formally define such a scheme and establish its existence. |
Jaiden Fairoze; Guillermo Ortiz-Jimenez; Mel Vecerik; Somesh Jha; Sven Gowal; |
| 138 | Pure Exploration with Feedback Graphs Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We study the sample complexity of pure exploration in an online learning problem with a feedback graph. |
Alessio Russo; Yichen Song; Aldo Pacchiano; |
| 139 | Variational Inference on The Boolean Hypercube with The Quantum Entropy Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we derive variational inference upper-bounds on the log-partition function of a pairwize Markov random fields on the Boolean hypercube, based on quantum relaxations of the Kullback-Leibler divergence. |
Eliot Beyler; Francis Bach; |
| 140 | A Safe Exploration Approach to Constrained Markov Decision Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we employ the LB-SGD algorithm proposed in (Usmanova et al., 2024), which utilizes an interior-point approach based on the log-barrier function of the CMDP. |
Tingting Ni; Maryam Kamgarpour; |
| 141 | Differentially Private Continual Release of Histograms and Related Queries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our first contribution is an output-sensitive mechanism in the insertions-only model ($\chi = \\{0,1\\}$) for maintaining (i) the histogram or (ii) queries that do not require maintaining the entire histogram, such as the maximum or minimum column sum, the median, or any quantiles. |
Monika Henzinger; A. R. Sricharan; Teresa Anna Steiner; |
| 142 | Semiparametric Conformal Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we aim to construct the conformal prediction set accounting for the joint correlation structure of the vector-valued non-conformity scores. |
Ji Won Park; Kyunghyun Cho; |
| 143 | Powerful Batch Conformal Prediction for Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a uniformly more powerful solution, based on specific combinations of conformal $p$-values that exploit the Simes inequality. |
Ulysse Gazin; Ruth Heller; Etienne Roquain; Aldo Solari; |
| 144 | Keeping Up with Dynamic Attackers: Certifying Robustness to Adaptive Online Data Poisoning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Indeed, it has been shown in prior work that online dynamic adversaries can be significantly more powerful than static ones. We present a novel framework for computing certified bounds on the impact of dynamic poisoning, and use these certificates to design robust learning algorithms. |
Avinandan Bose; Laurent Lessard; Maryam Fazel; Krishnamurthy Dj Dvijotham; |
| 145 | AxlePro: Momentum-Accelerated Batched Training of Kernel Machines Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we derive a novel iterative algorithm for learning kernel machines. |
Yiming Zhang; Parthe Pandit; |
| 146 | DDEQs: Distributional Deep Equilibrium Models Through Wasserstein Gradient Flows Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we present Distributional Deep Equilibrium Models (DDEQs), extending DEQs to discrete measure inputs, such as sets or point clouds. |
Jonathan Geuter; Cl�ment Bonet; Anna Korba; David Alvarez-Melis; |
| 147 | Proximal Sampler with Adaptive Step Size Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an \textbf{adaptive} proximal sampler that can utilize the local geometry to adjust step sizes and is guaranteed to converge to the target distribution. |
Bo Yuan; Jiaojiao Fan; Jiaming Liang; Yongxin Chen; |
| 148 | Elastic Representation: Mitigating Spurious Correlations for Group Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We hereby propose Elastic Representation (ElRep) to learn features by imposing Nuclear- and Frobenius-norm penalties on the representation from the last layer of a neural network. |
Tao Wen; Zihan Wang; Quan Zhang; Qi Lei; |
| 149 | Adversarial Training in High-Dimensional Regression: Generated Data and Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we perform a theoretical analysis of the asymptotic behavior of this method in high-dimensional regression problem when using two-layer neural networks. |
Yue Xing; |
| 150 | Robust Multi-fidelity Bayesian Optimization with Deep Kernel and Partition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate the dependency on cross-fidelity assumptions while maintaining the advantages of low-fidelity queries, we introduce a random sampling and partition-based MFBO framework with deep kernel learning. |
Fengxue Zhang; Thomas Desautels; Yuxin Chen; |
| 151 | An Iterative Algorithm for Rescaled Hyperbolic Functions Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose an iterative algorithm to solve a rescaled version of the slightly different formulation of the softmax regression problem that arises in attention mechanisms of large language models. |
Yeqi Gao; Zhao Song; Junze Yin; |
| 152 | Estimating The Spectral Moments of The Kernel Integral Operator from Finite Sample Matrices Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel algorithm that provides unbiased estimates of the spectral moments of the kernel integral operator in the limit of infinite inputs and features from finitely sampled measurement matrices. |
Chanwoo Chun; SueYeon Chung; Daniel Lee; |
| 153 | The Sample Complexity of Stackelberg Games Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we revise the sample complexity of learning an optimal strategy to commit to in SGs. |
Francesco Bacchiocchi; Matteo Bollini; Matteo Castiglioni; Alberto Marchesi; Nicola Gatti; |
| 154 | Accuracy on The Wrong Line: On The Pitfalls of Noisy Data for Out-of-distribution Generalisation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: But when does this useful relationship break down? In this work, we explore its robustness. |
Amartya Sanyal; Yaxi Hu; Yaodong Yu; Yian Ma; Yixin Wang; Bernhard Sch�lkopf; |
| 155 | A Theoretical Framework for Preventing Class Collapse in Supervised Contrastive Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present theoretically grounded guidelines for SupCL to prevent class collapse in learned representations. |
Chungpa Lee; Jeongheon Oh; Kibok Lee; Jy-yong Sohn; |
| 156 | A Multi-Armed Bandit Approach to Online Selection and Evaluation of Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an online evaluation and selection framework to find the generative model that maximizes a standard assessment score among a group of available models. |
Xiaoyan Hu; Ho-fung Leung; Farzan Farnia; |
| 157 | Almost Linear Time Differentially Private Release of Synthetic Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we give an almost linear time and space algorithms to sample from an exponential mechanism with an $\ell_1$-score function defined over an exponentially large non-convex set. |
Zongrui Zou; Jingcheng Liu; Jalaj Upadhyay; |
| 158 | Variational Inference in Location-Scale Families: Exact Recovery of The Mean and Correlation Matrix Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In practice, $\mathcal Q$ is not rich enough to contain $p$, and the approximation is misspecified even when it is a unique global minimizer of $\text{KL}(q||p)$. In this paper, we analyze the robustness of VI to these misspecifications when $p$ exhibits certain symmetries and $\mathcal Q$ is a location-scale family that shares these symmetries. |
Charles Margossian; Lawrence K. Saul; |
| 159 | Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Second, online experience may not reflect the true demand due to the lost-sales phenomenon typical in IC, which makes the learning process more challenging. To address the above challenges, we propose a training framework that combines reinforcement learning with feedback graph (RLFG) and intrinsically motivated exploration (IME) to boost sample efficiency. |
Zifan LIU; Xinran Li; Shibo Chen; Gen Li; Jiashuo Jiang; Jun Zhang; |
| 160 | Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the “Node-Edge approach”, in which the model first computes an embedding vector for each variable-node, then uses these variable-wise representations to concurrently and independently predict for each directed causal-edge. |
Jiaru Zhang; Rui Ding; Qiang Fu; Huang Bojun; Zizhen Deng; Yang Hua; Haibing Guan; Shi Han; Dongmei Zhang; |
| 161 | Ant Colony Sampling with GFlowNets for Combinatorial Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present the Generative Flow Ant Colony Sampler (GFACS), a novel meta-heuristic method that hierarchically combines amortized inference and parallel stochastic search. |
Minsu Kim; Sanghyeok Choi; Hyeonah Kim; Jiwoo Son; Jinkyoo Park; Yoshua Bengio; |
| 162 | Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces Dynamic Importance Sampling for Constrained Decoding (DISC) with GPU-based Parallel Prefix-Verification (PPV), a novel algorithm that leverages dynamic importance sampling to achieve theoretically guaranteed asymptotic unbiasedness and overcomes the inefficiency of prefix-tree. |
Haotian Ye; Himanshu Jain; Chong You; Ananda Theertha Suresh; Haowei Lin; James Zou; Felix Yu; |
| 163 | Truncated Inverse-L�vy Measure Representation of The Beta Process Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce the truncated inverse-L{é}vy measure representation (TILe-Rep) that extends the decreasing atom weights representation of the beta process to general hyperparameters. |
Junyi Zhang; Angelos Dassios; Zhong Chong; Qiufei Yao; |
| 164 | M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization Via Multiplier Induced Loss Landscape Scheduling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: A probabilistic graphical model is proposed, modeling the joint model parameter and multiplier evolution, with a hypervolume based likelihood, promoting multi-objective descent in structural risk minimization. |
Xudong Sun; Nutan Chen; Alexej Gossmann; Yu Xing; Matteo Wohlrapp; Emilio Dorigatti; Carla Feistner; Felix Drost; Daniele Scarcella; Lisa Helen Beer; Carsten Marr; |
| 165 | Constrained Multi-objective Bayesian Optimization Through Optimistic Constraints Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a novel constrained multi-objective Bayesian optimization algorithm \textbf{COMBOO} that balances active learning of the level-set defined on multiple unknowns with multi-objective optimization within the feasible region. |
Diantong Li; Fengxue Zhang; Chong Liu; Yuxin Chen; |
| 166 | Robust Kernel Hypothesis Testing Under Data Corruption Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a general method for constructing robust permutation tests under data corruption. |
Antonin Schrab; Ilmun Kim; |
| 167 | Memory-Efficient Optimization with Factorized Hamiltonian Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these algorithms typically experience high memory overhead caused by the accumulation of optimization states, leading to a critical challenge in training large-scale network models. In this study, we introduce a novel adaptive optimizer, H-Fac, which incorporates a memory-efficient factorization approach to address this challenge. |
Son Nguyen; Lizhang Chen; Bo Liu; Qiang Liu; |
| 168 | Inverse Optimization with Prediction Market: A Characterization of Scoring Rules for Elciting System States Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This is relevant when we want to identify the underlying state of a system or to design a system with desirable outcomes. Whereas inverse optimization has been investigated in the algorithmic perspective during past two decades, its formulation intimately tied with the principal’s subjective choice of a desirable state—indeed, this is crucial to make the inverse problem well-posed. |
Han Bao; Shinsaku Sakaue; |
| 169 | Diffusion Models Under Group Transformations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper focuses on structure-preserving diffusion models (SPDM), a specific subset of diffusion processes tailored for distributions with inherent structures, such as group symmetries. |
Haoye Lu; Spencer Szabados; Yaoliang Yu; |
| 170 | Evidential Uncertainty Probes for Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a plug-and-play framework for uncertainty quantification in GNNs that works with pre-trained models without the need for retraining. |
Linlin Yu; Kangshuo Li; Pritom Kumar Saha; Yifei Lou; Feng Chen; |
| 171 | Reward Maximization for Pure Exploration: Minimax Optimal Good Arm Identification for Nonparametric Multi-Armed Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The former focuses on exploiting arms with the highest means, while the latter may require constant exploration across all arms. In this work, we focus on good arm identification (GAI), a pure exploration objective that aims to label arms with means above a threshold as quickly as possible. |
Brian M Cho; Dominik Meier; Kyra Gan; Nathan Kallus; |
| 172 | Variation Due to Regularization Tractably Recovers Bayesian Deep Learning Uncertainty Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing methods quantify uncertainty at the last layer or other approximations of the network which may miss some sources of uncertainty in the model. To address this gap, we propose an uncertainty quantification method for large networks based on variation due to regularization. |
James McInerney; Nathan Kallus; |
| 173 | Anytime-Valid A/B Testing of Counting Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Motivated by monitoring the arrival of incoming adverse events such as customer support calls or crash events from users exposed to an experimental product change, we consider sequential hypothesis testing of continuous-time counting processes. |
Michael Lindon; Nathan Kallus; |
| 174 | Continuous Structure Constraint Integration for Robust Causal Discovery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such rigidity can lead to significant inaccuracies, especially when the priors are flawed. In response to these challenges, this work introduces the Edge Constraint Adaptive (ECA) method, a novel approach that softly represents the presence of edges, allowing for a differentiable representation of prior constraint loss. |
Lyuzhou Chen; Taiyu Ban; Derui Lyu; Yijia Sun; Kangtao Hu; Xiangyu Wang; Huanhuan Chen; |
| 175 | SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With An Unknown Graph Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on identifying causal effects between target variables in a computationally and statistically efficient way. |
M�ty�s Schubert; Tom Claassen; Sara Magliacane; |
| 176 | Global Group Fairness in Federated Learning Via Function Tracking Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Ensuring global fairness in distributed training presents unique challenges, as fairness regularizers typically involve probability metrics between distributions across all clients and are not naturally separable by client. To address this, we introduce a function-tracking scheme for the global fairness regularizer based on a Maximum Mean Discrepancy (MMD), which incurs a small communication overhead. |
Yves Rychener; Daniel Kuhn; Yifan Hu; |
| 177 | Linear Submodular Maximization with Bandit Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider the problem of linear submodular maximization under the bandit feedback in the pure-exploration setting, where the submodular objective function is defined as $f:2^U \rightarrow\mathbb{R}_{\ge 0}$, where $f=\sum_{i=1}^dw_iF_{i}$. |
Wenjing Chen; Victoria G. Crawford; |
| 178 | From Gradient Clipping to Normalization for Heavy Tailed SGD Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Lastly, even with this knowledge, current sample complexity upper bounds for the method are sub-optimal in nearly all parameters. To address these issues and motivated by practical observations, we make the connection of gradient clipping to its close relative — Normalized SGD (NSGD) — and study its convergence properties. |
Florian H�bler; Ilyas Fatkhullin; Niao He; |
| 179 | Clustered Invariant Risk Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this scenario, where a given set of environments exhibits an unknown clustered structure, our objective is to identify a single invariant feature extractor and per-cluster regressors (or classifiers) built on top of the feature extractor. To achieve this, we propose a new framework called Clustered IRM for simultaneously identifying the cluster structure and the invariant features. |
Tomoya Murata; Atsushi Nitanda; Taiji Suzuki; |
| 180 | Analyzing The Role of Permutation Invariance in Linear Mode Connectivity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This phenomenon has sparked significant attention due to both its theoretical interest and practical relevance in applications such as model merging. In this paper, we provide a fine-grained analysis of this phenomenon for two-layer ReLU networks under a teacher-student setup. |
Keyao Zhan; Puheng Li; Lei Wu; |
| 181 | Rethinking Neural-based Matrix Inversion: Why Can�t, and Where Can Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents a theoretical analysis demonstrating the fundamental limitations of neural networks in developing a generalized matrix inversion model. |
Yuliang Ji; Jian Wu; Yuanzhe Xi; |
| 182 | To Give or Not to Give? The Impacts of Strategically Withheld Recourse Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We show that this tension leads rational utility-maximizing systems to frequently withhold recourse, resulting in decreased population utility, particularly impacting sensitive groups. To mitigate these effects, we explore the role of recourse subsidies, finding them effective in increasing the provision of recourse actions by rational systems, as well as lowering the potential social cost and mitigating unfairness caused by recourse withholding. |
Yatong Chen; Andrew Estornell; Yevgeniy Vorobeychik; Yang Liu; |
| 183 | Convergence Analysis for General Probability Flow ODEs of Diffusion Models in Wasserstein Distances Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we provide the first non-asymptotic convergence analysis for a general class of probability flow ODE samplers in 2-Wasserstein distance, assuming accurate score estimates and smooth log-concave data distributions. |
Xuefeng Gao; Lingjiong Zhu; |
| 184 | Differentiable Causal Structure Learning with Identifiability By NOTIME Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address the issue, we propose NOTIME (\emph{Non-combinatorial Optimization of Trace exponential and Independence MEasures}), the first differentiable DAG learning algorithm with \emph{provable} identifiability guarantees under the LiNGAM by building on a measure of (joint) independence. |
Jeroen Berrevoets; Jakob Raymaekers; Mihaela van der Schaar; Tim Verdonck; Ruicong Yao; |
| 185 | Koopman-Equivariant Gaussian Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a family of Gaussian processes (GP) for dynamical systems with linear time-invariant responses, which are nonlinear only in initial conditions. |
Petar Bevanda; Max Beier; Alexandre Capone; Stefan Georg Sosnowski; Sandra Hirche; Armin Lederer; |
| 186 | Offline RL Via Feature-Occupancy Gradient Ascent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Starting from the classic linear-program formulation of the optimal control problem in MDPs, we develop a new algorithm that performs a form of gradient ascent in the space of feature occupancies, defined as the expected feature vectors that can potentially be generated by executing policies in the environment. |
Gergely Neu; Nneka Okolo; |
| 187 | Gaussian Smoothing in Saliency Maps: The Stability-Fidelity Trade-Off in Neural Network Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the role of Gaussian smoothing in the well-known Smooth-Grad algorithm in the stability of the gradient-based maps to the randomness of training samples. |
Zhuorui Ye; Farzan Farnia; |
| 188 | Fully Dynamic Adversarially Robust Correlation Clustering in Polylogarithmic Update Time Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the dynamic correlation clustering problem with \emph{adaptive} edge label flips. |
Vladimir Braverman; Prathamesh Dharangutte; Shreyas Pai; Vihan Shah; Chen Wang; |
| 189 | On Distributional Discrepancy for Experimental Design with General Assignment Probabilities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate experimental design for randomized controlled trials (RCTs) with both equal and unequal treatment-control assignment probabilities. |
Anup Rao; Peng Zhang; |
| 190 | Quantile Additive Trend Filtering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a practical algorithm for implementing quantile additive trend filtering using dimension-wise backfitting. |
Zhi Zhang; Kyle Ritscher; OSCAR HERNAN MADRID PADILLA; |
| 191 | Statistical Guarantees for Lifelong Reinforcement Learning Using PAC-Bayes Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose EPIC (Empirical PAC-Bayes that Improves Continuously), a novel algorithm designed for lifelong RL using PAC-Bayes theory. |
Zhi Zhang; Chris Chow; Yasi Zhang; Yanchao Sun; Haochen Zhang; Eric Hanchen Jiang; Han Liu; Furong Huang; Yuchen Cui; OSCAR HERNAN MADRID PADILLA; |
| 192 | Multi-marginal Schr�dinger Bridges with Iterative Reference Refinement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: So we propose a new method that (1) learns the unobserved trajectories from sample snapshots across multiple time points and (2) requires specification only of a class of reference dynamics, not a single fixed one. |
Yunyi Shen; Renato Berlinghieri; Tamara Broderick; |
| 193 | Learning Laplacian Positional Encodings for Heterophilous Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we theoretically demonstrate that current graph positional encodings (PEs) are not beneficial and could potentially hurt performance in tasks involving heterophilous graphs, where nodes that are close tend to have different labels. |
Michael Ito; Jiong Zhu; Dexiong Chen; Danai Koutra; Jenna Wiens; |
| 194 | Understanding GNNs and Homophily in Dynamic Node Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Focusing on graph convolutional networks (GCNs), we demonstrate theoretically that in dynamic settings, current GCN discriminative performance is characterized by the probability that a node’s future label is the same as its neighbors’ current labels. Based on this insight, we propose dynamic homophily, a new measure of homophily that applies in the dynamic setting. |
Michael Ito; Danai Koutra; Jenna Wiens; |
| 195 | Characterizing The Accuracy-Communication-Privacy Trade-off in Distributed Stochastic Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The objective is to design an algorithm to minimize a convex population loss using a collaborative effort across $M$ clients, while ensuring the privacy of the local datasets. In this work, we investigate the accuracy-communication-privacy trade-off for this problem. |
Sudeep Salgia; Nikola Pavlovic; Yuejie Chi; Qing Zhao; |
| 196 | A Shared Low-Rank Adaptation Approach to Personalized RLHF Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This assumption overlooks the inherent diversity and heterogeneity across individuals, limiting the adaptability of RLHF to personalized scenarios and risking misalignments that can diminish user satisfaction and trust in AI systems. In this paper, we address these challenges by introducing Low-Rank Adaptation (LoRA) into the personalized RLHF framework. |
Renpu Liu; Peng Wang; Donghao Li; Cong Shen; Jing Yang; |
| 197 | Computing High-dimensional Optimal Transport By Flow Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our method learns the dynamic OT by finding an invertible flow that minimizes the transport cost. |
Chen Xu; Xiuyuan Cheng; Yao Xie; |
| 198 | Federated Communication-Efficient Multi-Objective Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose FedCMOO, a novel communication-efficient federated multi-objective optimization (FMOO) algorithm that improves the error convergence performance of the model compared to existing approaches. |
Baris Askin; Pranay Sharma; Gauri Joshi; Carlee Joe-Wong; |
| 199 | Flexible and Efficient Probabilistic PDE Solvers Through Gaussian Markov Random Fields Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: It has been shown that such priors are solutions to stochastic PDEs (SPDEs) which when discretized allow for highly efficient GP regression through sparse linear algebra. In this work, we show how to leverage this prior class to make probabilistic PDE solvers practical, even for large-scale nonlinear PDEs, through greatly accelerated inference mechanisms. |
Tim Weiland; Marvin Pf�rtner; Philipp Hennig; |
| 200 | Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose a framework based on symmetry-based structured matrices to build approximately equivariant NNs with fewer parameters. |
Ashwin Samudre; Mircea Petrache; Brian Nord; Shubhendu Trivedi; |
| 201 | Type Information-Assisted Self-Supervised Knowledge Graph Denoising Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose to exploit the consistency between entity and relation type information for noise detection, resulting a novel self-supervised knowledge graph denoising method that avoids those problems. |
Jiaqi Sun; Yujia Zheng; Xinshuai Dong; Haoyue Dai; Kun Zhang; |
| 202 | Statistical Guarantees for Unpaired Image-to-Image Cross-Domain Analysis Using GANs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a framework for analyzing the generalization error in cross-domain deep generative models. |
Saptarshi Chakraborty; Peter Bartlett; |
| 203 | The Local Learning Coefficient: A Singularity-Aware Complexity Measure Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Recognizing the limitations of traditional complexity measures, the LLC leverages Singular Learning Theory (SLT), which has long recognized the significance of singularities in the loss landscape geometry. This paper provides an extensive exploration of the LLC’s theoretical underpinnings, offering both a clear definition and intuitive insights into its application. |
Edmund Lau; Zach Furman; George Wang; Daniel Murfet; Susan Wei; |
| 204 | Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: It remains unclear how to effectively use such data in the target task to provably enhance learning and sample efficiency. To address this, we propose a hybrid transfer RL (HTRL) setting, where an agent learns in a target environment while accessing offline data from a source environment with shifted dynamics. |
Chengrui Qu; Laixi Shi; Kishan Panaganti; Pengcheng You; Adam Wierman; |
| 205 | Locally Optimal Descent for Dynamic Stepsize Scheduling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice. |
Gilad Yehudai; Alon Cohen; Amit Daniely; Yoel Drori; Tomer Koren; Mariano Schain; |
| 206 | All Models Are Wrong, Some Are Useful: Model Selection with Limited Labels Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce MODEL SELECTOR, a framework for label-efficient selection of pretrained classifiers. |
Patrik Okanovic; Andreas Kirsch; Jannes Kasper; Torsten Hoefler; Andreas Krause; Nezihe Merve G�rel; |
| 207 | Behavior-Inspired Neural Networks for Relational Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a level of abstraction between the observable behavior of agents and the latent categories that determine their behavior. |
Yulong Yang; Bowen Feng; Keqin Wang; Naomi Leonard; Adji Bousso Dieng; Christine Allen-Blanchette; |
| 208 | Differential Privacy in Distributed Learning: Beyond Uniformly Bounded Stochastic Gradients Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional approaches often assume uniformly bounded stochastic gradients, which may not hold in practice. To address this issue, we propose differentially \textbf{Pri}vate \textbf{S}tochastic recursive \textbf{M}omentum with gr\textbf{A}dient clipping (PriSMA) that judiciously integrates clipping and momentum to enhance utility while guaranteeing privacy. |
Yue Huang; Jiaojiao Zhang; Qing Ling; |
| 209 | A Likelihood Based Approach for Watermark Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a statistical detection approach that improves the power of watermark detection, particularly in shorter texts. |
Xingchi Li; Guanxun Li; Xianyang Zhang; |
| 210 | Transfer Learning for High-dimensional Reduced Rank Time Series Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a new transfer learning algorithm tailored for estimating high-dimensional VAR models characterized by low-rank and sparse structures. |
Mingliang Ma; Abolfazl Safikhani; |
| 211 | FreqMoE: Enhancing Time Series Forecasting Through Frequency Decomposition Mixture of Experts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, these methods often use filtering techniques to remove certain frequency signals as noise, which may unintentionally discard important information and reduce prediction accuracy. To address this, we propose the Frequency Decomposition Mixture-of-Experts (FreqMoE) model, which dynamically decomposes time series data into frequency bands, each processed by a specialized expert. |
Ziqi Liu; |
| 212 | Achieving $\widetilde\mathcalO(\sqrtT)$ Regret in Average-Reward POMDPs with Known Observation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We tackle average-reward infinite-horizon POMDPs with an unknown transition model but a known observation model, a setting that has been previously addressed in two limiting ways: (i) frequentist methods relying on suboptimal stochastic policies having a minimum probability of choosing each action, and (ii) Bayesian approaches employing the optimal policy class but requiring strong assumptions about the consistency of employed estimators. Our work removes these limitations by proving convenient estimation guarantees for the transition model and introducing an optimistic algorithm that leverages the optimal class of deterministic belief-based policies. |
Alessio Russo; Alberto Maria Metelli; Marcello Restelli; |
| 213 | ScoreFusion: Fusing Score-based Generative Models Via Kullback�Leibler Barycenters Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce ScoreFusion, a theoretically grounded method for fusing multiple pre-trained diffusion models that are assumed to generate from auxiliary populations. |
Hao Liu; Tony Junze Ye; Jose Blanchet; Nian Si; |
| 214 | Approximate Equivariance in Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop approximately equivariant algorithms in reinforcement learning (RL). |
Jung Yeon Park; Sujay Bhatt; Sihan Zeng; Lawson L.S. Wong; Alec Koppel; Sumitra Ganesh; Robin Walters; |
| 215 | Robust Offline Policy Learning with Observational Data from Multiple Sources Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the problem of using observational bandit feedback data from multiple heterogeneous data sources to learn a personalized decision policy that robustly generalizes across diverse target settings. To achieve this, we propose a minimax regret optimization objective to ensure uniformly low regret under general mixtures of the source distributions. |
Aldo Gael Carranza; Susan Athey; |
| 216 | Invertible Fourier Neural Operators for Tackling Both Forward and Inverse Problems Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose an invertible Fourier Neural Operator (iFNO) for jointly tackling the forward and inverse problems. |
Da Long; Zhitong Xu; Qiwei Yuan; Yin Yang; Shandian Zhe; |
| 217 | Domain Adaptation and Entanglement: An Optimal Transport Perspective Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To this end, we derive new bounds based on optimal transport that analyze the UDA problem. |
Okan Koc; Alexander Soen; Chao-Kai Chiang; Masashi Sugiyama; |
| 218 | On The Geometry and Optimization of Polynomial Convolutional Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map – typically referred to as neuromanifold. In particular, we compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model, and describe its singularities. |
Vahid Shahverdi; Giovanni Luca Marchetti; Kathl�n Kohn; |
| 219 | A Shapley-value Guided Rationale Editor for Rationale Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Accordingly, we propose Shapley-value Guided Rationale Editor (SHARE), an unsupervised approach that refines editable rationales while predicting task outcomes. |
Zixin Kuang; Meng-Fen Chiang; Wang-Chien Lee; |
| 220 | Separation-Based Distance Measures for Causal Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose additional measures of distance that capture the difference in separations of two causal graphs which link-based distances are not fit to assess. |
Jonas Wahl; Jakob Runge; |
| 221 | LITE: Efficiently Estimating Gaussian Probability of Maximality Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce LITE, the first approach for estimating Gaussian PoM with \emph{almost-linear time and memory} complexity. |
Nicolas Menet; Jonas H�botter; Parnian Kassraie; Andreas Krause; |
| 222 | Training Neural Samplers with Reverse Diffusive KL Divergence Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the mode-seeking behavior of reverse KL hinders effective approximation of multi-modal target distributions. To address this, we propose to minimize the reverse KL along diffusion trajectories of both model and target densities. |
Jiajun He; Wenlin Chen; Mingtian Zhang; David Barber; Jos� Miguel Hern�ndez-Lobato; |
| 223 | Learning Signals Defined on Graphs with Optimal Transport and Gaussian Process Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an innovative strategy for Gaussian process regression where inputs are large and sparse graphs with continuous node attributes and outputs are signals defined on the nodes of the associated inputs. |
Raphael Carpintero Perez; S�bastien Da Veiga; Josselin Garnier; Brian Staber; |
| 224 | Heterogeneous Graph Structure Learning Through The Lens of Data-generating Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While significant advancements have been made in learning the structure of homogeneous graphs, many real-world graphs exhibit heterogeneous patterns where nodes and edges have multiple types. This paper fills this gap by introducing the first approach for heterogeneous graph structure learning (HGSL). |
Keyue Jiang; Bohan Tang; Xiaowen Dong; Laura Toni; |
| 225 | Deep Generative Quantile Bayes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop a multivariate posterior sampling procedure through deep generative quantile learning. |
Jungeum Kim; Percy S. Zhai; Veronika Rockova; |
| 226 | Scalable Implicit Graphon Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose Scalable Implicit Graphon Learning (SIGL), a scalable method that combines implicit neural representations (INRs) and graph neural networks (GNNs) to estimate a graphon from observed graphs. |
Ali Azizpour; Nicolas Zilberstein; Santiago Segarra; |
| 227 | Approximate Information Maximization for Bandit Games Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Notable examples include modeling decision-making within the brain using the free-energy principle, optimizing the accuracy-complexity trade-off when accessing hidden variables with the information bottleneck principle (Tishby et al. 2000), and navigation in random environments using information maximization (Vergassola et al. 2007). Building on this principle, we propose a new class of bandit algorithms that maximize an approximation to the information of a key variable within the system. |
Alex Barbier Chebbah; Christian L. Vestergaard; Jean-Baptiste Masson; Etienne Boursier; |
| 228 | Loss Gradient Gaussian Width Based Generalization and Optimization Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present generalization and optimization guarantees in terms of the complexity of the gradients, as measured by the Loss Gradient Gaussian Width (LGGW). |
Arindam Banerjee; Qiaobo Li; Yingxue Zhou; |
| 229 | Statistical Learning of Distributionally Robust Stochastic Control in Continuous State Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a distributionally robust stochastic control paradigm that accommodates possibly adaptive adversarial perturbation to the noise distribution within a prescribed ambiguity set. |
Shengbo Wang; Nian Si; Jose Blanchet; Zhengyuan Zhou; |
| 230 | UNHaP: Unmixing Noise from Hawkes Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work introduces UNHaP (Unmix Noise from Hawkes Processes), a novel approach addressing the joint learning of temporal structures in events and the removal of spurious detections. |
Virginie Loison; Guillaume Staerman; Thomas Moreau; |
| 231 | Infinite-Horizon Reinforcement Learning with Multinomial Logit Function Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop a provably efficient discounted value iteration-based algorithm that works for both infinite-horizon average-reward and discounted-reward settings. |
Jaehyun Park; Junyeop Kwon; Dabeen Lee; |
| 232 | Consistent Amortized Clustering Via Generative Flow Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce GFNCP, a novel framework for amortized clustering. |
Irit Chelly; Roy Uziel; Oren Freifeld; Ari Pakman; |
| 233 | When The Universe Is Too Big: Bounding Consideration Probabilities for Plackett-Luce Rankings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We apply the consider-then-choose framework to top-$k$ rankings, where we assume rankings are constructed according to a Plackett-Luce model after sampling a consideration set. |
Ben Aoki-Sherwood; Catherine Bregou; David Liben-Nowell; Kiran Tomlinson; Thomas Zeng; |
| 234 | Independent Learning in Performative Markov Potential Games Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we study multi-agent PRL by incorporating performative effects into Markov Potential Games (MPGs). |
Rilind Sahitaj; Paulius Sasnauskas; Yigit Yalin; Debmalya Mandal; Goran Radanovic; |
| 235 | Sparse Causal Effect Estimation Using Two-Sample Summary Statistics in The Presence of Unmeasured Confounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We provide two methods, based on L0- and L1-penalization, respectively. |
Shimeng Huang; Niklas Pfister; Jack Bowden; |
| 236 | Analysis of Two-Stage Rollout Designs with Clustering for Causal Inference Under Network Interference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, the required extrapolation can lead to prohibitively high variance. To address this, we propose a two-stage experiment that selects a sub-population in the first stage and restricts treatment rollout to this sub-population in the second stage. |
Mayleen Cortez-Rodriguez; Matthew Eichhorn; Christina Yu; |
| 237 | Stochastic Compositional Minimax Optimization with Provable Convergence Guarantees Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a formal definition of the stochastic compositional minimax problem, which involves optimizing a minimax loss with a compositional structure either in primal, dual, or both primal and dual variables. |
Yuyang Deng; Fuli Qiao; Mehrdad Mahdavi; |
| 238 | General Staircase Mechanisms for Optimal Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We derive the optimal differentially private additive noise mechanism for queries in $\mathbb{R}^d$ when sensitivity and error are defined by an arbitrary norm $||\cdot||_K$. |
Alex Kulesza; Ananda Theertha Suresh; Yuyan Wang; |
| 239 | Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the paper, we study a class of nonconvex-nonconcave minimax optimization with nonsmooth regularization, where the objective function is possibly nonconvex on primal variable $x$, and it is nonconcave and satisfies the Polyak-Lojasiewicz (PL) condition on dual variable $y$. |
Feihu Huang; Chunyu Xuan; Xinrui Wang; Siqi Zhang; Songcan Chen; |
| 240 | From Deep Additive Kernel Learning to Last-Layer Bayesian Neural Networks Via Induced Prior Approximation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: From the computational perspective, however, DKL becomes challenging when the input dimension of the GP layer is high. To address this challenge, we propose the Deep Additive Kernel (DAK) model, which incorporates i) an additive structure for the last-layer GP; and ii) induced prior approximation for each GP unit. |
Wenyuan Zhao; Haoyuan Chen; Tie Liu; Rui Tuo; Chao Tian; |
| 241 | On The Inherent Privacy of Zeroth-Order Projected Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, since the search direction in the zeroth-order methods is inherently random, researchers including Tang et al. (2024) and Zhang et al. (2024a) have raised an important question: is the inherent noise in zeroth-order estimators sufficient to ensure the overall differential privacy of the algorithm? This work settles this question for a class of oracle-based optimization algorithms where the oracle returns zeroth-order gradient estimates. |
Devansh Gupta; Meisam Razaviyayn; Vatsal Sharan; |
| 242 | Models That Are Interpretable But Not Transparent Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This work provides an approach, FaithfulDefense, that creates model explanations for logical models that are completely faithful, yet reveal as little as possible about the decision boundary. |
Chudi Zhong; Panyu Chen; Cynthia Rudin; |
| 243 | Kernel Single Proxy Control for Deterministic Confounding Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This generalizes existing work on causal methods with a single proxy variable to the continuous treatment setting. We propose two kernel-based methods for this setting: the first based on the two-stage regression approach, and the second based on a maximum moment restriction approach. |
Liyuan Xu; Arthur Gretton; |
| 244 | Unbiased and Sign Compression in Distributed Learning: Comparing Noise Resilience Via SDEs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Distributed methods are essential for handling machine learning pipelines comprising large-scale models and datasets. However, their benefits often come at the cost of increased … |
Enea Monzio Compagnoni; Rustem Islamov; Frank Norbert Proske; Aurelien Lucchi; |
| 245 | Infinite-dimensional Diffusion Bridge Simulation Via Operator Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents a method that merges score-matching techniques with operator learning, enabling a direct approach to learn the infinite-dimensional bridge and achieving a discretization equivariant bridge simulation. |
Gefan Yang; Elizabeth Louise Baker; Michael Lind Severinsen; Christy Anna Hipsley; Stefan Sommer; |
| 246 | Bridging Domains with Approximately Shared Features Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Inspired by our theory, we introduce ProjectionNet, a practical method to distinguish content features from environmental features via \textit{explicit feature space control}, further consolidating our theoretical findings. |
Ziliang Samuel Zhong; Xiang Pan; Qi Lei; |
| 247 | Changepoint Estimation in Sparse Dynamic Stochastic Block Models Under Near-Optimal Signal Strength Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the offline changepoint estimation problem in the context of multilayer stochastic block models. |
Shirshendu Chatterjee; Soumendu Sundar Mukherjee; TAMOJIT SADHUKHAN; |
| 248 | Geometric Collaborative Filtering with Convergence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In summary, our work proposes a theoretically sound method which paves a way to better understand generalization of collaborative filtering at large. |
Hisham Husain; Julien Monteil; |
| 249 | Sample Compression Unleashed: New Generalization Bounds for Real Valued Losses Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a general framework for deriving new sample compression bounds that hold for real-valued unbounded losses. |
Mathieu Bazinet; Valentina Zantedeschi; Pascal Germain; |
| 250 | Bridging The Theoretical Gap in Randomized Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, a persistent gap remains between theoretical certified robustness and empirical robustness accuracy. This paper introduces a new framework that bridges this gap by leveraging Lipschitz continuity for certification and proposing a novel, less conservative method for computing confidence intervals in randomized smoothing. |
Blaise Delattre; Paul Caillon; Quentin Barth�lemy; Erwan Fagnou; Alexandre Allauzen; |
| 251 | Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present the Federated Upper Confidence Bound Value Iteration algorithm ($\texttt{Fed-UCBVI}$), a novel extension of the $\texttt{UCBVI}$ algorithm (Azar et al., 2017) tailored for the federated learning framework. |
Safwan Labbi; Daniil Tiapkin; Lorenzo Mancini; Paul Mangold; Eric Moulines; |
| 252 | Bilevel Reinforcement Learning Via The Development of Hyper-gradient Without Lower-Level Convexity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The inherent non-convexity of the lower-level RL problem is, however, to be an impediment to developing bilevel optimization methods. |
Yan Yang; Bin Gao; Ya-xiang Yuan; |
| 253 | Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To further balance the exploration and exploitation, we propose Neural-$\sigma^2$-LinearUCB, a variance-aware algorithm that utilizes $\sigma^2_t$, i.e., an upper bound of the reward noise variance at round $t$, to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. |
Ha Manh Bui; Enrique Mallada; Anqi Liu; |
| 254 | Stochastic Approximation with Unbounded Markovian Noise: A General-Purpose Theorem Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Motivated by engineering applications such as resource allocation in networks and inventory systems, we consider average-reward Reinforcement Learning with unbounded state space and reward function. |
Shaan Ul Haque; Siva Theja Maguluri; |
| 255 | FedBaF: Federated Learning Aggregation Biased By A Foundation Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce Federated Learning Aggregation Biased by a Foundation Model (FedBaF), a novel method for dynamically integrating pre-trained foundation model weights during the FL aggregation phase. |
Jong-Ik Park; Srinivasa Pranav; Jose M F Moura; Carlee Joe-Wong; |
| 256 | A Differential Inclusion Approach for Learning Heterogeneous Sparsity in Neuroimaging Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new method based on differential inclusion, which generates a sparse regularized solution path on multiple parameters that are enforced with heterogeneous sparsity to capture lesion features and the procedural bias separately. |
Wenjing Han; Yueming Wu; Xinwei Sun; Lingjing Hu; Yizhou Wang; |
| 257 | Calm Composite Losses: Being Improper Yet Proper Composite Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we identify several losses as improper, calling into question the validity of class probability estimates derived from their simplex-projected outputs. |
Han Bao; Nontawat Charoenphakdee; |
| 258 | Your Copula Is A Classifier in Disguise: Classification-based Copula Density Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose reinterpreting copula density estimation as a discriminative task. |
David Huk; Mark Steel; Ritabrata Dutta; |
| 259 | HAR-former: Hybrid Transformer with An Adaptive Time-Frequency Representation Matrix for Long-Term Series Forecasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional methods, which often depend on high-dimensional embeddings, can obscure multivariate relationships and struggle with performance limitations, especially when handling complex temporal patterns. To address these issues, we propose HAR-former, a Hybrid Transformer with an Adaptive Time-Frequency Representation Matrix, which combines the strengths of Multi-Layer Perceptrons (MLPs) and Transformers to process trend and seasonal components, respectively. |
Kenghao Zheng; Zi Long; Shuxin Wang; |
| 260 | Synthetic Potential Outcomes and Causal Mixture Identifiability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes grouping with respect to the causal response of an intervention or perturbation on the system. |
Bijan Mazaheri; Chandler Squires; Caroline Uhler; |
| 261 | Consistent Validation for Predictive Methods in Spatial Settings Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This mismatch is often not an instance of covariate shift (as commonly formalized) because the validation and test locations are fixed (e.g., on a grid or at select points) rather than i.i.d. from two distributions. In the present work, we formalize a check on validation methods: that they become arbitrarily accurate as validation data becomes arbitrarily dense. |
David R. Burt; Yunyi Shen; Tamara Broderick; |
| 262 | Differentially Private Kernelized Contextual Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel algorithm that improves upon the state of the art and achieves an error rate of $\mathcal{O}\left(\sqrt{\dfrac{\gamma_T}{T}} + \dfrac{\gamma_T}{T \varepsilon}\right)$ after $T$ queries for a large class of kernel families, where $\gamma_T$ represents the effective dimensionality of the kernel and $\varepsilon > 0$ is the privacy parameter. |
Nikola Pavlovic; Sudeep Salgia; Qing Zhao; |
| 263 | Order-Optimal Regret in Distributed Kernel Bandits Using Uniform Sampling with Shared Randomness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop the first algorithm that achieves the optimal regret order (as defined by centralized learning) with a communication cost that is sublinear in both $N$ and $T$. |
Nikola Pavlovic; Sudeep Salgia; Qing Zhao; |
| 264 | Subspace Recovery in Winsorized PCA: Insights Into Accuracy and Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we explore the theoretical properties of subspace recovery using Winsorized Principal Component Analysis (WPCA), utilizing a common data transformation technique that caps extreme values to mitigate the impact of outliers. |
Sangil Han; Kyoowon Kim; Sungkyu Jung; |
| 265 | Tight Analysis of Difference-of-Convex Algorithm (DCA) Improves Convergence Rates for Proximal Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We examine the precise behavior of a single iteration of the difference-of-convex algorithm (DCA), providing a tight characterization of the objective function decrease, distinguishing between six distinct parameter regimes. |
Teodor Rotaru; Panagiotis Patrinos; Fran�ois Glineur; |
| 266 | Time-varying Gaussian Process Bandits with Unknown Prior Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: If the problem is stationary, one could rely on the Regret Balancing scheme to conduct the optimisation, but in the case of time-varying problems, such a scheme cannot be used. To address this gap in existing research, we propose a novel algorithm, PE-GP-UCB, which is capable of solving time-varying Bayesian optimisation problems even without the exact knowledge of the function’s prior. |
Juliusz Ziomek; Masaki Adachi; Michael A Osborne; |
| 267 | Causal Discovery on Dependent Binary Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a decorrelation-based approach for causal graph learning on dependent binary data, where the local conditional distribution is defined by a latent utility model with dependent errors across units. |
Alex Chen; Qing Zhou; |
| 268 | Learning High-dimensional Gaussians from Censored Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We provide efficient algorithms for the problem of distribution learning from high-dimensional Gaussian data where in each sample, some of the variable values are missing. We … |
Arnab Bhattacharyya; Constantinos Costis Daskalakis; Themis Gouleakis; Yuhao Wang; |
| 269 | Safe Exploration in Reproducing Kernel Hilbert Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a safe BO algorithm capable of estimating the RKHS norm from data. |
Abdullah Tokmak; Kiran G. Krishnan; Thomas B. Sch�n; Dominik Baumann; |
| 270 | A Unified Evaluation Framework for Epistemic Predictions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel unified evaluation framework for uncertainty-aware classifiers, applicable to a wide range of model classes, which allows users to tailor the trade-off between accuracy and precision of predictions via a suitably designed performance metric. |
Shireen Kudukkil Manchingal; Muhammad Mubashar; Kaizheng Wang; Fabio Cuzzolin; |
| 271 | Scalable Spectral Representations for Multiagent Reinforcement Learning in Network MDPs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, utilizing the exponential decay property of network dynamics, we first derive scalable spectral local representations for multiagent reinforcement learning in network MDPs, which induces a network linear subspace for the local $Q$-function of each agent. Building on these local spectral representations, we design a scalable algorithmic framework for multiagent reinforcement learning in continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm. |
Zhaolin Ren; Runyu Zhang; Bo Dai; Na Li; |
| 272 | Microfoundation Inference for Strategic Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we model agents’ responses as a cost-adjusted utility maximization problem and propose estimates for said cost. |
Daniele Bracale; Subha Maity; Felipe Maia Polo; Seamus Somerstep; Moulinath Banerjee; Yuekai Sun; |
| 273 | Hyperbolic Prototypical Entailment Cones for Image Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on hyperbolic manifolds and introduce a novel framework, Hyperbolic Prototypical Entailment Cones (HPEC). |
Samuele Fonio; Roberto Esposito; Marco Aldinucci; |
| 274 | HAVER: Instance-Dependent Error Bounds for Maximum Mean Estimation and Applications to Q-Learning and Monte Carlo Tree Search Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While there have been a few proposed algorithms, their performance analyses have been limited to their biases rather than a precise error metric. In this paper, we propose a novel algorithm called HAVER (Head AVERaging) and analyze its mean squared error. |
Tuan Nguyen; Jay Barrett; Kwang-Sung Jun; |
| 275 | Amortized Probabilistic Conditioning for Optimization, Simulation and Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce the Amortized Conditioning Engine (ACE), a new transformer-based meta-learning model that explicitly represents latent variables of interest. |
Paul Edmund Chang; Nasrulloh Ratu Bagus Satrio Loka; Daolang Huang; Ulpu Remes; Samuel Kaski; Luigi Acerbi; |
| 276 | Learning Geometrically-Informed Lyapunov Functions with Deep Diffeomorphic RBF Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: To this end, we introduce a novel approach to construct diffeomorphic maps based on RBF networks, which facilitate precise, local transformations around data. |
Samuel Tesfazgi; Leonhard Sprandl; Sandra Hirche; |
| 277 | Efficient Optimization Algorithms for Linear Adversarial Training Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we propose tailored optimization algorithms for the adversarial training of linear models, which render large-scale regression and classification problems more tractable. |
Antonio H. Ribeiro; Thomas B. Sch�n; Dave Zachariah; Francis Bach; |
| 278 | A Generalized Theory of Mixup for Structure-Preserving Synthetic Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Despite its success and popularity, limited attention has been given to understanding the statistical properties of the synthetic data it generates. In this paper, we delve into the theoretical underpinnings of mixup, specifically its effects on the statistical structure of synthesized data. |
Chungpa Lee; Jongho Im; Joseph H.T. Kim; |
| 279 | A Robust Kernel Statistical Test of Invariance: Detecting Subtle Asymmetries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our first contribution is to show that, while detecting subtle asymmetries is computationally intractable, a randomized method can be used to robustly estimate closeness measures to invariance within constant factors. |
Ashkan Soleymani; Behrooz Tahmasebi; Stefanie Jegelka; Patrick Jaillet; |
| 280 | Structure Based SAT Dataset for Analysing GNN Generalisation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To bridge the gap between structural graph properties (e.g., modularity, self-similarity) and the generalisability (or lack thereof) of GNN based SAT solvers, we present StructureSAT: a curated dataset, along with code to further generate novel examples, containing a diverse set of SAT problems from well known problem domains. |
Yi Fu; Anthony Tompkins; Yang Song; Maurice Pagnucco; |
| 281 | Lower Bounds for Time-Varying Kernelized Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider non-stationary scenarios, which are crucial for certain applications but are currently less well-understood. |
Xu Cai; Jonathan Scarlett; |
| 282 | Multi-Player Approaches for Dueling Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As user numbers grow and tasks shift to complex datasets like images or videos, distributed approaches become essential for efficiently gathering feedback. To address this, we introduce a multiplayer dueling bandit problem, highlighting that exploring non-informative candidate pairs becomes especially challenging in a collaborative environment. |
Or Raveh; Junya Honda; Masashi Sugiyama; |
| 283 | Planning and Learning in Risk-Aware Restless Multi-Arm Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we generalize the traditional restless multi-arm bandit problem with a risk-neutral objective by incorporating risk-awareness. |
Nima Akbarzadeh; Yossiri Adulyasak; Erick Delage; |
| 284 | Computation-Aware Kalman Filtering and Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: But since they do not model the error introduced by the computational approximation, their predictive uncertainty estimates can be overly optimistic. In this work, we propose a probabilistic numerical method for inference in high-dimensional Gauss-Markov models which mitigates these scaling issues. |
Marvin Pf�rtner; Jonathan Wenger; Jon Cockayne; Philipp Hennig; |
| 285 | Logarithmic Neyman Regret for Adaptive Estimation of The Average Treatment Effect Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing non-asymptotic methods are limited by poor empirical performance and exponential dependence on problem parameters. In order to address these gaps, we propose and analyze the Clipped Second Moment Tracking (ClipSMT) algorithm, a variant of an existing algorithm with strong asymptotic optimality guarantees, and provide finite sample bounds on its Neyman regret. |
Ojash Neopane; Aaditya Ramdas; Aarti Singh; |
| 286 | Learning to Negotiate Via Voluntary Commitment Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Based on MCGs, we propose a learnable commitment protocol via policy gradients. |
Shuhui Zhu; Baoxiang Wang; Sriram Ganapathi Subramanian; Pascal Poupart; |
| 287 | Parabolic Continual Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new approach to continual learning by imposing the properties of a parabolic partial differential equation (PDE) to regularize the expected behavior of the loss over time. |
Haoming Yang; Ali Hasan; Vahid Tarokh; |
| 288 | Empirical Error Estimates for Graph Sparsification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Although it is possible for users to obtain conceptual guidance from theoretical error bounds in the literature, such results are typically impractical at a numerical level. Taking an alternative approach, we propose to address these issues from a data-driven perspective by computing empirical error estimates. |
Siyao Wang; Miles E. Lopes; |
| 289 | Nonparametric Estimation of Hawkes Processes with RKHSs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Based on it, we propose an estimation method, that relies on two common approximations (of the ReLU function and of the integral operator). |
Anna Bonnet; Maxime Sangnier; |
| 290 | Wasserstein Gradient Flow Over Variational Parameter Space for Variational Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we reframe VI as the optimization of an objective that concerns probability distributions defined over a variational parameter space. |
Dai Hai Nguyen; Tetsuya Sakurai; Hiroshi Mamitsuka; |
| 291 | Wasserstein Distributionally Robust Bayesian Optimization with Continuous Context Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a novel algorithm for Wasserstein Distributionally Robust Bayesian Optimization that can handle continuous context distributions while maintaining computational tractability. |
Francesco Micheli; Efe C. Balta; Anastasios Tsiamis; John Lygeros; |
| 292 | Recurrent Neural Goodness-of-Fit Test for Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose the REcurrent NeurAL (RENAL) Goodness-of-Fit test, a novel and statistically rigorous framework for evaluating generative time series models. |
Aoran Zhang; Wenbin Zhou; Liyan Xie; Shixiang Zhu; |
| 293 | Prediction-Centric Uncertainty Quantification Via MMD Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In the context of a misspecified deterministic mathematical model, this has the undesirable consequence that posterior predictions become deterministic and certain, while being incorrect. Taking this observation as a starting point, we propose \emph{Prediction-Centric Uncertainty Quantification}, where a mixture distribution based on the deterministic model confers improved uncertainty quantification in the predictive context. |
Zheyang Shen; Jeremias Knoblauch; Samuel Power; Chris J. Oates; |
| 294 | Is Merging Worth It? Securely Evaluating The Information Gain for Causal Dataset Acquisition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For causal estimation this is particularly challenging as the value of a merge depends not only on reduction in epistemic uncertainty but also on improvement in overlap. To address this challenge, we introduce the first \emph{cryptographically secure} information-theoretic approach for quantifying the value of a merge in the context of heterogeneous treatment effect estimation. |
Jake Fawkes; Lucile Ter-Minassian; Desi R. Ivanova; Uri Shalit; Christopher C. Holmes; |
| 295 | The Hardness of Validating Observational Studies with Experimental Data Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Recent works attempt to remove this bias by supplementing observational data with experimental data, which, when available, is typically on a smaller scale due to the time and cost involved in running a randomised controlled trial. In this work, we prove a theorem that places fundamental limits on this “best of both worlds” approach. |
Jake Fawkes; Michael O�Riordan; Athanasios Vlontzos; Oriol Corcoll; Ciar�n Mark Gilligan-Lee; |
| 296 | Calibrated Computation-Aware Gaussian Processes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We thus propose a new CAGP framework, CAGP-GS, based on using Gauss-Seidel iterations for the underlying probabilistic linear solver. |
Disha Hegde; Mohamed Adil; Jon Cockayne; |
| 297 | Adapting to Online Distribution Shifts in Deep Learning: A Black-Box Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a meta-algorithm that takes any network architecture and any Online Learner (OL) algorithm as input and produces a new algorithm which provably enhances the performance of the given OL under non-stationarity. |
Dheeraj Baby; Boran Han; Shuai Zhang; Cuixiong Hu; Bernie Wang; Yu-Xiang Wang; |
| 298 | Emergence of Globally Attracting Fixed Points in Deep Neural Networks With Nonlinear Activations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we introduce a theoretical framework for the evolution of the kernel sequence, which measures the similarity between the hidden representation for two different inputs. |
Amir Joudaki; Thomas Hofmann; |
| 299 | Towards Cost Sensitive Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we consider RL models that may actively acquire features from the environment to improve the decision quality and certainty, while automatically balancing the cost of feature acquisition process and the reward of task decision process. |
Yang Li; Junier Oliva; |
| 300 | Conformal Prediction Under Generalized Covariate Shift with Posterior Drift Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, we study a particular type of classification problem, called conformal prediction, under a new distributional assumption for transfer learning. |
Baozhen Wang; Xingye Qiao; |
| 301 | Information-Theoretic Causal Discovery in Topological Order Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we develop a general information-theoretic framework called TOPIC for causal discovery in topological order. |
Sascha Xu; Sarah Mameche; Jilles Vreeken; |
| 302 | Theoretically Grounded Pruning of Large Ground Sets for Constrained, Discrete Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop light-weight pruning algorithms to quickly discard elements that are unlikely to be part of an optimal solution. |
Ankur Nath; Alan Kuhnle; |
| 303 | Bayesian Gaussian Process ODEs Via Double Normalizing Flows Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the use of standard GPs with basic kernels like squared exponential kernels has been common in GP-ODE research, limiting the model’s ability to represent complex scenarios. To address this limitation, we introduce normalizing flows to reparameterize the ODE vector field, resulting in a data-driven prior distribution, thereby increasing flexibility and expressive power. |
JIAN XU; Shian Du; Junmei Yang; Xinghao Ding; Delu Zeng; John Paisley; |
| 304 | Statistical Test for Auto Feature Engineering By Selective Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Unfortunately, because most AFE problems are formulated as combinatorial search problems and solved by heuristic algorithms, it has been challenging to theoretically quantify the reliability of generated features. To address this issue, we propose a new statistical test for generated features by AFE algorithms based on a framework called selective inference. |
Tatsuya Matsukawa; Tomohiro Shiraishi; Shuichi Nishino; Teruyuki Katsuoka; Ichiro Takeuchi; |
| 305 | What Ails Generative Structure-based Drug Design: Expressivity Is Too Little or Too Much? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Several generative models with elaborate training and sampling procedures have been proposed to accelerate structure-based drug design (SBDD); however, their empirical performance turns out to be suboptimal. |
Rafal Karczewski; Samuel Kaski; Markus Heinonen; Vikas K Garg; |
| 306 | Copula Based Trainable Calibration Error Estimator of Multi-Label Classification with Label Interdependencies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A key challenge in calibrating Multi-Label Classification(MLC) problems is to consider the interdependencies among labels. To address this, in this research we propose an unbiased, differentiable, trainable calibration error estimator for MLC problems by using Copula. |
Arkapal Panda; Utpal Garain; |
| 307 | Disentangling Impact of Capacity, Objective, Batchsize, Estimators, and Step-size on Flow VI Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We provide specific recommendations for different factors and propose a flow VI recipe that matches or surpasses leading turnkey Hamiltonian Monte Carlo (HMC) methods. |
Abhinav Agrawal; Justin Domke; |
| 308 | RTD-Lite: Scalable Topological Analysis for Comparing Weighted Graphs in Learning Tasks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce RTD-Lite, a scalable algorithm that efficiently compares topological features, specifically connectivity or cluster structures at arbitrary scales, of two weighted graphs with one-to-one correspondence between vertices. |
Eduard Tulchinskii; Daria Voronkova; Ilya Trofimov; Evgeny Burnaev; Serguei Barannikov; |
| 309 | Nonparametric Distributional Regression Via Quantile Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a new approach to estimating the distribution of a response variable conditioned on factors. |
Cheng Peng; Stan Uryasev; |
| 310 | Testing Conditional Independence with Deep Neural Network Based Binary Expansion Testing (DeepBET) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel test procedure, DeepBET. |
Yang Yang; Kai Zhang; Ping-Shou Zhong; |
| 311 | Optimal Downsampling for Imbalanced Classification with Generalized Linear Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study optimal downsampling for imbalanced classification using generalized linear models (GLMs). We propose a pseudo maximum likelihood estimator and study its asymptotic normality in the context of increasingly imbalanced populations relative to an increasingly large sample size. |
Yan Chen; Jose Blanchet; Krzysztof Dembczynski; Laura Fee Nern; Aaron Eliasib Flores; |
| 312 | Pareto Set Identification With Posterior Sampling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Building on posterior sampling in both the stopping and the sampling rules, we propose the \hyperlink{PSIPS}{PSIPS} algorithm that deals simultaneously with structure and correlation without paying the computational cost of existing oracle-based algorithms. |
Cyrille Kone; Marc Jourdan; Emilie Kaufmann; |
| 313 | A Primer on Linear Classification with Missing Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we theoretically analyze how three classical linear classifiers, namely perceptron, logistic regression and linear discriminant analysis (LDA), behave with Missing Completely At Random (MCAR) data, depending on the strategy (imputation or P-b-P) to handle missing values. |
Angel David REYERO LOBO; Alexis Ayme; Claire Boyer; Erwan Scornet; |
| 314 | Riemann$^2$: Learning Riemannian Submanifolds from Riemannian Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, when dealing with constrained data such as unit-norm vectors or symmetric positive-definite matrices, existing approaches ignore the underlying geometric constraints or fail to provide meaningful metrics in the latent space. To address these limitations, we propose to learn Riemannian latent representations of such geometric data. |
Leonel Rozo; Miguel Gonz�lez-Duque; No�mie Jaquier; S�ren Hauberg; |
| 315 | Variance-Dependent Regret Bounds for Nonstationary Linear Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, such a quantity only measures the non-stationarity with respect to the expectation of the reward distribution, which makes existing algorithms sub-optimal under the general non-stationary distribution setting. In this work, we propose algorithms that utilize the variance of the reward distribution as well as the $B_K$, and show that they can achieve tighter regret upper bounds. |
Zhiyong Wang; Jize Xie; Yi Chen; John C.S. Lui; Dongruo Zhou; |
| 316 | Reinforcement Learning for Infinite-Horizon Average-Reward Linear MDPs Via Approximation By Discounted-Reward MDPs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose the first algorithm that achieves $\widetilde{\mathcal{O}}(\sqrt{T})$ regret with computational complexity polynomial in the problem parameters, without making strong assumptions on dynamics. |
Kihyuk Hong; Woojin Chae; Yufan Zhang; Dabeen Lee; Ambuj Tewari; |
| 317 | The Size of Teachers As A Measure of Data Complexity: PAC-Bayes Excess Risk Bounds and Scaling Laws Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the generalization properties of neural networks through the lens of data complexity. |
Gintare Karolina Dziugaite; Daniel M. Roy; |
| 318 | From Learning to Optimize to Learning Optimization Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Towards designing learned optimization algorithms that are usable beyond their training setting, we identify key principles that classical algorithms obey, but have up to now, not been used for Learning to Optimize (L2O). Following these principles, we provide a general design pipeline, taking into account data, architecture and learning strategy, and thereby enabling a synergy between classical optimization and L2O, resulting in a philosophy of Learning Optimization Algorithms. |
Camille Castera; Peter Ochs; |
| 319 | Refined Analysis of Constant Step Size Federated Averaging and Federated Richardson-Romberg Extrapolation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present a novel analysis of $\texttt{FedAvg}$ with constant step size, relying on the Markov property of the underlying process. |
Paul Mangold; Alain Oliviero Durmus; Aymeric Dieuleveut; Sergey Samsonov; Eric Moulines; |
| 320 | MODL: Multilearner Online Deep Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we introduce an alternative paradigm through a hybrid multilearner approach. |
Antonios Valkanas; Boris N. Oreshkin; Mark Coates; |
| 321 | Parameter Estimation in State Space Models Using Particle Importance Sampling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, two SMC algorithms are proposed based on an importance sampling weight function to use each set of generated particles more efficiently. |
Yuxiong Gao; Wentao Li; Rong Chen; |
| 322 | On Subjective Uncertainty Quantification and Calibration in Natural Language Generation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We illustrate the methods on question answering and machine translation tasks. |
Ziyu Wang; Christopher C. Holmes; |
| 323 | Conditional Prediction ROC Bands for Graph Classification Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Empirically, to establish local exchangeability for TGNNs, we introduce a data-driven approach to construct local calibration sets for graphs. |
Yujia Wu; Bo Yang; Elynn Chen; Yuzhou Chen; Zheshi Zheng; |
| 324 | Bridging Multiple Worlds: Multi-marginal Optimal Transport for Causal Partial-identification Problem Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we cast the causal partial identification problem in the framework of MOT with $K$ margins and $d$-dimensional outcomes and obtain the exact partial identified set. |
Zijun Gao; Shu Ge; Jian Qian; |
| 325 | Learning from Biased Positive-unlabeled Data Via Threshold Calibration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Learning from positive and unlabeled data (PU learning) aims to train a binary classification model when only positive and unlabeled examples are available. |
Pawel Teisseyre; Timo Martens; Jessa Bekker; Jesse Davis; |
| 326 | Integer Programming Based Methods and Heuristics for Causal Graph Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop the first exact methods – based on integer programming – to find score-maximizing Bow- free and Arid ADMGs. |
Sanjeeb Dash; Joao Goncalves; Tian Gao; |
| 327 | Automatically Adaptive Conformal Risk Control Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Building on the recent work of Gibbs et al. [2023], we propose a methodology for achieving approximate conditional control of statistical risks—the expected value of loss functions—by adapting to the difficulty of test samples. |
Vincent Blot; Anastasios Nikolas Angelopoulos; Michael Jordan; Nicolas J-B. Brunel; |
| 328 | Hierarchical Bias-Driven Stratification for Interpretable Causal Effect Estimation Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Here, we present BICauseTree: an interpretable balancing method that identifies clusters where natural experiments occur locally. |
Lucile Ter-Minassian; Liran Szlak; Ehud Karavani; Christopher C. Holmes; Yishai Shimoni; |
| 329 | Legitimate Ground-truth-free Metrics for Deep Uncertainty Classification Scoring Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Equipped with those new results, and given the applicability of those metrics in the usual supervised paradigm, we argue that our contributions will help promoting a broader use of UQ in deep learning. |
Arthur Pignet; Chiara Regniez; John Klein; |
| 330 | Causal Discovery in Mixed Additive Noise Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our contribution is a structural causal model designed to handle mixed-type data through a general function class. |
Ruicong Yao; Tim Verdonck; Jakob Raymaekers; |
| 331 | Understanding The Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite the empirical success of Low-Rank Adaptation (LoRA) in fine-tuning pre-trained models, there is little theoretical understanding of how first-order methods with carefully crafted initialization adapt models to new tasks. In this work, we take the first step towards bridging this gap by theoretically analyzing the learning dynamics of LoRA for matrix factorization (MF) under gradient flow (GF), emphasizing the crucial role of initialization. |
Ziqing Xu; Hancheng Min; Lachlan Ewen MacDonald; Jinqi Luo; Salma Tarmoun; Enrique Mallada; Rene Vidal; |
| 332 | Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a scalable kinetic Langevin dynamics algorithm for sampling parameter spaces of big data and AI applications. |
Daniel Paulin; Peter A. Whalley; Neil K. Chada; Benedict J. Leimkuhler; |
| 333 | Revisiting Online Learning Approach to Inverse Linear Optimization: A Fenchel�Young Loss Perspective and Gap-Dependent Regret Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As a byproduct, we present an offline guarantee on the \emph{suboptimality loss}, which measures how well predicted objective vectors explain the agent’s choices, without assuming the optimality of the agent’s choices. |
Shinsaku Sakaue; Han Bao; Taira Tsuchiya; |
| 334 | On The Convergence of Locally Adaptive and Scalable Diffusion-Based Sampling Methods for Deep Bayesian Neural Network Posteriors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Over the past years, several papers have introduced sampling algorithms with corresponding theorems stating that they achieve this property. In this paper, we demonstrate that these methods can have a substantial bias in the distribution they sample, even in the limit of vanishing step sizes and at full batch size. |
Tim Rensmeyer; Oliver Niggemann; |
| 335 | LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate the \emph{linear contextual bandit problem} with independent and identically distributed (i.i.d.) contexts. In this problem, we aim to develop a \emph{Best-of-Both-Worlds} (BoBW) algorithm with regret upper bounds in both stochastic and adversarial regimes. |
Masahiro Kato; Shinji Ito; |
| 336 | Application of Structured State Space Models to High Energy Physics with Locality Sensitive Hashing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Modern high-energy physics (HEP) experiments are increasingly challenged by the vast size and complexity of their datasets, particularly regarding large-scale point cloud processing and long sequences. In this study, to address these challenges, we explore the application of structured state space models (SSMs), proposing one of the first trials to integrate local-sensitive hashing into either a hybrid or pure Mamba Model. |
Cheng Jiang; Sitian Qian; |
| 337 | Epistemic Uncertainty and Excess Risk in Variational Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite their practical importance, these metrics lack comprehensive theoretical analysis. In this paper, we investigate these EU metrics by providing their novel relationship to excess risk, which allows for a convergence analysis based on PAC-Bayesian theory. |
Futoshi Futami; |
| 338 | Common Learning Constraints Alter Interpretations of Direct Preference Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Large language models in the past have typically relied on some form of reinforcement learning with human feedback (RLHF) to better align model responses with human preferences. … |
Lemin Kong; Xiangkun Hu; Tong He; David Wipf; |
| 339 | Conditioning Diffusion Models By Explicit Forward-backward Bridging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we express \emph{exact} conditional simulation within the \emph{approximate} diffusion model as an inference problem on an augmented space corresponding to a partial SDE bridge. |
Adrien Corenflos; Zheng Zhao; Thomas B. Sch�n; Simo S�rkk�; Jens Sj�lund; |
| 340 | Adaptive Extragradient Methods for Root-finding Problems Under Relaxed Assumptions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop a new class of self-tuning algorithms to solve a root-finding problem involving a Lipschitz continuous operator, with applications in convex optimization, minimax saddle point problems and variational inequalities. |
Yang Luo; Michael J O�Neill; |
| 341 | Online-to-PAC Generalization Bounds Under Graph-mixing Dependencies Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Both approaches have their own limitations, the former requiring a temporal ordered structure, and the latter lacking a way to quantify the strength of inter-dependencies. In this work, we bridge these two lines of work by proposing a framework where dependencies decay with graph distance. |
Baptiste Ab�l�s; Gergely Neu; Eugenio Clerico; |
| 342 | Dissecting The Impact of Model Misspecification in Data-Driven Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Although intuitive, the statistical benefit of the latter approach is not well understood yet is important to guide the prescriptive usage of machine learning. In this paper, we dissect the performance comparisons between these approaches in terms of the amount of model misspecification. |
Adam N. Elmachtoub; Henry Lam; Haixiang Lan; Haofeng Zhang; |
| 343 | Density Ratio-based Proxy Causal Learning Without Density Ratios Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the present work, we propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments. |
Bariscan Bozkurt; Ben Deaner; Dimitri Meunier; Liyuan Xu; Arthur Gretton; |
| 344 | Cubic Regularized Subspace Newton for Non-convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper addresses the optimization problem of minimizing non-convex continuous functions, a problem highly relevant in high-dimensional machine learning scenarios, particularly those involving over-parameterization. |
Jim Zhao; Nikita Doikov; Aurelien Lucchi; |
| 345 | Credibility-Aware Multimodal Fusion Using Probabilistic Circuits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a combination function that uses probabilistic circuits (PCs) to combine predictive distributions over individual modalities. |
Sahil Sidheekh; Pranuthi Tenali; Saurabh Mathur; Erik Blasch; Kristian Kersting; Sriraam Natarajan; |
| 346 | Parallel Backpropagation for Inverse of A Convolution with Application to Normalizing Flows Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose to use Inverse of Convolutions in the forward (image to latent vector) pass of the Normalizing flow. |
Sandeep Nagar; Girish Varma; |
| 347 | High Dimensional Bayesian Optimization Using Lasso Variable Selection Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Although this approach can mitigate the high-dimensional challenge in BO, it still leads to sample inefficiency. To address this issue, we introduce a novel method that identifies important variables by estimating the length scales of Gaussian process kernels. |
Vu Viet Hoang; Hung The Tran; Sunil Gupta; Vu Nguyen; |
| 348 | MING: A Functional Approach to Learning Molecular Generative Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we propose Molecular Implicit Neural Generation (MING), a diffusion-based model that learns molecular distributions in the function space. |
Van Khoa Nguyen; Maciej Falkiewicz; Giangiacomo Mercatali; Alexandros Kalousis; |
| 349 | Energy-consistent Neural Operators for Hamiltonian and Dissipative Partial Differential Equations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes Energy-consistent Neural Operators (ENOs), a general framework for learning solution operators of PDEs that follows the energy conservation or dissipation law from observed solution trajectories. |
Yusuke Tanaka; Takaharu Yaguchi; Tomoharu Iwata; Naonori Ueda; |
| 350 | Learning Infinite-Horizon Average-Reward Linear Mixture MDPs of Bounded Span Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper proposes a computationally tractable algorithm for learning infinite-horizon average-reward linear mixture Markov decision processes (MDPs) under the Bellman optimality condition. |
Woojin Chae; Kihyuk Hong; Yufan Zhang; Ambuj Tewari; Dabeen Lee; |
| 351 | Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a black-box reduction method that, by employing properties of uniformly convex regularizers, turns an optimization algorithm for H{ö}lder smooth convex losses into a uniformly stable learning algorithm with optimal statistical risk bounds on the excess risk, up to a constant factor depending on $p$. |
Simon Vary; David Mart�nez-Rubio; Patrick Rebeschini; |
| 352 | Cross-Modal Imputation and Uncertainty Estimation for Spatial Transcriptomics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here, we propose an attention-based cross-modal framework that simultaneously imputes gene expression for ST and recovers spatial locations for SC, while also providing uncertainty estimates for the expression of the imputed genes. |
Xiangyu Guo; Ricardo Henao; |
| 353 | Trustworthy Assessment of Heterogeneous Treatment Effect Estimator Via Analysis of Relative Error Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose incorporating uncertainty quantification into HTE estimator comparisons. |
Zijun Gao; |
| 354 | Near-optimal Algorithms for Private Estimation and Sequential Testing of Collision Probability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present new algorithms for estimating and testing \emph{collision probability}, a fundamental measure of the spread of a discrete distribution that is widely used in many scientific fields. |
Robert Istvan Busa-Fekete; Umar Syed; |
| 355 | Post-processing for Fair Regression Via Explainable SVD Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper presents a post-processing algorithm for training fair neural network regression models that satisfy statistical parity, utilizing an explainable singular value decomposition (SVD) of the weight matrix. |
Zhiqun Zuo; Ding Zhu; Mohammad Mahdi Khalili; |
| 356 | Q-function Decomposition with Intervention Semantics for Factored Action Spaces Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider Q-functions defined over a lower dimensional projected subspace of the original action space, and study the condition for the unbiasedness of decomposed Q-functions using causal effect estimation from the no unobserved confounder setting in causal statistics. |
Junkyu Lee; Tian Gao; Elliot Nelson; Miao Liu; Debarun Bhattacharjya; Songtao Lu; |
| 357 | Permutation Invariant Functions: Statistical Testing, Density Estimation, and Metric Entropy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, less attention is given to: (1) how to statistically test for the assumption of permutation invariance of coordinates in a random vector where the dimension is allowed to grow with the sample size; (2) how to estimate permutation invariant density functions; (3) how much “smaller” is the class of smooth functions with permutation invariance compared to that without permutation invariance. In this paper, we take a step back and examine these fundamental questions. |
Wee Chaimanowong; Ying Zhu; |
| 358 | Knowledge Graph Completion with Mixed Geometry Tensor Factorization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a new geometric approach for knowledge graph completion via low rank tensor approximation. |
Viacheslav Yusupov; Maxim Rakhuba; Evgeny Frolov; |
| 359 | A Computation-Efficient Method of Measuring Dataset Quality Based on The Coverage of The Dataset Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a computationally efficient method for quantifying dataset quality. |
Beomjun Kim; Jaehwan Kim; Kangyeon Kim; Sunwoo Kim; Heejin Ahn; |
| 360 | Optimizing Neural Network Training and Quantization with Rooted Logistic Objectives Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce Rooted Logistic Objectives (RLO) to improve practical convergence behavior with benefits for downstream tasks. |
Zhu Wang; Praveen Raj Veluswami; Harsh Mishra; Sathya N. Ravi; |
| 361 | Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose an algorithm that uses fixed allocations based on the prior information and the structure of the environment. |
Nicolas Nguyen; Imad Aouali; Andr�s Gy�rgy; Claire Vernade; |
| 362 | Large Covariance Matrix Estimation With Nonnegative Correlations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this problem, we introduce a multistage adaptive estimation algorithm based on majorization-minimization (MM).We propose a positive definite thresholding covariance estimation problem that includes nonconvex sparsity penalties and nonnegative correlation constraints. |
Yixin Yan; QIAO YANG; Ziping Zhao; |
| 363 | Tighter Confidence Bounds for Sequential Kernel Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we use martingale tail inequalities to establish new confidence bounds for sequential kernel regression. |
Hamish Flynn; David Reeb; |
| 364 | Theoretical Analysis of Leave-one-out Cross Validation for Non-differentiable Penalties Under High-dimensional Settings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the hyperparameter tuning problem in the proportional high dimensional regime where both the sample size $n$ and number of features $p$ are large, and $n/p$ and the signal-to-noise ratio (per observation) remain finite. |
Haolin Zou; Arnab Auddy; Kamiar Rahnama Rad; Arian Maleki; |
| 365 | Linearized Wasserstein Barycenters: Synthesis, Analysis, Representational Capacity, and Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the linear barycentric coding model (LBCM) which utilizes the linear optimal transport (LOT) metric for analysis and synthesis of probability measures. |
Matthew Werenski; Brendan Mallery; Shuchin Aeron; James M. Murphy; |
| 366 | Explaining ViTs Using Information Flow Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we define a theoretical approach to creating explanations for ViTs called InFlow. |
Chase Walker; Md Rubel Ahmed; Sumit Kumar Jha; Rickard Ewetz; |
| 367 | Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes a new Q-learning algorithm for quantile optimization in MDPs with strong convergence and performance guarantees. |
Jia Lin Hau; Erick Delage; Esther Derman; Mohammad Ghavamzadeh; Marek Petrik; |
| 368 | Counting Graphlets of Size K Under Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a non-interactive, locally differentially private algorithm capable of counting graphlets of any size $k$. |
Vorapong Suppakitpaisarn; Donlapark Ponnoprat; Nicha Hirankarn; Quentin Hillebrand; |
| 369 | Strategic Conformal Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With such alterations in mind, existing approaches to uncertainty quantification break. In this work we propose a new framework, Strategic Conformal Prediction, which is capable of robust uncertainty quantification in such a setting. |
Daniel Csillag; Claudio Jose Struchiner; Guilherme Tegoni Goedert; |
| 370 | Variational Adversarial Training Towards Policies with Improved Robustness Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we propose to apply variational optimization to optimize over the worst-case distribution of the adversary instead of a single worst-case adversary. |
Juncheng Dong; Hao-Lun Hsu; Qitong Gao; Vahid Tarokh; Miroslav Pajic; |
| 371 | Learning Pareto Manifolds in High Dimensions: How Can Regularization Help? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we discuss how the application of vanilla regularization approaches can fail, and propose a two-stage MOL framework that can successfully leverage low-dimensional structure. |
Tobias Wegel; Filip Kovacevic; Alexandru Tifrea; Fanny Yang; |
| 372 | On Adaptivity and Minimax Optimality of Two-sided Nearest Neighbors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Nearest neighbor (NN) algorithms have been extensively used for missing data problems in recommender systems and sequential decision-making systems. |
Tathagata Sadhukhan; Manit Paul; Raaz Dwivedi; |
| 373 | High-Dimensional Differential Parameter Inference in Exponential Family Using Time Score Matching Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Instead of estimating a high-dimensional model at each time and estimating changes later, we directly learn the differential parameter, i.e., the time derivative of the parameter. |
Daniel James Williams; Leyang Wang; Qizhen Ying; Song Liu; Mladen Kolar; |
| 374 | Generalized Criterion for Identifiability of Additive Noise Models Using Majorization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel identifiability criterion for DAGs that places constraints on the conditional variances of additive noise models. |
Aramayis Dallakyan; Yang Ni; |
| 375 | Learning A Single Index Model from Anisotropic Data with Vanilla Stochastic Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we analyze the learning dynamics of vanilla SGD under the SIM with anisotropic input data, demonstrating that vanilla SGD automatically adapts to the data’s covariance structure. |
Guillaume Braun; Minh Ha Quang; Masaaki Imaizumi; |
| 376 | Near-Polynomially Competitive Active Logistic Regression Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present the first algorithm that is polynomially competitive with the optimal algorithm on every input instance, up to factors polylogarithmic in the error and domain size. |
Yihan Zhou; Eric Price; Trung Nguyen; |
| 377 | Function-Space MCMC for Bayesian Wide Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Bayesian Neural Networks represent a fascinating confluence of deep learning and probabilistic reasoning, offering a compelling framework for understanding uncertainty in complex predictive models. In this paper, we investigate the use of the preconditioned Crank-Nicolson algorithm and its Langevin version to sample from a reparametrised posterior distribution of the neural network’s weights, as the widths grow larger. |
Lucia Pezzetti; Stefano Favaro; Stefano Peluchetti; |
| 378 | ADEPT: Hierarchical Bayes Approach to Personalized Federated Unsupervised Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We develop adaptive algorithms that discover the balance between using limited local data and collaborative information. |
Kaan Ozkara; Bruce Huang; Ruida Zhou; Suhas Diggavi; |
| 379 | Infinite Width Limits of Self Supervised Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we bridge the gap between the NTK and self-supervised learning, focusing on two-layer neural networks trained under the Barlow Twins loss. |
Maximilian Fleissner; Gautham Govind Anil; Debarghya Ghoshdastidar; |
| 380 | Minimum Empirical Divergence for Sub-Gaussian Linear Bandits Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a novel linear bandit algorithm called LinMED (Linear Minimum Empirical Divergence), which is a linear extension of the MED algorithm that was originally designed for multi-armed bandits. |
Kapilan Balagopalan; Kwang-Sung Jun; |
| 381 | Decoupling Epistemic and Aleatoric Uncertainties with Possibility Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show that an alternative representation of epistemic uncertainty, based on possibility theory, maintains many of the convenient features of standard Bayesian inference while displaying specific behaviours and properties that closely match the ones of an intuitive notion of information. |
Nong Minh Hieu; Jeremie Houssineau; Neil K. Chada; Emmanuel Delande; |
| 382 | Advancing Fairness in Precision Medicine: A Universal Framework for Optimal Treatment Estimation in Censored Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The problem presents two key challenges: measuring heterogeneous treatment effects (HTE) under fairness constraints and dealing with censoring mechanisms. We propose a general framework for estimating HTE using nonparametric methods and integrating user-controllable fairness constraints to address these problems. |
Hongni Wang; Junxi Zhang; Na Li; Linglong Kong; Bei Jiang; Xiaodong Yan; |
| 383 | Conditional Generative Learning from Invariant Representations in Multi-Source: Robustness and Efficiency Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, existing approaches often struggle with limitations such as negative transfer and an over-reliance on large pre-trained models. To address these challenges, we propose a novel method that effectively handles scenarios with outlier source domains, while making weaker assumptions about the data, thus ensuring broader applicability. |
Guojun Zhu; Sanguo Zhang; Mingyang Ren; |
| 384 | A Tight Regret Analysis of Non-Parametric Repeated Contextual Brokerage Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For both feedback types, we propose algorithms achieving tight regret bounds. |
Fran�ois Bachoc; Tommaso Cesari; Roberto Colomboni; |
| 385 | Improved Dependence on Coherence in Eigenvector and Eigenvalue Estimation Error Bounds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Using a new matrix concentration result that may be of independent interest, we establish estimation error bounds for eigenvector and eigenvalue recovery whose dependence on coherence significantly improves upon prior work. |
Hao Yan; Keith Levin; |
| 386 | On The Asymptotic Mean Square Error Optimality of Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper proposes a novel denoising strategy inspired by the structure of the MSE-optimal conditional mean estimator (CME). |
Benedikt Fesl; Benedikt B�ck; Florian Strasser; Michael Baur; Michael Joham; Wolfgang Utschick; |
| 387 | Fast Convergence of Softmax Policy Mirror Ascent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recently, Vaswani et al. (2021) introduced a policy gradient method that corresponds to mirror ascent in the dual space of logits. |
Reza Asad; Reza Babanezhad Harikandeh; Issam H. Laradji; Nicolas Le Roux; Sharan Vaswani; |
| 388 | Perfect Recovery for Random Geometric Graph Matching with Shallow Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the graph matching problem in the presence of vertex feature information using shallow graph neural networks. |
Suqi Liu; Morgane Austern; |
| 389 | Noisy Low-Rank Matrix Completion Via Transformed $L_1$ Regularization and Its Theoretical Properties Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper focuses on recovering an underlying matrix from its noisy partial entries, a problem commonly known as matrix completion. |
Kun Zhao; Jiayi Wang; Yifei Lou; |
| 390 | Efficient Estimation of A Gaussian Mean with Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study the problem of estimating the unknown mean $\theta$ of a unit variance Gaussian distribution in a locally differentially private (LDP) way. |
Kalinin Nikita; Lukas Steinberger; |
| 391 | $�$-th Order Acyclicity Derivatives for DAG Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce $\beta$-th Order Taylor Series Expansion Based Local Search ($\beta$-LS) which yields actionable descent directions for any $\beta \in \mathbb{N}$. |
Madhumitha Shridharan; Garud Iyengar; |
| 392 | Online Student-$t$ Processes with An Overall-local Scale Structure for Modelling Non-stationary Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a mixture of Student-$t$ processes with an adaptive structure for the covariance and noise behaviour for each mixture. |
Taole Sha; Michael Minyi Zhang; |
| 393 | Unbiased Quantization of The $L_1$ Ball for Communication-Efficient Distributed Mean Estimation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of unbiased minimum mean squared error quantization of the $L_1$ ball, with applications to distributed mean estimation and federated learning. |
Nithish Suresh Babu; Ritesh Kumar; Shashank Vatedka; |
| 394 | An Adaptive Method for Weak Supervision with Drifting Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce an adaptive method with formal quality guarantees for weak supervision in a non-stationary setting. |
Alessio Mazzetto; Reza Esfandiarpoor; Akash Singirikonda; Eli Upfal; Stephen Bach; |
| 395 | Distributional Adversarial Loss Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Furthermore, we investigate the role of randomness in achieving robustness against adversarial attacks. We show a general derandomization technique that preserves the extent of a randomized classifier’s robustness against adversarial attacks and show its effectiveness empirically. |
Saba Ahmadi; Siddharth Bhandari; Avrim Blum; Chen Dan; Prabhav Jain; |
| 396 | Robust Fair Clustering with Group Membership Uncertainty Sets Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider a setting where the assigned group memberships are noisy. |
Sharmila Duppala; Juan Luque; John P Dickerson; Seyed A. Esmaeili; |
| 397 | BudgetIV: Optimal Partial Identification of Causal Effects with Mostly Invalid Instruments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: An IV must affect $Y$ exclusively through $X$ and be unconfounded with $Y$. We present a framework for relaxing these assumptions with tuneable and interpretable "budget constraints". |
Jordan Penn; Lee M. Gunderson; Gecia Bravo-Hermsdorff; Ricardo Silva; David Watson; |
| 398 | Synthesis and Analysis of Data As Probability Measures With Entropy-Regularized Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We consider synthesis and analysis of probability measures using the entropy-regularized Wasserstein-2 cost and its unbiased version, the Sinkhorn divergence. |
Brendan Mallery; James M. Murphy; Shuchin Aeron; |
| 399 | Improving N-Glycosylation and Biopharmaceutical Production Predictions Using AutoML-Built Residual Hybrid Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a residual hybrid modeling approach that integrates mechanistic modeling with machine learning to produce significantly more accurate predictions for N-glycosylation and bioproduction. |
Pedro Seber; Richard Braatz; |
| 400 | Approximate Global Convergence of Independent Learning in Multi-Agent Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study two representative algorithms—independent $Q$-learning and independent natural actor-critic—within both value-based and policy-based frameworks, and provide the first finite-sample analysis for approximate global convergence. |
Ruiyang Jin; Zaiwei Chen; Yiheng Lin; Jie Song; Adam Wierman; |
| 401 | Stochastic Gradient Descent for B�zier Simplex Representation of Pareto Set in Multi-Objective Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: While various multi-objective optimization algorithms have been proposed so far, most of them aim to find finite solutions as an approximation of the Pareto set, which may not adequately capture the entire structure of the Pareto set, especially when the number of variables is large. To overcome this limitation, we propose a method to obtain a parametric hypersurface representing the entire Pareto set instead of a finite set of points. |
Yasunari Hikima; Ken Kobayashi; Akinori Tanaka; Akiyoshi Sannai; Naoki Hamada; |
| 402 | TRADE: Transfer of Distributions Between External Conditions with Normalizing Flows Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Despite the relevance of these applications, existing solutions are unsatisfactory as they require severely restricted model architectures or rely on energy-based training, which is prone to instability. We introduce TRADE, which overcomes these limitations by formulating the learning process as a boundary value problem. |
Stefan Wahl; Armand Rousselot; Felix Draxler; Ullrich Koethe; |
| 403 | Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an alternative solution, the deep Vecchia ensemble (DVE), which allows deterministic UQ to work in the presence of feature collapse, negating the need for network retraining. |
Felix Jimenez; Matthias Katzfuss; |
| 404 | Task Shift: From Classification to Regression in Overparameterized Linear Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In the few-shot case, wherein limited regression data is available, we propose a simple postprocessing algorithm which asymptotically recovers the ground-truth predictor. |
Tyler LaBonte; Kuo-Wei Lai; Vidya Muthukumar; |
| 405 | On Local Posterior Structure in Deep Ensembles Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Similarly, deep ensembles (DEs) are also known to improve calibration, and therefore, it is natural to hypothesize that deep ensembles of BNNs (DE-BNNs) should provide even further improvements. In this work, we systematically investigate this across a number of datasets, neural network architectures, and BNN approximation methods and surprisingly find that when the ensembles grow large enough, DEs consistently outperform DE-BNNs on in-distribution data. |
Mikkel Jordahn; Jonas Vestergaard Jensen; Mikkel N. Schmidt; Michael Riis Andersen; |
| 406 | Score Matching for Bridges Without Learning Time-reversals Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a new algorithm for learning a bridged diffusion process using score-matching methods. |
Elizabeth Louise Baker; Moritz Schauer; Stefan Sommer; |
| 407 | Active Bipartite Ranking with Smooth Posterior Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this article, bipartite ranking, a statistical learning problem involved in many applications and widely studied in the passive context, is approached in a much more general active setting than the discrete one previously considered in the literature.In addition, we provide a problem dependent upper bound on the expected sampling time of smooth-rank and establish a problem dependent lower bound on the expected sampling time of any PAC$(\epsilon,\delta)$ algorithm. |
James Cheshire; Stephan Cl�men�on; |
| 408 | ClusterSC: Advancing Synthetic Control with Donor Selection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: As they contain a greater number of observed units, this shift introduces the curse of dimensionality to SC. To address this, we propose Cluster Synthetic Control (ClusterSC), based on the idea that groups of individuals may exist where behavior aligns internally but diverges between groups. |
Saeyoung Rho; Andrew Tang; Noah Bergam; Rachel Cummings; Vishal Misra; |
| 409 | Deep Optimal Sensor Placement for Black Box Stochastic Simulations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel and robust approach, modelling the joint distribution over input parameters and solution with a joint energy-based model, trained on simulation data. |
Paula Cordero Encinar; Tobias Schr�der; Peter Yatsyshin; Andrew B. Duncan; |
| 410 | Scalable Out-of-Distribution Robustness in The Presence of Unobserved Confounders Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the task of out-of-distribution (OOD) generalization, where the distribution shift is due to an unobserved confounder ($Z$) affecting both the covariates ($X$) and the labels ($Y$). |
Parjanya Prajakta Prashant; Seyedeh Baharan Khatami; Bruno Ribeiro; Babak Salimi; |
| 411 | Graph Machine Learning Based Doubly Robust Estimator for Network Causal Effects Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These assumptions often fail to hold in high-dimensional networks, limiting the applicability of such approaches. To address this, we propose a novel methodology that integrates graph machine learning techniques with the double machine learning framework, facilitating accurate and efficient estimation of both direct and peer effects in a single observational social network. |
Seyedeh Baharan Khatami; Harsh Parikh; Haowei Chen; Sudeepa Roy; Babak Salimi; |
| 412 | Learning The Distribution Map in Reverse Causal Performative Prediction Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Inspired by a microeconomic model that adeptly characterizes agents’ behavior within labor markets, we introduce a novel approach to learning the distribution shift. |
Daniele Bracale; Subha Maity; Yuekai Sun; Moulinath Banerjee; |
| 413 | Decision-Point Guided Safe Policy Improvement Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce Decision Points RL (DPRL), an algorithm that restricts the set of state-action pairs (or regions for continuous states) considered for improvement. |
Abhishek Sharma; Leo Benac; Sonali Parbhoo; Finale Doshi-Velez; |
| 414 | InnerThoughts: Disentangling Representations and Predictions in Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose instead to learn a small separate neural network predictor module on a collection of training questions, that take the hidden states from all the layers at the last temporal position as input and outputs predictions. |
Didier Ch�telat; Joseph Cotnareanu; Rylee Thompson; Yingxue Zhang; Mark Coates; |
| 415 | On The Relationship Between Robustness and Expressivity of Graph Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We investigate the vulnerability of Graph Neural Networks (GNNs) to bit-flip attacks (BFAs) by introducing an analytical framework to study the influence of architectural features, graph properties, and their interaction. |
Lorenz Kummer; Wilfried N. Gansterer; Nils Morten Kriege; |
| 416 | Multimodal Learning with Uncertainty Quantification Based on Discounted Belief Fusion Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Moreover, the state-of-the-art evidence averaging strategy is not order invariant and fails to scale to multiple modalities. To address these challenges, we propose a novel multimodal learning method with order-invariant evidence fusion and introduce a conflict-based discounting mechanism that reallocates uncertain mass when unreliable modalities are detected. |
Grigor Bezirganyan; Sana Sellami; Laure Berti-Equille; S�bastien Fournier; |
| 417 | Theoretical Convergence Guarantees for Variational Autoencoders Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper aims to bridge that gap by providing non-asymptotic convergence guarantees for VAE trained using both Stochastic Gradient Descent and Adam algorithms. |
Sobihan Surendran; Antoine Godichon-Baggioni; Sylvain Le Corff; |
| 418 | Robust Estimation in Metric Spaces: Achieving Exponential Concentration with A Fr�chet Median Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: More recent activity has focused on extending such ideas beyond Euclidean spaces to Hilbert spaces and Riemannian manifolds. In this work, we show that such exponential concentration in presence of heavy tails can be achieved over a broader class of parameter spaces called CAT($\kappa$) spaces, a very general metric space equipped with the minimal essential geometric structure for our purpose, while being sufficiently broad to encompass most typical examples encountered in statistics and machine learning. |
Jakwang Kim; Jiyoung Park; Anirban Bhattacharya; |
| 419 | Learning to Forget: Bayesian Time Series Forecasting Using Recurrent Sparse Spectrum Signature Gaussian Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, this property can quickly become a curse when local information is essential and forgetting is required; so far this has only been addressed with ad-hoc methods such as slicing the time series into smaller segments. To overcome this, we propose a principled and data-driven approach by introducing a novel forgetting mechanism for signature features. |
Csaba T�th; Masaki Adachi; Michael A Osborne; Harald Oberhauser; |
| 420 | Federated Causal Inference: Multi-Study ATE Estimation Beyond Meta-Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study Federated Causal Inference, an approach to estimate treatment effects from decentralized data across centers. |
R�mi Khellaf; Aur�lien Bellet; Julie Josse; |
| 421 | Understanding The Effect of GCN Convolutions in Regression Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Despite their widespread success across various applications, their statistical properties (e.g., consistency, convergence rates) remain ill-characterized. To begin addressing this knowledge gap, we consider networks for which the graph structure implies that neighboring nodes exhibit similar signals and provide statistical theory for the impact of convolution operators. |
Juntong Chen; Johannes Schmidt-Hieber; Claire Donnat; Olga Klopp; |
| 422 | Near-Optimal Sample Complexity in Reward-Free Kernel-based Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We first explore this question assuming a generative model, then relax this assumption at the cost of increasing the sample complexity by a factor of $H$, the episode length. We tackle this fundamental problem using a broad class of kernels and a simpler algorithm compared to prior work. |
Aya Kayal; Sattar Vakili; Laura Toni; Alberto Bernacchia; |
| 423 | Information-Theoretic Measures on Lattices for Higher-Order Interactions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Here we present a systematic framework based on lattice theory to derive higher-order information-theoretic measures for multivariate data. |
Zhaolu Liu; Mauricio Barahona; Robert Peach; |
| 424 | An Empirical Bernstein Inequality for Dependent Data in Hilbert Spaces and Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce data-dependent Bernstein inequalities tailored for vector-valued processes in Hilbert space. |
Erfan Mirzaei; Andreas Maurer; Vladimir R Kostic; Massimiliano Pontil; |
| 425 | Nystr�m Kernel Stein Discrepancy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the typical U- and V-statistic-based KSD estimators suffer from a quadratic runtime complexity, which hinders their application in large-scale settings. In this work, we propose a Nystr{ö}m-based KSD acceleration—with runtime $\mathcal{O} \left(mn+m^3\right)$ for $n$ samples and $m\ll n$ Nystr{ö}m points—, show its $\sqrt{n}$-consistency with a classical sub-Gaussian assumption, and demonstrate its applicability for goodness-of-fit testing on a suite of benchmarks. |
Florian Kalinke; Zolt�n Szab�; Bharath Sriperumbudur; |
| 426 | Neural Point Processes for Pixel-wise Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Traditional regression methods based on mean squared error emphasize pixels with labels, leading to distorted predictions in unlabeled areas. To address this limitation, we introduce Neural Point Processes, a novel approach that combines 2D Gaussian Processes with neural networks to leverage spatial correlations between sparse labels on images. |
Chengzhi Shi; G�zde �zcan; Miquel Sirera Perell�; Yuanyuan Li; Nina Iftikhar Shamsi; Stratis Ioannidis; |
| 427 | Transformers Are Provably Optimal In-context Estimators for Wireless Communications Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The optimal solution to the ICE problem is a non-linear function of the underlying context. In this paper, we prove that, for a subclass of such problems, a single-layer softmax attention transformer (SAT) computes the optimal solution of the above estimation problem in the limit of large prompt length. |
Vishnu Teja Kunde; Vicram Rajagopalan; Chandra Shekhara Kaushik Valmeekam; Krishna Narayanan; Jean-Francois Chamberland; Dileep Kalathil; Srinivas Shakkottai; |
| 428 | Distance Estimation for High-Dimensional Discrete Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents the first polynomial query distance estimator in the conditional sampling model ($\mathsf{COND}$). |
Kuldeep S. Meel; Gunjan Kumar; Yash Pote; |
| 429 | Paths and Ambient Spaces in Neural Loss Landscapes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we propose a novel approach to directly embed loss tunnels into the loss landscape of neural networks. |
Daniel Dold; Julius Kobialka; Nicolai Palm; Emanuel Sommer; David R�gamer; Oliver D�rr; |
| 430 | A Bias-Variance Decomposition for Ensembles Over Multiple Synthetic Datasets Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As our theory predicts, multiple synthetic datasets often improve accuracy, while a single large synthetic dataset gives at best minimal improvement, showing that our insights are practically relevant. |
Ossi R�is�; Antti Honkela; |
| 431 | Statistical Inference for Feature Selection After Optimal Transport-based Domain Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel statistical method to statistically test FS reliability under DA, named SFS-DA (statistical FS-DA). |
Nguyen Thang Loi; Duong Tan Loc; Vo Nguyen Le Duy; |
| 432 | Pick-to-Learn and Self-Certified Gaussian Process Approximations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, P2L comes with limitations, including computational overhead, reliance on consistent data, and restriction to non-Bayesian settings. In this work, we overcome these challenges in general settings and employ the corresponding results to show that classical Gaussian process (GP) training procedures can be interpreted as instantiations of P2L, thus inheriting tight, self-certified bounds. |
Daniel Marks; Dario Paccagnan; |
| 433 | Do Regularization Methods for Shortcut Mitigation Work As Intended? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, we demonstrate that these methods can sometimes overregularize, inadvertently suppressing causal features along with spurious ones. In this work, we analyze the theoretical mechanisms by which regularization mitigates shortcuts and explore the limits of its effectiveness. |
Haoyang Hong; Ioanna Papanikolaou; Sonali Parbhoo; |
| 434 | I-trustworthy Models. A Framework for Trustworthiness Evaluation of Probabilistic Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Grounded in the competence-based theory of trust, this work formalizes I-trustworthy framework – a novel framework for assessing the trustworthiness of probabilistic classifiers for inference tasks by linking conditional calibration to trustworthiness. |
Ritwik Vashistha; Arya Farahi; |
| 435 | Noise-Aware Differentially Private Variational Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel method for noise-aware approximate Bayesian inference based on stochastic gradient variational inference which can also be applied to high-dimensional and non-conjugate models. |
Talal Alrawajfeh; Joonas J�lk�; Antti Honkela; |
| 436 | Importance-weighted Positive-unlabeled Learning for Distribution Shift Adaptation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a distribution shift adaptation method for PU learning without assuming shift types by using a few PU data in the test distribution and PU data in the training distribution. |
Atsutoshi Kumagai; Tomoharu Iwata; Hiroshi Takahashi; Taishi Nishiyama; Yasuhiro Fujiwara; |
| 437 | Meta-learning from Heterogeneous Tensors for Few-shot Tensor Completion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose neural network-based models for tensor completion in few observation settings. |
Tomoharu Iwata; Atsutoshi Kumagai; |
| 438 | Generalization Bounds for Dependent Data Using Online-to-Batch Conversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we give generalization bounds of statistical learning algorithms trained on samples drawn from a dependent data source both in expectation and with high probability, using the Online-to-Batch conversion paradigm. |
Sagnik Chatterjee; MANUJ MUKHERJEE; Alhad Sethi; |
| 439 | Best-Arm Identification in Unimodal Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose modifications of Track-and-Stop and a Top Two algorithm that leverage the unimodal structure. |
Riccardo Poiani; Marc Jourdan; Emilie Kaufmann; R�my Degenne; |
| 440 | No-Regret Bayesian Optimization with Stochastic Observation Failures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose two algorithms that have a trade-off relation between regret bounds and practical performance. |
Shogo Iwazaki; Tomohiko Tanabe; Mitsuru Irie; Shion Takeno; Kota Matsui; Yu Inatsu; |
| 441 | Tamed Langevin Sampling Under Weaker Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This set of assumptions greatly exceeds the operational limits of the "vanilla" ULA, making sampling from such distributions a highly involved affair. To account for this, we introduce a taming scheme which is tailored to the growth and decay properties of the target distribution, and we provide explicit non-asymptotic guarantees for the proposed sampler in terms of the KL divergence, total variation, and Wasserstein distance to the target distribution. |
Iosif Lytras; Panayotis Mertikopoulos; |
| 442 | Provable Benefits of Task-Specific Prompts for In-context Learning Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we consider a novel setting where the global task distribution can be partitioned into a union of conditional task distributions. |
Xiangyu Chang; Yingcong Li; Muti Kara; Samet Oymak; Amit Roy-Chowdhury; |
| 443 | Locally Private Sampling with Public Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Most LDP methods assume that users possess only a single data record, which is a significant limitation since users often gather extensive datasets (e.g., images, text, time-series data) and frequently have access to public datasets. To address this limitation, we propose a locally private sampling framework that leverages both the private and public datasets of each user. |
Behnoosh Zamanlooy; Mario Diaz; Shahab Asoodeh; |
| 444 | TempTest: Local Normalization Distortion and The Detection of Machine-generated Text Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As language models mimic the distribution of human text ever closer, this will limit our ability to build effective detection algorithms. To combat this, we introduce a method for detecting machine-generated text that is entirely agnostic of the generating language model. |
Tom Kempton; Stuart Burrell; Connor J Cheverall; |
| 445 | Zero-Shot Action Generalization with Limited Observations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a novel zero-shot framework, Action Generalization from Limited Observations (AGLO). |
Abdullah Alchihabi; Hanping Zhang; Yuhong Guo; |
| 446 | Local Stochastic Sensitivity Analysis For Dynamical Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our main contribution is the extension of adjoint-based a posteriori analysis for differential operators of generic dynamical systems acting on states to the Liouville operator acting on probability densities of the states. |
Nishant Panda; Jehanzeb H Chaudhry; Natalie Klein; James Carzon; Troy Butler; |
| 447 | SINE: Scalable MPE Inference for Probabilistic Graphical Models Using Advanced Neural Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce two methods to overcome discretization challenges: (1) an external oracle-based approach that infers uncertain variables using additional evidence from confidently predicted ones, and (2) a technique that identifies and selects the highest-scoring discrete solutions near the continuous output. |
Shivvrat Arya; Tahrima Rahman; Vibhav Giridhar Gogate; |
| 448 | Robust Classification By Coupling Data Mollification with Label Smoothing Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Inspired by the success of generative diffusion models, we propose a novel approach of coupling data mollification, in the form of image noising and blurring, with label smoothing to align predicted label confidences with image degradation. |
Markus Heinonen; Ba-Hien Tran; Michael Kampffmeyer; Maurizio Filippone; |
| 449 | Entropic Matching for Expectation Propagation of Markov Jump Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel, tractable latent state inference scheme for Markov jump processes, for which exact inference is often intractable. |
Yannick Eich; Bastian Alt; Heinz Koeppl; |
| 450 | Near-Optimal Algorithm for Non-Stationary Kernelized Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, this existing algorithm suffers from feasibility issues due to its huge computational cost. Therefore, we propose a novel near-optimal algorithm called restarting phased elimination with random permutation (R-PERP), which bypasses the huge computational cost. |
Shogo Iwazaki; Shion Takeno; |
| 451 | Efficient Trajectory Inference in Wasserstein Space Using Consecutive Averaging Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this work, we propose methods for B-spline approximation and interpolation of point clouds through consecutive averaging that is intrinsic to the Wasserstein space. |
Amartya Banerjee; Harlin Lee; Nir Sharon; Caroline Moosm�ller; |
| 452 | Algorithmic Accountability in Small Data: Sample-Size-Induced Bias Within Classification Metrics Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We provide analyses of the bias that appears in several commonly applied metrics and propose a model-agnostic assessment and correction technique. |
Jarren Briscoe; Garrett Kepler; Daryl Robert DeFord; Assefaw Gebremedhin; |
| 453 | DPFL: Decentralized Personalized Federated Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce decentralized personalized FL (DPFL), a bi-level optimization framework that enhances personalized FL by leveraging combinatorial relationships among clients, enabling fine-grained and targeted collaborations. |
Salma Kharrat; Marco Canini; Samuel Horv�th; |
| 454 | Optimal Multi-Objective Best Arm Identification with Fixed Confidence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an algorithm that uses the novel idea of {\em surrogate proportions} to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step. |
Zhirui Chen; P. N. Karthik; Yeow Meng Chee; Vincent Y. F. Tan; |
| 455 | Optimistic Safety for Online Convex Optimization with Unknown Linear Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of online convex optimization (OCO) under unknown linear constraints that are either static, or stochastically time-varying. For this problem, we introduce an algorithm that we term Optimistically Safe OCO (OSOCO) and show that it enjoys $\tilde{O}(\sqrt{T})$ regret and no constraint violation. |
Spencer Hutchinson; Tianyi Chen; Mahnoosh Alizadeh; |
| 456 | Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a double machine learning (DML) algorithm for mediation analysis that supports continuous treatments. |
Houssam Zenati; Judith Ab�cassis; Julie Josse; Bertrand Thirion; |
| 457 | Contractivity and Linear Convergence in Bilinear Saddle-point Problems: An Operator-theoretic Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the convex-concave bilinear saddle-point problem $\min_x \max_y f(x) + y^\top Ax – g(y)$, where both, only one, or none of the functions $f$ and $g$ are strongly convex, and suitable rank conditions on the matrix $A$ hold. |
Colin Dirren; Mattia Bianchi; Panagiotis D. Grontas; John Lygeros; Florian Dorfler; |
| 458 | Some Targets Are Harder to Identify Than Others: Quantifying The Target-dependent Membership Leakage Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper explains the target-dependent hardness of membership attacks by studying the powers of the optimal attacks in a \emph{fixed-target} MI game. |
Achraf Azize; Debabrota Basu; |
| 459 | Differentially Private Range Queries with Correlated Input Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work proposes a class of differentially private mechanisms for linear queries, in particular range queries, that leverages correlated input perturbation to simultaneously achieve unbiasedness, consistency, statistical transparency, and control over utility requirements in terms of accuracy targets expressed either in certain query margins or as implied by the hierarchical database structure. |
Prathamesh Dharangutte; Jie Gao; Ruobin Gong; Guanyang Wang; |
| 460 | Variational Combinatorial Sequential Monte Carlo for Bayesian Phylogenetics in Hyperbolic Space Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Our approach introduces consistent and unbiased estimators, along with variational inference methods (\textsc{H-Vcsmc} and \textsc{H-Vncsmc}), which outperform their Euclidean counterparts. |
Alex Chen; Philippe Chlenski; Kenneth Munyuza; Antonio Khalil Moretti; Christian A. Naesseth; Itsik Pe�er; |
| 461 | Adversarially-Robust TD Learning with Markovian Data: Finite-Time Rates and Fundamental Limits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these guarantees hinge on the reward observations being always generated from a well-behaved (e.g., sub-Gaussian) true reward distribution. Motivated by harsh, real-world environments where such an idealistic assumption may no longer hold, we revisit the policy evaluation problem from the perspective of \emph{adversarial robustness}. |
Sreejeet Maity; Aritra Mitra; |
| 462 | Robust Score Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we utilize the geometric median of means to develop a robust score matching procedure that yields consistent parameter estimates in settings where the observed data has been contaminated. |
Richard Schwank; Andrew McCormack; Mathias Drton; |
| 463 | Bayesian Principles Improve Prompt Learning In Vision-Language Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, most existing methods suffer from overfitting to fine-tuning data, yielding poor generalizability. To address this, we propose a new training objective function based on a Bayesian learning principle to balance adaptability and generalizability. |
Mingyu Kim; Jongwoo Ko; Mijung Park; |
| 464 | S-CFE: Simple Counterfactual Explanations Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we tackle the canonical formulation using the accelerated proximal gradient (APG) method, a simple yet efficient first-order procedure capable of handling smooth non-convex objectives and non-smooth $\ell_p$ (where $0 \leq p < 1$) regularizers. |
Shpresim Sadiku; Moritz Wagner; Sai Ganesh Nagarajan; Sebastian Pokutta; |
| 465 | Strong Screening Rules for Group-based SLOPE Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We develop strong screening rules for group-based Sorted L-One Penalized Estimation (SLOPE) models: Group SLOPE and Sparse-group SLOPE. |
Fabio Feser; Marina Evangelou; |
| 466 | Improving Pre-trained Self-Supervised Embeddings Through Effective Entropy Maximization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we motivate an effective entropy maximization criterion (E2MC), defined in terms of easy-to-estimate, low-dimensional constraints. |
Deep Chakraborty; Yann LeCun; Tim G. J. Rudner; Erik Learned-Miller; |
| 467 | Signature Isolation Forest Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Such linear inner product and the dictionary are a priori choices that highly influence the algorithm’s performances and might lead to unreliable results, particularly with complex datasets. This work aims to target such challenges by introducing Signature Isolation Forest, a novel class of AD algorithm leveraging the signature transform arising from rough path theory. |
Marta Campi; Guillaume Staerman; Gareth W. Peters; Tomoko Masui; |
| 468 | Reinforcement Learning for Adaptive MCMC Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The aim of this paper is to set out a general framework, called \emph{Reinforcement Learning Metropolis—Hastings}, that is theoretically supported and empirically validated. |
Congye Wang; Wilson Ye Chen; Heishiro Kanagawa; Chris J. Oates; |
| 469 | Harnessing Causality in Reinforcement Learning with Bagged Decision Times Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our goal is to develop an online RL algorithm to maximize the discounted sum of the bag-specific rewards. |
Daiqi Gao; Hsin-Yu Lai; Predrag Klasnja; Susan Murphy; |
| 470 | Stein Boltzmann Sampling: A Variational Approach for Global Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we present a deterministic particle-based method for global optimization of continuous Sobolev functions, called \emph{Stein Boltzmann Sampling} (SBS). |
Ga�tan Serr�; Argyris Kalogeratos; Nicolas Vayatis; |
| 471 | Bandit Pareto Set Identification in A Multi-Output Linear Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce and analyze the first optimal design-based algorithms for PSI, providing nearly optimal guarantees in both the fixed-budget and the fixed-confidence settings. |
Cyrille Kone; Emilie Kaufmann; Laura Richert; |
| 472 | Axiomatic Explainer Globalness Via Optimal Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we define a complexity measure for explainers, globalness, which enables deeper understanding of the distribution of explanations produced by feature attribution and feature selection methods for a given dataset. |
Davin Hill; Joshua Bone; Aria Masoomi; Max Torop; Jennifer Dy; |
| 473 | High-probability Convergence Bounds for Online Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study high-probability convergence in online learning, in the presence of heavy-tailed noise. |
Aleksandar Armacki; Shuhua Yu; Pranay Sharma; Gauri Joshi; Dragana Bajovic; Dusan Jakovetic; Soummya Kar; |
| 474 | Factor Analysis with Correlated Topic Model for Multi-Modal Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, FA is not suited for structured data modalities, such as text or single cell sequencing data, where multiple data points are measured per each sample and exhibit a clustering structure. To overcome this challenge, we introduce FACTM, a novel, multi-view and multi-structure Bayesian model that combines FA with correlated topic modeling and is optimized using variational inference. |
Malgorzata Lazecka; Ewa Maria Szczurek; |
| 475 | Personalized Convolutional Dictionary Learning of Physiological Time Series Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In particular, we propose Personalized CDL (PerCDL), in which a local dictionary models local information as a personalized spatiotemporal transformation of a global dictionary. |
Axel Roques; Samuel Gruffaz; Kyurae Kim; Alain Oliviero Durmus; Laurent Oudre; |
| 476 | QuACK: A Multipurpose Queuing Algorithm for Cooperative $k$-Armed Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Rather than adapting a specific single-agent algorithm, we propose a general-purpose black-box reduction that extends any single-agent algorithm to the multi-agent setting. |
Benjamin Howson; Sarah Lucie Filippi; Ciara Pike-Burke; |
| 477 | Distribution-Aware Mean Estimation Under User-level Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples. |
Corentin Pla; Maxime Vono; Hugo Richard; |
| 478 | A Subquadratic Time Approximation Algorithm for Individually Fair K-Center Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the $k$-center problem in the context of individual fairness. |
Matthijs Ebbens; Nicole Funk; Jan H�ckendorff; Christian Sohler; Vera Weil; |
| 479 | Spectral Differential Network Analysis for High-Dimensional Time Series Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We illustrate the method on synthetic data experiments and on experiments with electroencephalography data. |
Michael Hellstern; Byol Kim; Zaid Harchaoui; Ali Shojaie; |
| 480 | Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we revisit the convergence properties of LocalSGD and SCAFFOLD under a variety of existing or weaker conditions, including gradient similarity, Hessian similarity, weak convexity, and Lipschitz continuity of the Hessian. |
Ruichen Luo; Sebastian U Stich; Samuel Horv�th; Martin Tak�c; |
| 481 | Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of Non-Stationary Reinforcement Learning (NS-RL) without prior knowledge about the system’s non-stationarity. |
Argyrios Gerogiannis; Yu-Han Huang; Venugopal Veeravalli; |
| 482 | New User Event Prediction Through The Lens of Causal Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a novel discrete event prediction framework for new users with limited history, without needing to know the user’s category. |
Henry Yuchi; Shixiang Zhu; Li Dong; Yigit M. Arisoy; Matthew C. Spencer; |
| 483 | TVineSynth: A Truncated C-Vine Copula Generator of Synthetic Tabular Data to Balance Privacy and Utility Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose TVineSynth, a vine copula based synthetic tabular data generator, which is designed to balance privacy and utility, using the vine tree structure and its truncation to do the trade-off. |
Elisabeth Griesbauer; Claudia Czado; Arnoldo Frigessi; Ingrid Hob�k Haff; |
| 484 | QPOTS: Efficient Batch Multiobjective Bayesian Optimization Via Pareto Optimal Thompson Sampling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This, ultimately, affects their sample efficiency. To overcome these challenges, we propose a Thompson sampling (TS) based approach ($q\texttt{POTS}$). |
Ashwin Renganathan; Kade Carlson; |
| 485 | Privacy in Metalearning and Multitask Learning: Modeling and Separations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we undertake a systematic study of differentially private personalized learning. |
Maryam Aliakbarpour; Konstantina Bairaktari; Adam Smith; Marika Swanberg; Jonathan Ullman; |
| 486 | Posterior Mean Matching: Generative Modeling Through Online Bayesian Inference Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces posterior mean matching (PMM), a new method for generative modeling that is grounded in Bayesian inference. |
Sebastian Salazar; Michal Kucer; Yixin Wang; Emily Casleton; David Blei; |
| 487 | On The Consistent Recovery of Joint Distributions from Conditionals Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These approaches learn conditional distributions $p(x_T \mid x_S)$ simultaneously, where $x_S$ and $x_T$ are subsets of the observed variables. In this paper, we examine the core problem of when all these conditional distributions are consistent with some joint distribution, and whether common models used in practice can learn consistent conditionals. |
Mahbod Majid; Rattana Pukdee; Vishwajeet Agrawal; Burak Varici; Pradeep Kumar Ravikumar; |
| 488 | Transfer Neyman-Pearson Algorithm for Outlier Detection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a general algorithmic approach which is shown theoretically to yield strong guarantees w.r.t. to a range of changes in abnormal distribution, and at the same time amenable to practical implementation. |
Mohammadreza Mousavi Kalan; Eitan J. Neugut; Samory Kpotufe; |
| 489 | Deep Clustering Via Probabilistic Ratio-Cut Optimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose a novel approach for optimizing the graph ratio-cut by modeling the binary assignments as random variables. |
Ayoub Ghriss; Claire Monteleoni; |
| 490 | Certifiably Quantisation-Robust Training and Inference of Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In particular, we pose the problem of bounding the worst-case discrepancy between the original neural network and all possible quantised ones parametrised by a given maximum quantisation diameter $\epsilon > 0$ over a finite dataset. |
Hue Dang; Matthew Robert Wicker; Goetz Botterweck; Andrea Patane; |
| 491 | Clustering Context in Off-Policy Evaluation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose an alternative estimator that shares information across similar contexts using clustering. |
Daniel Guzman Olivares; Philipp Schmidt; Jacek Golebiowski; Artur Bekasov; |
| 492 | A Convex Relaxation Approach to Generalization Analysis for Parallel Positively Homogeneous Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a general framework for deriving generalization bounds for parallel positively homogeneous neural networks–a class of neural networks whose input-output map decomposes as the sum of positively homogeneous maps. |
Uday Kiran Reddy Tadipatri; Benjamin David Haeffele; Joshua Agterberg; Rene Vidal; |
| 493 | Additive Model Boosting: New Insights and Path(ologie)s Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the solution paths of BAMs and establish connections with other approaches for certain classes of problems. |
Rickmer Schulte; David R�gamer; |
| 494 | Cost-aware Simulation-based Inference Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: A significant challenge in this context is the large computational cost of simulating data from complex models, and the fact that this cost often depends on parameter values. We therefore propose \emph{cost-aware SBI methods} which can significantly reduce the cost of existing sampling-based SBI methods, such as neural SBI and approximate Bayesian computation. |
Ayush Bharti; Daolang Huang; Samuel Kaski; Francois-Xavier Briol; |
| 495 | A Family of Distributions of Random Subsets for Controlling Positive and Negative Dependence Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this study, we introduce a new family of distributions, named the discrete kernel point process (DKPP), which includes determinantal point processes and parts of Boltzmann machines. |
Takahiro Kawashima; Hideitsu Hino; |
| 496 | Flexible Copula-Based Mixed Models in Deep Learning: A Scalable Approach to Arbitrary Marginals Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce copula-based neural networks (COPNN), a novel framework that extends beyond the limitations of Gaussian marginals for random effects in mixed models. |
Giora Simchoni; Saharon Rosset; |
| 497 | Bayesian Inference in Recurrent Explicit Duration Switching Linear Dynamical Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel model called Recurrent Explicit Duration Switching Linear Dynamical Systems (REDSLDS) that incorporates recurrent explicit duration variables into the rSLDS model. |
Mikolaj Slupinski; |
| 498 | Estimation of Large Zipfian Distributions with Sort and Snap Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, we assume observations arrive from a known alphabet, and with a known decay rate parametrizing the Zipfian, but we do not know a priori which alphabet elements have larger probability than others. We present a novel Sort and Snap estimator, which uses the empirical proportions to sort the alphabet, and then snaps them to the associated term from the Zipfian distribution. |
Peter Matthew Jacobs; Anirban Bhattacharya; Debdeep Pati; Lekha Patel; Jeff M. Phillips; |
| 499 | Harnessing The Power of Vicinity-Informed Analysis for Classification Under Covariate Shift Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper introduces a novel dissimilarity measure that utilizes vicinity information, i.e., the local structure of data points, to analyze the excess error in classification under covariate shift, a transfer learning setting where marginal feature distributions differ but conditional label distributions remain the same. |
Mitsuhiro Fujikawa; Youhei Akimoto; Jun Sakuma; Kazuto Fukuchi; |
| 500 | The Pivoting Framework: Frank-Wolfe Algorithms with Active Set Size Control Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the pivoting meta algorithm (PM) to enhance optimization algorithms that generate iterates as convex combinations of vertices of a feasible region $C\subseteq \mathbb{R}^n$, including Frank-Wolfe (FW) variants. |
Mathieu Besan�on; Sebastian Pokutta; Elias Samuel Wirth; |
| 501 | Accelerated Methods for Riemannian Min-Max Optimization Ensuring Bounded Geometric Penalties Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study optimization problems of the form $\min_x \max_y f(x, y)$, where $f(x, y)$ is defined on a product Riemannian manifold $\mathcal{M} \times \mathcal{N}$ and is $\mu_x$-strongly geodesically convex (g-convex) in $x$ and $\mu_y$-strongly g-concave in $y$, for $\mu_x, \mu_y \geq 0$. |
David Mart�nez-Rubio; Christophe Roux; Christopher Criscitiello; Sebastian Pokutta; |
| 502 | Scalable Inference for Bayesian Multinomial Logistic-Normal Dynamic Linear Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Each observation is a multivariate count vector, where the total counts are arbitrary, and the information lies in the relative frequency of the counts. Multiple authors have proposed Bayesian Multinomial Logistic-Normal Dynamic Linear Models (MLN-DLMs) as a flexible approach to modeling these data. |
Manan Saxena; Tinghua Chen; Justin D Silverman; |
| 503 | Signed Graph Autoencoder for Explainable and Polarization-Aware Network Embeddings Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To address this gap, we propose the Signed Graph Archetypal Autoencoder (SGAAE) framework.Additionally, we introduce the 2-level network polarization problem and show how SGAAE is able to characterize such a setting. |
Nikolaos Nakis; Chrysoula Kosma; Giannis Nikolentzos; Michail Chatzianastasis; Iakovos Evdaimon; Michalis Vazirgiannis; |
| 504 | Reliable and Scalable Variable Importance Estimation Via Warm-start and Early Stopping Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: When the number of variables is large, estimating variable importance presents a significant challenge because re-training neural networks or other black-box algorithms requires significant additional computation. In this paper, we address this challenge for algorithms using gradient descent and gradient boosting (e.g. neural networks, gradient-boosted decision trees). |
Zexuan Sun; Garvesh Raskutti; |
| 505 | Learning The Pareto Front Using Bootstrapped Observation Samples Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our key contribution is a new estimator that in every round updates the estimate for the unknown parameter along \emph{multiple} context directions – in contrast to the conventional estimator that only updates the parameter estimate along the chosen context. |
Wonyoung Kim; Garud Iyengar; Assaf Zeevi; |
| 506 | Optimal Estimation of Linear Non-Gaussian Structure Equation Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, it introduces a structure recovery algorithm using distance covariance that achieves the optimal sample complexity, $n = \Theta(d_{in} \log \frac{p}{d_{in}})$, without assuming faithfulness or a known indegree. |
Sunmin Oh; Seungsu Han; Gunwoong Park; |
| 507 | Ordered $\mathcalV$-information Growth: A Fresh Perspective on Shared Information Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Focusing on the mixed setting (continuous $X$ and discrete $Y$), here we highlight the upward bias of empirical $\mathcal{V}$-information, $\hat I_{\mathcal{V}}(X \rightarrow Y)$, even when $\mathcal{V}$ is low-complexity (e.g., shallow neural networks). To mitigate this bias, we introduce $\mathcal{V}$-Information Growth (VI-Growth), defined as $\\hat I_{\mathcal{V}}(X \rightarrow Y) – \hat I_{\mathcal{V}}(X’ \rightarrow Y’)$, where $X’, Y’ \sim P_X P_Y$ represent independent variables. |
Rohan Ghosh; Mehul Motani; |
| 508 | Get Rid of Your Constraints and Reparametrize: A Study in NNLS and Implicit Bias Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Several works observed that when training linear diagonal networks on the square loss for regression tasks (which corresponds to overparametrized linear regression) gradient descent converges to special solutions, e.g., non-negative ones. |
Hung-Hsu Chou; Johannes Maly; Claudio Mayrink Verdun; Bernardo Freitas Paulo da Costa; Heudson Mirandola; |
| 509 | Collaborative Non-parametric Two-sample Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. |
Alejandro David De la Concha Duarte; Nicolas Vayatis; Argyris Kalogeratos; |
| 510 | Bayesian Decision Theory on Decision Trees: Uncertainty Evaluation and Interpretability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a novel MH algorithm where the leaf parameters and the tree shape are marginalized out by using the meta-trees and only the inner parameters are sampled. |
Yuta Nakahara; Shota Saito; Naoki Ichijo; Koki Kazama; Toshiyasu Matsushima; |
| 511 | Learning-Augmented Algorithms for Online Concave Packing and Convex Covering Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present learning-augmented algorithmic frameworks for two fundamental optimizations settings, extending and generalizing prior works. |
Elena Grigorescu; Young-San Lin; Maoyuan Song; |
| 512 | Model Selection for Behavioral Learning Data and Applications to Contextual Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose two model selection methods: a general hold-out procedure and an AIC-type criterion, both adapted to non-stationary dependent data. |
Julien Aubert; Louis K�hler; Luc Leh�ricy; Giulia Mezzadri; Patricia Reynaud-Bouret; |
| 513 | Posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: When designing a new inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem is evaluating its accuracy and efficiency across a range of representative target posteriors. To solve this problem, we propose posteriordb, a database of models and data sets defining target densities along with reference Monte Carlo draws. |
M�ns Magnusson; Jakob Torgander; Paul-Christian B�rkner; Lu Zhang; Bob Carpenter; Aki Vehtari; |
| 514 | Class Imbalance in Anomaly Detection: Learning from An Exactly Solvable Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We provide a theoretical framework to analyze, interpret and address CI. |
Francesco Saverio Pezzicoli; Valentina Ros; Fran�ois P. Landes; Marco Baity-Jesi; |
| 515 | SubSearch: Robust Estimation and Outlier Detection for Stochastic Block Models Via Subgraph Search Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose SubSearch, an algorithm for robustly estimating SBM parameters by exploring the space of subgraphs in search of one that closely aligns with the model’s assumptions. |
Leonardo Bianco; Christine Keribin; Zacharie Naulet; |
| 516 | Representer Theorems for Metric and Preference Learning: Geometric Insights and Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop a mathematical framework to address a broad class of metric and preference learning problems within a Hilbert space. |
Peyman Morteza; |
| 517 | The VampPrior Mixture Model Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Widely used deep latent variable models (DLVMs), in particular Variational Autoencoders (VAEs), employ overly simplistic priors on the latent space. |
Andrew A. Stirn; David A. Knowles; |
| 518 | MEDUSA: Medical Data Under Shadow Attacks Via Hybrid Model Inversion Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce MEDUSA (Medical Data Under Shadow Attacks), a novel hybrid model inversion framework that leverages gradient-based optimization and TCNNs to reconstruct high-fidelity medical images from model outputs in a gray-box setting. |
Asfandyar Azhar; Paul Thielen; Curtis Langlotz; |
| 519 | Differentially Private Algorithms for Linear Queries Via Stochastic Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This article establishes a method to answer a finite set of linear queries on a given dataset while ensuring differential privacy. |
Giorgio Micali; Clement LEZANE; Annika Betken; |
| 520 | Adaptive Convergence Rates for Log-Concave Maximum Likelihood Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the task of estimating a log-concave density in $\mathbb{R}^d$ using the Maximum Likelihood Estimator, known as the log-concave MLE. |
Gil Kur; Aditya Guntuboyina; |
| 521 | Differentiable Calibration of Inexact Stochastic Simulation Models Via Kernel Score Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose to learn differentiable input parameters of stochastic simulation models using output-level data via kernel score minimization with stochastic gradient descent. |
Ziwei Su; Diego Klabjan; |
| 522 | Dynamic DBSCAN with Euler Tour Sequences Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a fast and dynamic algorithm for Density-Based Spatial Clustering of Applications with Noise (DBSCAN) that efficiently supports online updates. |
Seiyun Shin; Ilan Shomorony; Peter Macgregor; |
| 523 | Multi-agent Multi-armed Bandit Regret Complexity and Optimality Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: These lower bounds are made possible through our newly constructed instances. In the numerical study, we assess the performance of various algorithms on these hard instances. |
Mengfan Xu; Diego Klabjan; |
| 524 | Bayesian Circular Regression with Von Mises Quasi-Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we explore a family of expressive and interpretable distributions over circle-valued random functions related to Gaussian processes targeting two Euclidean dimensions conditioned on the unit circle. |
Yarden Cohen; Alexandre Khae Wu Navarro; Jes Frellsen; Richard E. Turner; Raziel Riemer; Ari Pakman; |
| 525 | Cross Validation for Correlated Data in Classification Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a methodology for model evaluation and selection in binary classification models with the presence of correlations in the data, where the sampling mechanism violates the i.i.d. assumption. |
Oren Yuval; Saharon Rosset; |
| 526 | Adaptive RKHS Fourier Features for Compositional Gaussian Process Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In particular, we introduce Ordinary Differential Equation(ODE)–based RKHS Fourier features that allow for adaptive amplitude and phase modulation through convolution operations. |
Xinxing Shi; Thomas Baldwin-McDonald; Mauricio A �lvarez; |
| 527 | On Tractability of Learning Bayesian Networks with Ancestral Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Taking the path cover number of the constraint graph as a parameter, we extend earlier results to the problems of sampling and weighted counting of network structures. |
Juha Harviainen; Pekka Parviainen; |
| 528 | Learning Graph Node Embeddings By Smooth Pair Sampling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Inspired by observations on real data, we take a different approach and propose a new regularization technique. |
Konstantin Kutzkov; |
| 529 | Approximating The Total Variation Distance Between Gaussians Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, somewhat surprisingly, questions about computing it \emph{algorithmically} appear not to have been systematically studied until very recently. In this paper, we contribute to this line of work by studying this question in the important special case of multivariate Gaussians. |
Arnab Bhattacharyya; Weiming Feng; Piyush Srivastava; |
| 530 | Model Evaluation in The Dark: Robust Classifier Metrics with Missing Labels Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a multiple imputation technique for evaluating classifiers using metrics such as precision, recall, and ROC-AUC. |
Danial Dervovic; Michael Cashmore; |
| 531 | Change Point Detection in Hadamard Spaces By Alternating Minimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a computationally efficient two-step iterative optimization algorithm called HOP (Hadamard Optimal Partitioning) that detects changes in the sequence of so-called Fr{é}chet means. |
Anica Kostic; Vincent Runge; Charles Truong; |
| 532 | StableMDS: A Novel Gradient Descent-Based Method for Stabilizing and Accelerating Weighted Multidimensional Scaling Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To mitigate the computational challenge, we introduce StableMDS, a novel gradient descent-based method that reduces the computational complexity to $\mathcal{O}(n^2 p)$ per iteration. |
Zhongxi Fang; Xun Su; Tomohisa Tabuchi; Jianming Huang; Hiroyuki Kasai; |
| 533 | Bayes Without Underfitting: Fully Correlated Deep Learning Posteriors Via Alternating Projections Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For linearized models, the null space of the generalized Gauss-Newton matrix corresponds to parameters that preserve the training predictions of the point estimate. We propose to build Bayesian approximations in this null space, thereby guaranteeing that the Bayesian predictive does not underfit. |
Marco Miani; Hrittik Roy; S�ren Hauberg; |
| 534 | Robust Gradient Descent for Phase Retrieval Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we investigate an approach that leverages robust gradient descent techniques to improve the Wirtinger Flow algorithm’s ability to simultaneously cope with fourth moment bounded noise and adversarial contamination in both the inputs (covariates) and outputs (responses). |
Alex Buna; Patrick Rebeschini; |
| 535 | Data-Driven Upper Confidence Bounds with Near-Optimal Regret for Heavy-Tailed Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new distribution-free, data-driven UCB algorithm for symmetric reward distributions, which needs no moment information. |
Ambrus Tam�s; Szabolcs Szentp�teri; Bal�zs Cs�ji; |
| 536 | Disentangling Interactions and Dependencies in Feature Attributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we derive DIP, a new mathematical decomposition of individual feature importance scores that disentangles three components: the standalone contribution and the contributions stemming from interactions and dependencies. |
Gunnar K�nig; Eric G�nther; Ulrike von Luxburg; |
| 537 | Tensor Network-Constrained Kernel Machines As Gaussian Processes Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper we establish a new connection between Tensor Network (TN)-constrained kernel machines and Gaussian Processes (GPs). |
Frederiek Wesel; Kim Batselier; |
| 538 | Classification of High-dimensional Time Series in Spectral Domain Using Explainable Features with Applications to Neuroimaging Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a model-based approach for classifying high-dimensional stationary time series by assuming sparsity in the difference between spectra. |
Sarbojit Roy; Malik Shahid Sultan; Tania Reyes Vallejo; Leena Ali Ibrahim; Hernando Ombao; |
| 539 | Incremental Uncertainty-aware Performance Monitoring with Active Labeling Intervention Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We study the problem of monitoring machine learning models under gradual distribution shifts, where circumstances change slowly over time, often leading to unnoticed yet significant declines in accuracy. To address this, we propose Incremental Uncertainty-aware Performance Monitoring (IUPM), a novel label-free method that estimates performance changes by modeling gradual shifts using optimal transport. |
Alexander Koebler; Thomas Decker; Ingo Thon; Volker Tresp; Florian Buettner; |
| 540 | Efficient Exploitation of Hierarchical Structure in Sparse Reward Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Under the assumption of a sparse reward function and known hierarchical decomposition, we propose a new algorithm to learn optimal hierarchical policies. |
Gianluca Drappo; Arnaud Robert; Marcello Restelli; Aldo A. Faisal; Alberto Maria Metelli; Ciara Pike-Burke; |
| 541 | MDP Geometry, Normalization and Reward Balancing Solvers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a new geometric interpretation of Markov Decision Processes (MDPs) with a natural normalization procedure that allows us to adjust the value function at each state without altering the advantage of any action with respect to any policy. |
Arsenii Mustafin; Aleksei Pakharev; Alex Olshevsky; Ioannis Paschalidis; |
| 542 | Asynchronous Decentralized Optimization with Constraints: Achievable Speeds of Convergence for Directed Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a novel decentralized convex optimization algorithm called ASY-DAGP, where each agent has its own distinct objective function and constraint set. |
Firooz Shahriari-Mehr; Ashkan Panahi; |
| 543 | Sparse Activations As Conformal Predictors Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we uncover a novel connection between conformal prediction and sparse "softmax-like" transformations, such as sparsemax and $\gamma$-entmax (with $\gamma> 1$), which assign nonzero probability only to some labels. |
Margarida M Campos; Jo�o C�lem; Sophia Sklaviadis; Mario A. T. Figueiredo; Andre Martins; |
| 544 | A Safe Bayesian Learning Algorithm for Constrained MDPs with Bounded Constraint Violation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In recent literature, progress has been made on this very challenging problem but with either unsatisfactory assumptions such as the knowledge of a safe policy, or have high cumulative regret. We propose the Safe-PSRL (posterior sampling-based RL) algorithm that does not need such assumptions and yet performs very well, both in terms of theoretical regret bounds as well as empirically. |
Krishna C Kalagarla; Rahul Jain; Pierluigi Nuzzo; |
| 545 | Is Gibbs Sampling Faster Than Hamiltonian Monte Carlo on GLMs? Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we challenge the presumed domination of HMC for the Bayesian analysis of GLMs. |
Son Luu; Zuheng Xu; Nikola Surjanovic; Miguel Biron-Lattes; Trevor Campbell; Alexandre Bouchard-Cote; |
| 546 | Meta-learning Task-specific Regularization Weights for Few-shot Linear Regression Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a few-shot learning method for linear regression, which learns how to choose regularization weights from multiple tasks with different feature spaces, and uses the knowledge for unseen tasks. |
Tomoharu Iwata; Atsutoshi Kumagai; Yasutoshi Ida; |
| 547 | Out-of-distribution Robustness for Multivariate Analysis Via Causal Regularisation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a regularisation strategy of classical machine learning algorithms rooted in causality that ensures robustness against distribution shifts. |
Homer Durand; Gherardo Varando; Nathan Mankovich; Gustau Camps-Valls; |
| 548 | Memorization in Attention-only Transformers Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We present a novel proof for language-based Transformers that extends the current hypothesis to any context size. |
L�o Dana; Muni Sreenivas Pydi; Yann Chevaleyre; |
| 549 | Safety in The Face of Adversity: Achieving Zero Constraint Violation in Online Learning with Slowly Changing Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present the first theoretical guarantees for zero constraint violation in Online Convex Optimization (OCO) across all rounds, addressing dynamic constraint changes. |
Bassel Hamoud; Ilnura Usmanova; Kfir Yehuda Levy; |
| 550 | Sketch-and-Project Meets Newton Method: Global $O(1/k^2)$ Convergence with Low-Rank Updates Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose the first sketch-and-project Newton method with the fast $O(1/k^2$) global convergence rate for self-concordant functions. |
Slavomir Hanzely; |
| 551 | Fixed-Budget Change Point Identification in Piecewise Constant Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the piecewise constant bandit problem where the expected reward is a piecewise constant function with one change point (discontinuity) across the action space $[0,1]$ and the learner’s aim is to locate the change point. |
Joseph Lazzaro; Ciara Pike-Burke; |
| 552 | Tensor Network Based Feature Learning Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, another important aspect of model training — identifying optimal feature hyperparameters — has not been addressed and is typically handled using the standard cross-validation approach. In this paper, we introduce the Feature Learning (FL) model, which addresses this issue by representing tensor-product features as a learnable Canonical Polyadic Decomposition (CPD). |
Albert Saiapin; Kim Batselier; |
| 553 | Global Ground Metric Learning with Applications to ScRNA Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, predefined metrics typically cannot account for the inherent structure and varying significance of different features in the data, and existing supervised ground metric learning methods often fail to generalize across multiple classes or are limited to distributions with shared supports. To address this issue, this paper introduces a novel approach for learning metrics for arbitrary distributions over a shared metric space. |
Damin K�hn; Michael T Schaub; |
| 554 | Narrowing The Gap Between Adversarial and Stochastic MDPs Via Policy Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an algorithm, called APO-MVP, that achieves a regret bound of order $\tilde{\mathcal{O}}(\mathrm{poly}(H)\sqrt{SAT})$, where $S$ and $A$ are sizes of the state and action spaces, respectively. |
Daniil Tiapkin; Evgenii Chzhen; Gilles Stoltz; |
| 555 | Covariance Selection Over Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study covariance selection in a distributed setting, where data is spread across a network of agents. |
Wenfu Xia; Fengpei Li; Ying Sun; Ziping Zhao; |
| 556 | Natural Language Counterfactual Explanations for Graphs Using Large Language Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, these “what-if” explanations are frequently complex and technical, making them difficult for non-experts to understand and, more broadly, challenging for humans to interpret. To bridge this gap, in this work, we exploit the power of open-source Large Language Models to generate natural language explanations when prompted with valid counterfactual instances produced by state-of-the-art explainers for graph-based models. |
Flavio Giorgi; Cesare Campagnano; Fabrizio Silvestri; Gabriele Tolomei; |
| 557 | Weighted Sum of Gaussian Process Latent Variable Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our key contribution is to augment Gaussian Process Latent Variable Models (GPLVMs) for the case where each data point comprises the weighted sum of a known number of pure component signals, observed across several input locations. |
James A C Odgers; Ruby Sedgwick; Chrysoula Dimitra Kappatou; Ruth Misener; Sarah Lucie Filippi; |
| 558 | Density-Dependent Group Testing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we initiate the study of a generalized model of group testing that accommodates the physical effects of dilution of infected samples in large pools. |
Rahil Morjaria; Saikiran Bulusu; Venkata Gandikota; Sidharth Jaggi; |
| 559 | A Novel Convex Gaussian Min Max Theorem for Repeated Features Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We prove a generalization of the CGMT to a family of problems in machine learning (ML) with correlated entries in the data matrix. |
David Bosch; Ashkan Panahi; |
| 560 | Unveiling The Role of Randomization in Multiclass Adversarial Classification: Insights from Graph Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unfortunately, much of the theoretical analysis so far has focused on binary classification, providing only limited insights into the more complex multiclass setting. In this paper, we take a step toward closing this gap by drawing inspiration from the field of graph theory. |
Lucas Gnecco Heredia; Matteo Sammut; Muni Sreenivas Pydi; Rafael Pinot; Benjamin Negrevergne; Yann Chevaleyre; |
| 561 | Learning Visual-Semantic Subspace Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a nuclear norm-based loss function, grounded in the same information theoretic principles that have proved effective in self-supervised learning. |
Gabriel Moreira; Manuel Marques; Joao Costeira; Alexander G Hauptmann; |
| 562 | Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: This paper introduces a general framework for risk-sensitive bandits that integrates the notions of risk-sensitive objectives by adopting a rich class of {\em distortion riskmetrics}. |
Meltem Tatli; Arpan Mukherjee; Prashanth L. A.; Karthikeyan Shanmugam; Ali Tajer; |
| 563 | Steinmetz Neural Networks for Complex-Valued Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new approach to processing complex-valued data using DNNs consisting of parallel real-valued subnetworks with coupled outputs. |
Shyam Venkatasubramanian; Ali Pezeshki; Vahid Tarokh; |
| 564 | Distributional Off-policy Evaluation with Bellman Residual Minimization Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In contrast, we study the more manageable expectation-extended statistical distances and provide a novel theoretical justification on their validity for learning the return distribution. Based on this attractive property, we propose a new method called Energy Bellman Residual Minimizer (EBRM) for distributional OPE. |
Sungee Hong; Zhengling Qi; Raymond K. W. Wong; |
| 565 | Enhancing Feature-Specific Data Protection Via Bayesian Coordinate Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, LDP applies uniform protection to all data features, including less sensitive ones, which degrades performance of downstream tasks. To overcome this limitation, we propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification. |
Maryam Aliakbarpour; Syomantak Chaudhuri; Thomas Courtade; Alireza Fallah; Michael Jordan; |
| 566 | Signal Recovery from Random Dot-Product Graphs Under Local Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the problem of recovering latent information from graphs under $\varepsilon$-edge local differential privacy where the presence of relationships/edges between two users/vertices remains confidential, even from the data curator. |
Siddharth Vishwanath; Jonathan Hehir; |
| 567 | Online Assortment and Price Optimization Under Contextual Choice Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The seller observes which, if any, item is chosen at the end of each round, with the goal of maximizing cumulative revenue over a selling horizon of length $T$. For this problem, we propose an algorithm that learns from user feedback and achieves a revenue regret of order $\widetilde{\mathcal{O}}(d \sqrt{K T} / L_0 )$ where $L_0$ is the minimum price sensitivity parameter. |
Yigit Efe Erginbas; Thomas Courtade; Kannan Ramchandran; |
| 568 | Stochastic Weight Sharing for Bayesian Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, We reinterpret weight-sharing quantization techniques from a stochastic perspective in the context of training and inference with Bayesian Neural Networks (BNNs). |
Moule Lin; Shuhao Guan; Weipeng Jing; Goetz Botterweck; Andrea Patane; |
| 569 | Beyond Discretization: Learning The Optimal Solution Path Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We propose an alternative approach that parameterizes the solution path with a set of basis functions and solves a \emph{single} stochastic optimization problem to learn the entire solution path. |
Qiran Dong; Paul Grigas; Vishal Gupta; |
| 570 | The Uniformly Rotated Mondrian Kernel Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: The Mondrian kernel is one such example of a fast random feature approximation of the Laplace kernel, generated by a computationally efficient hierarchical random partition of the input space known as the Mondrian process. In this work, we study a variation of this random feature map by applying a uniform random rotation to the input space before running the Mondrian process to approximate a kernel that is invariant under rotations. |
Calvin Osborne; Eliza O�Reilly; |
| 571 | Graph-based Complexity for Causal Effect By Empirical Plug-in Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper focuses on the computational complexity of computing empirical plug-in estimates for causal effect queries. |
Rina Dechter; Anna K Raichev; Jin Tian; Alexander Ihler; |
| 572 | Learning Stochastic Nonlinear Dynamics with Embedded Latent Transfer Operators Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We consider an operator-based latent Markov representation of a stochastic nonlinear dynamical system, where the stochastic evolution of the latent state embedded in a reproducing kernel Hilbert space is described with the corresponding transfer operator, and develop a spectral method to learn this representation based on the theory of stochastic realization. |
Naichang Ke; Ryogo Tanaka; Yoshinobu Kawahara; |
| 573 | Gaussian Mean Testing Under Truncation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the task of Gaussian mean testing, that is, of testing whether a high-dimensional vector perturbed by white noise has large magnitude, or is the zero vector. |
Clement Louis Canonne; Themis Gouleakis; Yuhao Wang; Qiping Yang; |
| 574 | Mixed-Feature Logistic Regression Robust to Distribution Shifts Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we study a distributionally robust logistic regression problem that seeks the model that will perform best against adversarial realizations of the data distribution drawn from a suitably constructed Wasserstein ambiguity set. |
Qingshi Sun; Nathan Justin; Andres Gomez; Phebe Vayanos; |
| 575 | Unconditionally Calibrated Priors for Beta Mixture Density Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose an extension allowing to model correlations in the covariates via Gaussian copulas, potentially reducing the necessary number of mixture components. |
Alix Lh�ritier; Maurizio Filippone; |
| 576 | Weighted Euclidean Distance Matrices Over Mixed Continuous and Categorical Inputs for Gaussian Process Models Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: However, standard GP models are limited to continuous variables due to the difficulties in establishing correlation structures for categorical variables. To overcome this limitation, we introduce \textbf{WE}ighted Euclidean distance matrices \textbf{G}aussian \textbf{P}rocess (WEGP). |
Mingyu Pu; Wang Songhao; Haowei Wang; Szu Hui Ng; |
| 577 | On The Convergence of Continual Federated Learning Using Incrementally Aggregated Gradients Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose \emph{Continual Federated Learning with Aggregated Gradients} (C-FLAG), a novel replay-memory based federated strategy consisting of edge-based gradient updates on memory and aggregated gradients on the current data. |
Satish Kumar Keshri; Nazreen Shah; Ranjitha Prasad; |
| 578 | Analyzing Generative Models By Manifold Entropic Metrics Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, it is difficult to measure objectively if and to what degree desirable properties of disentangled representations have been achieved. Inspired by the principle of independent mechanisms, we address this difficulty by introducing a novel set of tractable information-theoretic evaluation metrics. |
Daniel Galperin; Ullrich Koethe; |
| 579 | AlleNoise – Large-scale Text Classification Benchmark Dataset with Real-world Label Noise Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we present AlleNoise, a new curated text classification dataset with real-world instance-dependent label noise, containing over 500,000 examples across approximately 5600 classes, complemented with a meaningful, hierarchical taxonomy of categories. |
Alicja Raczkowska; Aleksandra Osowska-Kurczab; Jacek Szczerbinski; Kalina Jasinska-Kobus; Klaudia Nazarko; |
| 580 | Hypernym Bias: Unraveling Deep Classifier Training Dynamics Through The Lens of Class Hierarchy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a novel framework to track the evolution of the feature manifold during training, revealing how the hierarchy of class relations emerges and refines across the network layers. |
Roman Malashin; Yachnaya Valeria; Alexandr V. Mullin; |
| 581 | Mean-Field Microcanonical Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: As a remedy, we propose a mean-field microcanonical gradient descent that samples several weakly coupled data points simultaneously, allowing for better control of the entropy loss while paying little in terms of likelihood fit. |
Marcus H�ggbom; Morten Karlsmark; Joakim And�n; |
| 582 | Task-Driven Discrete Representation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we adopt a practical approach that examines DRL from a task-driven perspective. |
Long Tung Vuong; |
| 583 | The Strong Product Model for Network Inference Without Independence Assumptions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Unfortunately, current methods are not able to make this distinction. In this paper, we address this problem by introducing the strong product model for Gaussian graphical modelling. |
Bailey Andrew; David Robert Westhead; Luisa Cutillo; |