Paper Digest: COLT 2025 Papers & Highlights
To search for papers presented at COLT-2025 on a specific topic, please make use of the search by venue (COLT-2025) service. To summarize the latest research published at COLT 2025 on a specific topic, you can utilize the review by venue (COLT-2025) service. If you are interested in browsing papers by author, we have a comprehensive list of all authors (COLT-2025).
This curated list is created by the Paper Digest Team. Experience the cutting-edge capabilities of Paper Digest, an innovative AI-powered research platform that gets you the personalized and comprehensive daily updates on the latest research, discussions & news in your field. It also empowers you to read articles, write articles, get answers, conduct literature reviews and generate research reports.
Experience the full potential of our services today!
TABLE 1: Paper Digest: COLT 2025 Papers & Highlights
| Paper | Author(s) | |
|---|---|---|
| 1 | Learning Algorithms in The Limit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Complimentary to the traditional Input-Output Observations, we introduce Time-Bound Observations, and Policy-Trajectory Observations to study the learnability of general recursive functions under more realistic constraints. |
Hristo Papazov; Nicolas Flammarion; |
| 2 | Metric Embeddings Beyond Bi-Lipschitz Distortion Via Sherali-Adams Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on Multi-dimensional Scaling (MDS), where we are given a set of non-negative dissimilarities $\{d_{i,j}\}_{i,j\in[n]}$ over $n$ points, and the goal is to find an embedding $\{x_1,…,x_n\}\subset\mathbb{R}^k$ that minimizes \[ \mathrm{OPT} = \min_{x_1,…,x_n} \mathbb{E}_{i,j\in[n]} \left[ \left(1-\frac{\|x_i – x_j\|}{d_{i,j}}\right)^2 \right]. |
Ainesh Bakshi; Vincent Cohen-Addad; Rajesh Jayaram; Samuel B. Hopkins; Silvio Lattanzi; |
| 3 | Learning Shallow Quantum Circuits with Many-qubit Gates Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we present the first algorithm for efficient average-case learning of QAC0 circuits with logarithmic ancilla. |
Francisca Vasconcelos; Hsin-Yuan Huang; |
| 4 | The Pitfalls of Imitation Learning When Actions Are Continuous Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of imitating an expert demonstrator in a discrete-time, continuous state-and-action space control system. |
Max Simchowitz; Daniel Pfrommer; Ali Jadbabaie; |
| 5 | Learning General Gaussian Mixtures with Efficient Score Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions. |
Sitan Chen; Vasilis Kontonis; Kulin Shah; |
| 6 | Computational Intractability of Strategizing Against Online Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we establish a strong computational hardness result: unless $\mathsf{P} = \mathsf{NP}$, no polynomial-time optimizer can compute a near-optimal strategy against a learner using a standard no-regret algorithm, specifically Multiplicative Weights Update (MWU). |
Angelos Assos; Yuval Dagan; Nived Rajaraman; |
| 7 | Linear Convergence of Diffusion Models Under The Manifold Hypothesis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Score-matching generative models have proven successful at sampling from complex high-dimensional data distributions. |
Peter Potaptchik; Iskander Azangulov; George Deligiannidis; |
| 8 | Computational-Statistical Tradeoffs at The Next-Token Prediction Barrier: Autoregressive and Imitation Learning Under Misspecification (extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: From a theoretical perspective, this phenomenon should not appear in \emph{well-specified} settings, and, indeed, a growing body of empirical work hypothesizes that \emph{misspecification}, where the learner is not sufficiently expressive to represent the target distribution, may be the root cause. Under misspecification—where the goal is to learn as well as the best-in-class model up to a multiplicative approximation factor $C\geq{}1$—we confirm that $C$ indeed grows with $H$ for next-token prediction, lending theoretical support to this empirical hypothesis. |
Dhruv Rohatgi; Adam Block; Audrey Huang; Akshay Krishnamurthy; Dylan J. Foster; |
| 9 | Necessary and Sufficient Oracles: Toward A Computational Taxonomy for Reinforcement Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we clarify the impact of the choice of supervised learning oracle on the computational complexity of RL, as quantified by the oracle strength. |
Dhruv Rohatgi; Dylan J. Foster; |
| 10 | Computing Optimal Regularizers for Online Linear Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the choice of regularizer can significantly impact dimension-dependent factors in the regret bound. We present an algorithm that takes as input convex and symmetric action sets and loss sets for a specific OLO instance, and outputs a regularizer such that running FTRL with this regularizer guarantees regret within a universal constant factor of the best possible regret bound. |
Khashayar Gatmiry; Jon Schneider; Stefanie Jegelka; |
| 11 | Predicting Quantum Channels Over General Product Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose a new approach that achieves accurate prediction over essentially any product distribution $\mathcal{D}$, provided it is not “classical” in which case there is a trivial exponential lower bound. |
Sitan Chen; Jaume de Dios Pont; Jun-Ting Hsieh; Hsin-Yuan Huang; Jane Lange; Jerry Li; |
| 12 | Learning Compositional Functions with Transformers from Easy-to-Hard Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Transformer-based language models have demonstrated impressive capabilities across a range of complex reasoning tasks. |
Zixuan Wang; Eshaan Nichani; Alberto Bietti; Alex Damian; Daniel Hsu; Jason D Lee; Denny Wu; |
| 13 | Learning Mixtures of Gaussians Using Diffusion Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We give a new algorithm for learning mixtures of $k$ Gaussians (with identity covariance in $\mathbb{R}^n$) to TV error $\varepsilon$, with quasi-polynomial ($O(n^{\text{poly\,log}\left(\frac{n+k}{\varepsilon}\right)})$) time and sample complexity, under a minimum weight assumption. |
Khashayar Gatmiry; Jonathan Kelner; Holden Lee; |
| 14 | Gradient Methods with Online Scaling Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We introduce a framework to accelerate the convergence of gradient-based methods with online learning. |
Wenzhi Gao; Ya-Chi Chu; Yinyu Ye; Madeleine Udell; |
| 15 | The Role of Environment Access in Agnostic Reinforcement Learning (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We also show algorithm-specific lower bounds for PSDP and CPI under the weaker condition of \emph{policy class realizability}. 3) In light of these lower bounds, we introduce a new model of access called \emph{hybrid resets}, which subsumes both local simulators (which is weaker than generative access) and $\mu$-resets. |
Akshay Krishnamurthy; Gene Li; Ayush Sekhari; |
| 16 | Low-rank Fine-tuning Lies Between Lazy Training and Feature Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we study low-rank fine-tuning in a student-teacher setting. |
Arif Kerem Dayi; Sitan Chen; |
| 17 | Mixing Time of The Proximal Sampler in Relative Fisher Information Via Strong Data Processing Inequality (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show that when $\nu$ is $\alpha$-strongly log-concave, Proximal Sampler also has an exponential convergence in relative Fisher information. |
Andre Wibisono; |
| 18 | Optimal Scheduling of Dynamic Transport Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Though many popular methods seek straight line (i.e., zero acceleration) trajectories, we show here that a specific class of “curved” trajectories can significantly improve approximation and learning. |
Panos Tsimpos; Ren Zhi; Jakob Zech; Youssef Marzouk; |
| 19 | Online Covariance Estimation in Nonsmooth Stochastic Approximation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study an online batch-means covariance matrix estimator introduced in Zhu et al. (2023). |
Liwei Jiang; Abhishek Roy; Krishnakumar Balasubramanian; Damek Davis; Dmitriy Drusvyatskiy; Sen Na; |
| 20 | Accelerating Proximal Gradient Descent Via Silver Stepsizes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: An open question raised by several papers is whether this phenomenon of stepsize-based acceleration holds more generally for constrained and/or composite convex optimization via projected and/or proximal versions of gradient descent. We answer this in the affirmative by proving that the silver stepsize schedule yields analogously accelerated rates in these settings. |
Jinho Bok; Jason M. Altschuler; |
| 21 | Computing High-dimensional Confidence Sets for Arbitrary Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of learning a high-density region of an arbitrary distribution over $\mathbb{R}^d$. |
Chao Gao; Liren Shan; Vaidehi Srinivas; Aravindan Vijayaraghavan; |
| 22 | Sample Efficient Omniprediction and Downstream Swap Regret for Non-Linear Losses Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We define “decision swap regret” which generalizes both prediction for downstream swap regret and omniprediction, and give algorithms for obtaining it for arbitrary multi-dimensional Lipschitz loss functions in online adversarial settings. |
Jiuyao Lu; Aaron Roth; Mirah Shi; |
| 23 | A Polynomial-time Algorithm for Online Sparse Linear Regression with Improved Regret Bound Under Weaker Conditions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study the problem of online sparse linear regression (OSLR) where the algorithms are restricted to accessing only $k$ out of $d$ attributes per instance for prediction, which was proved to be NP-hard. |
Junfan Li; Shizhong Liao; Zenglin Xu; Liqiang Nie; |
| 24 | Community Detection with The Bethe-Hessian Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We provide the first rigorous analysis of the Bethe-Hessian spectral method in the SBM under both the bounded expected degree and the growing degree regimes. |
Ludovic Stephan; Yizhe Zhu; |
| 25 | Mean-field Analysis of Polynomial-width Two-layer Neural Network Beyond Finite Time Horizon Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We apply our results to the canonical feature learning problem of estimating a well-specified single-index model; we permit the information exponent to be arbitrarily large, leading to convergence times that grow polynomially in the ambient dimension d. |
Margalit Glasgow; Denny Wu; Joan Bruna; |
| 26 | Exploring Facets of Language Generation in The Limit Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The recent work of Kleinberg and Mullainathan provides a concrete model for language generation in the limit: given a sequence of examples from an unknown target language, the goal is to generate new examples from the target language such that no incorrect examples are generated beyond some point. |
Moses Charikar; Chirag Pabbaraju; |
| 27 | Faster Low-Rank Approximation and Kernel Ridge Regression Via The Block-Nystr�m Method Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, when the data exhibits heavy-tailed spectral decay, the effective dimension of the problem often becomes so large that even the Nyström method may be outside of our computational budget. To address this, we propose Block-Nyström, an algorithm that injects a block-diagonal structure into the Nyström method, thereby significantly reducing its computational cost while recovering strong approximation guarantees. |
Sachin Garg; Michal Derezinski; |
| 28 | Characterizing Dependence of Samples Along The Langevin Dynamics and Algorithms Via Contraction of $F$-Mutual Information (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study the question of how fast the samples become approximately independent along popular Markov chains for continuous-space sampling: the Langevin dynamics in continuous time, and the Unadjusted Langevin Algorithm and the Proximal Sampler in discrete time. |
Jiaming Liang; Siddharth Mitra; Andre Wibisono; |
| 29 | Agnostic Learning of Arbitrary ReLU Activation Under Gaussian Marginals Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the problem of learning an arbitrarily-biased ReLU activation (or neuron) over Gaussian marginals with the squared loss objective. |
Anxin Guo; Aravindan Vijayaraghavan; |
| 30 | Stochastic Block Models with Many Communities and The Kesten�Stigum Bound – Extended Abstract Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Conversely, recent work provides evidence for the hardness part using the low-degree paradigm. In this paper we investigate community recovery in the regime $q=q_n \to \infty$ as $n\to\infty$ where no such predictions exist. |
Byron Chin; Elchanan Mossel; Youngtak Sohn; Alexander S. Wein; |
| 31 | Rate-Preserving Reductions for Blackwell Approachability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Abernethy et al. (2011) showed that Blackwell approachability and no-regret learning are equivalent, in the sense that any algorithm that solves a specific Blackwell approachability instance can be converted to a sublinear regret algorithm for a specific no-regret learning instance, and vice versa. In this paper, we study a more fine-grained form of such reductions, and ask when this translation between problems preserves not only a sublinear rate of convergence, but also preserves the optimal rate of convergence. |
Christoph Dann; Yishay Mansour; Mehryar Mohri; Jon Schneider; Balasubramanian Sivan; |
| 32 | Fundamental Limits of Matrix Sensing: Exact Asymptotics, Universality, and Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider this model in the high-dimensional limit: while previous works on this model primarily focused on the recovery of low-rank matrices, we consider in this work more general classes of structured signal matrices with potentially large rank, e.g. a product of two matrices of sizes proportional to the dimension. |
Yizhou Xu; Antoine Maillard; Lenka Zdeborov�; Florent Krzakala; |
| 33 | Sparsity-Based Interpolation of External, Internal and Swap Regret Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: With $d$ experts and $T\gg d$ rounds in total, we present a single algorithm achieving the instance-adaptive $\phi$-regret bound $ \tilde O\left\{\min\left\{\sqrt{d-d^{\mathrm{unif}}_\phi+1},\sqrt{d-d^{\mathrm{self}}_\phi}\right\}\cdot\sqrt{T}\right\}, $ where $d^{\mathrm{unif}}_\phi$ is the maximum amount of experts modified identically by $\phi$, and $d^{\mathrm{self}}_\phi$ is the amount of experts that $\phi$ trivially modifies to themselves. |
Zhou Lu; Y Jennifer Sun; Zhiyu Zhang; |
| 34 | Fast and Furious Symmetric Learning in Zero-Sum Games: Gradient Descent As Fictitious Play Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we obtain strong new regret guarantees for both algorithms on a class of symmetric zero-sum games that generalize the classic three-strategy Rock-Paper-Scissors to a weighted, $n$-dimensional regime. |
John Lazarsfeld; Georgios Piliouras; Ryann Sim; Andre Wibisono; |
| 35 | Provable Complexity Improvement of AdaGrad Over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In fact, under standard assumptions of Lipschitz gradients and bounded noise variance, it is known that SGD is worst-case optimal (up to absolute constants) in terms of finding a near-stationary point with respect to the $\ell_2$-norm, making further improvements impossible. Motivated by this limitation, we introduce refined assumptions on the smoothness structure of the objective and the gradient noise variance, which better suit the coordinate-wise nature of adaptive gradient methods. |
Ruichen Jiang; Devyani Maladkar; Aryan Mokhtari; |
| 36 | A Theory of Learning with Autoregressive Chain of Thought Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a simple base class that allows for universal representability and computationally tractable chain-of-thought learning. |
Nirmit Joshi; Gal Vardi; Adam Block; Surbhi Goel; Zhiyuan Li; Theodor Misiakiewicz; Nathan Srebro; |
| 37 | Spike-and-Slab Posterior Sampling in High Dimensions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We give the first provable algorithms for spike-and-slab posterior sampling that apply for any SNR, and use a measurement count sublinear in the problem dimension. |
Symantak Kumar; Purnamrita Sarkar; Kevin Tian; Yusong Zhu; |
| 38 | Taking A Big Step: Large Learning Rates in Denoising Score Matching Prevent Memorization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Yet, in practice, only a moderate degree of memorization is observed, even without explicit regularization. In this paper, we investigate this phenomenon by uncovering an implicit regularization mechanism driven by large learning rates. |
Yu-Han Wu; Pierre Marion; G�rard Biau; Claire Boyer; |
| 39 | On The Minimax Regret of Sequential Probability Assignment Via Square-Root Entropy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of sequential probability assignment under logarithmic loss, both with and without side information. |
Zeyu Jia; Alexander Rakhlin; Yury Polyanskiy; |
| 40 | On The Convergence of Min-Max Langevin Dynamics and Algorithm Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study zero-sum games in the space of probability distributions over the Euclidean space $\mathbb{R}^d$ with entropy regularization, in the setting when the interaction function between the players is smooth and strongly convex-strongly concave. |
Yang Cai; Siddharth Mitra; Xiuyuan Wang; Andre Wibisono; |
| 41 | Truthfulness of Decision-Theoretic Calibration Measures Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce a new calibration measure termed subsampled step calibration, $\mathrm{StepCE}^{\mathrm{sub}}$, that is both decision-theoretic and truthful. |
Mingda Qiao; Eric Zhao; |
| 42 | Stability and List-Replicability for Agnostic Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we characterize the classes that are learnable under their proposed relaxed conditions, resolving the two open problems raised in their work. |
Ari Blondal; Gao Shan; Hamed Hatami; Pooya Hatami; |
| 43 | Improved Sample Upper and Lower Bounds for Trace Estimation of Quantum State Powers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we significantly improve the sample complexity of estimating $\operatorname{tr}(\rho^q)$ in both the upper and lower bounds. |
Kean Chen; Qisheng Wang; |
| 44 | Time-Uniform Self-Normalized Concentration for Vector-Valued Processes (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we construct a general, self-normalized inequality for multivariate processes that satisfy a simple yet broad “sub-$\psi$” tail condition, which generalizes assumptions based on cumulant generating functions. |
Justin Whitehouse; Zhiwei Steven Wu; Aaditya Ramdas; |
| 45 | Orthogonal Causal Calibration (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop general algorithms for reducing the task of causal calibration to that of calibrating a standard (non-causal) predictive model. |
Justin Whitehouse; Christopher Jung; Vasilis Syrgkanis; Bryan Wilder; Zhiwei Steven Wu; |
| 46 | Private Realizable-to-Agnostic Transformation with Near-Optimal Sample Complexity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we give an improved construction that eliminates the dependence on $\varepsilon$, thereby achieving a near-optimal extra sample complexity of $\widetilde{O}(\mathrm{VC}(\mathcal{C})/\alpha^2)$ for any $\varepsilon\le 1$. |
Bo Li; Wei Wang; Peng Ye; |
| 47 | Low-dimensional Adaptation of Diffusion Models: Convergence in Total Variation (extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper presents new theoretical insights into how diffusion generative models adapt to low-dimensional structure in data distributions. |
Jiadong Liang; Zhihan Huang; Yuxin Chen; |
| 48 | DiscQuant: A Quantization Method for Neural Networks Inspired By Discrepancy Theory Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: In this paper, we study the problem of rounding optimally given any quantization grid. |
Jerry Chee; Arturs Backurs; Rainie Heck; Li Zhang; Janardhan Kulkarni; Thomas Rothvoss; Sivakanth Gopi; |
| 49 | Lower Bounds for Greedy Teaching Set Constructions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In each iteration, this greedy algorithm chooses to add to the teaching set the $k$ labeled points that restrict the concept class the most. In this work, we prove lower bounds on the performance of this greedy approach for small $k$. |
Spencer Compton; Chirag Pabbaraju; Nikita Zhivotovskiy; |
| 50 | Beyond Worst-Case Online Classification: VC-Based Regret Bounds for Relaxed Benchmarks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We revisit online binary classification by shifting the focus from competing with the best-in-class binary loss to competing against relaxed benchmarks that capture smoothed notions of optimality. |
Omar Montasser; Abhishek Shetty; Nikita Zhivotovskiy; |
| 51 | Conference on Learning Theory 2025: Preface Related Papers Related Patents Related Grants Related Venues Related Experts View Save |
Nika Haghtalab; Ankur Moitra; |
| 52 | Optimistically Optimistic Exploration for Provably Efficient Infinite-Horizon Reinforcement and Imitation Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of reinforcement learning in infinite-horizon discounted linear Markov decision processes (MDPs), and propose the first computationally efficient algorithm achieving rate-optimal regret guarantees in this setting. |
Antoine Moulin; Gergely Neu; Luca Viano; |
| 53 | Can A Calibration Metric Be Both Testable and Actionable? Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: Conversely, the recently proposed Distance from Calibration (dCE) is testable, but it is not actionable since it lacks decision-theoretic guarantees needed for high-stakes applications. To resolve this question, we consider Cutoff Calibration Error, a calibration measure that bridges this gap by assessing calibration over intervals of forecasted probabilities. |
Raphael Rossellini; Jake A. Soloff; Rina Foygel Barber; Zhimei Ren; Rebecca Willett; |
| 54 | Online Convex Optimization with A Separation Oracle Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we introduce a new projection-free algorithm for Online Convex Optimization (OCO) with a state-of-the-art regret guarantee among separation-based algorithms. |
Zakaria Mhammedi; |
| 55 | Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we advance this effort by presenting an efficient algorithm for Markov Decision Processes (MDPs) where the state-action value function of any policy is linear in a given feature map. |
Zakaria Mhammedi; |
| 56 | Heavy-tailed Estimation Is Easier Than Adversarial Contamination Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Surprisingly, despite these distinct motivations, the algorithmic approaches to both these settings have converged, prompting questions on the relationship between the corruption models. In this paper, we investigate and provide a principled explanation for this phenomenon. |
Yeshwanth Cherapanamjeri; Daniel Lee; |
| 57 | A Proof of The Changepoint Detection Threshold Conjecture in Preferential Attachment Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: $ In this paper, we resolve the conjecture affirmatively, proving that detection is indeed impossible if the change occurs at time $n-o(\sqrt{n}). |
Hang Du; Shuyang Gong; Jiaming Xu; |
| 58 | Decision Making in Changing Environments: Robustness, Query-Based Learning, and Differential Privacy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a framework, which we call \textit{hybrid Decision Making with Structured Observations} (hybrid DMSO), that provides an interpolation between the stochastic and adversarial settings of decision making. |
Fan Chen; Alexander Rakhlin; |
| 59 | Structure-agnostic Optimality of Doubly Robust Learning for Treatment Effect Estimation (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we adopt the recently introduced structure-agnostic framework of statistical lower bounds, which poses no structural properties on the nuisance functions other than access to black-box estimators that achieve some statistical estimation rate. |
Jikai Jin; Vasilis Syrgkanis; |
| 60 | Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback Related Papers Related Patents Related Grants Related Venues Related Experts Related Code View Save Highlight: We address the question of whether acceleration is possible under bandit feedback only and provide an affirmative answer for two-player zero-sum normal-form games. |
Shinji Ito; Haipeng Luo; Taira Tsuchiya; Yue Wu; |
| 61 | Anytime Acceleration of Gradient Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For smooth (non-strongly) convex optimization, we propose a stepsize schedule that allows gradient descent to achieve convergence guarantees of $O\big(T^{-\frac{2\log_2\rho}{1+\log_2\rho}}\big) \approx O(T^{-1.119})$ for any stopping time $T$, where $\rho=\sqrt{2}+1$ is the silver ratio and the stepsize schedule is predetermined without prior knowledge of the stopping time. |
Zihan Zhang; Jason Lee; Simon Du; Yuxin Chen; |
| 62 | Universal Rates for Multiclass Learning with Bandit Feedback Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In the present work, we study the problem of multiclass learning under bandit feedback within the framework of \emph{universal learning} (Bousquet et al., STOC 2021).The seminal work of (Daniely et al., COLT 2011) introduced the problem of multiclass learning under bandit feedback and provided a combinatorial characterization of its learnability within the framework of PAC learning. |
Steve Hanneke; Amirreza Shaeiri; Qian Zhang; |
| 63 | Multi-Pass Memory Lower Bounds for Learning Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The authors conjectured that similar lower bounds should apply but left it as an open problem. In this paper, we resolve this open problem by proving that any $L$-pass streaming algorithm using $N$ samples requires $\Tilde{\Omega}(d\kappa\cdot \frac{\kappa}{NL})$ bits of memory. |
Qian Li; Shuo Wang; Jiapeng Zhang; |
| 64 | Quantum State and Unitary Learning Implies Circuit Lower Bounds Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We establish connections between state tomography, pseudorandomness, quantum state synthesis, and circuit lower bounds. |
Nai-Hui Chia; Daniel Liang; Fang Song; |
| 65 | Black-Box Reductions for Decentralized Online Convex Optimization in Changing Environments Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, none of them has been extended into D-OCO, possibly due to the difficulty in handling their commonly used two-level structure. To fill the gap, in this paper, we propose black-box reductions from minimizing these two metrics of D-OCO to minimizing them in the centralized setting. |
Yuanyu Wan; |
| 66 | The Oracle Complexity of Simplex-based Matrix Games: Linear Separability and Nash Equilibria Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of solving matrix games of the form $\max_{\mathbf{w}\in\mathcal{W}}\min_{\mathbf{p}\in\Delta}\mathbf{p}^{\top}A\mathbf{w}$, where $A$ is some matrix and $\Delta$ is the probability simplex. |
Guy Kornowski; Ohad Shamir; |
| 67 | The Sample Complexity of Distributed Simple Binary Hypothesis Testing Under Information Constraints Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we show that sequential interaction does not help. |
Hadi Kazemi; Ankit Pensia; Jog Varun; |
| 68 | Robust Random Graph Matching in Gaussian Models Via Vector Approximate Message Passing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on the matching recovery problem between a pair of correlated Gaussian Wigner matrices with a latent vertex correspondence. |
Zhangsong Li; |
| 69 | Robustly Learning Monotone Generalized Linear Models Via Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We give the first polynomial-time algorithm that achieves a constant-factor approximation for {\em any} monotone Lipschitz activation. |
Nikos Zarifis; Puqian Wang; Ilias Diakonikolas; Jelena Diakonikolas; |
| 70 | Generation Through The Lens of Learning Theory Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study generation through the lens of learning theory. |
Vinod Raman; Jiaxun Li; Ambuj Tewari; |
| 71 | PREM: Privately Answering Statistical Queries with Relative Error Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce $\mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a {\em relative} error guarantee for statistical queries under $(\varepsilon, \delta)$-differential privacy (DP). |
Badih Ghazi; Crist�bal Guzm�n; Pritish Kamath; Alexander Knop; Ravi Kumar; Pasin Manurangsi; Sushant Sachdeva; |
| 72 | Fast and Multiphase Rates for Nearest Neighbor Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the scaling of classification error rates with respect to the size of the training dataset. |
Pengkun Yang; Jingzhao Zhang; |
| 73 | Of Dice and Games: A Theory of Generalized Boosting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work we extend the celebrated theory of boosting to incorporate both cost-sensitive and multi-objective losses. |
Marco Bressan; Nataly Brukhim; Nicol� Cesa-Bianchi; Emmanuel Esposito; Yishay Mansour; Shay Moran; Maximilian Thiessen; |
| 74 | A Fine-grained Characterization of PAC Learnability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In a nutshell, standard PAC learnability precludes a fine-grained exploration of learnability. To overcome this limitation, we develop a fine-grained theory of PAC learnability. |
Marco Bressan; Nataly Brukhim; Nicol� Cesa-Bianchi; Emmanuel Esposito; Yishay Mansour; Shay Moran; Maximilian Thiessen; |
| 75 | Low Coordinate Degree Algorithms II: Categorical Signals and Generalized Stochastic Block Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This complements recent results studying the power of LCDF in testing for continuous structure like real-valued signals corrupted by additive noise. We study a general form of stochastic block model (SBM), where a population is assigned random labels and every $p$-tuple generates an observation according to an arbitrary probability measure associated to the $p$ labels of its members. |
Dmitriy Kunisky; |
| 76 | Open Problem: Structure-Agnostic Minimax Risk for Partial Linear Model Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Double machine learning is a theoretically grounded and practically efficient procedure for a variety of causal estimands and functional estimation problems when adopting black-box machine learning models for estimating nuisance parameters. |
Yihong Gu; |
| 77 | Trade-offs in Data Memorization Via Strong Data Processing Inequalities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we develop a general approach for proving lower bounds on excess data memorization, that relies on a new connection between strong data processing inequalities and data memorization. |
Vitaly Feldman; Guy Kornowski; Xin Lyu; |
| 78 | Testing Juntas and Junta Subclasses with Relative Error Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In relative-error testing we measure the distance from $f$ to $g$, where $f,g: \{0,1\}^n \to \{0,1\}$, by the ratio of $|f^{-1}(1) \triangle g^{-1}(1)|$ (the number of inputs on which $f$ and $g$ disagree) to $|f^{-1}(1)|$ (the number of satisfying assignments of $f$), and we give the testing algorithm both black-box access to $f$ and also access to independent uniform samples from $f^{-1}(1)$. |
Xi Chen; William Pires; Toniann Pitassi; R. A. Servedio; |
| 79 | Approximating The Total Variation Distance Between Spin Systems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a new reduction that connects the problem of approximating the TV-distance to sampling and approximate counting. |
Weiming Feng; Hongyang Liu; Minji Yang; |
| 80 | Private List Learnability Vs. Online List Learnability Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This work explores the connection between differential privacy (DP) and online learning in the context of PAC list learning. |
Steve Hanneke; Shay Moran; Hilla Schefler; Iska Tsubari; |
| 81 | Logarithmic Regret of Exploration in Average Reward Markov Decision Processes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, without modifying EVI, we show that there is a significant advantage in replacing the doubling trick by another simple rule, that we call the Vanishing Multiplicative rule (VM). |
Victor Boone; Bruno Gaujal; |
| 82 | Partial and Exact Recovery of A Random Hypergraph from Its Graph Projection Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: An earlier work of Bresler, Guo, and Polyanskiy (COLT 2024) showed that exact recovery for $d=3$ is possible if and only if $\delta < 2/5$. |
Guy Bresler; Chenghao Guo; Yury Polyanskiy; Andrew Yao; |
| 83 | Polynomial Low Degree Hardness for Broadcasting on Trees (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Specifically, they proved that any function expressed as a linear combination of functions of at most $O(log N)$ leaves has vanishing correlation with the root. In this work, we get an exponential improvement of this lower bound by establishing an $N^{\Omega(1)}$ degree lower bound, for any broadcast process in the whole regime below the Kesten-Stigum bound. |
Han Huang; Elchanan Mossel; |
| 84 | Corrupted Learning Dynamics in Games Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, this acceleration is limited to the \textit{honest regime}, in which all players fully adhere to a prescribed algorithm—a situation that may not be realistic in practice. To address this issue, we present \textit{corrupted learning dynamics} that adaptively find an equilibrium at a rate that depends on the extent to which each player deviates from the strategy suggested by the prescribed algorithm. |
Taira Tsuchiya; Shinji Ito; Haipeng Luo; |
| 85 | Regularized Dikin Walks for Sampling Truncated Logconcave Measures, Mixed Isoperimetry and Beyond Worst-Case Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our contributions include: (1) proving that the soft-threshold Dikin walk mixes in $\widetilde{O}(mn+\kappa n)$ iterations for logconcave distributions with condition number $\kappa$, dimension $n$ and $m$ linear constraints, without requiring bounded polytopes. |
Minhui Jiang; Yuansi Chen; |
| 86 | Experimental Design for Semiparametric Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose the first experimental-design approach that simultaneously offers a sharp regret bound, a PAC bound, and a best-arm identification guarantee. |
Seok-Jin Kim; Gi-Soo Kim; Min-hwan Oh; |
| 87 | Low-dimensional Functions Are Efficiently Learnable Under Randomly Biased Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show that high-complexity cases are rare. |
Elisabetta Cornacchia; Dan Mikulincer; Elchanan Mossel; |
| 88 | Universal Rates of ERM for Agnostic Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider the problem of universal learning by ERM for binary classification under the agnostic setting, where the ”learning curve" reflects the decay of the excess risk as the sample size increases. |
Steve Hanneke; Mingyue Xu; |
| 89 | Alternating Regret for Online Convex Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that this implies an alternating learning dynamic that finds a Nash equilibrium for any convex-concave zero-sum games or a coarse correlated equilibrium for any convex two-player general-sum games at a rate of $\tilde{\mathcal{O}}(d^{\frac{2}{3}}/T^{\frac{2}{3}})$. To further improve the time complexity and/or the dimension dependence, we propose another simple algorithm, Follow-the-Regularized-Leader with a regularizer whose convex conjugate is 3rd-order smooth, for OCO with smooth and self-concordant loss functions (such as linear or quadratic losses). |
Soumita Hait; Ping Li; Haipeng Luo; Mengxiao Zhang; |
| 90 | Capacity-Constrained Online Learning with Delays: Scheduling Frameworks and Regret Trade-offs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study online learning with oblivious losses and delays under a novel “capacity constraint” that limits how many past rounds can be tracked simultaneously for delayed feedback. |
Alexander Ryabchenko; Idan Attias; Daniel M. Roy; |
| 91 | From Fairness to Infinity: Outcome-Indistinguishable (Omni)Prediction in Evolving Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This is of independent interest; for example, we obtain efficient outcome indistinguishability for some interesting infinite collections of tests, as well as for any bounded function — including those computable by deep (graph) neural networks. We apply these techniques to evolving graphs by designing efficient kernel functions that capture socially meaningful features of nodes and their neighborhoods. |
Cynthia Dwork; Chris Hays; Nicole Immorlica; Juan C. Perdomo; Pranay Tankala; |
| 92 | Is A Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of The Base Model in Exploration Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To better understand how to leverage access to powerful pre-trained generative models to improve the efficiency of exploration, we introduce a new computational framework for RL with language models, in which the learner interacts with the model through a sampling oracle. |
Dylan J Foster; Zakaria Mhammedi; Dhruv Rohatgi; |
| 93 | Non-Monetary Mechanism Design Without Distributional Information: Using Scarce Audits Wisely (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: To do so, we impose future punishments and introduce a \emph{flagging} component, allowing agents to flag any biased estimate (we show that doing so aligns with individual incentives). |
Yan Dai; Mo�se Blanchard; Patrick Jaillet; |
| 94 | Solving Convex-Concave Problems with $\mathcal{O}(\epsilon^{-4/7})$ Second-Order Oracle Complexity Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show an improved upper bound of $\tilde{\gO}(\epsilon^{-4/7})$ by generalizing the optimal second-order method for convex optimization to solve the convex-concave minimax problem. |
Lesi Chen; Chengchang Liu; Luo Luo; Jingzhao Zhang; |
| 95 | A Gap Between The Gaussian RKHS and Neural Networks: An Infinite-Center Asymptotic Analysis Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We establish the following fundamental result: Certain functions that lie in the Gaussian RKHS have infinite norm in the neural network Banach space. |
Akash Kumar; Rahul Parhi; Mikhail Belkin; |
| 96 | Quantifying Overfitting Along The Regularization Path for Two-Part-Code MDL in Supervised Classification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We provide a complete characterization of the entire regularization curve of a modified two-part-code Minimum Description Length (MDL) learning rule for binary classification, based on an arbitrary prior or description language. |
Xiaohan Zhu; Nathan Srebro; |
| 97 | Data Selection for ERMs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we adopt a complementary data-centric perspective, whereby we fix a natural learning rule and focus on optimizing the training data. |
Steve Hanneke; Shay Moran; Alexander Shlimovich; Amir Yehudayoff; |
| 98 | Open Problem: Data Selection for Regression Tasks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This note proposes a set of open problems concerning data selection in regression tasks. |
Steve Hanneke; Shay Moran; Alexander Shlimovich; Amir Yehudayoff; |
| 99 | Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we study the online shortest path problem in directed acyclic graphs (DAGs) under bandit feedback against an adaptive adversary. |
Arnab Maiti; Zhiyuan Fan; Kevin Jamieson; Lillian J. Ratliff; Gabriele Farina; |
| 100 | Optimistic Q-learning for Average Reward and Episodic Reinforcement Learning Extended Abstract Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We present a simple, optimistic Q-learning algorithm for regret minimization in a tabular RL setting that encompasses \textit{both average reward and episodic} settings. |
Priyank Agrawal; Shipra Agrawal; |
| 101 | Optimal Robust Estimation Under Local and Global Corruptions: Stronger Adversary and Smaller Error Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Perhaps surprisingly, we show that information theoretically optimal error can indeed be achieved in polynomial time, under an even stronger local perturbation model (the sliced-Wasserstein metric as opposed to the Wasserstein metric). |
Thanasis Pittas; Ankit Pensia; |
| 102 | Regret Bounds for Robust Online Decision Making Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a framework which generalizes “decision making with structured observations" from Foster et al. (2023) by allowing \emph{robust} (i.e. multivalued) models. |
Alexander Appel; Vanessa Kosoy; |
| 103 | Improved Algorithms for Learning Quantum Hamiltonians, Via Flat Polynomials Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our main technical contribution is a new flat polynomial approximation to the exponential function, with significantly lower degree than the flat polynomial approximation used in Bakshi et al. |
Shyam Narayanan; |
| 104 | Differentially Private Synthetic Graphs Preserving Triangle-Motif Cuts Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of releasing a differentially private (DP) synthetic graph $G’$ that well approximates the triangle-motif sizes of all cuts of any given graph $G$, where a motif in general refers to a frequently occurring subgraph within complex networks. |
Pan Peng; Hangyu Xu; |
| 105 | Spherical Dimension Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce and study the \emph{spherical dimension}, a natural topological relaxation of the VC dimension that unifies several results in learning theory where topology plays a key role in the proofs. |
Bogdan Chornomaz; Shay Moran; Tom Waknine; |
| 106 | On The Query Complexity of Sampling from Non-log-concave Distributions (extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our results are in sharp contrast with the recent work of Huang et al. (COLT’24), where an algorithm with quasi-polynomial query complexity was proposed for sampling from a non-log-concave distribution when $M=\mathrm{poly}(d)$. |
Yuchen He; Chihao Zhang; |
| 107 | Algorithms for Sparse LPN and LSPN Against Low-noise (extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our main contribution is a new algorithmic framework that provides learning algorithms against low-noise for both Learning Sparse Parities (LSPN) problem and sparse LPN problem. |
Xue Chen; Wenxuan Shu; Zhaienhe Zhou; |
| 108 | Information-theoretic Reduction of Deep Neural Networks to Linear Models in The Overparametrized Proportional Regime Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We prove an information-theoretic equivalence between the Bayesian deep neural network model trained from data generated by a teacher with matching architecture, and a simpler model of optimal inference in a generalized linear model. |
Francesco Camilli; Daria Tieplova; Eleonora Bergamin; Jean Barbier; |
| 109 | Optimal Graph Reconstruction By Counting Connected Components in Induced Subgraphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new query model regarding the number of connected components, which is one of the most basic and fundamental graph parameters. |
Hadley Black; Arya Mazumdar; Barna Saha; Yinzhan Xu; |
| 110 | Faster Algorithms for Agnostically Learning Disjunctions and Their Implications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This complexity bound is known to be nearly best possible within the class of Correlational Statistical Query (CSQ) algorithms. In this work, we develop an agnostic learner for this concept class with complexity $2^{\tilde{O}(n^{1/3})}$. |
Ilias Diakonikolas; Daniel M. Kane; Lisheng Ren; |
| 111 | How to Safely Discard Features Based on Aggregate SHAP Values Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Recently, SHAP has been increasingly used for \textit{global} insights: practitioners average the absolute SHAP values over many data points to compute global feature importance scores, which are then used to discard “unimportant” features. % In this work, we investigate the soundness of this practice by asking whether small aggregate SHAP values necessarily imply that the corresponding feature does not affect the function. |
Robi Bhattacharjee; Karolin Frohnapfel; Ulrike von Luxburg; |
| 112 | Bayes Correlated Equilibria, No-regret Dynamics in Bayesian Games, and The Price of Anarchy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we identify a natural extension of correlated equilibria that can be computed efficiently and is guaranteed to have bounds on the price of anarchy in various games. |
Kaito Fujii; |
| 113 | What Makes Treatment Effects Identifiable? Characterizations and Estimators Beyond Unconfoundedness (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we initiate the study of general conditions that enable the \emph{identification} of the average treatment effect, extending beyond unconfoundedness and overlap. |
Yang Cai; Alkis Kalavasis; Katerina Mamali; Anay Mehrotra; Manolis Zampetakis; |
| 114 | Learning Sparse Generalized Linear Models with Binary Outcomes Via Iterative Hard Thresholding Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we propose to use and analyze an iterative hard thresholding (projected gradient descent on the ReLU loss) algorithm, called binary iterative hard thresholding (BIHT), for parameter estimation in sparse GLMs with binary outcomes. |
Namiko Matsumoto; Arya Mazumdar; |
| 115 | Generalization Error Bound for Denoising Score Matching Under Relaxed Manifold Assumption Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We examine theoretical properties of the denoising score matching estimate. |
Konstantin Yakovlev; Nikita Puchkin; |
| 116 | An Uncertainty Principle for Linear Recurrent Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider linear recurrent neural networks, which have become a key building block of sequence modeling due to their ability for stable and effective long-range modeling. In this paper, we aim at characterizing this ability on the simple but core copy task, whose goal is to build a linear filter of order $S$ that approximates the filter that looks $K$ time steps in the past (which we refer to as the shift-$K$ filter), where $K$ is larger than $S$. |
Alexandre Fran�ois; Antonio Orvieto; Francis Bach; |
| 117 | A Distributional-Lifting Theorem for PAC Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Distributional assumptions facilitate the design of efficient algorithms but also limit their reach and relevance. Towards addressing this, we prove a {\sl distributional-lifting theorem}: This upgrades a learner that succeeds with respect to a limited distribution family $\mathcal{D}$ to one that succeeds with respect to {\sl any} distribution $D^\star$, with an efficiency overhead that scales with the complexity of expressing $D^\star$ as a mixture of distributions in $\mathcal{D}$. |
Guy Blanc; Jane Lange; Carmen Strassle; Li-Yang Tan; |
| 118 | Existence of Adversarial Examples for Random Convolutional Networks Via Isoperimetric Inequalities on $\mathbb{SO}(d)$ Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: We show that adversarial examples exist for various random convolutional networks, and furthermore, that this is a relatively simple consequence of the isoperimetric inequality on … |
Amit Daniely; |
| 119 | The Fundamental Limits of Recovering Planted Subgraphs (extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our focus in this work is to understand the minimum mean-squared error (MMSE) in terms of recovering the edges of $H^*$, as a function of $p$ and $H$. |
Daniel Z. Lee; Francisco Pernice; Amit Rajaraman; Ilias Zadik; |
| 120 | Learning Partitions with Optimal Query and Round Complexities Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In many of these applications, it is of paramount interest to reduce adaptivity, a.k.a the number of rounds, while minimizing the query complexity. In this paper, we give a complete characterization of the deterministic query complexity of this problem as a function of the number of rounds, $r$, which interpolates smoothly between the non-adaptive and adaptive settings: for any constant $r \geq 1$, the query complexity is $\smash{\Theta(n^{1+\frac{1}{2^r-1}}k^{1-\frac{1}{2^r-1}})}$. |
Hadley Black; Arya Mazumdar; Barna Saha; |
| 121 | Noisy Group Testing in The Linear Regime: Exact Thresholds and Efficient Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Thus, we completely solve the problem of binary noisy group testing in the studied setting. |
Lukas Hintze; Lena Krieg; Olga Scheftelowitsch; Haodong Zhu; |
| 122 | Complexity of Injectivity and Verification of ReLU Neural Networks (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: On the positive side, we show that the problem for a single ReLU-layer is still tractable for small input dimension; more precisely, we present a parameterized algorithm which yields fixed-parameter tractability with respect to the input dimension. |
Vincent Froese; Moritz Grillo; Martin Skutella; |
| 123 | Open Problem: Fixed-Parameter Tractability of Zonotope Problems Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: Neural networks with ReLU activation play a key role in modern machine learning. Understanding the functions represented by ReLU networks is a major topic in current research. … |
Vincent Froese; Moritz Grillo; Christoph Hertrich; Martin Skutella; |
| 124 | Faster Acceleration for Steepest Descent Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we propose a new accelerated first-order method for convex optimization under non-Euclidean smoothness assumptions. |
Cedar Site Bai; Brian Bullins; |
| 125 | Testing (Conditional) Mutual Information – Extended Abstract Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For the special case of mutual information testing (when $B$ is trivial), we establish the necessary and sufficient number of samples required up to polylogarithmic terms. Our technical contributions include a novel method to efficiently simulate weakly correlated samples from the conditionally independent distribution $P_{A|B} P_{C|B} P_B$ given access to samples from an unknown distribution $P_{ABC}$, and a new estimator for equivalence testing that can handle such correlated samples, which might be of independent interest. |
Jan Seyfried; Sayantan Sen; Marco Tomamichel; |
| 126 | Detecting Arbitrary Planted Subgraphs in Random Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, prior work has largely focused on specific ad hoc planted structures and inferential settings, while a general theory has remained elusive. In this paper, we bridge this gap by investigating the detection of an \emph{arbitrary} planted subgraph $\Gamma = \Gamma_n$ in an Erdős-Rényi random graph $\mathcal{G}(n, q_n)$, where the edge probability within $\Gamma$ is $p_n$. |
Dor Elimelech; Wasim Huleihel; |
| 127 | Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we focus on recovering the subspace spanned by the signals via spectral estimators – a family of methods routinely used in practice, often as a warm-start for iterative algorithms. |
Filip Kovacevic; Zhang Yihan; Marco Mondelli; |
| 128 | Depth Separations in Neural Networks: Separating The Dimension from The Accuracy Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We prove an exponential separation between depth 2 and depth 3 neural networks, when approximating a $\mathcal{O}(1)$-Lipschitz target function to constant accuracy, with respect to a distribution with support in the unit ball, under the mild assumption that the weights of the depth 2 network are exponentially bounded. |
Itay Safran; Daniel Reichman; Paul Valiant; |
| 129 | Improved Offline Contextual Bandits with Second-Order Bounds: Betting and Freezing Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider off-policy selection and learning in contextual bandits, where the learner aims to select or train a reward-maximizing policy using data collected by a fixed behavior policy. |
J. Jon Ryu; Jeongyeol Kwon; Benjamin Koppe; Kwang-Sung Jun; |
| 130 | Computational Equivalence of Spiked Covariance and Spiked Wigner Models Via Gram-Schmidt Perturbation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we show the first average-case reduction transforming the sparse Spiked Covariance Model into the sparse Spiked Wigner Model and as a consequence obtain the first computational equivalence result between two well-studied high-dimensional statistics models. |
Guy Bresler; Alina Harbuzova; |
| 131 | Span-Agnostic Optimal Sample Complexity and Oracle Inequalities for Average-Reward RL Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The minimax optimal span-based complexity of $\widetilde{O}(SAH/\varepsilon^2)$, where $H$ is the span of the optimal bias function, has only been achievable with prior knowledge of the value of $H$. Prior-knowledge-free algorithms have been the objective of intensive research, but several natural approaches provably fail to achieve this goal. |
Matthew Zurek; Yudong Chen; |
| 132 | Non-Euclidean High-Order Smooth Convex Optimization Extended Abstract Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop algorithms for the optimization of convex objectives that have Hölder continuous $q$-th derivatives by using a $q$-th order oracle, for any $q \geq 1$. |
Juan Pablo Contreras; Crist�bal Guzm�n; David Marti�nez-Rubio; |
| 133 | Simplifying Adversarially Robust PAC Learning With Tolerance Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We prove, for the first time, the existence of a simpler learner that achieves a sample complexity linear in the VC-dimension without requiring additional assumptions on $\mathcal{H}$. |
Hassan Ashtiani; Vinayak Pathak; Ruth Urner; |
| 134 | Tight Bounds for Noisy Computation of High-Influence Functions, Connectivity, and Threshold Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we obtain tight bounds on the noisy query complexity of several fundamental problems. |
Yuzhou Gu; Xin Li; Yinzhan Xu; |
| 135 | Learning Intersections of Two Margin Halfspaces Under Factorizable Distributions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we introduce a novel algorithm that provably circumvents the CSQ hardness barrier. |
Ilias Diakonikolas; Ma Mingchen; Ren Lisheng; Tzamos Christos; |
| 136 | Are All Models Wrong? Fundamental Limits in Distribution-free Empirical Model Falsification Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Particularly in the setting of interpolation learning where machine learning models are trained to reach zero error on the training data, we might ask if, at the very least, a positive lower bound on the model class risk is possible—or are we unable to detect that “all models are wrong”? In this work, we answer these questions in a distribution-free setting by establishing a model-agnostic, fundamental hardness result for the problem of constructing a lower bound on the best test error achievable over a model class, and examine its implications on specific model classes such as tree-based methods and linear regression. |
Manuel M. M�ller; Yuetian Luo; Rina Foygel Barber; |
| 137 | Some Easy Optimization Problems Have The Overlap-gap Property Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We show that the shortest $s$-$t$ path problem has the overlap-gap property in (i) sparse $\mathbb{G}(n,p)$ graphs and (ii) complete graphs with i.i.d. Exponential edge weights. |
Shuangping Li; Tselil Schramm; |
| 138 | Computable Learning of Natural Hypothesis Classes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We use the on-a-cone machinery from computability theory to prove that, under certain assumptions on the hypothesis class, any “natural” hypothesis class which is learnable must be computably learnable. |
Syed Akbari; Matthew Harrison-Trainor; |
| 139 | Deterministic Apple Tasting Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we provide the first widely-applicable deterministic apple tasting learner, and show that in the realizable case, a hypothesis class is learnable if and only if it is deterministically learnable, confirming a conjecture of Raman, Subedi, Raman, Tewari-24. |
Zachary Chase; Idan Mehalel; |
| 140 | Robust Algorithms for Recovering Planted $r$-Colorable Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: A natural question that arises then is: what other planted structures can be efficiently recovered? In this work, we investigate this question by considering random planted and semirandom models for the $r$-coloring problem. |
Anand Louis; Rameesh Paul; Prasad Raghavendra; |
| 141 | Identifiability and Estimation in High-Dimensional Nonparametric Latent Structure Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We introduce an identifiability theorem that generalizes existing conditions, establishing a unified framework applicable to diverse statistical settings. |
Yichen Lyu; Pengkun Yang; |
| 142 | Open Problem: Optimal Instance-Dependent Sample Complexity for Finding Nash Equilibrium in Two Player Zero-Sum Matrix Games Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, the analogous question in the setting of two-player zero-sum matrix games, where the payoff matrix can only be accessed through noisy samples, remains largely unexplored despite being a natural generalization of the multi-armed bandit problem. In this write-up, we pose a simple open question: What is the optimal instance-dependent sample complexity to find an approximate Nash equilibrium in two-player zero-sum matrix games? |
Arnab Maiti; |
| 143 | Open Problem: Regret Minimization in Heavy-Tailed Bandits with Unknown Distributional Parameters Related Papers Related Patents Related Grants Related Venues Related Experts View Save Abstract: The heavy-tailed bandit problem Bubeck et al.,2013 , is a variant of the stochastic multi-armed bandit problem where the reward distributions have finite absolute raw moments of … |
Gianmarco Genalti; Alberto Maria Metelli; |
| 144 | The Planted Spanning Tree Problems: Exact Overlap Characterization Via Local Weak Convergence Extended Abstract Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of detecting and recovering a planted spanning tree $M_n^*$ hidden within a complete, randomly weighted graph $G_n$. |
Mehrdad Moharrami; Cristopher Moore; Jiaming Xu; |
| 145 | Data-dependent Bounds with $T$-Optimal Best-of-Both-Worlds Guarantees in Multi-Armed Bandits Using Stability-Penalty Matching Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Existing data-dependent and best-of-both-worlds regret bounds for multi-armed bandits problems have limited adaptivity as they are either data-dependent but not best-of-both-worlds (BOBW), BOBW but not data-dependent or have sub-optimal $O(\sqrt{T\ln{T}})$ worst-case guarantee in the adversarial regime. To overcome these limitations, we propose real-time stability-penalty matching (SPM), a new method for obtaining regret bounds that are simultaneously data-dependent, best-of-both-worlds and $T$-optimal for multi-armed bandits problems. |
Quan Nguyen; Shinji Ito; Junpei Komiyama; Mehta Nishant; |
| 146 | Logarithmic Width Suffices for Robust Memorization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we consider the natural question of how well feedforward ReLU neural networks can memorize \emph{robustly}, namely while being able to withstand adversarial perturbations of a given radius. |
Amitsour Egosi; Gilad Yehudai; Ohad Shamir; |
| 147 | Non-convex Matrix Sensing: Breaking The Quadratic Rank Barrier in The Sample Complexity Extended Abstract Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In contrast, while non-convex approaches are computationally less expensive, existing recovery guarantees assume that the number of samples scales at least quadratically with the rank $r$ of the ground-truth matrix. In this paper, we close this gap by showing that the non-convex approaches can be as efficient as nuclear norm minimization in terms of sample complexity. |
Dominik St�ger; Yizhe Zhu; |
| 148 | Decision Making in Hybrid Environments: A Model Aggregation Approach Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For the hybrid regime where the dynamics of the world is fixed while the reward arbitrarily changes, they only give pessimistic bounds on the decision complexity. In this work, we propose a general extension of DEC that more precisely characterizes this case. |
Haolin Liu; Chen-Yu Wei; Zimmert Julian; |
| 149 | Metric Clustering and Graph Optimization Problems Using Weak Comparison Oracles Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We adopt oracle-based methods as defined by Galhotra et al. (2024), focusing on two types of oracles: the quadruplet oracle, a weak and inexpensive comparator that answers binary queries of the form "Is A closer to B or C closer to D?" |
Rahul Raychaudhury; Wen-Zhi Li; Syamantak Das; Sainyam Galhotra; Stavros Sintos; |
| 150 | Efficiently Learning and Sampling Multimodal Distributions with Data-based Initialization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from the stationary measure. |
Frederic Koehler; Holden Lee; Thuy-Duong Vuong; |
| 151 | Learning DNF Through Generalized Fourier Representations Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Lower bounds are presented to show that such constraints are necessary. Combining these contributions, the paper shows learnability of DNF with membership queries under difference-bounded tree BN. |
Mohsen Heidari; Roni Khardon; |
| 152 | Thompson Sampling for Bandit Convex Optimisation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: For general high-dimensional problems we show that Thompson sampling can fail catastrophically. |
Alireza Bakhtiari; Tor Lattimore; Csaba Szepesv�ri; |
| 153 | Towards Fundamental Limits for Active Multi-distribution Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we develop new algorithms for active multi-distribution learning and establish improved label complexity upper and lower bounds, in distribution-dependent and distribution-free settings. |
Chicheng Zhang; Yihan Zhou; |
| 154 | New Lower Bounds for Non-Convex Stochastic Optimization Through Divergence Decomposition Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study fundamental limits of first\hyp order stochastic optimization in a range of non\hyp convex settings, including $L$-smooth functions satisfying Quasar\hyp Convexity ({QC}), Quadratic Growth ({QG}), and Restricted Secant Inequalities ({RSI}). |
El Mehdi Saad; Wei-Cheng Lee; Francesco Orabona; |
| 155 | The Space Complexity of Learning-Unlearning Algorithms (extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we ask how many bits of storage are needed to be able to complete certain training samples, at a later time. |
Yeshwanth Cherapanamjeri; Sumegba Garg; Nived Rajaraman; Ayush Sekhari; Abhishek Shetty; |
| 156 | Testing Thresholds and Spectral Properties of High-Dimensional Random Toroidal Graphs Via Edgeworth-Style Expansions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The main reason for this is that RGGs under $L_q$-distances can not easily be represented as the logical \say{AND} of their 1-dimensional counterparts, as is the case for $L_\infty$ geometries. To overcome this difficulty, we devise a novel technique for quantifying the dependence between edges based on a modified version of Edgeworth expansions. |
Samuel Baguley; Andreas G�bel; Marcus Pappik; Leon Schiller; |
| 157 | Towards Fair Representation: Clustering and Consensus Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Consensus clustering, a fundamental task in machine learning and data analysis, aims to aggregate multiple input clusterings of a dataset, potentially based on different non-sensitive attributes, into a single clustering that best represents the collective structure of the data. In this work, we study this fundamental problem through the lens of fair clustering, as introduced by Chierichetti et al. [NeurIPS’17], which incorporates the disparate impact doctrine to ensure proportional representation of each protected group in the dataset within every cluster. |
Diptarka Chakraborty; Kushagra Chatterjee; Debarati Das; Tien Long Nguyen; Romina Nobahari; |
| 158 | Optimization, Isoperimetric Inequalities, and Sampling Via Lyapunov Potentials Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we prove that optimizability of any function $F$ using Gradient Flow from all initializations implies a Poincaré Inequality for Gibbs measures $\mu_{\beta}\propto e^{-\beta F}$ at low temperature. |
August Y Chen; Karthik Sridharan; |
| 159 | Universality of High-Dimensional Logistic Regression and A Novel CGMT Under Dependence with Applications to Data Augmentation Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: However, these results rely on the assumption that the data consists of independent random vectors-an assumption that significantly limit its applicability to many practical setups. In this paper, we address this limitation by generalizing both results to the dependent setting. |
Matthew Esmaili Mallory; Kevin Han Huang; Morgane Austern; |
| 160 | �All-Something-Nothing� Phase Transitions in Planted $k$-Factor Recovery (Extended Abstract) Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This paper studies the problem of inferring a $k$-factor, specifically a spanning $k$-regular graph, planted within an Erdős–Rényi random graph $\mathcal{G}(n,\lambda/n)$. |
Julia Gaudio; Colin Sandon; Jiaming Xu; Dana Yang; |
| 161 | Optimal Differentially Private Sampling of Unbounded Gaussians Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We provide the first $\widetilde{\mathcal{O}}(d)$-sample algorithm for sampling from unbounded Gaussian distributions under the constraint of $(\varepsilon, \delta)$-differential privacy. |
Valentio Iverson; Gautam Kamath; Argyris Mouzakis; |
| 162 | Estimating Stationary Mass, Frequency By Frequency Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Suppose we observe a trajectory of length $n$ from an exponentially $\alpha$-mixing stochastic process over a finite but potentially large state space. We consider the problem of estimating the probability mass placed by the stationary distribution of any such process on elements that occur with a certain frequency in the observed sequence. |
Milind Nakul; Vidya Muthukumar; Ashwin Pananjady; |
| 163 | Better Private Distribution Testing By Leveraging Unverified Auxiliary Data Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. |
Maryam Aliakbarpour; Arnav Burudgunte; Cl�ment Canonne; Ronitt Rubinfeld; |
| 164 | Proofs As Explanations: Short Certificates for Reliable Predictions Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: So, a set $S’$ of size $d+1$ could be released as an explanation for a positive prediction, and would serve as a short proof of correctness of the prediction under the assumption of perfect realizability. In this work, we consider this problem more generally, for general hypothesis classes $\cH$ and general values $b\geq 0$. |
Avrim Blum; Steve Hanneke; Chirag Pabbaraju; Donya Saless; |
| 165 | Market Making Without Regret Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Our main technical contribution is a lower bound for the i.i.d. case with Lipschitz distributions and independence between market prices and takers’ valuations. |
Nicol� Cesa-Bianchi; Tommaso Cesari; Roberto Colomboni; Luigi Foscari; Vinayak Pathak; |
| 166 | Learning Augmented Graph $k$-Clustering Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: Previous research has focused on learning-augmented $k$-means in Euclidean metrics, limiting its applicability to complex data representations. In this paper, we generalize learning-augmented $k$-clustering to operate on general metrics, enabling its application to graph-structured and non-Euclidean domains. |
Chenglin Fan; Kijun Shin; |
| 167 | Blackwell�s Approachability with Approximation Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We revisit Blackwell’s celebrated approachability problem which considers a repeated vector-valued game between a player and an adversary. |
Dan Garber; Massalha Mhna; |
| 168 | Model Predictive Control Is Almost Optimal for Restless Bandits Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We propose a model predictive control based non-stationary policy with a rolling computational horizon $\tau$. |
Nicolas Gast; Dheeraj Narasimha; |
| 169 | Compression Barriers in Autoregressive Transformers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: This raises the following natural question: Can truly sublinear space utilization be achieved without such assumptions? In this work, we answer this question in the negative. |
Themistoklis Haris; Krzysztof Onak; |
| 170 | Improved Margin Generalization Bounds for Voting Classifiers Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper we establish a new margin-based generalization bound for voting classifiers, refining existing results and yielding tighter generalization guarantees for widely used boosting algorithms such as AdaBoost Freund and Schapire (1997). |
Mikael H�gsgaard M�ller; Kasper Green Larsen; |
| 171 | Local Regularizers Are Not Transductive Learners Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We outline challenges in extending our negative result to the PAC model, leaving open the tantalizing possibility of a PAC/transductive separation with respect to local regularization. |
Sky Jafar; Julian Asilis; Shaddin Dughmi; |
| 172 | Learning Constant-Depth Circuits in Malicious Noise Models Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: The seminal work of Linial, Mansour, and Nisan gave a quasipolynomial-time algorithm for learning constant-depth circuits ($\mathsf{AC}^0$) with respect to the uniform distribution on the hypercube. |
Adam Klivans; Konstantinos Stavropoulos; Arsen Vasilyan; |
| 173 | Sharper Bounds for Chebyshev Moment Matching, with Applications Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the problem of approximately recovering a probability distribution given noisy measurements of its Chebyshev polynomial moments. |
Cameron Musco; Christopher Musco; Lucas Rosenblatt; Apoorv Vikram Singh; |
| 174 | On The Hardness of Bandit Learning Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We study the task of bandit learning, also known as best-arm identification, under the assumption that the true reward function f belongs to a known, but arbitrary, function class F. |
Nataly Brukhim; Aldo Pacchiano; Miroslav Dudik; Robert Schapire; |
| 175 | Recovering Labels from Crowdsourced Data: An Optimal and Polynomial-Time Method Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study a permutation-based model where each worker $i$ has an ability $M_{ik}$ to recover a binary label $x_k^*\in\{-1,1\}$ for task $k$. |
Emmanuel Pilliat; |
| 176 | Lower Bounds for Private Estimation of Gaussian Covariance Matrices Under All Reasonable Parameter Regimes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We prove lower bounds on the number of samples needed to privately estimate the covariance matrix of a Gaussian distribution. |
Victor S. Portella; Nicholas J. A. Harvey; |
| 177 | The Late-stage Training Dynamics of (stochastic) Subgradient Descent on Homogeneous Neural Networks Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We analyze the implicit bias of constant step stochastic subgradient descent (SGD). |
Sholom Schechtman; Nicolas Schreuder; |
| 178 | Optimal Online Bookmaking for Any Number of Outcomes Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We develop an efficient algorithm that computes the optimal bookmaking strategy: when facing an optimal gambler, the algorithm achieves the optimal loss, and in rounds where the gambler is suboptimal, it reduces the achieved loss to the \emph{optimal opportunistic} loss, a notion that is related to subgame perfect Nash equilibrium. |
Hadar Tal; Oron Sabag; |
| 179 | Beyond Propagation of Chaos: A Stochastic Algorithm for Mean Field Optimization Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the virtual particle stochastic approximation, originally introduced for Stein Variational Gradient Descent. |
Chandan Tankala; Dheeraj Nagaraj; Anant Raj; |
| 180 | Improved Algorithms for Effective Resistance Computation on Graphs Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this paper, we address the problem of efficiently approximating ER on a graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ with $n$ vertices and $m$ edges. |
Yang Yichun; Li Rong-Hua; Liao Meihao; Wang Guoren; |
| 181 | Linear Bandits on Ellipsoids: Minimax Optimal Algorithms Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: We consider linear stochastic bandits where the set of actions is an ellipsoid. We provide the first known minimax optimal algorithm for this problem. |
Raymond Zhang; Hadiji H�di; Combes Richard; |
| 182 | The Adaptive Complexity of Finding A Stationary Point Related Papers Related Patents Related Grants Related Venues Related Experts View Save Highlight: In this work, we study the adaptive complexity of finding a stationary point, which is the minimal number of sequential rounds required to achieve stationarity given polynomially many queries executed in parallel at each round. |
Zhou Huanjian; Han Andi; Takeda Akiko; Sugiyama Masashi; |